Re: Encode UTF-8 optimizations

2016-11-01 Thread pali
Hi! New Encode 2.87 with lots of fixes for Encode.xs and Encode::MIME::Header was released. Can you sync/import it into blead?

Re: Encode UTF-8 optimizations

2016-10-27 Thread pali
On Sunday 25 September 2016 10:49:41 Karl Williamson wrote: > On 09/25/2016 04:06 AM, p...@cpan.org wrote: > >On Thursday 01 September 2016 09:30:08 p...@cpan.org wrote: > >>On Wednesday 31 August 2016 21:27:37 Karl Williamson wrote: > >>>We may change Encode in blead too, since it already differs

Re: Encode UTF-8 optimizations

2016-09-25 Thread pali
On Thursday 01 September 2016 09:30:08 p...@cpan.org wrote: > On Wednesday 31 August 2016 21:27:37 Karl Williamson wrote: > > We may change Encode in blead too, since it already differs from > > cpan. I'll have to get Sawyer's opinion on that. But the next > > step is for me to fix Devel::PPPort

Re: Encode UTF-8 optimizations

2016-08-31 Thread pali
On Monday 29 August 2016 17:00:00 Karl Williamson wrote: > If you'd be willing to test this out, especially the performance > parts that would be great! [snip] > There are 2 experimental performance commits. If you want to see if > they actually improve performance by doing a before/after compare

Re: Encode UTF-8 optimizations

2016-08-25 Thread pali
On Wednesday 24 August 2016 22:49:21 Karl Williamson wrote: > On 08/22/2016 02:47 PM, p...@cpan.org wrote: > > snip > > >I added some tests for overlong sequences. Only for ASCII platforms, tests > >for EBCDIC > >are missing (sorry, I do not have access to any EBCDIC platform for testing). > >

Re: Encode UTF-8 optimizations

2016-08-22 Thread pali
(this only applies for strict UTF-8) On Monday 22 August 2016 23:19:51 Karl Williamson wrote: > The code could be tweaked to call UTF8_IS_SUPER first, but I'm > asserting that an optimizing compiler will see that any call to > is_utf8_char_slow() is pointless, and will optimize it out. Such

Re: Encode UTF-8 optimizations

2016-08-22 Thread Karl Williamson
On 08/22/2016 03:19 PM, Karl Williamson wrote: On 08/22/2016 02:47 PM, p...@cpan.org wrote: > And I think you misunderstand when is_utf8_char_slow() is called. It is > called only when the next byte in the input indicates that the only > legal UTF-8 that might follow would be for a code point

Re: Encode UTF-8 optimizations

2016-08-22 Thread Karl Williamson
On 08/22/2016 02:47 PM, p...@cpan.org wrote: > And I think you misunderstand when is_utf8_char_slow() is called. It is > called only when the next byte in the input indicates that the only > legal UTF-8 that might follow would be for a code point that is at least > U+20, almost twice as

Re: Encode UTF-8 optimizations

2016-08-22 Thread Karl Williamson
On 08/22/2016 07:05 AM, p...@cpan.org wrote: On Sunday 21 August 2016 08:49:08 Karl Williamson wrote: On 08/21/2016 02:34 AM, p...@cpan.org wrote: On Sunday 21 August 2016 03:10:40 Karl Williamson wrote: Top posting. Attached is my alternative patch. It effectively uses a different

Re: Encode UTF-8 optimizations

2016-08-22 Thread pali
On Sunday 21 August 2016 08:49:08 Karl Williamson wrote: > On 08/21/2016 02:34 AM, p...@cpan.org wrote: > >On Sunday 21 August 2016 03:10:40 Karl Williamson wrote: > >>Top posting. > >> > >>Attached is my alternative patch. It effectively uses a different > >>algorithm to avoid decoding the input

Re: Encode UTF-8 optimizations

2016-08-21 Thread pali
On Sunday 21 August 2016 03:10:40 Karl Williamson wrote: > Top posting. > > Attached is my alternative patch. It effectively uses a different > algorithm to avoid decoding the input into code points, and to copy > all spans of valid input at once, instead of character at a time. > > And it uses

Re: Encode UTF-8 optimizations

2016-08-20 Thread Karl Williamson
On 08/20/2016 08:33 PM, Aristotle Pagaltzis wrote: * Karl Williamson [2016-08-21 03:12]: That should be done anyway to make sure we've got less buggy Unicode handling code available to older modules. I think you meant “available to older perls”? Yes, thanks

Re: Encode UTF-8 optimizations

2016-08-20 Thread Aristotle Pagaltzis
* Karl Williamson [2016-08-21 03:12]: > That should be done anyway to make sure we've got less buggy Unicode > handling code available to older modules. I think you meant “available to older perls”?

Re: Encode UTF-8 optimizations

2016-08-20 Thread Karl Williamson
Top posting. Attached is my alternative patch. It effectively uses a different algorithm to avoid decoding the input into code points, and to copy all spans of valid input at once, instead of character at a time. And it uses only currently available functions. Any of these that are

Re: Encode UTF-8 optimizations

2016-08-19 Thread pali
On Thursday 18 August 2016 23:06:27 Karl Williamson wrote: > On 08/12/2016 09:31 AM, p...@cpan.org wrote: > >On Thursday 11 August 2016 17:41:23 Karl Williamson wrote: > >>On 07/09/2016 05:12 PM, p...@cpan.org wrote: > >>>Hi! As we know utf8::encode() does not provide correct UTF-8 encoding >

Re: Encode UTF-8 optimizations

2016-08-18 Thread Karl Williamson
On 08/12/2016 09:31 AM, p...@cpan.org wrote: On Thursday 11 August 2016 17:41:23 Karl Williamson wrote: On 07/09/2016 05:12 PM, p...@cpan.org wrote: Hi! As we know utf8::encode() does not provide correct UTF-8 encoding and Encode::encode("UTF-8", ...) should be used instead. Also opening file

Re: Encode UTF-8 optimizations

2016-08-12 Thread pali
On Thursday 11 August 2016 17:41:23 Karl Williamson wrote: > On 07/09/2016 05:12 PM, p...@cpan.org wrote: > >Hi! As we know utf8::encode() does not provide correct UTF-8 encoding > >and Encode::encode("UTF-8", ...) should be used instead. Also opening > >file should be done by :encoding(UTF-8)

Re: Encode UTF-8 optimizations

2016-08-11 Thread Karl Williamson
On 07/09/2016 05:12 PM, p...@cpan.org wrote: Hi! As we know utf8::encode() does not provide correct UTF-8 encoding and Encode::encode("UTF-8", ...) should be used instead. Also opening file should be done by :encoding(UTF-8) layer instead :utf8. But UTF-8 strict implementation in Encode module