Re: [Python-Dev] UTF-8 Decoder

2009-04-27 Thread Antoine Pitrou
Jeroen Ruigrok van der Werven in-nomine.org> writes: > > So on medium and large datasets the decoder of Bjoern is very interesting, > but the tiny case (just Bjoern's name) is quite a tad bit slower. The other > cases seems more typical of what the average use in Python would be. Keep in mind wh

Re: [Python-Dev] UTF-8 Decoder

2009-04-27 Thread Jeroen Ruigrok van der Werven
-On [20090414 16:43], Antoine Pitrou (solip...@pitrou.net) wrote: >If you have some time on your hands, you could try benchmarking it against >Python 3.1's (py3k) decoder. There are two cases to consider: Bjoern actually did it himself already: http://bjoern.hoehrmann.de/utf-8/decoder/dfa/#perfor

Re: [Python-Dev] UTF-8 Decoder

2009-04-14 Thread Antoine Pitrou
Jeroen Ruigrok van der Werven in-nomine.org> writes: > > This got posted on the Unicode list, does it seem interesting for Python > itself, the UTF-8 to UTF-16 transcoding might be? > > http://bjoern.hoehrmann.de/utf-8/decoder/dfa/ If you have some time on your hands, you could try benchmarking

[Python-Dev] UTF-8 Decoder

2009-04-13 Thread Jeroen Ruigrok van der Werven
[Note: I haven't looked thoroughly at our handling yet, so hence I raise the question.] This got posted on the Unicode list, does it seem interesting for Python itself, the UTF-8 to UTF-16 transcoding might be? http://bjoern.hoehrmann.de/utf-8/decoder/dfa/ -- Jeroen Ruigrok van der Werven / as