Jeroen Ruigrok van der Werven in-nomine.org> writes:
>
> So on medium and large datasets the decoder of Bjoern is very interesting,
> but the tiny case (just Bjoern's name) is quite a tad bit slower. The other
> cases seems more typical of what the average use in Python would be.
Keep in mind wh
-On [20090414 16:43], Antoine Pitrou (solip...@pitrou.net) wrote:
>If you have some time on your hands, you could try benchmarking it against
>Python 3.1's (py3k) decoder. There are two cases to consider:
Bjoern actually did it himself already:
http://bjoern.hoehrmann.de/utf-8/decoder/dfa/#perfor
Jeroen Ruigrok van der Werven in-nomine.org> writes:
>
> This got posted on the Unicode list, does it seem interesting for Python
> itself, the UTF-8 to UTF-16 transcoding might be?
>
> http://bjoern.hoehrmann.de/utf-8/decoder/dfa/
If you have some time on your hands, you could try benchmarking
[Note: I haven't looked thoroughly at our handling yet, so hence I raise the
question.]
This got posted on the Unicode list, does it seem interesting for Python
itself, the UTF-8 to UTF-16 transcoding might be?
http://bjoern.hoehrmann.de/utf-8/decoder/dfa/
--
Jeroen Ruigrok van der Werven / as