Hye-Shik Chang wrote: > On 10/5/05, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > >>Of course, a C version could use the same approach as >>the unicodedatabase module: that of compressed lookup >>tables... >> >> http://aggregate.org/TechPub/lcpc2002.pdf >> >>genccodec.py anyone ? >> > > > I had written a test codec for single byte character sets to evaluate > algorithms to use in CJKCodecs once before (it's not a direct > implemention of you've mentioned, tough) I just ported it > to unicodeobject (as attached).
Thanks. Please upload the patch to SF. Looks like we now have to competing patches: yours and the one written by Walter. So far you've only compared decoding strings into Unicode and they seem to be similar in performance. Do they differ in encoding performance ? > It showed relatively fine result > than charmap codecs: > > % python ./Lib/timeit.py -s "s='a'*1024*1024; u=unicode(s)" > "s.decode('iso8859-1')" > 10 loops, best of 3: 96.7 msec per loop > % ./python ./Lib/timeit.py -s "s='a'*1024*1024; u=unicode(s)" > "s.decode('iso8859_10_fc')" > 10 loops, best of 3: 22.7 msec per loop > % ./python ./Lib/timeit.py -s "s='a'*1024*1024; u=unicode(s)" > "s.decode('utf-8')" > 100 loops, best of 3: 18.9 msec per loop > > (Note that it doesn't contain any documentation nor good error > handling yet. :-) > > > Hye-Shik -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 05 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com