2013/2/11 Eleytherios Stamatogiannakis <est...@gmail.com> > Right now we are using PyPy's "codecs.utf_8_encode" and > "codecs.utf_8_decode" to do this conversion. >
It's the most direct way to use the utf-8 conversion functions. > It there a faster way to do these conversions (encoding, decoding) in > PyPy? Does CPython do something more clever than PyPY, like storing > unicodes with full ASCII char content, in an ASCII representation? > Over years, utf-8 conversions have been heavily optimized in CPython: allocate short buffers on the stack, use aligned reads, quick check for ascii-only content (data & 0x80808080)... All things that pypy does not. But I tried some "timeit" runs, and pypy is often faster that CPython, and never much slower. Do your strings have many non-ascii characters? what's the len(utf8)/len(unicode) ratio? -- Amaury Forgeot d'Arc
_______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev