On Thu, Nov 6, 2008 at 10:26 PM, Stefan Behnel <[EMAIL PROTECTED]> wrote:

> I may be biased since I've been working on the lxml XML library for quite a
> while now, but may I ask why you use unicode strings and Py_UNICODE
> internally, instead of a UTF-8 encoded byte buffer?

The tree is built using a pure Python parser, even though the tree
itself is in Cython. The strings are already passed from the parser as
unicode objects so it's easier for me to just store a pointer to the
PyObject. I don't know if there's a performance hit (or gain?) but
I've found that method more convenient. If there's a better way then I
would be happy to hear it.

I'll take a look this evening at how Cython deals with interning strings.

-Aaron
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to