[Neal Norwitz]
> In import.c starting around line 1210 (I removed a bunch of code that
> doesn't matter for the problem):
>
> if (PyUnicode_Check(v)) {
> copy = PyUnicode_Encode(PyUnicode_AS_UNICODE(v),
> PyUnicode_GET_SIZE(v),
> Py_FileSystemDefaultEncoding, NULL);
> v = copy;
> }
> len = PyString_GET_SIZE(v);
> if (len + 2 + namelen + MAXSUFFIXSIZE >= buflen) {
> Py_XDECREF(copy);
> continue; /* Too long */
> }
> strcpy(buf, PyString_AS_STRING(v));
>
> ***
> So if v is originally unicode, then copy is unicode from the second
> line, right?
No. An encoded unicode string is of type str, and PyUnicode_Encode()
returns an encoded string. Like so:
>>> u"\u1122".encode('utf-8')
'\xe1\x84\xa2'
>>> type(_)
<type 'str'>
> Then we assign v to copy, so v is still unicode.
Almost ;-)
> Then later on we do PyString_GET_SIZE and PyString_AS_STRING. That doesn't
> work, does it? What am I missing?
The conceptual type of the object returned by PyUnicode_Encode().
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com