[Neal Norwitz] > In import.c starting around line 1210 (I removed a bunch of code that > doesn't matter for the problem): > > if (PyUnicode_Check(v)) { > copy = PyUnicode_Encode(PyUnicode_AS_UNICODE(v), > PyUnicode_GET_SIZE(v), > Py_FileSystemDefaultEncoding, NULL); > v = copy; > } > len = PyString_GET_SIZE(v); > if (len + 2 + namelen + MAXSUFFIXSIZE >= buflen) { > Py_XDECREF(copy); > continue; /* Too long */ > } > strcpy(buf, PyString_AS_STRING(v)); > > *** > So if v is originally unicode, then copy is unicode from the second > line, right?
No. An encoded unicode string is of type str, and PyUnicode_Encode() returns an encoded string. Like so: >>> u"\u1122".encode('utf-8') '\xe1\x84\xa2' >>> type(_) <type 'str'> > Then we assign v to copy, so v is still unicode. Almost ;-) > Then later on we do PyString_GET_SIZE and PyString_AS_STRING. That doesn't > work, does it? What am I missing? The conceptual type of the object returned by PyUnicode_Encode(). _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com