Martin v. Löwis wrote: >>> I now tried, and it turned out that bytes.__reduce__ would break >>> (again); I fixed it and changed it in r56755. >>> >>> It turned out that PyUnicode_FromString was even documented to >>> accept latin-1. >> Yes, that seemed to me to be the most obvious interpretion. > > Unfortunately, this made creating and retrieving asymmetric: > when you do PyUnicode_AsString, you'll get an UTF-8 string; when > you do PyUnicode_FromString, you did have to pass Latin-1. Making > AsString also return Latin-1 would, of course, restrict the number of > cases where it works.
True, UTF-8 seems to be the better choice. However all spots in the C source that call PyUnicode_FromString() only pass ASCII anyway, which will probably be the most common case. >>> While I was looking at it, I wondered why PyUnicode_FromStringAndSize >>> allows a NULL first argument, creating a null-initialized Unicode >>> object. >> Because that's what PyString_FromStringAndSize() does. > > I guessed that was the historic reason; I just wondered whether the > rationale for having it in PyString_FromStringAndSize still applies > to Unicode. > >> So should NULL support be dropped from PyUnicode_FromStringAndSize()? > > That's my proposal, yes. At least this would give a clear error message in case someone passes NULL. Servus, Walter _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
