[issue2799] Remove PyUnicode_AsString(), rework PyUnicode_AsStringAndSize(), add PyUnicode_AsChar()
John Ehresman j...@wingware.com added the comment: I'm trying to port an existing C extension to py3k and find myself wanting something like PyUnicode_AsString so I don't need to introduce other objects to do memory management. PyUnicode_AsString is equivalent to PyArg_Parse w/ a 's' format code, which I find hard to believe will be removed. Another bug proposes changing the name and passing in a default value, which may be a good idea. -- nosy: +jpe ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2799 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2799] Remove PyUnicode_AsString(), rework PyUnicode_AsStringAndSize(), add PyUnicode_AsChar()
Changes by Daniel Diniz aja...@gmail.com: -- nosy: +ezio.melotti, haypo priority: - normal type: - feature request versions: +Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2799 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2799] Remove PyUnicode_AsString(), rework PyUnicode_AsStringAndSize(), add PyUnicode_AsChar()
Stefan Behnel [EMAIL PROTECTED] added the comment: While PyUnicode_AsStringAndSize() may be a better solution if the length is required, PyUnicode_AsString is enough() when it is not required. So I don't buy that argument. Since there are dedicated UTF-8 encoding functions, both functions are pure convenience anyway. Embedded \0 bytes can bite you, but that's completely unrelated to the issue discussed here. I wouldn't oppose renaming the function, but I don't see why it should go. -- nosy: +scoder ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2799 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2799] Remove PyUnicode_AsString(), rework PyUnicode_AsStringAndSize(), add PyUnicode_AsChar()
Alexandre Vassalotti [EMAIL PROTECTED] added the comment: I now think the proposed changes wouldn't be bad thing, after all. I have been bitten myself by the confusing naming of the Unicode API. So, there is definitely a potential for errors. The main problem with PyUnicode_AsString(), as Marc-André pointed out, is it doesn't follow the API signature of the rest of the Unicode API: char *PyUnicode_AsString(PyObject *unicode); PyObject *PyUnicode_AsUTF8String(PyObject *unicode); PyObject *PyUnicode_AsASCIIString(PyObject *unicode); On the other hand, I do like the simple API of PyUnicode_AsString. Also, I have to admit that the apparent similarity between the PyString and the PyUnicode API helped me to port my code to Py3K when I first started working on Python core. So, pragmatism might beat purity here. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2799 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2799] Remove PyUnicode_AsString(), rework PyUnicode_AsStringAndSize(), add PyUnicode_AsChar()
Martin v. Löwis [EMAIL PROTECTED] added the comment: How about PyUnicode_GetUTF8Buffer() or just PyUnicode_UTF8() ?! -1 Note that the function *must* check the UTF-8 buffer for embedded NUL bytes and then raise an exception if it finds one. Otherwise, the API would silently cause truncations. PyString_AsString doesn't check for null bytes, either, and will also silently truncate. This has never been a problem, so I fail to see why it is a problem for Unicode strings. -- title: Remove PyUnicode_AsString(), rework PyUnicode_AsStringAndSize(), add PyUnicode_AsChar() - Remove PyUnicode_AsString(), rework PyUnicode_AsStringAndSize(), add PyUnicode_AsChar() ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2799 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2799] Remove PyUnicode_AsString(), rework PyUnicode_AsStringAndSize(), add PyUnicode_AsChar()
Martin v. Löwis [EMAIL PROTECTED] added the comment: I don't agree that PyUnicode_AsString is useless. There are many cases where you don't need the length of the string, e.g. when relying on NULL termination when passing stuff to some C library. I suggest to close this report as works for me. As for the unrelated issue of PyUnicode_AsStringAndSize: AFAICT, PyString_AsStringAndSize doesn't support Unicode objects (and IMO shouldn't, either). Making PyUnicode_AsStringAndSize and PyString_AsStringAndSize similar is probably a good idea. -- nosy: +loewis __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2799 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2799] Remove PyUnicode_AsString(), rework PyUnicode_AsStringAndSize(), add PyUnicode_AsChar()
Changes by Haoyu Bai [EMAIL PROTECTED]: -- nosy: +bhy __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2799 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2799] Remove PyUnicode_AsString(), rework PyUnicode_AsStringAndSize(), add PyUnicode_AsChar()
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment: IMO, it's better to correct API design errors early, rather than going through a deprecation process. Note that PyUnicode_AsString() is also different than its cousind PyString_AsString(). PyString_AsString() is mostly used to access the char* buffer used by the string object in order to change it, e.g. by first constructing a new PyString object and then filling it in by accessing the internal char* buffer directly. Doing the same with PyUnicode_AsString() will not work. What's worse: direct changes would go undetected, since the UTF8 PyString object is held by the PyUnicode object internally. Even if you just use PyUnicode_AsString() for reading and get the size information from somewhere else, the API doesn't make sure that the PyUnicode object doesn't have embedded 0 code points (which PyString_AsString() does). PyUnicode_AsString() would have to use PyString_AsString() for this instead of the PyString_AS_STRING() macro. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2799 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2799] Remove PyUnicode_AsString(), rework PyUnicode_AsStringAndSize(), add PyUnicode_AsChar()
New submission from Marc-Andre Lemburg [EMAIL PROTECTED]: The API PyUnicode_AsString() is pretty useless by itself - there's no way to access the size information of the returned string without again going to the Unicode object. I'd suggest to remove the API altogether and not only deprecating it. Furthermore, the API PyUnicode_AsStringAndSize() does not follow the API signature of PyString_AsStringAndSize() in that it passes back the pointer to the string as output parameter. That should be changed as well. Note that PyString_AsStringAndSize() already does this for both 8-bit strings and Unicode, so the special Unicode API is not really needed at all or you may want to rename PyString_AsStringAndSize() to PyUnicode_AsStringAndSize(). Finally, since there are many cases where the string buffer contents are copied to a new buffer, it's probably worthwhile to add a new API which does the copying straight away and also deals with the overflow cases in a central place. I'd suggest PyUnicode_AsChar() (with an API like PyUnicode_AsWideChar()). (this was taken from a comment on #1950) -- components: Unicode messages: 66463 nosy: lemburg severity: normal status: open title: Remove PyUnicode_AsString(), rework PyUnicode_AsStringAndSize(), add PyUnicode_AsChar() versions: Python 3.0 __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2799 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2799] Remove PyUnicode_AsString(), rework PyUnicode_AsStringAndSize(), add PyUnicode_AsChar()
Alexandre Vassalotti [EMAIL PROTECTED] added the comment: Honestly, I am not sure if removing PyUnicode_AsString() is a good idea. There is many cases where the size of the returned string is not needed. Furthermore, this would be a rather major backward-incompatible change to be included in a beta release. [copied from duplicate issue #2807] -- nosy: +alexandre.vassalotti __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2799 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com