[issue9738] Document the encoding of functions bytes arguments of the C API

2011-05-30 Thread STINNER Victor
STINNER Victor added the comment: > Here is an interesting case for your collection: PyDict_GetItemString. It's easier to guess the encoding of such function: Python 3 always use UTF-8, but yes, the encoding should be documented. I documented many functions, directly in the header files, and

[issue9738] Document the encoding of functions bytes arguments of the C API

2011-01-04 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: Victor, Here is an interesting case for your collection: PyDict_GetItemString. Note that it is documented as not setting error, but in fact it may if encoding fails. This rarely an issue because most uses of PyDict_GetItemString are with an ASCII str

[issue9738] Document the encoding of functions bytes arguments of the C API

2010-12-26 Thread STINNER Victor
STINNER Victor added the comment: While documenting encodings, I found two issues: #10778 and #10779. -- ___ Python tracker ___ ___ Py

[issue9738] Document the encoding of functions bytes arguments of the C API

2010-12-26 Thread STINNER Victor
STINNER Victor added the comment: r87504 documents encodings of error functions. r87505 documents encodings of unicode functions. r87506 documents encodings of AST, compiler, parser and PyRun functions. -- ___ Python tracker

[issue9738] Document the encoding of functions bytes arguments of the C API

2010-12-08 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: > A (probably crazy) idea that just occurred to me: > typedef char utf8_bytes; > typedef char iso8859_1_bytes; > typedef char fsenc_bytes; I like it! Let's see how far we can get without iso8859_1_bytes, though. (It is likely to be locale_bytes anyw

[issue9738] Document the encoding of functions bytes arguments of the C API

2010-12-08 Thread Dave Malcolm
Dave Malcolm added the comment: A (probably crazy) idea that just occurred to me: typedef char utf8_bytes; typedef char iso8859_1_bytes; typedef char fsenc_bytes; then specify the encoding in the type signature of the API e.g.: - int PyRun_SimpleFile(FILE *fp, const char *filename) + int

[issue9738] Document the encoding of functions bytes arguments of the C API

2010-11-17 Thread Alexander Belopolsky
Changes by Alexander Belopolsky : -- nosy: +belopolsky ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://ma

[issue9738] Document the encoding of functions bytes arguments of the C API

2010-09-09 Thread STINNER Victor
STINNER Victor added the comment: #6543 changed the encoding of the filename argument of PyRun_SimpleFileExFlags() (and all functions based on PyRun_SimpleFileExFlags) and c_filename attribute of the compiler (private) structure in Python 3.1.3: use utf-8 in strict mode instead of filesystem

[issue9738] Document the encoding of functions bytes arguments of the C API

2010-09-03 Thread STINNER Victor
STINNER Victor added the comment: About PyErr_Format() and PyUnicode_FromFormat*() encoding: it's not exactly ISO-8859-1... there is a bug => issue #9769. -- ___ Python tracker

[issue9738] Document the encoding of functions bytes arguments of the C API

2010-09-03 Thread Terry J. Reedy
Terry J. Reedy added the comment: Better specifying requirements is good. A few comments: - The second argument is an error message; it is converted to a string object. + The second argument is an error message; it is decoded to a string object + with ``'utf-8'`` encoding. I would write

[issue9738] Document the encoding of functions bytes arguments of the C API

2010-09-02 Thread Dave Malcolm
Dave Malcolm added the comment: > I think either of these is correct: > - a UTF-8-encoded string > - a string encoded in UTF-8 Possibly use the word "buffer" here, rather than "string", as "string" may suggest the "str" type. Or even: "NUL-terminated buffer of UTF-8-encoded bytes", or whatnot

[issue9738] Document the encoding of functions bytes arguments of the C API

2010-09-02 Thread Éric Araujo
Éric Araujo added the comment: I think either of these is correct: - a UTF-8-encoded string - a string encoded in UTF-8 -- nosy: +eric.araujo ___ Python tracker ___ _

[issue9738] Document the encoding of functions bytes arguments of the C API

2010-09-01 Thread STINNER Victor
New submission from STINNER Victor : Many C functions have bytes argument (char* type) but the encoding is not documented. If would not be a problem if the encoding was always the same, but it is not. Examples: - format of PyUnicode_FromFormat() should be encoded as ISO-8859-1 - filename of P