Marc-Andre Lemburg added the comment:

On 17.04.2015 01:38, Larry Hastings wrote:
> 
> Documentation is here:
> 
>     https://docs.python.org/3/c-api/arg.html#arg-parsing
> 
> 
> The first line of documentation for each format unit follows this convention:
>     formatunit (pythontype) [arguments, to, pyarg_parsetuple]
> 
> These represent the format unit itself, followed by the Python type it 
> consumes in parentheses, followed by the C types it outputs in square 
> brackets.  Thus
>     i (int) [int]
> means the format unit is 'i', it consumes a Python 'int', and it produces a C 
> 'int'.  Similarly,
>     s (str) [const char *]
> means the format unit is 's', it consumes a Python 'str', and it produces a C 
> 'const char *'.
> 
> When you call PyArg_ParseTuple (AndKeywords), you pass in a pointer to the 
> thing you expect.  If it gives you an int, you pass in &my_int. So the type 
> of the expression you pass in for 'i' is actually "int *".  And the type you 
> pass in for 's' is actually "char **".
> 
> The format units that deal with encodings are a bit weirder.  You actually 
> pass in a const char * string first, followed by the buffer you want to write 
> data too.  Technically the types of the values you pass in for "es" are 
> "const char *, char **".  But the documentation for es says
>     es (str) [const char *encoding, char **buffer]

You need to pass in a variable which will then be set up to point to a
buffer which will be written too :-)

The "e" variants (typically) allocate a buffer for you, since it's pretty
much unknown how long the encoded data will be.

> This led me to believe that I actually had to pass in a "char ***" for 
> buffer!  Which is wrong and doing so makes your programs explode-y.

Indeed :-)

> The documentation should
> 
> * explain this first-line convention precisely, and
> 
> * use the types consistently.
> 
> My suspicion is that the things in brackets have to be the precise C type, 
> e.g. "int *" for i, "char **" for s, "const char *, char **" for es.

The paragraph under "Parsing argument" says:

"""
In the following description, the quoted form is the format unit; the entry in 
(round) parentheses
is the Python object type that matches the format unit; and the entry in 
[square] brackets is the
type of the C variable(s) whose address should be passed.
"""

So I guess the "e" descriptions need to have the additional * removed
or the paragraph has to be updated and all other listings need
to be converted to precise types (that would be my preference).

I wonder why no one has noticed in all these years. I apparently had
understood the listings back to be precise C types back in the days
I added the documentation for the "e" codes:
https://hg.python.org/cpython-fullhistory/rev/3ae06c57d09e).

The descriptions for the codes do clarify what is going on, though.

----------
nosy: +lemburg

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23980>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to