On 12/04/2012 01:08 AM, Antoine Pitrou wrote:
Le Mon, 03 Dec 2012 14:29:35 -0800,
Larry Hastings <la...@hastings.org> a écrit :
/*[clinic]
dbm.open -> mapping
basename=dbmopen
const char *filename;
The filename to open.
So how does it handle the fact that filename can either be a unicode
string or a fsencoding-encoded bytestring? And how does it do the right
encoding/decoding dance, possibly platform-specific?
[...]
I see, it doesn't :-)
If you compare the Clinic-generated code to the current implementation
of dbm.open (and all the other functions I've touched) you'll find the
"format units" specified to PyArg_Parse* are identical. Thus I assert
the replacement argument parsing is no worse (and no better) than what's
currently shipping in Python.
Separately, I contributed code that handles unicode vs bytes for
filenames in a reasonably cross-platform way; see "path_converter" in
Modules/posixmodule.c. (This shipped in Python 3.3.) And indeed, I
have examples of using "path_converter" with Clinic in my branch.
Along these lines, I've been contemplating proposing that Clinic
specifically understand "path" arguments, distinctly from other string
arguments, as they are both common and rarely handled correctly. My
main fear is that I probably don't understand all their complexities
either ;-)
Anyway, this is certainly something we can consider *improving* for
Python 3.4. But for now I'm trying to make Clinic an indistinguishable
drop-in replacement.
I like the idea, but it needs more polishing. I don't think the various
"duck types" accepted by Python can be expressed fully in plain C types
(e.g. you must distinguish between taking all kinds of numbers or only
an __index__-providing number).
Naturally I agree Clinic needs more polishing. But the problem you fear
is already solved. Clinic allows precisely expressing any existing
PyArg_ "format unit"** through a combination of the type of the
parameter and its "flags". The flags only become necessary for types
used by multiple format units; for example, s, z, es, et, es#, et#, y,
and y# all map to char *, so it's necessary to disambiguate by using the
"flags". The specific case you cite ("__index__-providing number") is
already unambiguous; that's n, mapped to Py_ssize_t. There aren't any
other format units that map to a Py_ssize_t, so we're done.
** Well, any format unit except w*. I don't handle it just because I
wasn't sure how best to do so.
//arry/
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com