On 12/04/2012 01:08 AM, Antoine Pitrou wrote:
Le Mon, 03 Dec 2012 14:29:35 -0800,
Larry Hastings <la...@hastings.org> a écrit :
    /*[clinic]
    dbm.open -> mapping
    basename=dbmopen

        const char *filename;
            The filename to open.
So how does it handle the fact that filename can either be a unicode
string or a fsencoding-encoded bytestring? And how does it do the right
encoding/decoding dance, possibly platform-specific?

[...]
I see, it doesn't :-)

If you compare the Clinic-generated code to the current implementation of dbm.open (and all the other functions I've touched) you'll find the "format units" specified to PyArg_Parse* are identical. Thus I assert the replacement argument parsing is no worse (and no better) than what's currently shipping in Python.

Separately, I contributed code that handles unicode vs bytes for filenames in a reasonably cross-platform way; see "path_converter" in Modules/posixmodule.c. (This shipped in Python 3.3.) And indeed, I have examples of using "path_converter" with Clinic in my branch.

Along these lines, I've been contemplating proposing that Clinic specifically understand "path" arguments, distinctly from other string arguments, as they are both common and rarely handled correctly. My main fear is that I probably don't understand all their complexities either ;-)

Anyway, this is certainly something we can consider *improving* for Python 3.4. But for now I'm trying to make Clinic an indistinguishable drop-in replacement.


I like the idea, but it needs more polishing. I don't think the various
"duck types" accepted by Python can be expressed fully in plain C types
(e.g. you must distinguish between taking all kinds of numbers or only
an __index__-providing number).

Naturally I agree Clinic needs more polishing. But the problem you fear is already solved. Clinic allows precisely expressing any existing PyArg_ "format unit"** through a combination of the type of the parameter and its "flags". The flags only become necessary for types used by multiple format units; for example, s, z, es, et, es#, et#, y, and y# all map to char *, so it's necessary to disambiguate by using the "flags". The specific case you cite ("__index__-providing number") is already unambiguous; that's n, mapped to Py_ssize_t. There aren't any other format units that map to a Py_ssize_t, so we're done.

** Well, any format unit except w*. I don't handle it just because I wasn't sure how best to do so.


//arry/
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to