Re: [Python-Dev] The docstring hack for signature information has to go
Am 05.02.14 17:04, schrieb Georg Brandl: Mostly unrelated question while seeing the char * here: do we (or do we want to) support non-ASCII names for functions implemented in C? I didn't try, but I think it should work. methodobject.c:meth_get__name__ uses PyUnicode_FromString, which in turn decodes from UTF-8. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
Am 03.02.14 15:43, schrieb Larry Hastings: A: We create a PyMethodDefEx structure with an extra field: const char *signature. We add a new METH_SIGNATURE (maybe just METH_SIG?) flag to the flags, indicating that this is an extended structure. When iterating over the PyMethodDefs, we know how far to advance the pointer based on this flag. B: Same as A, but we add three unused pointers (void *reserved1 etc) to PyMethodDefEx to give us some room to grow. C: Same as A, but we add two fields to PyMethodDefEx. The second new field identifies the version of the structure, telling us its size somehow. Like the lStructSize field of the OPENFILENAME structure in Win32. I suspect YAGNI. D: Add a new type slot for method signatures. This would be a tp_signature field, along with a new slot id Py_tp_signature. The signature field itself could be struct PyMethodSignature { char *method_name; char *method_signature; }; Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
Am 05.02.2014 14:52, schrieb Martin v. Löwis: Am 03.02.14 15:43, schrieb Larry Hastings: A: We create a PyMethodDefEx structure with an extra field: const char *signature. We add a new METH_SIGNATURE (maybe just METH_SIG?) flag to the flags, indicating that this is an extended structure. When iterating over the PyMethodDefs, we know how far to advance the pointer based on this flag. B: Same as A, but we add three unused pointers (void *reserved1 etc) to PyMethodDefEx to give us some room to grow. C: Same as A, but we add two fields to PyMethodDefEx. The second new field identifies the version of the structure, telling us its size somehow. Like the lStructSize field of the OPENFILENAME structure in Win32. I suspect YAGNI. D: Add a new type slot for method signatures. This would be a tp_signature field, along with a new slot id Py_tp_signature. The signature field itself could be struct PyMethodSignature { char *method_name; char *method_signature; }; Mostly unrelated question while seeing the char * here: do we (or do we want to) support non-ASCII names for functions implemented in C? Georg ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On Wed, Feb 5, 2014 at 11:04 AM, Georg Brandl g.bra...@gmx.net wrote: Am 05.02.2014 14:52, schrieb Martin v. Löwis: Am 03.02.14 15:43, schrieb Larry Hastings: A: We create a PyMethodDefEx structure with an extra field: const char *signature. We add a new METH_SIGNATURE (maybe just METH_SIG?) flag to the flags, indicating that this is an extended structure. When iterating over the PyMethodDefs, we know how far to advance the pointer based on this flag. B: Same as A, but we add three unused pointers (void *reserved1 etc) to PyMethodDefEx to give us some room to grow. C: Same as A, but we add two fields to PyMethodDefEx. The second new field identifies the version of the structure, telling us its size somehow. Like the lStructSize field of the OPENFILENAME structure in Win32. I suspect YAGNI. D: Add a new type slot for method signatures. This would be a tp_signature field, along with a new slot id Py_tp_signature. The signature field itself could be struct PyMethodSignature { char *method_name; char *method_signature; }; Mostly unrelated question while seeing the char * here: do we (or do we want to) support non-ASCII names for functions implemented in C? Extension modules names being non-ASCII being discussed in http://bugs.python.org/issue20485 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On 02/05/2014 05:52 AM, Martin v. Löwis wrote: D: Add a new type slot for method signatures. This would be a tp_signature field, along with a new slot id Py_tp_signature. The signature field itself could be struct PyMethodSignature { char *method_name; char *method_signature; }; That should work too, though we'd also have to add a md_signature field to module objects. It would probably be best to merge the signature into the callable object anyway. Otherwise we'd have to go look up the signature using __name__ and __self__ / __objclass__ on demand. Maybe that isn't such a big deal, but it gets a little worse: as far as I can tell, there's no attribute on a type object one can use to find the module it lives in. So in this situation: import _pickle import inspect inspect.signature(_pickle.Pickler) How could inspect.signature figure out that the Pickler type object lives in the _pickle module? My best guess: parsing the __qualname__, which is pretty ugly. Also, keeping the signature as a reasonably-human-readable preface to the docstring means that, if we supported this for third-party modules, they could be binary ABI compatible with 3.3 and still have something approximating the hand-written signature at the top of their docstring. Cheers, //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On 02/03/2014 08:19 PM, Guido van Rossum wrote: But why do you even need a flag? Reading issue 20075 where the complaint started, it really feels that the change was an overreaction to a very minimal problem. I'll cop to that. I'm pretty anxious about trying to get it right. My worry was (and is) that this hiding-the-signature-in-the-docstring approach is a cheap hack, and it will have unexpected and undesirable side-effects that will in retrospect seem obvious. This is FUD I admit. But it seems to me if we did it the right way, with a PyMethodDefEx, we'd be able to do a lot better job predicting the side-effects. A few docstrings appear truncated. Big deal. We can rewrite the ones that are reported as broken (either by adjusting the docstring to not match the patter or by adjusting it to match the pattern better, depending on the case). Tons of docstrings contain incorrect info, we just fix them when we notice the issue, we don't declare the language broken. I don't think #20075 touches on it, but my biggest concern was third-party modules. If you maintain a Python module, you very well might compile for 3.4 only to find that the first line of your docstrings have mysteriously vanished. You'd have to be very current on changes in Python 3.4 to know what was going on. It seemed like an overly efficient way of pissing off external module maintainers. (Why would they vanish? The mechanical separator for __doc__ vs __text_signature__ would accept them, but unless they're 100% compatible Python ast.parse will reject them. So they'd get stripped from your docstring, but you wouldn't get a valid signature in return.) I'd feel much better with an explicit flag--explicit is better than implicit, after all. But here's a reminder, to make it easier for you to say no. That would mean adding an explicit flag to all the objects which support a signature hidden in the docstring: * PyTypeObject (has tp_flags, we've used 18 of 32 bits by my count) * PyMethodDef (has ml_flags, 7 of 32 bits in use) * PyMethodDescrObject (reuses PyMethodDef) * PyWrapperDescrObject (has d_base-flags, 1 of 32 bits in use) * wrapperobject (reuses PyWrapperDescrObject) Argument Clinic would write the PyMethodDefs, so we'd get those for free. The last three all originate from a PyMethodDef, so when we copied out the docstring pointer we could also propagate the flag. But we'd have to add the flag by hand to the PyTypeObjects. If you won't let me have a flag, can I at least have a more-clever marker? How about this: name-of-function(...)\n \n Yes, the last four characters are right-parenthesis, newline, space, and newline. Benefits: * The odds of finding *that* in the wild seem remote. * If this got displayed as help in 3.3 the user would never notice the space. For the record, here are things that may be in the signature that aren't legal Python syntax and therefore might be surprising: * self parameters (and module and type) are prefixed with '$'. * Positional-only parameters will soon be delimited by '/', just as keyword-only parameters are currently delimited by '*'. (Hasn't happened yet. Needs to happen for 3.4, in order for inspect.Signature to be accurate.) //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
Am 04.02.2014 10:12, schrieb Larry Hastings: If you won't let me have a flag, can I at least have a more-clever marker? How about this: name-of-function(...)\n \n Yes, the last four characters are right-parenthesis, newline, space, and newline. Benefits: * The odds of finding *that* in the wild seem remote. * If this got displayed as help in 3.3 the user would never notice the space. Clever, but due to the hidden space it also increases the frustration factor for people trying to find out why is this accepted as a signature and not this. I don't think a well-chosen visible separator is worse off, such as --\n. Georg ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On 02/03/2014 12:55 PM, Terry Reedy wrote: I think the solution adopted should be future-oriented (ie, clean in the future) even if the cost is slight awkwardness in 3.3. Just a minor point: I keep saying 3.3, but I kind of mean 3.2 through 3.3. I believe the binary ABI shipped with 3.2. However, in practice I suspect there are few installations that * are still on 3.2, and * will ever use binary-ABI-clean third-party modules compiled against 3.4+ that contain signatures. //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On 02/03/2014 01:10 PM, Zachary Ware wrote: What about just choosing a marker value that is somewhat less unsightly? signature = (, or parameters: (, or something (better) to that effect? It may not be beautiful in 3.3, but we can at least make it make sense. It's a reasonable enough idea, and we could consider it if we stick with something like sig=. However, see later in the thread where Guido says to return to the old somewhat-ambiguous form with the function name. ;-) //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On 02/03/2014 02:06 PM, Gregory P. Smith wrote: Wouldn't your proposal to extend the PyMethodDef structure would require ifdef's and make it impossible to include the type information in something compiled against the 3.3 headers that you want to use in 3.4 without recompiling? It might use #ifdefs. However, my proposal was forwards-compatible. When iterating over the methoddef array passed in with a type, if the PyMethodDef flags parameter had METH_SIGNATURE set, I'd advance by sizeof(PyMethodDefEx) bytes, otherwise I'd advance by sizeof(PyMethodDef) bytes. Modules compiled against 3.3 would not have the flag set, therefore I'd advance by the right amount, therefore they should be fine. //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On 02/03/2014 02:26 PM, Antoine Pitrou wrote: How do you create an array that mixes PyMethodDefs and PyMethodDefExs together? You're right, you wouldn't be able to. For my PyMethodDefEx proposal, the entire array would have to be one way or the other. It sounds like METH_SIGNATURE is the wrong mechanism. Instead, you may want a tp_methods_ex as well as as a Py_TPFLAGS_HAVE_METHODS_EX. You may well be right. We'd already need a flag on the type object anyway, to indicate tp_doc start with a signature. So if we had such a flag, it could do double-duty to also indicate tp_methods points to PyMethodDefEx structures. My only concern here: __text_signature__ is supported on five internal objects: PyCFunctionObject, PyTypeObject, PyMethodDescr_Type, _PyMethodWrapper_Type, and PyWrapperDescr_Type. I'm not certain that all of those carry around their own pointer back to their original type object. If you went off the self parameter, you wouldn't have one if you were an unbound method. And you might get the wrong answer if the user bound you to a different class, or if you were accessed through a subclass. (I say might not to mean it could happen sometimes, but rather I don't know what the correct answer is.) Note that this constrains future growth to only add pointer fields, unless you also add a couple of long fields. But at least it sounds workable. Ah, in the back of my mind I meant to say add some unused union {void *p; long i;} fields. Though in practice I don't think we support any platforms where sizeof(long) sizeof(void *). Uh... If you write a conversion function, you may as well make it so it converts the sig= line to a plain signature line in 3.3, which avoids the issue entirely. Yeah, that's an improvement, though it makes the conversion function a lot more complicated, and presumably uses more memory. (and how would that conversion function be shipped to the user anyway? Python 3.3 and the stable ABI don't have it) As a C function in a text file, that they'd have to copy into their program. I know it's ugly. //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On 02/04/2014 01:41 AM, Georg Brandl wrote: Clever, but due to the hidden space it also increases the frustration factor for people trying to find out why is this accepted as a signature and not this. I don't think a well-chosen visible separator is worse off, such as --\n. I could live with that. To be explicit: the signature would then be of the form name-of-function(...)\n--\n The scanning function would look for name-of-function( at the front. If it found it it'd scan forwards in the docstring for )\n--\n. If it found *that*, then it would declare success. //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On 02/03/2014 08:05 AM, Stefan Krah wrote: I think we may slowly get into PEP territory here. Just imagine that we settle on X, then decide at a later point to have a standard way of adding type annotations, then find that X does not work because of (unknown). I'm mentioning this because signatures get really interesting for me if they contain type information. I simultaneously share your interest, and also suspect that maybe Python is the wrong language for that. After all, Python has always been about duck-typing. Even if it did happen, it won't be for quite a while yet. The logical mechanism for type information in pure Python is annotations, and afaik they're not getting any large-scale real-world use for type annotating. (If I'm misinformed I'd love to hear counterexamples.) //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On Tue, 04 Feb 2014 02:21:51 -0800 Larry Hastings la...@hastings.org wrote: On 02/04/2014 01:41 AM, Georg Brandl wrote: Clever, but due to the hidden space it also increases the frustration factor for people trying to find out why is this accepted as a signature and not this. I don't think a well-chosen visible separator is worse off, such as --\n. I could live with that. To be explicit: the signature would then be of the form name-of-function(...)\n--\n The scanning function would look for name-of-function( at the front. If it found it it'd scan forwards in the docstring for )\n--\n. If it found *that*, then it would declare success. This would have to be checked for layout regressions. If the docstring is formatted using a ReST-to-HTML converter, what will be the effect? Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
Georg Brandl writes: I don't think a well-chosen visible separator is worse off, such as --\n. Don't you mean -- \n?duck / L'Ancien Mail-guique-ly y'rs, ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On 4 February 2014 02:04, Larry Hastings la...@hastings.org wrote: On 02/03/2014 07:08 AM, Barry Warsaw wrote: On Feb 03, 2014, at 06:43 AM, Larry Hastings wrote: But that only fixes part of the problem. Our theoretical extension that wants to be binary-compatible with 3.3 and 3.4 still has a problem: how can they support signatures? They can't give PyMethodDefEx structures to 3.3, it will blow up. But if they don't use PyMethodDefEx, they can't have signatures. Can't an extension writer #ifdef around this? Yeah, it's ugly, but it's a pretty standard approach for making C extensions multi-version compatible. For source compatibility, yes. But I thought the point of the binary ABI was to allow compiling a single extension that worked unmodified with multiple versions of Python. If we simply don't support that, then an ifdef would be fine. Then the solution appears straightforward to me: Python 3.4 will not support providing introspection information through the stable ABI. If you want to provide signature info for your C extension without an odd first line in your 3.3 docstring, you must produce version specific binaries (which allows #ifdef hackery). Then PEP 457 can address this properly for 3.5 along with the other issues it needs to cover. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
Am 04.02.2014 13:14, schrieb Antoine Pitrou: On Tue, 04 Feb 2014 02:21:51 -0800 Larry Hastings la...@hastings.org wrote: On 02/04/2014 01:41 AM, Georg Brandl wrote: Clever, but due to the hidden space it also increases the frustration factor for people trying to find out why is this accepted as a signature and not this. I don't think a well-chosen visible separator is worse off, such as --\n. I could live with that. To be explicit: the signature would then be of the form name-of-function(...)\n--\n The scanning function would look for name-of-function( at the front. If it found it it'd scan forwards in the docstring for )\n--\n. If it found *that*, then it would declare success. This would have to be checked for layout regressions. If the docstring is formatted using a ReST-to-HTML converter, what will be the effect? The -- will be added after the signature in the same paragraph. However, I don't think this is a valid concern: if you process signatures as ReST you will already have to deal with lots of markup errors (e.g. due to unpaired * and **). Tools that extract the docstrings and treat them specially (such as Sphinx) will adapt anyway. Georg ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] The docstring hack for signature information has to go
A quick summary of the context: currently in CPython 3.4, a builtin function can publish its signature as a specially encoded line at the top of its docstring. CPython internally detects this line inside PyCFunctionObject.__doc__ and skips past it, and there's a new getter at PyCFunctionObject.__text_signature__ that returns just this line. As an example, the signature for os.stat looks like this: sig=($module, path, *, dir_fd=None, follow_symlinks=True) The convention is, if you have this signature, you shouldn't have your docstring start with a handwritten signature like 3.3 and before. help() on a callable displays the signature automatically if it can, so if you *also* had a handwritten signature, help() would show two signatures. That would look dumb. - So here's the problem. Let's say you want to write an extension that will work with Python 3.3 and 3.4, using the stable ABI. If you don't add this line, then in 3.4 you won't have introspection information, drat. But if you *do* add this line, your docstring will look mildly stupid in 3.3, because it'll have this unsightly sig=( line at the top. And it *won't* have a nice handwritten docstring. (And if you added both a sig= signature *and* a handwritten signature, in 3.4 it would display both. That would also look dumb.) I can't figure out any way to salvage this first line of the docstring approach. So I think we have to abandon it, and do this the hard way: extend the PyMethodDef structure. I propose three different variations. I prefer B, but I'm guessing Guido would prefer the YAGNI approach, which is A: A: We create a PyMethodDefEx structure with an extra field: const char *signature. We add a new METH_SIGNATURE (maybe just METH_SIG?) flag to the flags, indicating that this is an extended structure. When iterating over the PyMethodDefs, we know how far to advance the pointer based on this flag. B: Same as A, but we add three unused pointers (void *reserved1 etc) to PyMethodDefEx to give us some room to grow. C: Same as A, but we add two fields to PyMethodDefEx. The second new field identifies the version of the structure, telling us its size somehow. Like the lStructSize field of the OPENFILENAME structure in Win32. I suspect YAGNI. - But that only fixes part of the problem. Our theoretical extension that wants to be binary-compatible with 3.3 and 3.4 still has a problem: how can they support signatures? They can't give PyMethodDefEx structures to 3.3, it will blow up. But if they don't use PyMethodDefEx, they can't have signatures. Solution: we write a function (which users would have to copy into their extension) that gives a PyMethodDefEx array to 3.4+, but converts it into a PyMethodDef array for 3.3. The tricky part there: what do we do about the docstring? The convention for builtins is to have the first line(s) contain a handwritten signature. But you *don't* want that if you provide a signature, because help() will read that signature and automatically render this first line for you. I can suggest four options here, and of these I like P best: M: Don't do anything. Docstrings with real signature information and a handwritten signature in the docstring will show two signatures in 3.4+, docstrings without any handwritten signature won't display their signature in help in 3.3. (Best practice for modules compiled for 3.4+ is probably: skip the handwritten signature. Users would have to do without in 3.3.) N: Leave the handwritten signature in the docstring, then when registering for 3.4+ add a second flag called METH_33_COMPAT that means when displaying help for this function, don't automatically generate that first line. O: Have the handwritten signature in the docstring. When registering the function for 3.3, have the PyMethodDef docstring point to the it starting at the signature. When registering the function for 3.4+, have the docstring in the PyMethodDefEx point to the first byte after the handwritten signature. Note that automatically skipping the signature with a heuristic is mildly complicated, so this may be hard to get right. P: Have the handwritten signature in the docstring, and have separate static PyMethodDef and PyMethodDefEx arrays. The PyMethodDef docstring points to the docstring like normal. The PyMethodDefEx docstring field points to the first byte after the handwritten signature. This makes the registration function very simple: if it's 3.3 or before, use the PyMethodDef array, if it's 3.4+ use the PyMethodDefEx array. (Argument Clinic could theoretically automate coding some or all of this.) It's late and my brain is only working so well. I'd be interested in other approaches if people can suggest something good. Sorry about the mess, //arry/ ___ Python-Dev mailing list Python-Dev@python.org
Re: [Python-Dev] The docstring hack for signature information has to go
On Feb 03, 2014, at 06:43 AM, Larry Hastings wrote: But that only fixes part of the problem. Our theoretical extension that wants to be binary-compatible with 3.3 and 3.4 still has a problem: how can they support signatures? They can't give PyMethodDefEx structures to 3.3, it will blow up. But if they don't use PyMethodDefEx, they can't have signatures. Can't an extension writer #ifdef around this? Yeah, it's ugly, but it's a pretty standard approach for making C extensions multi-version compatible. -Barry ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On 02/03/2014 07:08 AM, Barry Warsaw wrote: On Feb 03, 2014, at 06:43 AM, Larry Hastings wrote: But that only fixes part of the problem. Our theoretical extension that wants to be binary-compatible with 3.3 and 3.4 still has a problem: how can they support signatures? They can't give PyMethodDefEx structures to 3.3, it will blow up. But if they don't use PyMethodDefEx, they can't have signatures. Can't an extension writer #ifdef around this? Yeah, it's ugly, but it's a pretty standard approach for making C extensions multi-version compatible. For source compatibility, yes. But I thought the point of the binary ABI was to allow compiling a single extension that worked unmodified with multiple versions of Python. If we simply don't support that, then an ifdef would be fine. //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
Larry Hastings la...@hastings.org wrote: So here's the problem. Let's say you want to write an extension that will work with Python 3.3 and 3.4, using the stable ABI. If you don't add this line, then in 3.4 you won't have introspection information, drat. But if you *do* add this line, your docstring will look mildly stupid in 3.3, because it'll have this unsightly sig=( line at the top. And it *won't* have a nice handwritten docstring. (And if you added both a sig= signature *and* a handwritten signature, in 3.4 it would display both. That would also look dumb.) I think we may slowly get into PEP territory here. Just imagine that we settle on X, then decide at a later point to have a standard way of adding type annotations, then find that X does not work because of (unknown). I'm mentioning this because signatures get really interesting for me if they contain type information. Stefan Krah ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
Larry, Can you summarize why neither of the two schemes you tried so far worked? AFAIR the original scheme was to support the 3.3-compatible syntax; there was some kind of corner-case problem with this, so you switched to the new sig=... syntax, but obviously this has poor compatibility with 3.3. Can you remind us of what the corner-case was? How bad would it be if we decided to just live with it or if we added a new flag bit (only recognized by 3.4) to disambiguate corner-cases? -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On 2/3/2014 9:43 AM, Larry Hastings wrote: A quick summary of the context: currently in CPython 3.4, a builtin function can publish its signature as a specially encoded line at the top of its docstring. CPython internally detects this line inside PyCFunctionObject.__doc__ and skips past it, and there's a new getter at PyCFunctionObject.__text_signature__ that returns just this line. As an example, the signature for os.stat looks like this: sig=($module, path, *, dir_fd=None, follow_symlinks=True) The convention is, if you have this signature, you shouldn't have your docstring start with a handwritten signature like 3.3 and before. help() on a callable displays the signature automatically if it can, so if you *also* had a handwritten signature, help() would show two signatures. That would look dumb. - So here's the problem. Let's say you want to write an extension that will work with Python 3.3 and 3.4, using the stable ABI. If you don't add this line, then in 3.4 you won't have introspection information, drat. But if you *do* add this line, your docstring will look mildly stupid in 3.3, because it'll have this unsightly sig=( line at the top. And it *won't* have a nice handwritten docstring. (And if you added both a sig= signature *and* a handwritten signature, in 3.4 it would display both. That would also look dumb.) I think the solution adopted should be future-oriented (ie, clean in the future) even if the cost is slight awkwardness in 3.3. To me, an temporary 'unsightly' extra 'sig=' at the start of some docstrings, in one release, is better that any of the permanent contortions you propose to avoid it. For 3.3.5 Idle, I could add a check to detect and remove 'sig=' from calltips, but I would not consider it a disaster for it to appear with earlier versions. In 3.3.5 (assuming no change is possible for 3.3.4), help (or pydoc) could make the same check and deletion. As with calltips, help is for interactive viewing by humans. [snip] O: Have the handwritten signature in the docstring. When registering the function for 3.3, have the PyMethodDef docstring point to the it starting at the signature. When registering the function for 3.4+, have the docstring in the PyMethodDefEx point to the first byte after the handwritten signature. Note that automatically skipping the signature with a heuristic is mildly complicated, so this may be hard to get right. The old convention builtins was a one line handwritten signature followed by a blank line. For Python functions, one line describing the return value. The 'heuristic' for Idle was to grab the first line of the docstring. If than ended in mid-sentence because someone did not follow the convention, too bad. The newer convention for builtins is multiple lines followed by a blank line. So I recently changed the heuristic to all lines up to the first blank, but with a limit of 5 (needed for bytes), as protection against doctrings that start with a long paragraph. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On Mon, Feb 3, 2014 at 8:43 AM, Larry Hastings la...@hastings.org wrote: A quick summary of the context: currently in CPython 3.4, a builtin function can publish its signature as a specially encoded line at the top of its docstring. CPython internally detects this line inside PyCFunctionObject.__doc__ and skips past it, and there's a new getter at PyCFunctionObject.__text_signature__ that returns just this line. As an example, the signature for os.stat looks like this: sig=($module, path, *, dir_fd=None, follow_symlinks=True) The convention is, if you have this signature, you shouldn't have your docstring start with a handwritten signature like 3.3 and before. help() on a callable displays the signature automatically if it can, so if you *also* had a handwritten signature, help() would show two signatures. That would look dumb. - So here's the problem. Let's say you want to write an extension that will work with Python 3.3 and 3.4, using the stable ABI. If you don't add this line, then in 3.4 you won't have introspection information, drat. But if you *do* add this line, your docstring will look mildly stupid in 3.3, because it'll have this unsightly sig=( line at the top. And it *won't* have a nice handwritten docstring. (And if you added both a sig= signature *and* a handwritten signature, in 3.4 it would display both. That would also look dumb.) What about just choosing a marker value that is somewhat less unsightly? signature = (, or parameters: (, or something (better) to that effect? It may not be beautiful in 3.3, but we can at least make it make sense. -- Zach ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On Mon, Feb 3, 2014 at 8:04 AM, Larry Hastings la...@hastings.org wrote: On 02/03/2014 07:08 AM, Barry Warsaw wrote: On Feb 03, 2014, at 06:43 AM, Larry Hastings wrote: But that only fixes part of the problem. Our theoretical extension that wants to be binary-compatible with 3.3 and 3.4 still has a problem: how can they support signatures? They can't give PyMethodDefEx structures to 3.3, it will blow up. But if they don't use PyMethodDefEx, they can't have signatures. Can't an extension writer #ifdef around this? Yeah, it's ugly, but it's a pretty standard approach for making C extensions multi-version compatible. For source compatibility, yes. But I thought the point of the binary ABI was to allow compiling a single extension that worked unmodified with multiple versions of Python. If we simply don't support that, then an ifdef would be fine. Wouldn't your proposal to extend the PyMethodDef structure would require ifdef's and make it impossible to include the type information in something compiled against the 3.3 headers that you want to use in 3.4 without recompiling? If you don't like seeing an sig= at the front of the docstring couldn't you just move it to the end of the docstring. I don't think messiness in docstrings when running something read for 3.4 under 3.3 is a big deal. [side note] I consider it CRAZY for anyone to load a binary extension module compiled for one version in a later version of Python. People do it, I know, but they're insane. I wish we didn't bother trying to support that crap. I know this isn't going to change in 3.4. Just ranting. [/side note] -gps ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On Mon, 03 Feb 2014 06:43:31 -0800 Larry Hastings la...@hastings.org wrote: A: We create a PyMethodDefEx structure with an extra field: const char *signature. We add a new METH_SIGNATURE (maybe just METH_SIG?) flag to the flags, indicating that this is an extended structure. When iterating over the PyMethodDefs, we know how far to advance the pointer based on this flag. How do you create an array that mixes PyMethodDefs and PyMethodDefExs together? It sounds like METH_SIGNATURE is the wrong mechanism. Instead, you may want a tp_methods_ex as well as as a Py_TPFLAGS_HAVE_METHODS_EX. B: Same as A, but we add three unused pointers (void *reserved1 etc) to PyMethodDefEx to give us some room to grow. Note that this constrains future growth to only add pointer fields, unless you also add a couple of long fields. But at least it sounds workable. C: Same as A, but we add two fields to PyMethodDefEx. The second new field identifies the version of the structure, telling us its size somehow. Like the lStructSize field of the OPENFILENAME structure in Win32. I suspect YAGNI. That doesn't work. The size of elements of a C array is constant, so you can't mix and match PyMethodDefExs of different versions with different sizes each. Solution: we write a function (which users would have to copy into their extension) that gives a PyMethodDefEx array to 3.4+, but converts it into a PyMethodDef array for 3.3. The tricky part there: what do we do about the docstring? The convention for builtins is to have the first line(s) contain a handwritten signature. But you *don't* want that if you provide a signature, because help() will read that signature and automatically render this first line for you. Uh... If you write a conversion function, you may as well make it so it converts the sig= line to a plain signature line in 3.3, which avoids the issue entirely. (and how would that conversion function be shipped to the user anyway? Python 3.3 and the stable ABI don't have it) Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
On 02/03/2014 09:46 AM, Guido van Rossum wrote: Can you summarize why neither of the two schemes you tried so far worked? Certainly. In the first attempt, the signature looked like this: name-of-function(arguments)\n The (arguments) part of the string was 100% compatible with Python syntax. So much so that I didn't write my own parser. Instead, I would take the whole line, strip off the \n, prepend it with def , append it with : pass, and pass in the resulting string to ast.parse(). This had the advantage of looking great if the signature was not mechanically separated from the rest of the docstring: it looked like the old docstring with the handwritten signature on top. The problem: false positives. This is also exactly the traditional format for handwritten signatures. The function in C that mechanically separated the signature from the rest of the docstring had a simple heuristic: if the docstring started with name-of-function(, it assumed it had a valid signature and separated it from the rest of the docstring. But most of the functions in CPython passed this test, which resulted in complaints like help(open) eats first line: http://bugs.python.org/issue20075 I opened an issue, writing a long impassioned plea to change this syntax: http://bugs.python.org/issue20326 Which we did. In the second attempt, the signature looked like this: sig=(arguments)\n In other words, the same as the first attempt, but with sig= instead of the name of the function. Since you never see docstrings that start with sig= in the wild, the false positives dropped to zero. I also took the opportunity to modify the signature slightly. Signatures were a little inconsistent about whether they specified the self parameter or not, so there were some complicated heuristics in inspect.Signature about when to keep or omit the first argument. In the new format I made this more explicit: if the first argument starts with a dollar sign ($), that means this is a special first argument (self for methods, module for module-level callables, type for class methods and __new__). That removed all the guesswork from inspect.Signature; now it works great. (In case you're wondering: I still use ast.parse to parse the signature, I just strip out the $ first.) I want to mention: we anticipate modifying the syntax further in 3.5, adding square brackets around parameters to indicate optional groups. This all has caused no problems so far. But my panicky email last night was me realizing a problem we may see down the road. To recap: if a programmer writes a module using the binary ABI, in theory they can use it with different Python versions without modification. If this programmer added Python 3.4+ compatible signatures, they'd have to insert this sig=( line at the top of their docstring. The downside: Python 3.3 doesn't understand that this is a signature and would happily display it to the user as part of help(). How bad would it be if we decided to just live with it or if we added a new flag bit (only recognized by 3.4) to disambiguate corner-cases? A new flag might solve the problem cheaply. Let's call it METH_SIG, set in the flags portion of the PyMethodDef. It would mean This docstring contains a computer-readable signature. One could achieve source compatibility with 3.3 easily by adding #ifndef METH_SIG / #define METH_SIG 0 / #endif; the next version of 3.3 could add that itself. We could then switch back to the original approach of name-of-function(, so the signature would look presentable when displayed to the user. It would still have the funny dollar-sign, a la $self or $module or $type, but perhaps users could live with that. Though perhaps this time maybe the end delimiter should be two newlines in a row, so that we can text-wrap long signature lines to enhance their readability if/when they get shown to users. I have two caveats: A: for binary compatibility, would Python 3.3 be allergic to this unfamiliar flag in PyMethodDef? Or does it ignore flags it doesn't explicitly look for? B: I had to modify four (or was it five?) different types in Python to add support for mechanically separating the __text_signature__. Although all of them originally started with a PyMethodDef structure, I'm not sure that all of them carry the flags parameter around with them. We might have to add a flags to a couple of these. Fortunately I believe they're all part of Py_LIMITED_API. //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docstring hack for signature information has to go
Hmm... I liked the original scheme because it doesn't come out so badly if some tool doesn't special-case the first line of the docstring at all. (I have to fess up that I wrote such a tool for a limited case not too long ago, and I wrote it to search for a blank line if the docstring starts with methodname followed by '('.) Adding a flag sounds harmless, all the code I could find that looks at them just checks whether specific flags it knows about are set. But why do you even need a flag? Reading issue 20075 where the complaint started, it really feels that the change was an overreaction to a very minimal problem. A few docstrings appear truncated. Big deal. We can rewrite the ones that are reported as broken (either by adjusting the docstring to not match the patter or by adjusting it to match the pattern better, depending on the case). Tons of docstrings contain incorrect info, we just fix them when we notice the issue, we don't declare the language broken. On Mon, Feb 3, 2014 at 5:29 PM, Larry Hastings la...@hastings.org wrote: On 02/03/2014 09:46 AM, Guido van Rossum wrote: Can you summarize why neither of the two schemes you tried so far worked? Certainly. In the first attempt, the signature looked like this: name-of-function(arguments)\n The (arguments) part of the string was 100% compatible with Python syntax. So much so that I didn't write my own parser. Instead, I would take the whole line, strip off the \n, prepend it with def , append it with : pass, and pass in the resulting string to ast.parse(). This had the advantage of looking great if the signature was not mechanically separated from the rest of the docstring: it looked like the old docstring with the handwritten signature on top. The problem: false positives. This is also exactly the traditional format for handwritten signatures. The function in C that mechanically separated the signature from the rest of the docstring had a simple heuristic: if the docstring started with name-of-function(, it assumed it had a valid signature and separated it from the rest of the docstring. But most of the functions in CPython passed this test, which resulted in complaints like help(open) eats first line: http://bugs.python.org/issue20075 I opened an issue, writing a long impassioned plea to change this syntax: http://bugs.python.org/issue20326 Which we did. In the second attempt, the signature looked like this: sig=(arguments)\n In other words, the same as the first attempt, but with sig= instead of the name of the function. Since you never see docstrings that start with sig= in the wild, the false positives dropped to zero. I also took the opportunity to modify the signature slightly. Signatures were a little inconsistent about whether they specified the self parameter or not, so there were some complicated heuristics in inspect.Signature about when to keep or omit the first argument. In the new format I made this more explicit: if the first argument starts with a dollar sign ($), that means this is a special first argument (self for methods, module for module-level callables, type for class methods and __new__). That removed all the guesswork from inspect.Signature; now it works great. (In case you're wondering: I still use ast.parse to parse the signature, I just strip out the $ first.) I want to mention: we anticipate modifying the syntax further in 3.5, adding square brackets around parameters to indicate optional groups. This all has caused no problems so far. But my panicky email last night was me realizing a problem we may see down the road. To recap: if a programmer writes a module using the binary ABI, in theory they can use it with different Python versions without modification. If this programmer added Python 3.4+ compatible signatures, they'd have to insert this sig=( line at the top of their docstring. The downside: Python 3.3 doesn't understand that this is a signature and would happily display it to the user as part of help(). How bad would it be if we decided to just live with it or if we added a new flag bit (only recognized by 3.4) to disambiguate corner-cases? A new flag might solve the problem cheaply. Let's call it METH_SIG, set in the flags portion of the PyMethodDef. It would mean This docstring contains a computer-readable signature. One could achieve source compatibility with 3.3 easily by adding #ifndef METH_SIG / #define METH_SIG 0 / #endif; the next version of 3.3 could add that itself. We could then switch back to the original approach of name-of-function(, so the signature would look presentable when displayed to the user. It would still have the funny dollar-sign, a la $self or $module or $type, but perhaps users could live with that. Though perhaps this time maybe the end delimiter should be two newlines in a row, so that we can text-wrap long signature lines to enhance their readability if/when they get
Re: [Python-Dev] The docstring hack for signature information has to go
Larry Hastings la...@hastings.org writes: In the second attempt, the signature looked like this: sig=(arguments)\n [...] This all has caused no problems so far. But my panicky email last night was me realizing a problem we may see down the road. To recap: if a programmer writes a module using the binary ABI, in theory they can use it with different Python versions without modification. If this programmer added Python 3.4+ compatible signatures, they'd have to insert this sig=( line at the top of their docstring. The downside: Python 3.3 doesn't understand that this is a signature and would happily display it to the user as part of help(). I think this is not a bug, it's a feature. Since 3.3 users don't have the special signature parser either, this gives them exactly the information they need and without any duplication. The only drawback is in the cosmetic sig= prefix -- but that's the right amount of non-intrusive, kind nudging to get people to eventually update. How bad would it be if we decided to just live with it or if we added a new flag bit (only recognized by 3.4) to disambiguate corner-cases? A new flag might solve the problem cheaply. Let's call it METH_SIG, set in the flags portion of the PyMethodDef. It would mean This docstring contains a computer-readable signature. One could achieve source compatibility with 3.3 easily by adding #ifndef METH_SIG / #define METH_SIG 0 / #endif; the next version of 3.3 could add that itself. We could then switch back to the original approach of name-of-function(, so the signature would look presentable when [...] That much effort to fix a purely cosmetic problem showing up only in older releases? Note that it's going to be a while until machine generated signatures have actually trickled down to end-users, so it's not as if every 3.3 installation would suddenly show different docstrings for all modules. Just my $0.02 of course. Best, Nikolaus -- Encrypted emails preferred. PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C »Time flies like an arrow, fruit flies like a Banana.« ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com