Re: [Python-Dev] Proposal: explicitly disallow function/class mismatches in accelerator modules
On 10 July 2016 at 11:15, Brett Cannon wrote: > On Sat, 9 Jul 2016 at 06:52 Nick Coghlan wrote: >> That issue was opened due to a few things that work with the C >> implementation that fail with the Python implementation: >> >> - the C version can be pickled (and hence used with multiprocessing) >> - the C version can be subclassed >> - the C version can be used in "isinstance" checks >> - the C version behaves as a static method, the Python version as a >> normal instance method >> >> While I'm planning to accept the patch that converts the pure Python >> version to a full class that matches the semantics of the C version in >> these areas as well as in its core behaviour, that last case is one >> where the pure Python version merely exhibits different behaviour from >> the C version, rather than failing outright. >> >> Given that the issues that arose in this case weren't at all obvious >> up front, what do folks think of the idea of updating PEP 399 to >> explicitly prohibit class/function mismatches between accelerator >> modules and their pure Python counterparts? > > I think flat-out prohibiting won't work in the Python -> C case as you can > do things such as closures and such that I don't know if we provide the APIs > to mimic through the C API. I'm fine saying we "strongly encourage mirroring > the design between the pure Python and accelerated version for various > reasons". I think we should be more specific than that, as the main problem is that the obvious way to emulate a closure in C is with a custom callable, and there are some subtleties involved in doing that in a way that doesn't create future cross-implementation compatibility traps. Specifically, if the Python implementation is a closure, then from an external behaviour perspective, the key behaviours to mimic in a C implementation would be: - disable subclassing & isinstance checks against the public API (e.g. by implementing it as a factory function rather than exposing the custom type directly) - either wrap the Python version in staticmethod, or add descriptor protocol support to the C version - don't add a custom representation in C without also adding it to the Python version - don't add pickling support in C without also adding it to the Python version Similarly, if an existing C implementation uses a custom callable, then a closure may not be a sufficiently compatible alternative, even though it's clean to write and easy to read. These issues don't tend to arise with normal functions, as the obvious replacement for a module level function written in Python is a module level function written in C, and those already tend to behave similarly in all these respects. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: explicitly disallow function/class mismatches in accelerator modules
Hello all, and thanks Nick for starting the discussion! Long wall of text ahead, whoops! TL;DR - everyone seems to agree, let's do it. I think the main issue that we're hitting is that we (whatever you want "we" to mean) prefer to make Python code in the standard library as easily understandable and readable as possible (a point Raymond raised in the issue, which could be a discussion on its own too, but I won't get into that). People new to Python will sometimes look into the code to try and understand how it works, while only contributors and core devs (read: people who know C) will look at the C code, so keeping the code simple isn't as baked in the design as the Python versions. As such, making closures might be TOOWTDI in Python, but it can quickly become annoying to reimplement in C - either you hide away some of the implementation details behind C level variables and pretend like you're a closure, or you change the Python version to something else. It's much less consequential to change a closure into a class than it is to change a class into a function. In this particular case, we lose the descriptorness(?) of the Python version, but I'd rather think of this as fixing a bug rather than removing a feature, especially when partialmethod literally lies right below in the source. (Side-note: I noticed the source says "Purely functional, no descriptor behaviour" but functions exhibit descriptor behaviour) I think the change is worth it (well, there'd be a problem if I didn't ;), but I'm much more concerned about ensuring that: - Someone at some point finding a bunch of bugs^Wdiscrepancies between the Python and C versions of a feature to have some concise rules on the changes they can and cannot make; - Python implementation of existing C features, of C implementations of existing Python features to know exactly the liberty they can and cannot take; - New features implemented both in C and Python, to know offhand their limits and make sure someone further down the line doesn't have to fix it when they realize e.g. PyPy behaves differently. > On Saturday, July 09, 2016 8:16 PM, Nick Coghlan wrote: > > That's the proposed policy change and the reason I figured it needed a > python-dev discussion, as currently it's up to folks adding the Python > equivalents (or the C accelerators) to decide on a case by case basis > whether or not to care about compatibility for: > > - string representations > - pickling (and hence multiprocessing support) > - subclassing > - isinstance checks > - descriptor behaviour That's quite an exhaustive list for "let the person making the patch decide what to do with that;" quite the more reason to make this concise (also see my reply to Brett below). > The main way for such discrepancies to arise is for the Python > implementation to be a function (or closure), while the C > implementation is a custom stateful callable. Maybe closures are "too complicated" to be a proper design if something is to be written in both Python and C ;) > The problem with the current "those are just technical implementation > details" approach is that a lot of Pythonistas learn standard library > API behaviour and capabilities through a mix of experimentation and > introspection rather than reading the documentation, so if CPython > uses the accelerated version by default, then most folks aren't going > to notice the discrepancies until they (or their users) are trying to > debug a problem like "my library works fine in CPython, but breaks > when used with multiprocessing on PyPy" or "my doctests fail when > running under MicroPython". I noticed that Python's standard library takes a very "duck-typed" approach to the accelerator modules: "I have this thing which is a function, and I [expose it as a global/use it privately], but before I do so, let's see if there's something with the same name in that other module, then use it." In practice, this doesn't create much of an issue, except this thread exists. (I'm not proposing to change how accelerator modules are handled, merely pointing out that making designs identical was never a hard requirement and depended on the developer(s)) > One example of a practical consequence of the change in policy would > be to say that if you don't want to support subclassing, then don't > give the *type* a public name - hide it behind a factory function [...] It's a mixed bag though. How do you disallow subclassing but still allow isinstance() checks? Now let's try it in Python and without metaclasses, and the documented vs undocumented (and unguaranteed) API differences become much more important. But you have to be a consenting adult if you're working your way around the rules, so there's that I guess. > On Saturday, July 09, 2016 9:16 PM, Brett Cannon wrote: > I think flat-out prohibiting won't work in the Python -> C case as you can do things such as closures and such that I don't know if we provide the APIs to mimic through the C API. I'm fine saying
Re: [Python-Dev] Proposal: explicitly disallow function/class mismatches in accelerator modules
On Sun, Jul 10, 2016 at 10:15:33AM +1000, Nick Coghlan wrote: > On 10 July 2016 at 05:10, Steven D'Aprano wrote: > > The other side of the issue is that requiring exact correspondence is > > considerably more difficult and may be over-kill for some uses. > > Hypothetically speaking, if I have an object that only supports pickling > > by accident, and I replace it with one that doesn't support pickling, > > shouldn't it be my decision whether that counts as a functional > > regression (a bug) or a reliance on an undocumented and accidental > > implementation detail? > > That's the proposed policy change and the reason I figured it needed a > python-dev discussion, as currently it's up to folks adding the Python > equivalents (or the C accelerators) to decide on a case by case basis > whether or not to care about compatibility for: > > - string representations > - pickling (and hence multiprocessing support) > - subclassing > - isinstance checks > - descriptor behaviour Right... and that's what I'm saying *ought* to be the decision of the maintainer. Do I understand that you agree with this, but you want to ensure that such decisions are made up front rather than when and if discrepencies are noticed? > The main way for such discrepancies to arise is for the Python > implementation to be a function (or closure), while the C > implementation is a custom stateful callable. > > The problem with the current "those are just technical implementation > details" approach is that a lot of Pythonistas learn standard library > API behaviour and capabilities through a mix of experimentation and > introspection rather than reading the documentation, Indeed. > so if CPython > uses the accelerated version by default, then most folks aren't going > to notice the discrepancies until they (or their users) are trying to > debug a problem like "my library works fine in CPython, but breaks > when used with multiprocessing on PyPy" or "my doctests fail when > running under MicroPython". Yes, and that's a problem, but is it a big enough problem to justify a policy change and pre-emptive effort to prevent it? The great majority of people are never going to run their Python code on anything other than CPython, and while I always encourage people to write in the most platform-independent fashion possible, I am also realistic enough to recognise that platform independence is an ideal that many people will fail to meet. (I'm sure that I've written code that isn't as platform independent as I hope.) My two core questions are: (1) How much extra effort are we going to *mandate* that core devs put in to hide the differences between C and Python code, for the benefit of a small minority that will notice them? (2) When should that effort be done? Upfront, or when and as problems are reported or noticed? My preference for answers will be, (1) not much, and (2) when problems are reported. In other words, close to the status quo. I can't speak for others, but I have a tendency towards over-analysing my code, trying to pre-emptively spot and avoid even really obscure failure modes before they occur. That's a trap: it makes it hard to finish (as much as any code is finished) and harder to meet deadlines. It's taken me a lot of effort, and much influence from TDD, to realise that it's okay to release code with a bug you didn't spot. You can always fix it in the next release. I think the same applies here. I'm okay with something close to the status quo: if a C accelerator doesn't quite have the same undocumented interface as the pure Python one, then its a bug in one or the other, in which case its okay to fix it when somebody notices. But I don't think I'm okay to make it mandatory that we prevent such possible incompatibilities ahead of time. If we do make this mandatory, how is it going to be enforced and checked? The normal way to enforce that accelerated code has the same behaviour as Python code is to see that they both pass the same tests. But this can only check for features where a test has been written. If you don't think of an incompatibility ahead of time, how do you write a test for it? I appreciate that the standard library should be held up to a higher level of professionalism than external code, but I don't think that *all* the burden should fall on the core developers. Reliance on undocumented features is always a dubious thing to do. We all do it, and when it turns out that the feature can't be counted on (because it changes from one version to another, or isn't available on some platforms), who is to blame for our application breaking? As the programmer who relied on a promise that was never made, surely I must take at least a bit of responsibility? Its not like the docs are locked up in a filing cabinent in the basement behind a door with a sign saying "Beware of the leopard". I'm just not comfortable with mandating that core devs must do even more work to protect p
Re: [Python-Dev] Proposal: explicitly disallow function/class mismatches in accelerator modules
On Mon, Jul 11, 2016 at 1:26 PM, Steven D'Aprano wrote: > (1) How much extra effort are we going to *mandate* that core devs put > in to hide the differences between C and Python code, for the benefit of > a small minority that will notice them? > The subject line is raising one specific difference: the use of a function in one version and a class in the other. I think it's not unreasonable to stipulate one specific incompatibility that mustn't be permitted. ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: explicitly disallow function/class mismatches in accelerator modules
On 07/10/2016 08:32 PM, Chris Angelico wrote: On Mon, Jul 11, 2016 at 1:26 PM, Steven D'Aprano wrote: (1) How much extra effort are we going to *mandate* that core devs put in to hide the differences between C and Python code, for the benefit of a small minority that will notice them? The subject line is raising one specific difference: the use of a function in one version and a class in the other. I think it's not unreasonable to stipulate one specific incompatibility that mustn't be permitted. Is that what the subject line meant? I missed that, thanks for pointing that out! I think I can agree with having both versions being functions or both versions being classes. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: explicitly disallow function/class mismatches in accelerator modules
On Mon, Jul 11, 2016 at 4:25 PM, Ethan Furman wrote: > On 07/10/2016 08:32 PM, Chris Angelico wrote: >> >> On Mon, Jul 11, 2016 at 1:26 PM, Steven D'Aprano >> wrote: >>> >>> (1) How much extra effort are we going to *mandate* that core devs put >>> in to hide the differences between C and Python code, for the benefit of >>> a small minority that will notice them? >>> >> >> The subject line is raising one specific difference: the use of a >> function in one version and a class in the other. I think it's not >> unreasonable to stipulate one specific incompatibility that mustn't be >> permitted. > > > Is that what the subject line meant? I missed that, thanks for pointing > that out! > > I think I can agree with having both versions being functions or both > versions being classes. > What I mean is, the subject line is ONLY raising that one difference. If the C version of a class has no __dict__ but the Python version does, that's not as big a difference. ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com