Re: [Python-Dev] Proposal: explicitly disallow function/class mismatches in accelerator modules

2016-07-10 Thread Nick Coghlan
On 10 July 2016 at 11:15, Brett Cannon  wrote:
> On Sat, 9 Jul 2016 at 06:52 Nick Coghlan  wrote:
>> That issue was opened due to a few things that work with the C
>> implementation that fail with the Python implementation:
>>
>> - the C version can be pickled (and hence used with multiprocessing)
>> - the C version can be subclassed
>> - the C version can be used in "isinstance" checks
>> - the C version behaves as a static method, the Python version as a
>> normal instance method
>>
>> While I'm planning to accept the patch that converts the pure Python
>> version to a full class that matches the semantics of the C version in
>> these areas as well as in its core behaviour, that last case is one
>> where the pure Python version merely exhibits different behaviour from
>> the C version, rather than failing outright.
>>
>> Given that the issues that arose in this case weren't at all obvious
>> up front, what do folks think of the idea of updating PEP 399 to
>> explicitly prohibit class/function mismatches between accelerator
>> modules and their pure Python counterparts?
>
> I think flat-out prohibiting won't work in the Python -> C case as you can
> do things such as closures and such that I don't know if we provide the APIs
> to mimic through the C API. I'm fine saying we "strongly encourage mirroring
> the design between the pure Python and accelerated version for various
> reasons".

I think we should be more specific than that, as the main problem is
that the obvious way to emulate a closure in C is with a custom
callable, and there are some subtleties involved in doing that in a
way that doesn't create future cross-implementation compatibility
traps.

Specifically, if the Python implementation is a closure, then from an
external behaviour perspective, the key behaviours to mimic in a C
implementation would be:

- disable subclassing & isinstance checks against the public API (e.g.
by implementing it as a factory function rather than exposing the
custom type directly)
- either wrap the Python version in staticmethod, or add descriptor
protocol support to the C version
- don't add a custom representation in C without also adding it to the
Python version
- don't add pickling support in C without also adding it to the Python version

Similarly, if an existing C implementation uses a custom callable,
then a closure may not be a sufficiently compatible alternative, even
though it's clean to write and easy to read.

These issues don't tend to arise with normal functions, as the obvious
replacement for a module level function written in Python is a module
level function written in C, and those already tend to behave
similarly in all these respects.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: explicitly disallow function/class mismatches in accelerator modules

2016-07-10 Thread Emanuel Barry
Hello all, and thanks Nick for starting the discussion!

Long wall of text ahead, whoops! TL;DR - everyone seems to agree, let's do
it.

I think the main issue that we're hitting is that we (whatever you want "we"
to mean) prefer to make Python code in the standard library as easily
understandable and readable as possible (a point Raymond raised in the
issue, which could be a discussion on its own too, but I won't get into
that). People new to Python will sometimes look into the code to try and
understand how it works, while only contributors and core devs (read: people
who know C) will look at the C code, so keeping the code simple isn't as
baked in the design as the Python versions.

As such, making closures might be TOOWTDI in Python, but it can quickly
become annoying to reimplement in C - either you hide away some of the
implementation details behind C level variables and pretend like you're a
closure, or you change the Python version to something else. It's much less
consequential to change a closure into a class than it is to change a class
into a function. In this particular case, we lose the descriptorness(?) of
the Python version, but I'd rather think of this as fixing a bug rather than
removing a feature, especially when partialmethod literally lies right below
in the source. (Side-note: I noticed the source says "Purely functional, no
descriptor behaviour" but functions exhibit descriptor behaviour)

I think the change is worth it (well, there'd be a problem if I didn't ;),
but I'm much more concerned about ensuring that:

- Someone at some point finding a bunch of bugs^Wdiscrepancies between the
Python and C versions of a feature to have some concise rules on the changes
they can and cannot make;
- Python implementation of existing C features, of C implementations of
existing Python features to know exactly the liberty they can and cannot
take;
- New features implemented both in C and Python, to know offhand their
limits and make sure someone further down the line doesn't have to fix it
when they realize e.g. PyPy behaves differently.

> On Saturday, July 09, 2016 8:16 PM, Nick Coghlan wrote:
> 
> That's the proposed policy change and the reason I figured it needed a
> python-dev discussion, as currently it's up to folks adding the Python
> equivalents (or the C accelerators) to decide on a case by case basis
> whether or not to care about compatibility for:
> 
> - string representations
> - pickling (and hence multiprocessing support)
> - subclassing
> - isinstance checks
> - descriptor behaviour

That's quite an exhaustive list for "let the person making the patch decide
what to do with that;" quite the more reason to make this concise (also see
my reply to Brett below).

> The main way for such discrepancies to arise is for the Python
> implementation to be a function (or closure), while the C
> implementation is a custom stateful callable.

Maybe closures are "too complicated" to be a proper design if something is
to be written in both Python and C ;)

> The problem with the current "those are just technical implementation
> details" approach is that a lot of Pythonistas learn standard library
> API behaviour and capabilities through a mix of experimentation and
> introspection rather than reading the documentation, so if CPython
> uses the accelerated version by default, then most folks aren't going
> to notice the discrepancies until they (or their users) are trying to
> debug a problem like "my library works fine in CPython, but breaks
> when used with multiprocessing on PyPy" or "my doctests fail when
> running under MicroPython".

I noticed that Python's standard library takes a very "duck-typed" approach
to the accelerator modules: "I have this thing which is a function, and I
[expose it as a global/use it privately], but before I do so, let's see if
there's something with the same name in that other module, then use it." In
practice, this doesn't create much of an issue, except this thread exists.
(I'm not proposing to change how accelerator modules are handled, merely
pointing out that making designs identical was never a hard requirement and
depended on the developer(s))

> One example of a practical consequence of the change in policy would
> be to say that if you don't want to support subclassing, then don't
> give the *type* a public name - hide it behind a factory function [...]

It's a mixed bag though. How do you disallow subclassing but still allow
isinstance() checks? Now let's try it in Python and without metaclasses, and
the documented vs undocumented (and unguaranteed) API differences become
much more important. But you have to be a consenting adult if you're working
your way around the rules, so there's that I guess.

> On Saturday, July 09, 2016 9:16 PM, Brett Cannon wrote:
> I think flat-out prohibiting won't work in the Python -> C case as you can
do things such as closures and such that I don't know if we provide the APIs
to mimic through the C API. I'm fine saying

Re: [Python-Dev] Proposal: explicitly disallow function/class mismatches in accelerator modules

2016-07-10 Thread Steven D'Aprano
On Sun, Jul 10, 2016 at 10:15:33AM +1000, Nick Coghlan wrote:
> On 10 July 2016 at 05:10, Steven D'Aprano  wrote:
> > The other side of the issue is that requiring exact correspondence is
> > considerably more difficult and may be over-kill for some uses.
> > Hypothetically speaking, if I have an object that only supports pickling
> > by accident, and I replace it with one that doesn't support pickling,
> > shouldn't it be my decision whether that counts as a functional
> > regression (a bug) or a reliance on an undocumented and accidental
> > implementation detail?
> 
> That's the proposed policy change and the reason I figured it needed a
> python-dev discussion, as currently it's up to folks adding the Python
> equivalents (or the C accelerators) to decide on a case by case basis
> whether or not to care about compatibility for:
> 
> - string representations
> - pickling (and hence multiprocessing support)
> - subclassing
> - isinstance checks
> - descriptor behaviour

Right... and that's what I'm saying *ought* to be the decision of the 
maintainer. Do I understand that you agree with this, but you want to 
ensure that such decisions are made up front rather than when and if 
discrepencies are noticed?


> The main way for such discrepancies to arise is for the Python
> implementation to be a function (or closure), while the C
> implementation is a custom stateful callable.
> 
> The problem with the current "those are just technical implementation
> details" approach is that a lot of Pythonistas learn standard library
> API behaviour and capabilities through a mix of experimentation and
> introspection rather than reading the documentation, 

Indeed.


> so if CPython
> uses the accelerated version by default, then most folks aren't going
> to notice the discrepancies until they (or their users) are trying to
> debug a problem like "my library works fine in CPython, but breaks
> when used with multiprocessing on PyPy" or "my doctests fail when
> running under MicroPython".

Yes, and that's a problem, but is it a big enough problem to justify a 
policy change and pre-emptive effort to prevent it?

The great majority of people are never going to run their Python code on 
anything other than CPython, and while I always encourage people to 
write in the most platform-independent fashion possible, I am also 
realistic enough to recognise that platform independence is an ideal 
that many people will fail to meet. (I'm sure that I've written code 
that isn't as platform independent as I hope.)

My two core questions are:

(1) How much extra effort are we going to *mandate* that core devs put 
in to hide the differences between C and Python code, for the benefit of 
a small minority that will notice them?

(2) When should that effort be done? Upfront, or when and as problems 
are reported or noticed?

My preference for answers will be, (1) not much, and (2) when problems 
are reported. In other words, close to the status quo.

I can't speak for others, but I have a tendency towards over-analysing 
my code, trying to pre-emptively spot and avoid even really obscure 
failure modes before they occur. That's a trap: it makes it hard to 
finish (as much as any code is finished) and harder to meet deadlines. 
It's taken me a lot of effort, and much influence from TDD, to realise 
that it's okay to release code with a bug you didn't spot. You can 
always fix it in the next release.

I think the same applies here. I'm okay with something close to the 
status quo: if a C accelerator doesn't quite have the same undocumented 
interface as the pure Python one, then its a bug in one or the other, 
in which case its okay to fix it when somebody notices.

But I don't think I'm okay to make it mandatory that we prevent such 
possible incompatibilities ahead of time.

If we do make this mandatory, how is it going to be enforced and 
checked? The normal way to enforce that accelerated code has the same 
behaviour as Python code is to see that they both pass the same tests. 
But this can only check for features where a test has been written. If 
you don't think of an incompatibility ahead of time, how do you write a 
test for it?

I appreciate that the standard library should be held up to a higher 
level of professionalism than external code, but I don't think that 
*all* the burden should fall on the core developers. Reliance on 
undocumented features is always a dubious thing to do. We all do it, and 
when it turns out that the feature can't be counted on (because it 
changes from one version to another, or isn't available on some 
platforms), who is to blame for our application breaking?

As the programmer who relied on a promise that was never made, surely I 
must take at least a bit of responsibility? Its not like the docs are 
locked up in a filing cabinent in the basement behind a door with a sign 
saying "Beware of the leopard".

I'm just not comfortable with mandating that core devs must do even more 
work to protect p

Re: [Python-Dev] Proposal: explicitly disallow function/class mismatches in accelerator modules

2016-07-10 Thread Chris Angelico
On Mon, Jul 11, 2016 at 1:26 PM, Steven D'Aprano  wrote:
> (1) How much extra effort are we going to *mandate* that core devs put
> in to hide the differences between C and Python code, for the benefit of
> a small minority that will notice them?
>

The subject line is raising one specific difference: the use of a
function in one version and a class in the other. I think it's not
unreasonable to stipulate one specific incompatibility that mustn't be
permitted.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: explicitly disallow function/class mismatches in accelerator modules

2016-07-10 Thread Ethan Furman

On 07/10/2016 08:32 PM, Chris Angelico wrote:

On Mon, Jul 11, 2016 at 1:26 PM, Steven D'Aprano  wrote:

(1) How much extra effort are we going to *mandate* that core devs put
in to hide the differences between C and Python code, for the benefit of
a small minority that will notice them?



The subject line is raising one specific difference: the use of a
function in one version and a class in the other. I think it's not
unreasonable to stipulate one specific incompatibility that mustn't be
permitted.


Is that what the subject line meant?  I missed that, thanks for pointing 
that out!


I think I can agree with having both versions being functions or both 
versions being classes.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: explicitly disallow function/class mismatches in accelerator modules

2016-07-10 Thread Chris Angelico
On Mon, Jul 11, 2016 at 4:25 PM, Ethan Furman  wrote:
> On 07/10/2016 08:32 PM, Chris Angelico wrote:
>>
>> On Mon, Jul 11, 2016 at 1:26 PM, Steven D'Aprano 
>> wrote:
>>>
>>> (1) How much extra effort are we going to *mandate* that core devs put
>>> in to hide the differences between C and Python code, for the benefit of
>>> a small minority that will notice them?
>>>
>>
>> The subject line is raising one specific difference: the use of a
>> function in one version and a class in the other. I think it's not
>> unreasonable to stipulate one specific incompatibility that mustn't be
>> permitted.
>
>
> Is that what the subject line meant?  I missed that, thanks for pointing
> that out!
>
> I think I can agree with having both versions being functions or both
> versions being classes.
>

What I mean is, the subject line is ONLY raising that one difference.
If the C version of a class has no __dict__ but the Python version
does, that's not as big a difference.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com