[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-15 Thread Brett Cannon
It seems a little odd to be dictating website updates about other VMs in this 
PEP. I'm not arguing that we shouldn't update the site, I just think requiring 
it as part of this PEP seems tangential to what the PEP is focusing on.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TKHNENOXP6H34E73XGFOL2KKXSM4Z6T2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-15 Thread Ronald Oussoren via Python-Dev


> On 15 Apr 2020, at 03:39, Victor Stinner  wrote:
> 
> Hi Ronald,
> 
> Le mar. 14 avr. 2020 à 18:25, Ronald Oussoren  a 
> écrit :
>> Making “PyObject” opaque will also affect the stable ABI because even types 
>> defined using the PyTypeSpec API embed a “PyObject” value in the structure 
>> defining the instance layout. It is easy enough to change this in a way that 
>> preserves source-code compatibility, but I’m  not sure it is possible to 
>> avoid breaking the stable ABI.
> 
> Oh, that's a good point. I tracked this issue at:
> https://bugs.python.org/issue39573#msg366473
> 
>> BTW. This will require growing the PyTypeSpec ABI a little, there are 
>> features you cannot implement using that API for example the buffer protocol.
> 
> I tracked this feature request at:
> https://bugs.python.org/issue40170#msg366474

Another issue with making structures opaque is that this makes it at best 
harder to subclass builtin types in an extension while adding additional data 
fields to the subclass. This is a similar issue as the fragile base class issue 
that was fixed in Objective-C 2.0 by adding a level of indirection, and could 
probably be fixed in a similar way in Python.

Ronald
—

Twitter / micro.blog: @ronaldoussoren
Blog: https://blog.ronaldoussoren.net/ 


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HBOKY4WKIV7Z4Y5RSYISEU7D5ABI2B2I/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread Victor Stinner
Hi Ronald,

Le mar. 14 avr. 2020 à 18:25, Ronald Oussoren  a écrit :
> Making “PyObject” opaque will also affect the stable ABI because even types 
> defined using the PyTypeSpec API embed a “PyObject” value in the structure 
> defining the instance layout. It is easy enough to change this in a way that 
> preserves source-code compatibility, but I’m  not sure it is possible to 
> avoid breaking the stable ABI.

Oh, that's a good point. I tracked this issue at:
https://bugs.python.org/issue39573#msg366473

> BTW. This will require growing the PyTypeSpec ABI a little, there are 
> features you cannot implement using that API for example the buffer protocol.

I tracked this feature request at:
https://bugs.python.org/issue40170#msg366474

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IG3SU3RTOSO24OHWT6PQIZJP4WMGKADA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread André Malo
Steve Dower wrote:
> On 14Apr2020 1557, André Malo wrote:
> 
> > Stefan Behnel wrote:
> > 
> >> André Malo schrieb am 14.04.20 um 13:39:
> >> 
> >>> A good way to test that promise (or other implications like
> >>> performance)
> >>> might
> >>> 
> >   also be to rewrite the standard library extensions in Cython and
> >   
> >>> see where it leads.
> >>
> >>
> >>
> >>
> >> Not sure I understand what you're saying here. stdlib extension modules
> >> are
 currently written in C, with a bit of code generation. How is that
> >> different?
> > 
> > 
> > They are C extensions like the ones everybody could write. They should use
> > the
 same APIs. What I'm saying is, that it would be a good test if the
> > APIs are good enough (for everybody else). If, say, Cython is
> > recommended, some attempt should be made to achieve the same results with
> > Cython. Or some other sets of APIs which are considered for "the public".
> > 
> > I don't think, the current stdlib modules restrict themselves to a
> > limited
> > API. The distinction between "inside" and "outside" bothers me.
> 
> 
> It should not bother you. The standard library is not a testing ground 
> for the public API - it's a layer to make those APIs available to users 
> in a reliable, compatible format. Think of it like your C runtime, which 
> uses a lot of system calls that have changed far more often than libc.

I can agree up to a certain level. There are extensions and there are 
extensions, see below.

> 
> We can change the interface between the runtime and the included modules 
> as frequently as we like, because it's private. And we do change them, 
> and the changes go unnoticed because we adapt both sides of the contract 
> at once. For example, we recently changed the calling conventions for 
> certain functions, which didn't break anyone because we updated the 
> callers as well. And we completely reimplemented stat() emulation on 
> Windows recently, which wasn't incompatible because the public part of 
> the API didn't change (except to have fewer false errors).
> 
> Modules that are part of the core runtime deliberately use private APIs 
> so that other extension modules don't have to. It's not any sort of 
> unfair advantage - it's a deliberate aspect of the software's design.

Ah, hmm, maybe I was not clear enough. I was talking about extensions like 
itertools or datetime. Not core builtins like sys or the type system. I think, 
there's a difference. People do use especially the former ones also as a 
template how things are done "correctly".

I agree, it's easy enough to change everything at once, assuming a good test 
suite :-)

Cheers,
nd

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZEYZX7PRCACRZEZIHU35CLWIPHQBALDV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread Steve Dower

On 14Apr2020 1557, André Malo wrote:

Stefan Behnel wrote:

André Malo schrieb am 14.04.20 um 13:39:

A good way to test that promise (or other implications like performance)
might

  also be to rewrite the standard library extensions in Cython and

see where it leads.



Not sure I understand what you're saying here. stdlib extension modules are
currently written in C, with a bit of code generation. How is that
different?


They are C extensions like the ones everybody could write. They should use the
same APIs. What I'm saying is, that it would be a good test if the APIs are
good enough (for everybody else). If, say, Cython is recommended, some attempt
should be made to achieve the same results with Cython. Or some other sets of
APIs which are considered for "the public".

I don't think, the current stdlib modules restrict themselves to a limited
API. The distinction between "inside" and "outside" bothers me.


It should not bother you. The standard library is not a testing ground 
for the public API - it's a layer to make those APIs available to users 
in a reliable, compatible format. Think of it like your C runtime, which 
uses a lot of system calls that have changed far more often than libc.


We can change the interface between the runtime and the included modules 
as frequently as we like, because it's private. And we do change them, 
and the changes go unnoticed because we adapt both sides of the contract 
at once. For example, we recently changed the calling conventions for 
certain functions, which didn't break anyone because we updated the 
callers as well. And we completely reimplemented stat() emulation on 
Windows recently, which wasn't incompatible because the public part of 
the API didn't change (except to have fewer false errors).


Modules that are part of the core runtime deliberately use private APIs 
so that other extension modules don't have to. It's not any sort of 
unfair advantage - it's a deliberate aspect of the software's design.


Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RFYAK5NDBY4DYHKBYOQK5SUKMIT4VZZX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread Matěj Cepl
On 2020-04-14, 12:35 GMT, Stefan Behnel wrote:
>> A good way to test that promise (or other implications like performance) 
>> might 
>> also be to rewrite the standard library extensions in Cython and see where 
>> it 
>> leads.
>
> Not sure I understand what you're saying here. stdlib extension modules are
> currently written in C, with a bit of code generation. How is that different?

When you are saying that writing C extensions is unnecessary,
because everything can be easily written in Cython, start
persuading me by rewriting all C extensions included in CPython
into Cython. If you are not willing to do it, why I should
I start rewriting my 7k lines of SWIG code to Cython, just
because you hope that somebody finally finally (please!) notices
existence of PyPy and hopefully starts to care about it. No,
they won’t.

Matěj
-- 
https://matej.ceplovi.cz/blog/, Jabber: mc...@ceplovi.cz
GPG Finger: 3C76 A027 CA45 AD70 98B5  BC1D 7920 5802 880B C9D8
 
Never, never, never believe any war will be smooth and easy, or
that anyone who embarks on the strange voyage can measure the
tides and hurricanes he will encounter. The statesman who yields
to war fever must realise that once the signal is given, he is no
longer the master of policy but the slave of unforeseeable and
uncontrollable events.
-- Winston Churchill, 1930

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FXUBENCEONTRFTAKR2SNIJ5OOEE22FBY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread Matěj Cepl
On 2020-04-13, 17:39 GMT, Eric Fahlgren wrote:
> Ok, so put that in a Pros/Cons list that provides guidance as to what
> interface and tools to choose when writing a new extension module.
> Personally, I'd put Cython (and other "big" packages, numpy, requests and
> such) on par with CPython itself with respect to "likely to implode and
> become unusable."

Time for the unplesant questions: what is the bus factor of Cython?

Best,

Matěj
-- 
https://matej.ceplovi.cz/blog/, Jabber: mc...@ceplovi.cz
GPG Finger: 3C76 A027 CA45 AD70 98B5  BC1D 7920 5802 880B C9D8
 
To love another person
Is to see the face of God.
  -- yes, incredibly cheesy verse from the screenplay of the
 movie Les Miserables (2012)

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RT4HSRZF37KYED2ZREDMF37EAYYFNR4R/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread Ronald Oussoren via Python-Dev


> On 10 Apr 2020, at 19:20, Victor Stinner  wrote:
> 
[…]

> 
> 
> PEP xxx: Modify the C API to hide implementation details
> 
> 
> Abstract
> 
> 
> * Hide implementation details from the C API to be able to `optimize
>  CPython`_ and make PyPy more efficient.
> * The expectation is that `most C extensions don't rely directly on
>  CPython internals`_ and so will remain compatible.
> * Continue to support old unmodified C extensions by continuing to
>  provide the fully compatible "regular" CPython runtime.
> * Provide a `new optimized CPython runtime`_ using the same CPython code
>  base: faster but can only import C extensions which don't use
>  implementation details. Since both CPython runtimes share the same
>  code base, features implemented in CPython will be available in both
>  runtimes.
> * `Stable ABI`_: Only build a C extension once and use it on multiple
>  Python runtimes and different versions of the same runtime.
> * Better advertise alternative Python runtimes and better communicate on
>  the differences between the Python language and the Python
>  implementation (especially CPython).
> 
> Note: Cython and cffi should be preferred to write new C extensions.

I’m too old… I still prefer the CPython ABI over the other two mostly because 
that’s what I know best but also the reduce dependencies. 

> This PEP is about existing C extensions which cannot be rewritten with
> Cython.

I’m not sure what this PEP  proposes beyond “lets make the stable ABI the 
default API” and provide a mechanism to get access to the current API.  I guess 
the proposal also expands the scope for the stable ABI, some internals that are 
currently exposed in the stable ABI would no longer be so. 

I’m not  opposed to this as long as it is still possible to use the current 
API, possibly with clean-ups and correctness fixes, As you write the CPython 
API has some features that make writing correct code harder, in particular the 
concept of borrowed references. There’s still good reasons to want be as close 
to the metal as possible, both to get maximal performance and to accomplish 
things that aren’t possible using the stable ABI.

[…]
> 
> API and ABI incompatible changes
> 
> 
> * Make structures opaque: move them to the internal C API.
> * Remove functions from the public C API which are tied to CPython
>  internals. Maybe begin by marking these functions as private (rename
>  ``PyXXX`` to ``_PyXXX``) or move them to the internal C API.
> * Ban statically allocated types (by making ``PyTypeObject`` opaque):
>  enforce usage of ``PyType_FromSpec()``.
> 
> Examples of issues to make structures opaque:
> 
> * ``PyGC_Head``: https://bugs.python.org/issue40241
> * ``PyObject``: https://bugs.python.org/issue39573
> * ``PyTypeObject``: https://bugs.python.org/issue40170
> * ``PyThreadState``: https://bugs.python.org/issue39573
> 
> Another example are ``Py_REFCNT()`` and ``Py_TYPE()`` macros which can
> currently be used l-value to modify an object reference count or type.
> Python 3.9 has new ``Py_SET_REFCNT()`` and ``Py_SET_TYPE()`` macros
> which should be used instead. ``Py_REFCNT()`` and ``Py_TYPE()`` macros
> should be converted to static inline functions to prevent their usage as
> l-value.
> 
> **Backward compatibility:** backward incompatible on purpose. Break the
> limited C API and the stable ABI, with the assumption that `Most C
> extensions don't rely directly on CPython internals`_ and so will remain
> compatible.

This is definitely backward incompatible in a way that affects all extensions 
defining types without using  PyTypeSpec due to having PyObject ad PyTypeObject 
in the list. I wonder how large a percentage of existing extensions is affected 
by this.  

Making “PyObject” opaque will also affect the stable ABI because even types 
defined using the PyTypeSpec API embed a “PyObject” value in the structure 
defining the instance layout. It is easy enough to change this in a way that 
preserves source-code compatibility, but I’m  not sure it is possible to avoid 
breaking the stable ABI. 

BTW. This will require growing the PyTypeSpec ABI a little, there are features 
you cannot implement using that API for example the buffer protocol. 

[…]
> 
> 
> CPython specific behavior
> =
> 
> Some C functions and some Python functions have a behavior which is
> closely tied to the current CPython implementation.
> 
> is operator
> ---
> 
> The "x is y" operator is closed tied to how CPython allocates objects
> and to ``PyObject*``.
> 
> For example, CPython uses singletons for numbers in [-5; 256] range::
> 
 x=1; (x + 1) is 2
>True
 x=1000; (x + 1) is 1001
>False
> 
> Python 3.8 compiler now emits a ``SyntaxWarning`` when the right operand
> of the ``is`` and ``is not`` operators is a literal (ex: integer or
> 

[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread André Malo
Stefan Behnel wrote:
> André Malo schrieb am 14.04.20 um 13:39:
> 
> > I think, it does not serve well as a policy for CPython. Since we're
> > talking 
 hypotheticals right now, if Cython vanishes tomorrow, we're
> > kind of left empty handed. Such kind of a runtime, if considered part of
> > the compatibility "promise", should be provided by the core itself, no?
> 
> 
> There was some discussion a while ago about integrating a stripped-down
> variant of Cython into CPython's stdlib. I was arguing against that because
> the selling point of Cython is really what it is, and stripping that down
> wouldn't lead to something equally helpful for users.
> 
> I think it's good to have separate projects (and, in fact, it's more than
> one) deal with this need.
> 
> In the end, it's an external tool, [...]

Thank you, that is my point exactly. It's the same "external" as everything 
else. I'm still trying to understand where to separate the different sets of 
"external".

> 
> > A good way to test that promise (or other implications like performance)
> > might 
 also be to rewrite the standard library extensions in Cython and
> > see where it leads.
> 
> 
> Not sure I understand what you're saying here. stdlib extension modules are
> currently written in C, with a bit of code generation. How is that
> different? 

They are C extensions like the ones everybody could write. They should use the 
same APIs. What I'm saying is, that it would be a good test if the APIs are 
good enough (for everybody else). If, say, Cython is recommended, some attempt 
should be made to achieve the same results with Cython. Or some other sets of 
APIs which are considered for "the public".

I don't think, the current stdlib modules restrict themselves to a limited 
API. The distinction between "inside" and "outside" bothers me.


> > I personally see myself using the python-provided runtime (types, methods,
> > 
 GC), out of convenience (it's there, so why not use it). The vision of
> > the future outlined here can easily lead to backing off from that and
> > rebuilding all those things and really only keep touchpoints with python
> > when it comes to interfacing with python itself. It's probably even
> > desirable that way
> 
> That's actually not an uncommon thing to do. Some packages really only use
> Cython or pybind11 to wrap their otherwise native C or C++ code. It's a
> choice given specific organisational/project/developer constraints, and
> choices are good.

Agreed. Nevertheless, the choices are going to be limited by extra 
constraints.

Cheers,
nd

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/U7KHC7KV5GOOQ4ST5HI3MZKAW4CMRJ6S/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread Stefan Behnel
Paul Moore schrieb am 13.04.20 um 14:25:
> On a related but different note, what is the recommended policy
> (assuming it's not to use the C API) for embedding Python, and for
> exposing the embedding app to Python as a C extension? My standard
> example of this is the Vim interface to Python - see
> https://github.com/vim/vim/blob/master/src/if_python3.c. I originally
> wrote this back in the Python 1.5 days, so it's *very* old, and quite
> likely not how I'd write it now, even using the C API. But what's the
> recommendation for code like that in the face of these changes, and
> the suggestion that using 3rd party tools is the normal way to write C
> extensions?

Embedding is not very well documented overall. I recently looked through
the docs to collect what a user would need to know in this case, and ended
up creating at least a little link collection, because I failed to find a
good place to refer users to. The things people need to know from the
CPython docs are scattered across different places, and lack a complete
real-world-like example that "most people" could start from. (I don't think
many users will pass strings into Python to execute code there.)

https://cython.readthedocs.io/en/latest/src/tutorial/embedding.html

From Cython's PoV, the main thing that future embedders need to understand
is that it's not really different from extending – you just have to start
the Python runtime before doing anything else. I think there should be some
help for getting that done, and then it's just executing your Python code
in some module. Cython then has its ways to go back and forth from there,
e.g. by writing cdef (C) functions as entry points for your application.

Cython currently doesn't really have "direct" support for embedding. You
can let it generate a C main function for you to start your program, but
that's not what you want in the case of vim. There's a "cython_freeze"
script that generates an inittab list in addition, but it's a bit
simplistic and not integrated. We have a beginners ticket for integrating
it better:

https://github.com/cython/cython/issues/2849

What I would like to see eventually is to let users pass a list of modules
into Cython's frontend (maybe cythonize(), maybe not), and then it would
just generate a single distutils Extension from them that links everything
together and registers all modules on import, optionally with a generated
exported C function that starts up the whole thing. That seems simple
enough to do and use, and you end up with a shared library that your
application can load. PRs welcome. :)

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Y6VRSVWYSV63AFQNAQEIJZBDZZG7QOTM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread Stefan Behnel
André Malo schrieb am 14.04.20 um 13:39:
> I think, it does not serve well as a policy for CPython. Since we're talking 
> hypotheticals right now, if Cython vanishes tomorrow, we're kind of left 
> empty 
> handed. Such kind of a runtime, if considered part of the compatibility 
> "promise", should be provided by the core itself, no?

There was some discussion a while ago about integrating a stripped-down
variant of Cython into CPython's stdlib. I was arguing against that because
the selling point of Cython is really what it is, and stripping that down
wouldn't lead to something equally helpful for users.

I think it's good to have separate projects (and, in fact, it's more than
one) deal with this need.

In the end, it's an external tool, like your editor, your C compiler, your
debugger and whatever else you need for developing Python extensions. It
spits out C code and lets you do with it what you want. There's no reason
it should be part of the CPython project, core or stdlib. It's even written
in Python. If it doesn't work for you, you can fix it.


> A good way to test that promise (or other implications like performance) 
> might 
> also be to rewrite the standard library extensions in Cython and see where it 
> leads.

Not sure I understand what you're saying here. stdlib extension modules are
currently written in C, with a bit of code generation. How is that different?


> I personally see myself using the python-provided runtime (types, methods, 
> GC), out of convenience (it's there, so why not use it). The vision of the 
> future outlined here can easily lead to backing off from that and rebuilding 
> all those things and really only keep touchpoints with python when it comes 
> to 
> interfacing with python itself. It's probably even desirable that way

That's actually not an uncommon thing to do. Some packages really only use
Cython or pybind11 to wrap their otherwise native C or C++ code. It's a
choice given specific organisational/project/developer constraints, and
choices are good.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QZSX36TPAKLXAA3O6KLUNCPKVJ2SKASN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread Stefan Behnel
Steve Dower schrieb am 14.04.20 um 00:27:
> On 13Apr2020 2308, André Malo wrote:
>> For one thing, if you open up APIs for Cython, they're open for everybody
>> (Cython being "just" another C extension).
>> More to the point: The ABIs have the same problem as they have now,
>> regardless
>> how responsive the Cython developers are. Once you compiled the extension,
>> you're using the ABI and are supposedly not required to recompile to stay
>> compatible.
>>
>> So, where I'm getting at is: Either you open up to everybody or nobody. In C
>> there's not really an in-between.
> 
> On a technical level, you are correct.
> 
> On a policy level, we don't make changes that would break users of the C
> API. Because we can't track everyone who's using it, we have to assume that
> everything is used and any change will cause breakage.
> 
> To make sure it's possible to keep developing CPython, we declare parts of
> the API off limits (typically by prepending them with an underscore). If
> you use these, and you break, we're sorry but we aren't going to fix it.
> 
> This line of discussion is basically saying that we would designate a
> broader section of the API that is off limits, most likely the parts that
> are only useful for increased performance (rather than increased
> functionality). We would then specifically include the Cython
> team/volunteers in discussions about how to manage changes to these parts
> of the API to avoid breaking them, and possibly do simultaneous releases to
> account for changes so that their users have more time to rebuild.
> 
> Effectively, when we change our APIs, we would break everyone except Cython
> because we've worked with them to avoid the breakage. Anyone else using it
> has to make their own effort to follow CPython development and detect any
> breakage themselves (just like today).
> 
> So probably the part you're missing is where we would give ourselves
> permission to break more APIs in a release, while simultaneously
> encouraging people to use Cython as an isolation layer from those breaks.

To add to that, the main difference for users here is a choice:

1) I want to use whatever is in the C-API and will fix my broken code
myself whenever there's a new CPython release.

2) I write my code against the stable ABI, accept the performance
limitations, and hope that it'll "never" break and my code just keeps
working (even through future compatibility layers, if necessary).

3) I use Cython and rerun it on my code at least once for each new CPython
release series, because I want to get the best performance for each target
version.

4) I use Cython and activate its (yet to be completed) stable ABI mode, so
that I don't have to target separate (C)Python releases but can release a
single wheel, at the cost of reduced performance.

And then there are a couple of grey areas, e.g. people using Cython plus a
bit of the C-API directly, for which they are then responsible themselves
again. But it's still way easier to adapt 3% of your code every couple of
CPython releases than all of your modules for each new release. That's just
the normal price that you pay for manual optimisations.

A nice feature of Cython here is that 3) and 4) are actually not mutually
exclusive, at least as it looks so far. You should eventually be able to
generate both from your same sources (we are trying hard to keep them in
the same C file), and even mix them on PyPI, e.g. distribute a generic
stable ABI wheel for all Pythons that support it, plus accelerated wheels
for CPython 3.9 and 3.10. You may even be able to release a pure Python
wheel as well, as we currently do for Cython itself to better support PyPy.

And to drive the point home, if CPython starts changing its C-API more
radically, or comes up with a new one, we can add the support for it to
Cython and then, in the best case, users will still only have to rerun it
on their code to target that new API. Compare that to case 1).


> (Cython is still just a placeholder name here, btw. There are 1-2 other
> projects that could be considered instead, though I think Cython is the
> only one that also provides a usability improvement as well as API
> stability.)

pybind11 and mypyc could probably make a similar offer to users. The
important point is just that we centralise the abstraction and adaptation work.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BHH3XKTCKZ73WQNHPVHYNBMJPYBELZFV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread André Malo
Steve Dower wrote:

> On a policy level, we don't make changes that would break users of the C 
> API. Because we can't track everyone who's using it, we have to assume 
> that everything is used and any change will cause breakage.
> 
> To make sure it's possible to keep developing CPython, we declare parts 
> of the API off limits (typically by prepending them with an underscore). 
> If you use these, and you break, we're sorry but we aren't going to fix it.
> 
> This line of discussion is basically saying that we would designate a 
> broader section of the API that is off limits, most likely the parts 
> that are only useful for increased performance (rather than increased 
> functionality). We would then specifically include the Cython 
> team/volunteers in discussions about how to manage changes to these 
> parts of the API to avoid breaking them, and possibly do simultaneous 
> releases to account for changes so that their users have more time to 
> rebuild.
> 
> Effectively, when we change our APIs, we would break everyone except 
> Cython because we've worked with them to avoid the breakage. Anyone else 
> using it has to make their own effort to follow CPython development and 
> detect any breakage themselves (just like today).
> 
> So probably the part you're missing is where we would give ourselves 
> permission to break more APIs in a release, while simultaneously 
> encouraging people to use Cython as an isolation layer from those breaks.

The encouraging part is not working for me :-) And seriously, my gut tells me, 
we're split at 50/50 here. People usually write C for a reason and Cython is 
not. For, let's say, half of the cases that's fine, speeding up inner loops 
and all that, which not touching the C level at all. The other half wants to 
solve different issues.

I think, it does not serve well as a policy for CPython. Since we're talking 
hypotheticals right now, if Cython vanishes tomorrow, we're kind of left empty 
handed. Such kind of a runtime, if considered part of the compatibility 
"promise", should be provided by the core itself, no?
A good way to test that promise (or other implications like performance) might 
also be to rewrite the standard library extensions in Cython and see where it 
leads.

I personally see myself using the python-provided runtime (types, methods, 
GC), out of convenience (it's there, so why not use it). The vision of the 
future outlined here can easily lead to backing off from that and rebuilding 
all those things and really only keep touchpoints with python when it comes to 
interfacing with python itself. It's probably even desirable that way. But 
definitely more work (for an extension author).

As a closing word, I don't mind either way. IOW I'm not complaining. I'm just 
putting more opinion from the "outside" into the ring. Thanks for listening 
:-)

Cheers,
nd

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZDPNR3PO4RVQ3EHQQLEFEKUVBF72A2MX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Steve Dower

On 13Apr2020 2308, André Malo wrote:

For one thing, if you open up APIs for Cython, they're open for everybody
(Cython being "just" another C extension).
More to the point: The ABIs have the same problem as they have now, regardless
how responsive the Cython developers are. Once you compiled the extension,
you're using the ABI and are supposedly not required to recompile to stay
compatible.

So, where I'm getting at is: Either you open up to everybody or nobody. In C
there's not really an in-between.


On a technical level, you are correct.

On a policy level, we don't make changes that would break users of the C 
API. Because we can't track everyone who's using it, we have to assume 
that everything is used and any change will cause breakage.


To make sure it's possible to keep developing CPython, we declare parts 
of the API off limits (typically by prepending them with an underscore). 
If you use these, and you break, we're sorry but we aren't going to fix it.


This line of discussion is basically saying that we would designate a 
broader section of the API that is off limits, most likely the parts 
that are only useful for increased performance (rather than increased 
functionality). We would then specifically include the Cython 
team/volunteers in discussions about how to manage changes to these 
parts of the API to avoid breaking them, and possibly do simultaneous 
releases to account for changes so that their users have more time to 
rebuild.


Effectively, when we change our APIs, we would break everyone except 
Cython because we've worked with them to avoid the breakage. Anyone else 
using it has to make their own effort to follow CPython development and 
detect any breakage themselves (just like today).


So probably the part you're missing is where we would give ourselves 
permission to break more APIs in a release, while simultaneously 
encouraging people to use Cython as an isolation layer from those breaks.


(Cython is still just a placeholder name here, btw. There are 1-2 other 
projects that could be considered instead, though I think Cython is the 
only one that also provides a usability improvement as well as API 
stability.)


Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BPFQKMXTMVVSFVFEAJRXAPVQEZE3HMFN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread André Malo
Steve Dower wrote:
> On 11Apr2020 0025, Antoine Pitrou wrote:
> > On Fri, 10 Apr 2020 23:33:28 +0100
> > 
> > Steve Dower  wrote:
> >> On 10Apr2020 2055, Antoine Pitrou wrote:
> >>> On Fri, 10 Apr 2020 19:20:00 +0200
> >>> 
> >>> Victor Stinner  wrote:
>  Note: Cython and cffi should be preferred to write new C extensions.
>  This PEP is about existing C extensions which cannot be rewritten with
>  Cython.
> >>> 
> >>> Using Cython does not make the C API irrelevant.  In some
> >>> applications, the C API has to be low-level enough for performance.
> >>> Whether the application is written in Cython or not.
> >> 
> >> It does to the code author.
> >> 
> >> The point here is that we want authors who insist on coding against the
> >> C API to be aware that they have fewer compatibility guarantees [...]
> > 
> > Yeah, you missed the point of my comment here.  Cython *does* call into
> > the C API, and it's quite insistent on performance optimizations too.
> > Saying "just use Cython" doesn't make the C API unimportant - it just
> > hides it from your own sight.
> 
> It centralises the change. I have no problem giving Cython access to
> things that we discourage every developer from using, provided they
> remain responsive to change and use the special access responsibly (e.g.
> by not touching reserved fields at all).

It appears to me that this whole line of argument is contradicting the purpose 
of the whole idea. What am I missing?

For one thing, if you open up APIs for Cython, they're open for everybody 
(Cython being "just" another C extension).
More to the point: The ABIs have the same problem as they have now, regardless 
how responsive the Cython developers are. Once you compiled the extension, 
you're using the ABI and are supposedly not required to recompile to stay 
compatible.

So, where I'm getting at is: Either you open up to everybody or nobody. In C 
there's not really an in-between.

Cheers,
nd

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/S2BRL2GJGRNK5WXCSZPNLLMN4LGA5KTN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Steve Dower

On 13Apr2020 2105, Chris Meyer wrote:
How would I call a Python function from the C++ application that returns 
a Python object to C++ and then call a method on that Python object from 
C++?


My specific example is that I create Python handlers for Qt windows and 
then from the Qt/C++ I call methods on those Python objects from C++ 
such as “handle mouse event”.


You're in a bit of trouble here regardless, depending on how robust you 
need to be. If you've only got synchronous, single-threaded event 
handlers then you'll be okay. Anything more complex and you'll have some 
fun debugging sessions to look forward to.


I would definitely say look at PyBind11. A while ago I posted a sample 
using this to embed Python in a game engine at 
https://devblogs.microsoft.com/python/embedding-python-in-a-cpp-project-with-visual-studio/ 
(VS is not required, it just happened to be the hook to do the 
post/video ;) )


To jump straight to the code, go to 
https://github.com/zooba/ogre3d-python-embed/blob/master/src/PythonCharacter.cpp 
and search for "py::", and also 
https://github.com/zooba/ogre3d-python-embed/blob/master/src/ogre_module.h


PyBind11 is nice for avoiding the boilerplate and ref-counting, but has 
its own set of obscure error cases. It's also not as easy to debug as 
Cython or going straight to the Python C API, depending on what the 
issue is, as there's no straightforward generated code. Even stepping 
through the templated code interactively in VS doesn't help make it any 
easier to follow.


Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GVXH2AJC7Z2F5AIBIMCXEDKXLEYVCCU4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Daniel Holth
Sorry that this is a bit off-topic. cffi would be a user of any new C API.

I've tried to make sure ABI3 is supported in setuptools and wheel, with
varying success. Apparently virtualenvs and Windows have problems. I'm
excited about the possibility of a better C API and possibly ABI.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XFHKIXSEEZMCC7HZ3N4GP6SBUBYACI6K/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Chris Meyer
> On Apr 13, 2020, at 11:26 AM, Daniel Holth  wrote:
> 
> Was it regular cffi or cffi's embedding API, which is used a bit differently 
> than regular cffi, that "seems to only solve a fraction of the problem"? Was 
> just playing around with the embedding API and was impressed.
> 
> In Python:
> 
> @ffi.def_extern()
> def uwsgi_pyexample_init():
> print("init called")
> 
> return 0
> 
> In C (embedded in the same plugin):
> 
> CFFI_DLLEXPORT struct uwsgi_plugin pyexample_plugin = {
> .init = uwsgi_pyexample_init
> };
> 
> Seems to be happily importing and exporting APIs. Interpreter starts the 
> first time a @ffi.def_extern() function is called.
> 
> https://cffi.readthedocs.io/en/latest/embedding.html 
> 
> 
> https://github.com/unbit/uwsgi/blob/f6ad0c6dfe431d91ffe365bed3105ed052bef6e4/plugins/pyexample/pyexample_plugin.py
>  
> 
I might need to understand cffi embedding more to really answer your question - 
and it’s entirely possible cffi can do this - but as a simple example:

How would I call a Python function from the C++ application that returns a 
Python object to C++ and then call a method on that Python object from C++?

My specific example is that I create Python handlers for Qt windows and then 
from the Qt/C++ I call methods on those Python objects from C++ such as “handle 
mouse event”.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6PZ7OJSPXJVQ2BLZWGOOMWZN6EINLQAV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Daniel Holth
It can be done exactly as passing a void* when registering a C callback,
and getting it passed back to your callback function.
https://cffi.readthedocs.io/en/latest/ref.html#ffi-new-handle-ffi-from-handle

https://bitbucket.org/dholth/kivyjoy/src/aaeab79b2891782209a1219cd65a4d9716cea669/kivyjoy/controller.py#lines-49
https://bitbucket.org/dholth/kivyjoy/src/aaeab79b2891782209a1219cd65a4d9716cea669/kivyjoy/__init__.py#lines-15
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QDHQANZJAKPTTOJHXLGFSJWKLE5NA6ZU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Daniel Holth
Was it regular cffi or cffi's embedding API, which is used a bit
differently than regular cffi, that "seems to only solve a fraction of the
problem"? Was just playing around with the embedding API and was impressed.

In Python:

@ffi.def_extern()
def uwsgi_pyexample_init():
print("init called")

return 0

In C (embedded in the same plugin):

CFFI_DLLEXPORT struct uwsgi_plugin pyexample_plugin = {
.init = uwsgi_pyexample_init
};

Seems to be happily importing and exporting APIs. Interpreter starts the
first time a @ffi.def_extern() function is called.

https://cffi.readthedocs.io/en/latest/embedding.html

https://github.com/unbit/uwsgi/blob/f6ad0c6dfe431d91ffe365bed3105ed052bef6e4/plugins/pyexample/pyexample_plugin.py
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/S5P524LXXNIXIB2MUATZNHYQE57MXCQ5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Eric Fahlgren
On Mon, Apr 13, 2020 at 9:00 AM Steve Dower  wrote:

> On 13Apr2020 1325, Paul Moore wrote:
> > Personally, I'd say that "recommended 3rd party tools" reads as saying
> > "if you want a 3rd party tool to build extensions, these are good (and
> > are a lot easier than using the raw C API)". That's a lot different
> > than saying "we recommend that people writing C extensions do not use
> > the raw C API, but use one of these tools instead".
>
> Yeah, that's fair. But at the same time, saying anything more strong is
> an endorsement that we might have to withdraw at some point in the
> future (if the project we recommend implodes, for example).
>

Ok, so put that in a Pros/Cons list that provides guidance as to what
interface and tools to choose when writing a new extension module.
Personally, I'd put Cython (and other "big" packages, numpy, requests and
such) on par with CPython itself with respect to "likely to implode and
become unusable."
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VFRGMKVVE3OUJUXARNEIC67GFC6H22K7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Chris Meyer
> On Apr 13, 2020, at 5:25 AM, Paul Moore  wrote:
> 
> On a related but different note, what is the recommended policy
> (assuming it's not to use the C API) for embedding Python, and for
> exposing the embedding app to Python as a C extension? My standard
> example of this is the Vim interface to Python - see
> https://github.com/vim/vim/blob/master/src/if_python3.c 
> . I originally
> wrote this back in the Python 1.5 days, so it's *very* old, and quite
> likely not how I'd write it now, even using the C API. But what's the
> recommendation for code like that in the face of these changes, and
> the suggestion that using 3rd party tools is the normal way to write C
> extensions?

I’d like to +1 this request for a standard for embedding Python while
at the same time exposing the embedding app to Python as a C extension. We do
something remarkably similar to Vim here (along with other files in the same 
directory).

https://github.com/nion-software/nionui-tool/blob/master/launcher/PythonStubs.cpp
 


I’ve looked into cffi but it seems to only solve a fraction of the problem. Our
Qt-based application embeds Python and provides callbacks from Python to our
application. It runs on macOS, Linux, and Windows and runs unchanged on
Python 3.6, 3.7, and 3.8 since it dynamically links to Python.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5RMFHARU7AY5TY7TKU356UTNHCLYBDYB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Steve Dower

On 13Apr2020 1325, Paul Moore wrote:

Personally, I'd say that "recommended 3rd party tools" reads as saying
"if you want a 3rd party tool to build extensions, these are good (and
are a lot easier than using the raw C API)". That's a lot different
than saying "we recommend that people writing C extensions do not use
the raw C API, but use one of these tools instead".


Yeah, that's fair. But at the same time, saying anything more strong is 
an endorsement that we might have to withdraw at some point in the 
future (if the project we recommend implodes, for example).



Also, if we *are* going to push people away from the raw C API, then I
think we should be recommending a particular tool (likely Cython) as
what people writing their first extension (or wanting to switch from
the raw C API for the first time) should use. Faced with the API docs,
and a list of 3rd party options, I know that *I* am likely to say
"yeah, leave that research for another day, I'll use what's in the
docs in front of me for now". Also, if we are expecting to push people
towards 3rd party tools, that seems to me to be a relatively
significant shift in emphasis, and one we should be publicising more
directly (via What's New, and blog postings / release announcements,
etc.) In the absence of anything like that, I think it's quite
reasonable for people to gravitate towards the traditional C API.


Right, except we haven't decided to do it yet. There's still a debate 
about whether the current third party tools are even sufficient (not to 
mention what "sufficient" means).



Having said all this, I *do* think that promoting some 3rd party tool
(as I say, I suspect this would be Cython) as the recommended means of
writing C extensions, is a reasonable approach to take. I just object
to it happening "quietly" via changes like this which make it harder
to use the raw C API, justifying themselves by saying "you shouldn't
do that anyway".


Agreed, I'd rather be up front about it.


On a related but different note, what is the recommended policy
(assuming it's not to use the C API) for embedding Python, and for
exposing the embedding app to Python as a C extension? My standard
example of this is the Vim interface to Python - see
https://github.com/vim/vim/blob/master/src/if_python3.c. I originally
wrote this back in the Python 1.5 days, so it's *very* old, and quite
likely not how I'd write it now, even using the C API. But what's the
recommendation for code like that in the face of these changes, and
the suggestion that using 3rd party tools is the normal way to write C
extensions?


I don't think any current 3rd party tools really help with embedding (I 
say that as a regular embedder, not as someone who skim-read their 
docs). In this case, you really do need low-level access to Python's 
thread and memory management, and the ability to interact directly with 
the rest of your application's data structures.


PyBind11 is the best I've used here - Cython insists on including all 
its boilerplate to make a complete module, which often is not what you 
want. But there's a lot of core things that need to be improved if 
embedding is going to get any better, as I've posted often enough. We 
can't rely on third-party tools here, yet.


Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/F6F6HQPSOEEQKLW2M6OQSSVMWXZHQ6Y3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Fabio Zadrozny
> * Hide implementation details from the C API to be able to `optimize
>   CPython`_ and make PyPy more efficient.
> * The expectation is that `most C extensions don't rely directly on
>   CPython internals`_ and so will remain compatible.
> * Continue to support old unmodified C extensions by continuing to
>   provide the fully compatible "regular" CPython runtime.
> * Provide a `new optimized CPython runtime`_ using the same CPython code
>   base: faster but can only import C extensions which don't use
>   implementation details. Since both CPython runtimes share the same
>   code base, features implemented in CPython will be available in both
>   runtimes.
>
>
Adding my 2cents from someone who does use the CPython API (for a debugger).

I must say I'm -1 until alternative APIs needed are available in the
optimized CPython runtime (I'd also say that this is a really big
incompatible change and would need a Python 4.0 to do)... I guess that in
order for this to work, the first step wouldn't be breaking everyone but
talking to extension authors (maybe checking for the users of the APIs
which will be deprecated) and seeing alternatives before pushing something
which will break CPython extensions which rely on such APIs.

I also don't think that CPython should have 2 runtimes... if the idea is to
leverage extensions to other CPython implementations, I think going just
for a more limited API is the way to go (but instead of just breaking
extensions that use the CPython internal API, try to come up with
alternative APIs for the users of the current CPython API -- for my use
case, I know the debugger could definitely do with just a few simple
additions: it uses the internal API mostly because there aren't real
alternatives for a couple of use cases). i.e.: if numpy/pandas/ doesn't adopt the optimized runtime because they don't have the
needed support they need, it won't be useful to have it in the first place
(you'd just be in the same place where other Python implementations already
are).

Also, this should probably follow the usual deprecation cycle: do a major
CPython release which warns about using the APIs that'll be deprecated and
only in the next CPython release should those APIs be actually removed (and
when that's done it probably deserves to be called Python 4).

Cheers,

Fabio
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IT6TQLDRII66K4T42NU2ZFTYOE6GYBRI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Rhodri James

On 13/04/2020 11:17, Steve Dower wrote:

On 11Apr2020 1156, Rhodri James wrote:

On 10/04/2020 18:20, Victor Stinner wrote:

Note: Cython and cffi should be preferred to write new C extensions.
This PEP is about existing C extensions which cannot be rewritten with
Cython.


If this is true, the documentation on python.org needs a serious 
rewrite.  I am in the throes of writing a C extension, and using 
Cython or cffi never even crossed my mind.




Sorry you missed the first two sections: "Recommended third party tools" 
and "Creating extensions without third party tools".


https://docs.python.org/3/extending/index.html


"Creating extensions without third party tools" is what I read, because 
the preceding sections suggested to me that was what I was supposed to do.


The opening paragraph of the document more or less reads as "This is how 
you write C extensions."  It's an intro.  That's fair enough.


The next section, "Recommended third party tools", basically says "Third 
party tools exist."  It notably does not say "Use them in preference to 
what follows," so I didn't even look at them.  There is in fact a mild 
statement at the end of the first paragraph of the next section that 
should have clued me in, but I missed it because I'd already skipped to 
the table of contents.


If you have any suggestions on how to make this recommendation more 
obvious, please open an issue and describe what would have helped.


I'll give it some thought, but fundamentally if you want people to use 
the third party tools, you need a much stronger statement to that 
effect.  I'm sure I'm not the only one whose reaction to "third party" 
is "not official then".


--
Rhodri James *-* Kynesim Ltd
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4NDJSPJC7FKY7QGHLE3PVK4FGPMBW3EV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Paul Moore
On Mon, 13 Apr 2020 at 11:20, Steve Dower  wrote:
>
> On 11Apr2020 1156, Rhodri James wrote:
> > On 10/04/2020 18:20, Victor Stinner wrote:
> >> Note: Cython and cffi should be preferred to write new C extensions.
> >> This PEP is about existing C extensions which cannot be rewritten with
> >> Cython.
> >
> > If this is true, the documentation on python.org needs a serious
> > rewrite.  I am in the throes of writing a C extension, and using Cython
> > or cffi never even crossed my mind.
> >
>
> Sorry you missed the first two sections: "Recommended third party tools"
> and "Creating extensions without third party tools".
>
> https://docs.python.org/3/extending/index.html
>
> If you have any suggestions on how to make this recommendation more
> obvious, please open an issue and describe what would have helped.

Personally, I'd say that "recommended 3rd party tools" reads as saying
"if you want a 3rd party tool to build extensions, these are good (and
are a lot easier than using the raw C API)". That's a lot different
than saying "we recommend that people writing C extensions do not use
the raw C API, but use one of these tools instead".

Also, if we *are* going to push people away from the raw C API, then I
think we should be recommending a particular tool (likely Cython) as
what people writing their first extension (or wanting to switch from
the raw C API for the first time) should use. Faced with the API docs,
and a list of 3rd party options, I know that *I* am likely to say
"yeah, leave that research for another day, I'll use what's in the
docs in front of me for now". Also, if we are expecting to push people
towards 3rd party tools, that seems to me to be a relatively
significant shift in emphasis, and one we should be publicising more
directly (via What's New, and blog postings / release announcements,
etc.) In the absence of anything like that, I think it's quite
reasonable for people to gravitate towards the traditional C API.

Having said all this, I *do* think that promoting some 3rd party tool
(as I say, I suspect this would be Cython) as the recommended means of
writing C extensions, is a reasonable approach to take. I just object
to it happening "quietly" via changes like this which make it harder
to use the raw C API, justifying themselves by saying "you shouldn't
do that anyway".

On a related but different note, what is the recommended policy
(assuming it's not to use the C API) for embedding Python, and for
exposing the embedding app to Python as a C extension? My standard
example of this is the Vim interface to Python - see
https://github.com/vim/vim/blob/master/src/if_python3.c. I originally
wrote this back in the Python 1.5 days, so it's *very* old, and quite
likely not how I'd write it now, even using the C API. But what's the
recommendation for code like that in the face of these changes, and
the suggestion that using 3rd party tools is the normal way to write C
extensions?

Paul
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TWTWVC4IHBNV2P4HCFFOQZ2WZKEONQHB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Steve Dower

On 13Apr2020 1122, Steve Dower wrote:

On 11Apr2020 0111, Victor Stinner wrote:

Steve: the use case is to debug very rare Python crashes (ex: once
every two months) of customers who fail to provide a reproducer. My
*expectation* is that a debug build should help to reproduce the bug
and/or provide more information when the bug happens. My motivation
for this feature is also to show that the bug is not on Python but in
third-party C extensions ;-)


I think your expectation is wrong. If a stack trace of the crash doesn't 
show that it belongs to the third party module (which most of the ones 
that are sent back on Windows indeed show), then you need more invasive 
tracing to show that the issue came from the module. Until we actually 
have opaque, non-static objects, that doesn't seem to be possible.


All you've done right now is enable new inconsistencies and potential 
issues when mixing debug and release builds. That just makes things 
harder to diagnose.


I think what you really wanted to do here was have a build option 
_other_ than the debug flag to turn on additional checks. Like you did 
with tracemalloc.


The debug flag turns on additional runtime checks in the underlying C 
compiler and runtime on Windows (and I presume elsewhere? Is this such a 
crazy idea?), such as buffer overrun detection and memory misuse. The 
only way to make a debug build properly compatible with a release build 
is to disable these checks, which leaves us completely unable to take 
advantage of them. It also significantly speeds up compile time, which 
is very useful as a developer.


But if your goal is to have a release build that includes additional 
ABI-transparent checks, then I don't see why you wouldn't just build 
with those options? It's not like CPython takes that long to build from 
a clean working directory.


Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AY3QEYN6JEQFEZVJ2MUT5A2SJ5I72RAS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Steve Dower

On 13Apr2020 1157, Antoine Pitrou wrote:

On Mon, 13 Apr 2020 11:35:34 +0100
Steve Dower  wrote:

and this code
that they're using doesn't have any system dependencies that differ in
debug builds (spoiler: they do).


Are you talking about Windows?  On non-Windows systems, I don't think
there are "system dependencies that differ in debug builds".


Of course I'm talking about Windows. I'm about the only person here who
does, and I'm having to represent at least half of our overall userbase
(look up my PyCon 2019 talk for the charts).


Ok :-)  However, Victor's point holds for non-Windows platforms, which
is *also* half of our userbase.


True, though probably not the half sending him binary extension modules 
that nobody can rebuild ;)


Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CFJXEYIVPCJWCL27WN4BKEYN2RBKY3O5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Antoine Pitrou
On Mon, 13 Apr 2020 11:35:34 +0100
Steve Dower  wrote:
> 
> Neither Windows not macOS support fork (macOS only recently).

Victor's argument: "fork() is not terrific with inline reference
counts".

My argument: people shouldn't generally use fork() anyway, because it
has other issues.

My statement that people should prefer "forkserver" was in that
context (if you are trying to build parallel applications using fork()
calls, think twice). Obviously on Windows you'll use the "spawn"
method, because it's the only available one ;-)  And on macOS, you'll
probably do whatever the latest recommended thing to do is ("spawn", I
suppose).

> >> Separating refcounts theoretically improves cache locality, specifically
> >> the case where cache invalidation impacts multiple CPUs (and even the
> >> case where a single thread moves between CPUs).  
> > 
> > I'm a bit curious why it would improve, rather than degrade, cache
> > locality. If you take the typical example of the eval loop, an object
> > is incref'ed and decref'ed just about the same time that it gets used.  
> 
> Two CPUs can read the contents of a string from their own cache. As soon 
> as one touches the refcount, the cache line containing both the refcount 
> and the string data in the other CPU is invalidated, and now it has to 
> wait for synchronisation before reading the data.

Ah, you're right.  However, the GIL should make such events less
frequent than in a language like C++.  Compared to the overhead of
look up reference counts in a different memory area (probably using a
non-trivial algorithm to determine the exact address), I'm not sure
which factor would dominate.

> >> and this code
> >> that they're using doesn't have any system dependencies that differ in
> >> debug builds (spoiler: they do).  
> > 
> > Are you talking about Windows?  On non-Windows systems, I don't think
> > there are "system dependencies that differ in debug builds".  
> 
> Of course I'm talking about Windows. I'm about the only person here who 
> does, and I'm having to represent at least half of our overall userbase 
> (look up my PyCon 2019 talk for the charts).

Ok :-)  However, Victor's point holds for non-Windows platforms, which
is *also* half of our userbase.

Regards

Antoine.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OYM7W6QFVGGVP4EX32BEM37WB5O7DFIX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Steve Dower

On 11Apr2020 0025, Antoine Pitrou wrote:

On Fri, 10 Apr 2020 23:33:28 +0100
Steve Dower  wrote:

On 10Apr2020 2055, Antoine Pitrou wrote:

On Fri, 10 Apr 2020 19:20:00 +0200
Victor Stinner  wrote:


Note: Cython and cffi should be preferred to write new C extensions.
This PEP is about existing C extensions which cannot be rewritten with
Cython.


Using Cython does not make the C API irrelevant.  In some
applications, the C API has to be low-level enough for performance.
Whether the application is written in Cython or not.


It does to the code author.

The point here is that we want authors who insist on coding against the
C API to be aware that they have fewer compatibility guarantees [...]


Yeah, you missed the point of my comment here.  Cython *does* call into
the C API, and it's quite insistent on performance optimizations too.
Saying "just use Cython" doesn't make the C API unimportant - it just
hides it from your own sight.


It centralises the change. I have no problem giving Cython access to 
things that we discourage every developer from using, provided they 
remain responsive to change and use the special access responsibly (e.g. 
by not touching reserved fields at all).


We could do a better job of helping them here.


**Backward compatibility:** backward incompatible on purpose. Break the
limited C API and the stable ABI, with the assumption that `Most C
extensions don't rely directly on CPython internals`_ and so will remain
compatible.


The problem here is not only compatibility but potential performance
regressions in C extensions.


I don't think we've ever guaranteed performance between releases.
Correctness, sure, but not performance.


That's a rather weird argument.  Just because you don't guarantee
performance doesn't mean it's ok to introduce performance regressions.

It's especially a weird argument to make when discussing a PEP where
most of the arguments are distant promises of improved performance.


If you've guaranteed compatibility but not performance, it means you can 
make changes that prioritise compatibility over performance.


If you promise to keep everything the same, you can never change 
anything. Arguing that everything is an implied contract between major 
version releases is the weird argument.



Fork and "Copy-on-Read" problem
...

Solve the "Copy on read" problem with fork: store reference counter
outside ``PyObject``.


Nowadays it is strongly recommended to use multiprocessing with the
"forkserver" start method:
https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

With "forkserver", the forked process is extremely lightweight and
there are little savings to be made in the child.


Unfortunately, a recommendation that only applies to a minority of
Python users. Oh well.


Which "minority" are you talking about?  Neither of us has numbers, but
I'm quite sure that the population of Python users calling into
multiprocessing (or a third-party library relying on multiprocessing,
such as Dask) is much larger than the population of Python users
calling fork() directly and relying on copy-on-write for optimization
purposes.

But if you have a different experience to share, please do so.


Neither Windows not macOS support fork (macOS only recently).

Break that down however you like, but by number of *developers* (as 
opposed to number of machines), and factoring in those who care about 
cross-platform compatibility, fork is not a viable thing to rely on.



Separating refcounts theoretically improves cache locality, specifically
the case where cache invalidation impacts multiple CPUs (and even the
case where a single thread moves between CPUs).


I'm a bit curious why it would improve, rather than degrade, cache
locality. If you take the typical example of the eval loop, an object
is incref'ed and decref'ed just about the same time that it gets used.


Two CPUs can read the contents of a string from their own cache. As soon 
as one touches the refcount, the cache line containing both the refcount 
and the string data in the other CPU is invalidated, and now it has to 
wait for synchronisation before reading the data.


If the refcounts are in a separate cache line, this synchronization 
doesn't have to happen.



I'll also note that the PEP proposes to remove APIs which return
borrowed references... yet increasing the number of cases where
accessing an object implies updating its refcount.


Yeah, I'm more okay with keeping borrowed references in some cases, but 
it does make things more complicated. Apparently some developers get it 
wrong consistently enough that we have to fix it? (ALL developers get it 
wrong during development ;) )



and this code
that they're using doesn't have any system dependencies that differ in
debug builds (spoiler: they do).


Are you talking about Windows?  On non-Windows systems, I don't think
there are "system dependencies that differ in debug builds".


Of course I'm 

[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Steve Dower

On 11Apr2020 1156, Rhodri James wrote:

On 10/04/2020 18:20, Victor Stinner wrote:

Note: Cython and cffi should be preferred to write new C extensions.
This PEP is about existing C extensions which cannot be rewritten with
Cython.


If this is true, the documentation on python.org needs a serious 
rewrite.  I am in the throes of writing a C extension, and using Cython 
or cffi never even crossed my mind.




Sorry you missed the first two sections: "Recommended third party tools" 
and "Creating extensions without third party tools".


https://docs.python.org/3/extending/index.html

If you have any suggestions on how to make this recommendation more 
obvious, please open an issue and describe what would have helped.


Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7QZOSBXNNVQV6LYX3DPKI2RLSJ2K7XRY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread Steve Dower

On 11Apr2020 0111, Victor Stinner wrote:

Steve: the use case is to debug very rare Python crashes (ex: once
every two months) of customers who fail to provide a reproducer. My
*expectation* is that a debug build should help to reproduce the bug
and/or provide more information when the bug happens. My motivation
for this feature is also to show that the bug is not on Python but in
third-party C extensions ;-)


I think your expectation is wrong. If a stack trace of the crash doesn't 
show that it belongs to the third party module (which most of the ones 
that are sent back on Windows indeed show), then you need more invasive 
tracing to show that the issue came from the module. Until we actually 
have opaque, non-static objects, that doesn't seem to be possible.


All you've done right now is enable new inconsistencies and potential 
issues when mixing debug and release builds. That just makes things 
harder to diagnose.


Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/66K6TAPXEDBGSZFJLRNZNKNPCHIPUAAY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-11 Thread Phil Thompson

On 11/04/2020 13:08, Ivan Pozdeev via Python-Dev wrote:

On 10.04.2020 20:20, Victor Stinner wrote:



Stable ABI
--

The idea is to build a C extension only once: the built binary will be
usable on multiple Python runtimes and different versions of the same
runtime (stable ABI).

The idea is not new but is an extension of the `PEP 384: Defining a
Stable ABI `__ implemented 
in

CPython 3.4 with its "limited C API". The limited API is not used by
default and is not widely used: PyQt is one of the only few known 
users.


The idea here is that the default C API becomes the limited C API and 
so

all C extensions will benefit of advantages of a stable ABI.


In my practice with helping maintain a C extension module, it's not a
problem to build the module separately for every minor release.

That's because there are only a few officially supported releases, and
they aren't released frequently.

Conversely, if you are using a "limited ABI", you are "limited" (pun
intended) to what it has and can't take advantage of any new features
until the next major Python version -- i.e. for potentially several
years!

So I don't see any "advantages of a stable ABI" atm that matter in
practice while I do see _dis_advantages. So this area can perhaps be
excluded from the PEP or at least given low priority.
Unless, of course, you have some other, more real upcoming "advantages" 
in mind.


PyQt uses the stable ABI because it dramatically reduces the number of 
wheels that need to be created for a full release.


PyQt consists of 6 different PyPI packages. Wheels are provided for 4 
different platforms. Currently Python v3.5 to v3.8 are supported.


With the stable ABI that's 24 wheels for a full release. No additional 
wheels are needed when Python v3.9 is supported.


Without the stable ABI it would be 96 wheels. 24 additional wheels would 
be needed when Python v3.9 is supported.


Phil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GJLDX6SYWLOJ7JFAE6LCJZ6WEQJAYRGG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-11 Thread Ivan Pozdeev via Python-Dev


On 10.04.2020 20:20, Victor Stinner wrote:

Hi,

Here is a first draft a PEP which summarize the research work I'm
doing on CPython C API since 2017 and the changes that me and others
already made since Python 3.7 towards an "opaque" C API. The PEP is
also a collaboration with developers of PyPy, HPy, Rust-CPython and
many others! Thanks to everyone who helped me to write it down!

Maybe this big document should be reorganized as multiple smaller
better defined goals: as multiple PEPs. The PEP is quite long and
talks about things which are not directly related. It's a complex
topic and I chose to put everything as a single document to have a
good starting point to open the discussion. I already proposed some of
these ideas in 2017: see the Prior Art section ;-)

The PEP can be read on GitHub where it's better formatted:
https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst

If someone wants to work on the PEP itself, the document on GitHub is
the current reference.

Victor



PEP xxx: Modify the C API to hide implementation details


Abstract


* Hide implementation details from the C API to be able to `optimize
   CPython`_ and make PyPy more efficient.
* The expectation is that `most C extensions don't rely directly on
   CPython internals`_ and so will remain compatible.
* Continue to support old unmodified C extensions by continuing to
   provide the fully compatible "regular" CPython runtime.
* Provide a `new optimized CPython runtime`_ using the same CPython code
   base: faster but can only import C extensions which don't use
   implementation details. Since both CPython runtimes share the same
   code base, features implemented in CPython will be available in both
   runtimes.
* `Stable ABI`_: Only build a C extension once and use it on multiple
   Python runtimes and different versions of the same runtime.
* Better advertise alternative Python runtimes and better communicate on
   the differences between the Python language and the Python
   implementation (especially CPython).

Note: Cython and cffi should be preferred to write new C extensions.
This PEP is about existing C extensions which cannot be rewritten with
Cython.


Rationale
=

To remain competitive in term of performance with other programming
languages like Go or Rust, Python has to become more efficient.

Make Python (at least) two times faster
---

The C API leaks too many implementation details which prevent optimizing
CPython. See `Optimize CPython`_.

PyPy's support for Python's C API (pycext) is slow because it has to
emulate CPython internals like memory layout and reference counting. The
emulation causes memory overhead, memory copies, conversions, etc. See
`Inside cpyext: Why emulating CPython C API is so Hard
`_
(Sept 2018) by Antonio Cuni.

While this PEP may make CPython a little bit slower in the short term,
the long-term goal is to make "Python" at least two times faster. This
goal is not hypothetical: PyPy is already 4.2x faster than CPython and is
fully compatible. C extensions are the bottleneck of PyPy. This PEP
proposes a migration plan to move towards opaque C API which would make
PyPy faster.

Separated the Python language and the CPython runtime (promote
alternative runtimes)


The Python language should be better separated from its runtime. It's
common to say "Python" when referring to "CPython". Even in this PEP :-)

Because the CPython runtime remains the reference implementation, many
people believe that the Python language itself has design flaws which
prevent it from being efficient. PyPy proved that this is a false
assumption: on average, PyPy runs Python code 4.2 times faster than
CPython.

One solution for separating the language from the implementation is to
promote the usage of alternative runtimes: not only provide the regular
CPython, but also PyPy, optimized CPython which is only compatible with
C extensions using the limited C API, CPython compiled in debug mode to
ease debugging issues in C extensions, RustPython, etc.

To make alternative runtimes viable, they should be competitive in term
of features and performance. Currently, C extension modules remain the
bottleneck for PyPy.

Most C extensions don't rely directly on CPython internals
--

While the C API is still tidely coupled to CPython internals, in
practical, most C extensions don't rely directly on CPython internals.

The expectation is that these C extensions will remain compatible with
an "opaque" C API and only a minority of C extensions will have to be
modified.

Moreover, more and more C extensions are implemented in Cython or 

[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-11 Thread Rhodri James

On 10/04/2020 18:20, Victor Stinner wrote:

Note: Cython and cffi should be preferred to write new C extensions.
This PEP is about existing C extensions which cannot be rewritten with
Cython.


If this is true, the documentation on python.org needs a serious 
rewrite.  I am in the throes of writing a C extension, and using Cython 
or cffi never even crossed my mind.


--
Rhodri James *-* Kynesim Ltd
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NEYKZMQ6ZWB5VGS22SZTVQPVOJETDKUQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-11 Thread Antoine Pitrou
On Sat, 11 Apr 2020 01:52:13 +0200
Victor Stinner  wrote:
> 
> By the way, CPython currently uses statically allocated types for
> builtin types like str or list. This may have to change to run
> efficiently multiple subinterepters in parallel: each subinterpeter
> should have its own heap-allocated type with its own reference
> counter.
> 
> Using heap allocated types means that PyUnicode_Check() implementation
> has to change. It's just another good reason to better hide
> PyUnicode_Check() implementation right now ;-)

I'm not sure I understand.  If PyUnicode_Check() uses tp_flags, it
doesn't have to change, precisely.

Regards

Antoine.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QKBWU2IQSGYKSU3SQZ5N25KJ47ZDXOIE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-11 Thread Antoine Pitrou
On Sat, 11 Apr 2020 02:11:41 +0200
Victor Stinner  wrote:
> Le ven. 10 avr. 2020 à 22:00, Antoine Pitrou  a écrit :
> > > Debug runtime and remove debug checks in release mode
> > > .
> > >
> > > If the C extensions are no longer tied to CPython internals, it becomes
> > > possible to switch to a Python runtime built in debug mode to enable
> > > runtime debug checks to ease debugging C extensions.  
> >
> > That's the one convincing feature in this PEP, as far as I'm concerned.  
> 
> In fact, I already implemented this feature in Python 3.8:
> https://docs.python.org/dev/whatsnew/3.8.html#debug-build-uses-the-same-abi-as-release-build

I had missed that. Great! :-)

Regards

Antoine.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MW3K7AHKDVWMKJCOFFUK5DR3GIDPG6WK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-10 Thread Victor Stinner
Le ven. 10 avr. 2020 à 22:00, Antoine Pitrou  a écrit :
> > Debug runtime and remove debug checks in release mode
> > .
> >
> > If the C extensions are no longer tied to CPython internals, it becomes
> > possible to switch to a Python runtime built in debug mode to enable
> > runtime debug checks to ease debugging C extensions.
>
> That's the one convincing feature in this PEP, as far as I'm concerned.

In fact, I already implemented this feature in Python 3.8:
https://docs.python.org/dev/whatsnew/3.8.html#debug-build-uses-the-same-abi-as-release-build

Feature implemented on most platforms, except on Android, Cygwin and
Windows sadly.

You can now switch between a release build of Python and a debug build
of Python without having to rebuild your C extensions which were
compiled in release mode. If you want, you can use a debug build of
some C extensions.You now have many options: the debug ABI is now
compatible with the release ABI.

This PEP section is mostly a call to remove debug checks in release
mode :-) In my latest attempt, I failed to explain that the debug
build is now easy enough to be used by developers in practice (I never
finished my article explaining how to use it):
https://bugs.python.org/issue37406

Steve: the use case is to debug very rare Python crashes (ex: once
every two months) of customers who fail to provide a reproducer. My
*expectation* is that a debug build should help to reproduce the bug
and/or provide more information when the bug happens. My motivation
for this feature is also to show that the bug is not on Python but in
third-party C extensions ;-)

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/72KHLMKWFUGEI5AEVCD66255JJTZHBDY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-10 Thread Victor Stinner
Le ven. 10 avr. 2020 à 22:00, Antoine Pitrou  a écrit :
> How do you keep fast type checking such as PyTuple_Check() if extension
> code doesn't have access e.g. to tp_flags?
>
> I notice you did:
> """
> Add fast inlined version _PyType_HasFeature() and _PyType_IS_GC()
> for object.c and typeobject.c.
> """
>
> So you understand there is a need.

By the way, CPython currently uses statically allocated types for
builtin types like str or list. This may have to change to run
efficiently multiple subinterepters in parallel: each subinterpeter
should have its own heap-allocated type with its own reference
counter.

Using heap allocated types means that PyUnicode_Check() implementation
has to change. It's just another good reason to better hide
PyUnicode_Check() implementation right now ;-)

Victor
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EIEBID2VQKZN6Z3XSAS6LMVHKCIOKHAI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-10 Thread Victor Stinner
Le ven. 10 avr. 2020 à 22:00, Antoine Pitrou  a écrit :
> > Examples of issues to make structures opaque:
> >
> > * ``PyGC_Head``: https://bugs.python.org/issue40241
> > * ``PyObject``: https://bugs.python.org/issue39573
> > * ``PyTypeObject``: https://bugs.python.org/issue40170
>
> How do you keep fast type checking such as PyTuple_Check() if extension
> code doesn't have access e.g. to tp_flags?

Hum. I should clarify that we have the choice to not having any impact
on performance for the regular runtime: only use opaque function for
the "new" runtime. It's exactly what is already done with the
Py_LIMITED_API. Concrete example:

static inline int
PyType_HasFeature(PyTypeObject *type, unsigned long feature) {
#ifdef Py_LIMITED_API
return ((PyType_GetFlags(type) & feature) != 0);
#else
return ((type->tp_flags & feature) != 0);
#endif
}

The Py_LIMITED_API goes through PyType_GetFlags() function call,
otherwise PyTypeObject.tp_flags field is accessed directly.

I recently modified this function to:

static inline int
PyType_HasFeature(PyTypeObject *type, unsigned long feature) {
return ((PyType_GetFlags(type) & feature) != 0);
}

I consider that checking a type is not performance critical and so I
chose to have the same implementation for everyone. If someone sees
that it's major performance overhead, we can visit this choice and
reintroduce an #ifdef.

It's more a practical issue about the maintenance of two flavors of
Python in the same code base. Do you want to have two implementations
of each function? Or is it possible to have a single implementation
for some functions?

I suggest to reduce the code duplication and accept a performance
overhead when it's small enough.


> > O(1) bytearray to bytes conversion
> > ..
> >
> > Convert bytearray to bytes without memory copy.
> > (...)
>
> If that's desirable (I'm not sure it is), (...)

Hum, maybe I should clarify the whole "New optimized CPython runtime"
section. The list of optimizations are not optimizations that must be
implemented. There are more examples of optimizations which becomes
possible to implement, or at least easier to implement, once the C API
will be fixed.

I'm not sure that "bytearray to bytes conversion" is performance
bottleneck. It's just that such optimization is easier to explain that
other more complex optimizations ;-)

The intent of this PEP is not to design a faster CPython, but to show
that reworking the C API allows to implement such faster CPython.


> > Fork and "Copy-on-Read" problem
> > ...
> >
> > Solve the "Copy on read" problem with fork: store reference counter
> > outside ``PyObject``.
>
> Nowadays it is strongly recommended to use multiprocessing with the
> "forkserver" start method:
> https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

I understood that the Instagram workload is to load heavy data only
once, and fork later. I 'm not sure that forkserver fits such
workload.


> > One solution for that would be to store reference counters outside
> > ``PyObject``. For example, in a separated hash table (pointer to
> > reference counter). Changing ``PyObject`` structures requires that C
> > extensions don't access them directly.
>
> You're planning to introduce a large overhead for each reference
> count lookup just to satisfy a rather niche use case?  CPython
> probably does millions of reference counts per second.

Sorry, again, I'm not proposing to move ob_refcnt outside PyObject for
everyone. The intent is to show that it becomes possible to do if you
have a very specific use case where it would be more efficient.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JPDCY4BVKZ2USN5MWFWUQ2JVBQWEAYGM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-10 Thread Antoine Pitrou
On Fri, 10 Apr 2020 23:33:28 +0100
Steve Dower  wrote:
> On 10Apr2020 2055, Antoine Pitrou wrote:
> > On Fri, 10 Apr 2020 19:20:00 +0200
> > Victor Stinner  wrote:  
> >>
> >> Note: Cython and cffi should be preferred to write new C extensions.
> >> This PEP is about existing C extensions which cannot be rewritten with
> >> Cython.  
> > 
> > Using Cython does not make the C API irrelevant.  In some
> > applications, the C API has to be low-level enough for performance.
> > Whether the application is written in Cython or not.  
> 
> It does to the code author.
> 
> The point here is that we want authors who insist on coding against the 
> C API to be aware that they have fewer compatibility guarantees [...]

Yeah, you missed the point of my comment here.  Cython *does* call into
the C API, and it's quite insistent on performance optimizations too.
Saying "just use Cython" doesn't make the C API unimportant - it just
hides it from your own sight.

> - maybe 
> even to the point of needing to rebuild for each minor version if you 
> want to insist on using macros (i.e. anything starting with "_Py").

If there's still a way for C extensions to get at now-private APIs,
then the PEP fails to convey that, IMHO.

> >> **Backward compatibility:** backward incompatible on purpose. Break the
> >> limited C API and the stable ABI, with the assumption that `Most C
> >> extensions don't rely directly on CPython internals`_ and so will remain
> >> compatible.  
> > 
> > The problem here is not only compatibility but potential performance
> > regressions in C extensions.  
> 
> I don't think we've ever guaranteed performance between releases. 
> Correctness, sure, but not performance.

That's a rather weird argument.  Just because you don't guarantee
performance doesn't mean it's ok to introduce performance regressions.

It's especially a weird argument to make when discussing a PEP where
most of the arguments are distant promises of improved performance.

> >> Fork and "Copy-on-Read" problem
> >> ...
> >>
> >> Solve the "Copy on read" problem with fork: store reference counter
> >> outside ``PyObject``.  
> > 
> > Nowadays it is strongly recommended to use multiprocessing with the
> > "forkserver" start method:
> > https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods
> > 
> > With "forkserver", the forked process is extremely lightweight and
> > there are little savings to be made in the child.  
> 
> Unfortunately, a recommendation that only applies to a minority of 
> Python users. Oh well.

Which "minority" are you talking about?  Neither of us has numbers, but
I'm quite sure that the population of Python users calling into
multiprocessing (or a third-party library relying on multiprocessing,
such as Dask) is much larger than the population of Python users
calling fork() directly and relying on copy-on-write for optimization
purposes.

But if you have a different experience to share, please do so.

> Separating refcounts theoretically improves cache locality, specifically 
> the case where cache invalidation impacts multiple CPUs (and even the 
> case where a single thread moves between CPUs).

I'm a bit curious why it would improve, rather than degrade, cache
locality. If you take the typical example of the eval loop, an object
is incref'ed and decref'ed just about the same time that it gets used.

I'll also note that the PEP proposes to remove APIs which return
borrowed references... yet increasing the number of cases where
accessing an object implies updating its refcount.

Therefore I'm unconvinced that stashing refcounts in a separate memory
area would provide any CPU efficiency benefit.

> >> Debug runtime and remove debug checks in release mode
> >> .
> >>
> >> If the C extensions are no longer tied to CPython internals, it becomes
> >> possible to switch to a Python runtime built in debug mode to enable
> >> runtime debug checks to ease debugging C extensions.  
> > 
> > That's the one convincing feature in this PEP, as far as I'm concerned.  
> 
> Eh, this assumes that someone is fully capable of rebuilding CPython and 
> their own extension, but not one of their dependencies, [...]

You don't need to rebuild CPython if someone provides a binary debug
build (which would probably happen if such a build were compatible with
regular packages).  You also don't need to rebuild your own extension
to take advantage of the interpreter's internal correctness checks, if
the interpreter's ABI hasn't changed.

This is the whole point: being able to load an unmodified extension
(and unmodified dependencies) on a debug-checks-enabled interpreter.

> and this code 
> that they're using doesn't have any system dependencies that differ in 
> debug builds (spoiler: they do).

Are you talking about Windows?  On non-Windows systems, I don't think
there are "system dependencies that differ in debug builds".

Regards


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-10 Thread Steve Dower

On 10Apr2020 2055, Antoine Pitrou wrote:

On Fri, 10 Apr 2020 19:20:00 +0200
Victor Stinner  wrote:


Note: Cython and cffi should be preferred to write new C extensions.
This PEP is about existing C extensions which cannot be rewritten with
Cython.


Using Cython does not make the C API irrelevant.  In some
applications, the C API has to be low-level enough for performance.
Whether the application is written in Cython or not.


It does to the code author.

The point here is that we want authors who insist on coding against the 
C API to be aware that they have fewer compatibility guarantees - maybe 
even to the point of needing to rebuild for each minor version if you 
want to insist on using macros (i.e. anything starting with "_Py").



Examples of issues to make structures opaque:

* ``PyGC_Head``: https://bugs.python.org/issue40241
* ``PyObject``: https://bugs.python.org/issue39573
* ``PyTypeObject``: https://bugs.python.org/issue40170


How do you keep fast type checking such as PyTuple_Check() if extension
code doesn't have access e.g. to tp_flags?


Measured in isolation, sure. But what task are you doing that is being 
held up by builtin type checks?


If the type check is the bottleneck, you need to work on more 
interesting algorithms ;)



I notice you did:
"""
Add fast inlined version _PyType_HasFeature() and _PyType_IS_GC()
for object.c and typeobject.c.
"""

So you understand there is a need.


These are private APIs.


**Backward compatibility:** backward incompatible on purpose. Break the
limited C API and the stable ABI, with the assumption that `Most C
extensions don't rely directly on CPython internals`_ and so will remain
compatible.


The problem here is not only compatibility but potential performance
regressions in C extensions.


I don't think we've ever guaranteed performance between releases. 
Correctness, sure, but not performance.



New optimized CPython runtime
==

Backward incompatible changes is such a pain for the whole Python
community. To ease the migration (accelerate adoption of the new C
API), one option is to provide not only one but two CPython runtimes:

* Regular CPython: fully backward compatible, support direct access to
   structures like ``PyObject``, etc.
* New optimized CPython: incompatible, cannot import C extensions which
   don't use the limited C API, has new optimizations, limited to the C
   API.


Well, this sounds like a distribution nightmare.  Some packages will
only be available for one runtime and not the other.  It will confuse
non-expert users.


Agreed (except that it will also confuse expert users). Doing "Python 
4"-by-stealth like this is a terrible idea.


If it's incompatible, give it a new version number. If you don't want a 
new version number, maintain compatibility. There are no alternatives.



O(1) bytearray to bytes conversion
..

Convert bytearray to bytes without memory copy.

Currently, bytearray is used to build a bytes string, but it's usually
converted into a bytes object to respect an API. This conversion
requires to allocate a new memory block and copy data (O(n) complexity).

It is possible to implement O(1) conversion if it would be possible to
pass the ownership of the bytearray object to bytes.

That requires modifying the ``PyBytesObject`` structure to support
multiple storages (support storing content into a separate memory
block).


If that's desirable (I'm not sure it is), there is a simpler solution:
instead of allocating a raw memory area, bytearray could allocate... a
private bytes object that you can detach without copying it.


Yeah, I don't see the point in this one, unless you mean a purely 
internal change. Is this a major bottleneck?


Having a broader concept of "freezable" objects may be a valuable thing 
to enable in a new runtime, but retrofitting it to CPython doesn't seem 
likely to have a big impact.



Fork and "Copy-on-Read" problem
...

Solve the "Copy on read" problem with fork: store reference counter
outside ``PyObject``.


Nowadays it is strongly recommended to use multiprocessing with the
"forkserver" start method:
https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

With "forkserver", the forked process is extremely lightweight and
there are little savings to be made in the child.


Unfortunately, a recommendation that only applies to a minority of 
Python users. Oh well.


Separating refcounts theoretically improves cache locality, specifically 
the case where cache invalidation impacts multiple CPUs (and even the 
case where a single thread moves between CPUs). But I don't think 
there's been a convincing real benchmark of this yet.



Debug runtime and remove debug checks in release mode
.

If the C extensions are no longer tied to CPython internals, it becomes
possible to switch to a Python runtime built in debug mode 

[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-10 Thread Antoine Pitrou
On Fri, 10 Apr 2020 19:20:00 +0200
Victor Stinner  wrote:
> 
> Note: Cython and cffi should be preferred to write new C extensions.
> This PEP is about existing C extensions which cannot be rewritten with
> Cython.

Using Cython does not make the C API irrelevant.  In some
applications, the C API has to be low-level enough for performance.
Whether the application is written in Cython or not.

> **Status:** not started. The performance overhead must be measured with
> benchmarks and this PEP should be accepted.

Surely you mean "before this PEP should be accepted"?

> Examples of issues to make structures opaque:
> 
> * ``PyGC_Head``: https://bugs.python.org/issue40241
> * ``PyObject``: https://bugs.python.org/issue39573
> * ``PyTypeObject``: https://bugs.python.org/issue40170

How do you keep fast type checking such as PyTuple_Check() if extension
code doesn't have access e.g. to tp_flags?

I notice you did:
"""
Add fast inlined version _PyType_HasFeature() and _PyType_IS_GC()
for object.c and typeobject.c.
"""

So you understand there is a need.

> **Backward compatibility:** backward incompatible on purpose. Break the
> limited C API and the stable ABI, with the assumption that `Most C
> extensions don't rely directly on CPython internals`_ and so will remain
> compatible.

The problem here is not only compatibility but potential performance
regressions in C extensions.

> New optimized CPython runtime
> ==
> 
> Backward incompatible changes is such a pain for the whole Python
> community. To ease the migration (accelerate adoption of the new C
> API), one option is to provide not only one but two CPython runtimes:
> 
> * Regular CPython: fully backward compatible, support direct access to
>   structures like ``PyObject``, etc.
> * New optimized CPython: incompatible, cannot import C extensions which
>   don't use the limited C API, has new optimizations, limited to the C
>   API.

Well, this sounds like a distribution nightmare.  Some packages will
only be available for one runtime and not the other.  It will confuse
non-expert users.

> O(1) bytearray to bytes conversion
> ..
> 
> Convert bytearray to bytes without memory copy.
> 
> Currently, bytearray is used to build a bytes string, but it's usually
> converted into a bytes object to respect an API. This conversion
> requires to allocate a new memory block and copy data (O(n) complexity).
> 
> It is possible to implement O(1) conversion if it would be possible to
> pass the ownership of the bytearray object to bytes.
> 
> That requires modifying the ``PyBytesObject`` structure to support
> multiple storages (support storing content into a separate memory
> block).

If that's desirable (I'm not sure it is), there is a simpler solution:
instead of allocating a raw memory area, bytearray could allocate... a
private bytes object that you can detach without copying it.

But really, this is why we have BytesIO.  Which already uses that exact
strategy: allocate a private bytes object.

> Fork and "Copy-on-Read" problem
> ...
> 
> Solve the "Copy on read" problem with fork: store reference counter
> outside ``PyObject``.

Nowadays it is strongly recommended to use multiprocessing with the
"forkserver" start method:
https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

With "forkserver", the forked process is extremely lightweight and
there are little savings to be made in the child.

> `Dismissing Python Garbage Collection at Instagram
> `_
> (Jan 2017) by Instagram Engineering.
> 
> Instagram contributed `gc.freeze()
> `_ to Python 3.7
> which works around the issue.
> 
> One solution for that would be to store reference counters outside
> ``PyObject``. For example, in a separated hash table (pointer to
> reference counter). Changing ``PyObject`` structures requires that C
> extensions don't access them directly.

You're planning to introduce a large overhead for each reference
count lookup just to satisfy a rather niche use case?  CPython
probably does millions of reference counts per second.

> Debug runtime and remove debug checks in release mode
> .
> 
> If the C extensions are no longer tied to CPython internals, it becomes
> possible to switch to a Python runtime built in debug mode to enable
> runtime debug checks to ease debugging C extensions.

That's the one convincing feature in this PEP, as far as I'm concerned.

Regards

Antoine.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at