Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-21 Thread Victor Stinner
Le mer. 21 nov. 2018 à 12:11, Antoine Pitrou  a écrit :
> You mean the same API can compile to two different things depending on
> a configuration?

Yes, my current plan is to keep #include  but have an opt-in
define to switch to the new C API.

> I expect it to be error-prone.  For example, let's suppose I want to
> compile in a given mode, but I also use Numpy's C API.  Will the
> compile mode "leak" to Numpy as well?

For example, if we continue to use Py_LIMITED_API: I don't think that
Numpy currently uses #ifdef Py_LIMITED_API, nor plans to do that.

If we add a new define (ex: my current proof-of-concept uses
Py_NEWCAPI), we can make sure that it's not already used by Numpy :-)

>  What if a third-party header
> includes "Python.h" before I do the "#define" that's necessary?

IMHO the define should be added by distutils directly, using -D
 in compiler flags.

I wouldn't suggest:

#define 
#include 

But the two APIs should diverge, so your C extension should also use a
define to decide to use the old or the new API. So something will
break if you mess up in the compilation :-)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-21 Thread Antoine Pitrou
On Tue, 20 Nov 2018 23:17:05 +0100
Victor Stinner  wrote:
> Le mar. 20 nov. 2018 à 23:08, Stefan Krah  a écrit :
> > Intuitively, it should probably not be part of a limited API, but I never
> > quite understood the purpose of this API, because I regularly need any
> > function that I can get my hands on.
> > (...)
> > Reading typed strings directly into an array with minimal overhead.  
> 
> IMHO performance and hiding implementation details are exclusive. You
> should either use the C API with impl. details for best performances,
> or use a "limited" C API for best compatibility.
> 
> Since I would like to not touch the C API with impl. details, you can
> imagine to have two compilation modes: one for best performances on
> CPython, one for best compatibility (ex: compatible with PyPy). I'm
> not sure how the "compilation mode" will be selected.

You mean the same API can compile to two different things depending on
a configuration?

I expect it to be error-prone.  For example, let's suppose I want to
compile in a given mode, but I also use Numpy's C API.  Will the
compile mode "leak" to Numpy as well?  What if a third-party header
includes "Python.h" before I do the "#define" that's necessary?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-20 Thread Glenn Linderman

On 11/20/2018 10:33 PM, Nathaniel Smith wrote:

On Tue, Nov 20, 2018 at 6:05 PM Glenn Linderman  wrote:

On 11/20/2018 2:17 PM, Victor Stinner wrote:

IMHO performance and hiding implementation details are exclusive. You
should either use the C API with impl. details for best performances,
or use a "limited" C API for best compatibility.

The "limited" C API concept would seem to be quite sufficient for extensions 
that want to extend Python functionality to include new system calls, etc. (pywin32, 
pyMIDI, pySide, etc.) whereas the numpy and decimal might want best performance.

To make things more complicated: numpy and decimal are in a category
of modules where if you want them to perform well on JIT-based VMs,
then there's no possible C API that can achieve that. To get the
benefits of a JIT on code using numpy or decimal, the JIT has to be
able to see into their internals to do inlining etc., which means they
can't be written in C at all [1], at which point the C API becomes
irrelevant.

It's not clear to me how this affects any of the discussion in
CPython, since supporting JITs might not be part of the goal of a new
C API, and I'm not sure how many packages fall between the
numpy/decimal side and the pure-ffi side.

-n

[1] Well, there's also the option of teaching your Python JIT to
handle LLVM bitcode as a source language, which is the approach that
Graal is experimenting with. It seems completely wacky to me to hope
you could write a C API emulation layer like PyPy's cpyext, and
compile that + C extension code to LLVM bitcode, translate the LLVM
bitcode to JVM bytecode, inline the whole mess into your Python JIT,
and then fold everything away to produce something reasonable. But I
could be wrong, and Oracle is throwing a lot of money at Graal so I
guess we'll find out.

Interesting, thanks for the introduction to wacky. I was quite content 
with the idea that numpy, and other modules that would choose to use the 
unlimited API, would be sacrificing portability to non-CPython 
implementations... except by providing a Python equivalent (decimal, and 
some others do that, IIRC).


Regarding JIT in general, though, it would seem that "precompiled" 
extensions like numpy would not need to be re-compiled by the JIT.


But if it does, then the JIT better understand/support C syntax, but JVM 
JITs probably don't! so that leads to the scenario you describe.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-20 Thread Nathaniel Smith
On Tue, Nov 20, 2018 at 6:05 PM Glenn Linderman  wrote:
>
> On 11/20/2018 2:17 PM, Victor Stinner wrote:
>> IMHO performance and hiding implementation details are exclusive. You
>> should either use the C API with impl. details for best performances,
>> or use a "limited" C API for best compatibility.
>
> The "limited" C API concept would seem to be quite sufficient for extensions 
> that want to extend Python functionality to include new system calls, etc. 
> (pywin32, pyMIDI, pySide, etc.) whereas the numpy and decimal might want best 
> performance.

To make things more complicated: numpy and decimal are in a category
of modules where if you want them to perform well on JIT-based VMs,
then there's no possible C API that can achieve that. To get the
benefits of a JIT on code using numpy or decimal, the JIT has to be
able to see into their internals to do inlining etc., which means they
can't be written in C at all [1], at which point the C API becomes
irrelevant.

It's not clear to me how this affects any of the discussion in
CPython, since supporting JITs might not be part of the goal of a new
C API, and I'm not sure how many packages fall between the
numpy/decimal side and the pure-ffi side.

-n

[1] Well, there's also the option of teaching your Python JIT to
handle LLVM bitcode as a source language, which is the approach that
Graal is experimenting with. It seems completely wacky to me to hope
you could write a C API emulation layer like PyPy's cpyext, and
compile that + C extension code to LLVM bitcode, translate the LLVM
bitcode to JVM bytecode, inline the whole mess into your Python JIT,
and then fold everything away to produce something reasonable. But I
could be wrong, and Oracle is throwing a lot of money at Graal so I
guess we'll find out.

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-20 Thread Glenn Linderman

On 11/20/2018 2:17 PM, Victor Stinner wrote:

Le mar. 20 nov. 2018 à 23:08, Stefan Krah  a écrit :

Intuitively, it should probably not be part of a limited API, but I never
quite understood the purpose of this API, because I regularly need any
function that I can get my hands on.
(...)
Reading typed strings directly into an array with minimal overhead.

IMHO performance and hiding implementation details are exclusive. You
should either use the C API with impl. details for best performances,
or use a "limited" C API for best compatibility.


The "limited" C API concept would seem to be quite sufficient for 
extensions that want to extend Python functionality to include new 
system calls, etc. (pywin32, pyMIDI, pySide, etc.) whereas the numpy and 
decimal might want best performance.



Since I would like to not touch the C API with impl. details, you can
imagine to have two compilation modes: one for best performances on
CPython, one for best compatibility (ex: compatible with PyPy). I'm
not sure how the "compilation mode" will be selected.


The nicest interface from a compilation point of view would be to have 
two #include files: One to import the limited API, and one to import the 
performance API. Importing both should be allowed and should work.


If you import the performance API, you have to learn more, and be more 
careful.


Of course, there might be appropriate subsets of each API, having 
multiple include files, to avoid including everything, but that is a 
refinement.




Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/v%2Bpython%40g.nevcal.com


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-20 Thread Victor Stinner
Le mar. 20 nov. 2018 à 23:08, Stefan Krah  a écrit :
> Intuitively, it should probably not be part of a limited API, but I never
> quite understood the purpose of this API, because I regularly need any
> function that I can get my hands on.
> (...)
> Reading typed strings directly into an array with minimal overhead.

IMHO performance and hiding implementation details are exclusive. You
should either use the C API with impl. details for best performances,
or use a "limited" C API for best compatibility.

Since I would like to not touch the C API with impl. details, you can
imagine to have two compilation modes: one for best performances on
CPython, one for best compatibility (ex: compatible with PyPy). I'm
not sure how the "compilation mode" will be selected.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-20 Thread Stefan Krah


On Mon, Nov 19, 2018 at 04:08:07PM +0100, Victor Stinner wrote:
> Le lun. 19 nov. 2018 à 13:18, Stefan Krah  a écrit :
> > In practice people desperately *have* to use whatever is there, including
> > functions with underscores that are not even officially in the C-API.
> >
> > I have to use _PyFloat_Pack* in order to be compatible with CPython,
> 
> Oh, I never used this function. These functions are private (name
> prefixed by "_") and excluded from the limited API.
> 
> For me, the limited API should be functions available on all Python
> implementations. Does it make sense to provide PyFloat_Pack4() in
> MicroPython, Jython, IronPython and PyPy? Or is it something more
> specific to CPython? I don't know the answer. If yes, open an issue to
> propose to make this function public?

It depends on what the goal is: If PyPy wants to be able to use as many
C extensions as possible, then yes.

The function is just one example of what people have to use to be 100%
compatible with CPython (or copy these functions and maintain them ...).


Intuitively, it should probably not be part of a limited API, but I never
quite understood the purpose of this API, because I regularly need any
function that I can get my hands on.


> > I need PyUnicode_KIND()
> 
> IMHO this one should not be part of the public API. The only usage
> would be to micro-optimize, but such API is very specific to one
> Python implementation. For example, PyPy doesn't use "compact string"
> but UTF-8 internally. If you use PyUnicode_KIND(), your code becomes
> incompatible with PyPy.
> 
> What is your use case?

Reading typed strings directly into an array with minimal overhead.


> I would prefer to expose the "_PyUnicodeWriter" API than PyUnicode_KIND().
> 
> > need PyUnicode_AsUTF8AndSize(),
> 
> Again, that's a micro-optimization and it's very specific to CPython:
> result cached in the "immutable" str object. I don't want to put it in
> a public API. PyUnicode_AsUTF8String() is better since it doesn't
> require an internal cache.
> 
> > I *wish* there were PyUnicode_AsAsciiAndSize().
> 
> PyUnicode_AsASCIIString() looks good to me. Sadly, it doesn't return
> the length, but usually the length is not needed.

Yes, these are all just examples.  It's also very useful to be able to
do PyLong_Type.tp_as_number->nb_multiply or grab as_integer_ratio from
the float PyMethodDef.

The latter two cases are for speed reasons but also because sometimes
you *don't* want a method from a subclass (Serhiy was very good in
finding corner cases :-).


Most C modules that I've seen have some internals. Psycopg2:

PyDateTime_DELTA_GET_MICROSECONDS
PyDateTime_DELTA_GET_DAYS
PyDateTime_DELTA_GET_SECONDS
PyList_GET_ITEM
Bytes_GET_SIZE
Py_BEGIN_ALLOW_THREADS
Py_END_ALLOW_THREADS

floatobject.h and longintrepr.h are also popular.


Stefan Krah



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-20 Thread Nathaniel Smith
On Tue, Nov 20, 2018 at 1:34 AM Petr Viktorin  wrote:
>
> On 11/19/18 12:14 PM, Victor Stinner wrote:
> > To design a new C API, I see 3 options:
> >
> > (1) add more functions to the existing Py_LIMITED_API
> > (2) "fork" the current public C API: remove functions and hide as much
> > implementation details as possible
> > (3) write a new C API from scratch, based on the current C API.
> > Something like #define newcapi_Object_GetItem PyObject_GetItem"?
> > Sorry, but "#undef " doesn't work. Only very few
> > functions are defined using "#define ...".
> >
> > I dislike (1) because it's too far from what is currently used in
> > practice. Moreover, I failed to find anyone who can explain me how the
> > C API is used in the wild, which functions are important or not, what
> > is the C API, etc.
>
> One big, complex project that now uses the limited API is PySide. They
> do some workarounds, but the limited API works. Here's a writeup of the
> troubles they have with it:
> https://github.com/pyside/pyside2-setup/blob/5.11/sources/shiboken2/libshiboken/pep384impl_doc.rst

AFAIK the only two projects that use the limited API are
PySide-generated modules and cffi-generated modules. I guess if there
is some cleanup needed to remove stuff that snuck into the limited
API, then that will be fine as long as you make sure they aren't used
by either of those two projects.

For the regular C API, I guess the PyPy folks, and especially Matti
Picus, probably know more than anyone else about what parts are
actually used in the wild, since they've spent way more time digging
into real projects. (Do you want to know about the exact conditions in
which real projects rely on being able to skip calling PyType_Ready on
a statically allocated PyTypeObject? Matti knows...)

> I hope the new C API will be improvements (and clarifications) of the
> stable ABI, rather than a completely new thing.
> My ideal would be that Python 4.0 would keep the same API (with
> questionable things emulated & deprecated), but break *ABI*. The "new C
> API" would become that new stable ABI -- and this time it'd be something
> we'd really want to support, without reservations.

We already break ABI with every feature release – at least for the
main ABI. The limited ABI supposedly doesn't, but probably does, and
as noted above it has such limited use that it's probably still
possible to fix any stuff that's leaked out accidentally.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-20 Thread Petr Viktorin

On 11/19/18 12:14 PM, Victor Stinner wrote:

To design a new C API, I see 3 options:

(1) add more functions to the existing Py_LIMITED_API
(2) "fork" the current public C API: remove functions and hide as much
implementation details as possible
(3) write a new C API from scratch, based on the current C API.
Something like #define newcapi_Object_GetItem PyObject_GetItem"?
Sorry, but "#undef " doesn't work. Only very few
functions are defined using "#define ...".

I dislike (1) because it's too far from what is currently used in
practice. Moreover, I failed to find anyone who can explain me how the
C API is used in the wild, which functions are important or not, what
is the C API, etc.


One big, complex project that now uses the limited API is PySide. They 
do some workarounds, but the limited API works. Here's a writeup of the 
troubles they have with it: 
https://github.com/pyside/pyside2-setup/blob/5.11/sources/shiboken2/libshiboken/pep384impl_doc.rst



I propose (2). We control how much changes we do at each milestone,
and we start from the maximum compatibility with current C API. Each
change can be discussed and experimented to define what is the C API,
what we want, etc. I'm working on this approach for 1 year, that's why
many discussions popped up around specific changes :-)


I hope the new C API will be improvements (and clarifications) of the 
stable ABI, rather than a completely new thing.
My ideal would be that Python 4.0 would keep the same API (with 
questionable things emulated & deprecated), but break *ABI*. The "new C 
API" would become that new stable ABI -- and this time it'd be something 
we'd really want to support, without reservations.


One thing that did not work with the stable ABI was that it's "opt-out"; 
I think we can agree that a new one must be "opt-in" from the start.
I'd also like the "new API" to be a *strict subset* of the stable ABI: 
if a new function needs to be added, it should be added to both.



Some people recently proposed (3) on python-dev. I dislike this option
because it starts by breaking the backward compatibility. It looks
like (1), but worse. The goal and the implementation are unclear to
me.

--

Replacing PyDict_GetItem() (specialized call) with PyObject_Dict()
(generic API) is not part of my short term plan. I wrote it in the
roadmap, but as I wrote before, each change should be discusssed,
experimented, benchmarked, etc.

Victor
Le lun. 19 nov. 2018 à 12:02, M.-A. Lemburg  a écrit :


On 19.11.2018 11:53, Antoine Pitrou wrote:

On Mon, 19 Nov 2018 11:28:46 +0100
Victor Stinner  wrote:

Python internals rely on internals to implement further optimizations,
than modifying an "immutable" tuple, bytes or str object, because you
can do that at the C level. But I'm not sure that I would like 3rd
party extensions to rely on such things.


I'm not even talking about *modifying* tuples or str objects, I'm
talking about *accessing* their value without going through an abstract
API that does slot lookups, indirect function calls and object unboxing.

For example, people may need a fast way to access the UTF-8
representation of a unicode object.  Without making indirect function
calls, and ideally without making a copy of the data either.  How do
you do that using the generic C API?


Something else you need to consider is creating instances of
types, e.g. a tuple. In C you will have to be able to put
values into the data structure before it is passed outside
the function in order to build the tuple.

If you remove this possibility to have to copy data all the
time, losing the advantages of having a rich C API.
  --
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Nov 19 2018)

Python Projects, Coaching and Consulting ...  http://www.egenix.com/
Python Database Interfaces ...   http://products.egenix.com/
Plone/Zope Database Interfaces ...   http://zope.egenix.com/



::: We implement business ideas - efficiently in both time and costs :::

eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
   http://www.malemburg.com/

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/encukou%40gmail.com


___
Python-Dev mailing list
Python-Dev@python.org

Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-19 Thread Brett Cannon
On Mon., Nov. 19, 2018, 14:04 Neil Schemenauer  On 2018-11-19, Antoine Pitrou wrote:
> > There are important use cases for the C API where it is desired to have
> > fast type-specific access to Python objects such as tuples, ints,
> > strings, etc.  This is relied upon by modules such as _json and _pickle,
> > and third-party extensions as well.
>
> Thank you for pointing this out.  The feedback from Stefan on what
> Cython would like (e.g. more access to functions that are currently
> "internal") is useful too.  Keeping our dreams tied to reality
> is important. ;-P
>
> It seems to me that we can't "have our cake and eat it too". I.e. on
> the one hand hide CPython implementation internals but on the other
> hand allow extensions that want to take advantage of those internals
> to provide the best performance.
>

No, but those are different APIs as well. E.g. no one is saying CPython has
to do away with any of its API. What I and some others have said is the
CPython API is too broad to be called "universal".


> Maybe we could have a multiple levels of API:
>
> A) maximum portability (Py_LIMITED_API)
>
> B) source portability (non-stable ABI, inlined functions)
>
> C) portability but poor performance on non-CPython VMs
>(PySequence_Fast_ITEMS, borrowed refs, etc)
>

I don't know own how doable that is as e.g. borrowed refs are not pleasant
to simulate.


> D) non-portability, CPython specific (access to more internals like
>Stefan was asking for).  The extension would have to be
>re-implemented on each VM or provide a pure Python
>alternative.


> I think it would be nice if the extension module could explicitly
> choose which level of API it wants to use.
>

Yes, and I thought we were working towards nesting our header files so you
very clearly opted into your level of compatibility.

In my head there's:
- bare minimum, cross-VM, gets you FFI
- CPython API for more performance that we're willing to maintain
- Everything open for e.g. CPython with no compatibility guarantees

Due note my first point isn't necessarily worrying about crazy performance
to start. I would assume an alternative VM would help make up for this with
a faster runtime where dropping into C is more about FFI than performance
(we all know PyPy, for instance, wished people just wrote more Python code).

Otherwise we're back to the idea of standardizing on some Cython solution
to help make perfect easier without tying oneself to the C API (like
Julia's FFI solution).


> It would be interesting to do a census on what extensions are out
> there.  If they mostly fall into wanting level "C" then I think this
> API overhaul is not going to work out too well.  Level C is mostly
> what we have now.  No point in putting the effort into A and B if no
> one will use them.


It won't until someone can show benefits for switching. This is very much a
chicken-and-egg problem.


___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-19 Thread Neil Schemenauer
On 2018-11-19, Antoine Pitrou wrote:
> There are important use cases for the C API where it is desired to have
> fast type-specific access to Python objects such as tuples, ints,
> strings, etc.  This is relied upon by modules such as _json and _pickle,
> and third-party extensions as well.

Thank you for pointing this out.  The feedback from Stefan on what
Cython would like (e.g. more access to functions that are currently
"internal") is useful too.  Keeping our dreams tied to reality
is important. ;-P

It seems to me that we can't "have our cake and eat it too". I.e. on
the one hand hide CPython implementation internals but on the other
hand allow extensions that want to take advantage of those internals
to provide the best performance.

Maybe we could have a multiple levels of API:

A) maximum portability (Py_LIMITED_API)

B) source portability (non-stable ABI, inlined functions)

C) portability but poor performance on non-CPython VMs
   (PySequence_Fast_ITEMS, borrowed refs, etc)

D) non-portability, CPython specific (access to more internals like
   Stefan was asking for).  The extension would have to be
   re-implemented on each VM or provide a pure Python
   alternative.

I think it would be nice if the extension module could explicitly
choose which level of API it wants to use.

It would be interesting to do a census on what extensions are out
there.  If they mostly fall into wanting level "C" then I think this
API overhaul is not going to work out too well.  Level C is mostly
what we have now.  No point in putting the effort into A and B if no
one will use them.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-19 Thread Neil Schemenauer
On 2018-11-19, Victor Stinner wrote:
> Moreover, I failed to find anyone who can explain me how the C API
> is used in the wild, which functions are important or not, what is
> the C API, etc.

One idea is to download a large sample of extension modules from
PyPI and then analyze them with some automated tool (maybe
libclang).  I guess it is possible there is a large non-public set
of extensions that we would miss.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-19 Thread Victor Stinner
Hi Stefan,

Le lun. 19 nov. 2018 à 13:18, Stefan Krah  a écrit :
> In practice people desperately *have* to use whatever is there, including
> functions with underscores that are not even officially in the C-API.
>
> I have to use _PyFloat_Pack* in order to be compatible with CPython,

Oh, I never used this function. These functions are private (name
prefixed by "_") and excluded from the limited API.

For me, the limited API should be functions available on all Python
implementations. Does it make sense to provide PyFloat_Pack4() in
MicroPython, Jython, IronPython and PyPy? Or is it something more
specific to CPython? I don't know the answer. If yes, open an issue to
propose to make this function public?

> I need PyUnicode_KIND()

IMHO this one should not be part of the public API. The only usage
would be to micro-optimize, but such API is very specific to one
Python implementation. For example, PyPy doesn't use "compact string"
but UTF-8 internally. If you use PyUnicode_KIND(), your code becomes
incompatible with PyPy.

What is your use case?

I would prefer to expose the "_PyUnicodeWriter" API than PyUnicode_KIND().

> need PyUnicode_AsUTF8AndSize(),

Again, that's a micro-optimization and it's very specific to CPython:
result cached in the "immutable" str object. I don't want to put it in
a public API. PyUnicode_AsUTF8String() is better since it doesn't
require an internal cache.

> I *wish* there were PyUnicode_AsAsciiAndSize().

PyUnicode_AsASCIIString() looks good to me. Sadly, it doesn't return
the length, but usually the length is not needed.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-19 Thread Stefan Krah


Victor Stinner wrote:

> Moreover, I failed to find anyone who can explain me how the C API is used
> in the wild, which functions are important or not, what is the C API, etc.

In practice people desperately *have* to use whatever is there, including
functions with underscores that are not even officially in the C-API.

I have to use _PyFloat_Pack* in order to be compatible with CPython, I need
PySlice_Unpack() etc., I need PyUnicode_KIND(), need PyUnicode_AsUTF8AndSize(),
I *wish* there were PyUnicode_AsAsciiAndSize().


In general, in daily use of the C-API I wish it were *larger* and not smaller.

I often want functions that return C instead of Python values ot functions
that take C instead of Python values.

The ideal situation for me would be a lower layer library, say libcpython.a
that has all those functions like _PyFloat_Pack*.

It would be an enormous amount of work though, especially since the status quo
kind of works.



Stefan Krah



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-19 Thread Victor Stinner
To design a new C API, I see 3 options:

(1) add more functions to the existing Py_LIMITED_API
(2) "fork" the current public C API: remove functions and hide as much
implementation details as possible
(3) write a new C API from scratch, based on the current C API.
Something like #define newcapi_Object_GetItem PyObject_GetItem"?
Sorry, but "#undef " doesn't work. Only very few
functions are defined using "#define ...".

I dislike (1) because it's too far from what is currently used in
practice. Moreover, I failed to find anyone who can explain me how the
C API is used in the wild, which functions are important or not, what
is the C API, etc.

I propose (2). We control how much changes we do at each milestone,
and we start from the maximum compatibility with current C API. Each
change can be discussed and experimented to define what is the C API,
what we want, etc. I'm working on this approach for 1 year, that's why
many discussions popped up around specific changes :-)

Some people recently proposed (3) on python-dev. I dislike this option
because it starts by breaking the backward compatibility. It looks
like (1), but worse. The goal and the implementation are unclear to
me.

--

Replacing PyDict_GetItem() (specialized call) with PyObject_Dict()
(generic API) is not part of my short term plan. I wrote it in the
roadmap, but as I wrote before, each change should be discusssed,
experimented, benchmarked, etc.

Victor
Le lun. 19 nov. 2018 à 12:02, M.-A. Lemburg  a écrit :
>
> On 19.11.2018 11:53, Antoine Pitrou wrote:
> > On Mon, 19 Nov 2018 11:28:46 +0100
> > Victor Stinner  wrote:
> >> Python internals rely on internals to implement further optimizations,
> >> than modifying an "immutable" tuple, bytes or str object, because you
> >> can do that at the C level. But I'm not sure that I would like 3rd
> >> party extensions to rely on such things.
> >
> > I'm not even talking about *modifying* tuples or str objects, I'm
> > talking about *accessing* their value without going through an abstract
> > API that does slot lookups, indirect function calls and object unboxing.
> >
> > For example, people may need a fast way to access the UTF-8
> > representation of a unicode object.  Without making indirect function
> > calls, and ideally without making a copy of the data either.  How do
> > you do that using the generic C API?
>
> Something else you need to consider is creating instances of
> types, e.g. a tuple. In C you will have to be able to put
> values into the data structure before it is passed outside
> the function in order to build the tuple.
>
> If you remove this possibility to have to copy data all the
> time, losing the advantages of having a rich C API.
>  --
> Marc-Andre Lemburg
> eGenix.com
>
> Professional Python Services directly from the Experts (#1, Nov 19 2018)
> >>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
> >>> Python Database Interfaces ...   http://products.egenix.com/
> >>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/
> 
>
> ::: We implement business ideas - efficiently in both time and costs :::
>
>eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
> D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>Registered at Amtsgericht Duesseldorf: HRB 46611
>http://www.egenix.com/company/contact/
>   http://www.malemburg.com/
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-19 Thread Antoine Pitrou
On Mon, 19 Nov 2018 11:53:42 +0100
Antoine Pitrou  wrote:
> On Mon, 19 Nov 2018 11:28:46 +0100
> Victor Stinner  wrote:
> > I would expect that the most common source of speed up of a C
> > extension is the removal of the cost of bytecode evaluation (ceval.c
> > loop).  
> 
> Well, I don't.  All previous experiments showed that simply compiling
> Python code to C code using the "generic" C API yielded a 30%
> improvement.
> 
> Conversely, the C _pickle module can be 100x faster than the pure
> Python pickle module.  It's doing it *not* by using the generic C
> API, but by special-casing access to concrete types.  You don't get
> that level of performance simply by removing the cost of bytecode
> evaluation:
> 
> # C version
> $ python3 -m timeit -s "import pickle; x = list(range(1000))"
> "pickle.dumps(x)" 10 loops, best of 3: 19 usec per loop
> 
> # Python version
> $ python3 -m timeit -s "import pickle; x = list(range(1000))"
> "pickle._dumps(x)" 100 loops, best of 3: 2.25 msec per loop

And to show that this is important for third-party C extensions as
well, PyArrow (*) has comparable performance using similar techniques:

$ python -m timeit -s "import pyarrow as pa; x = list(range(1000))"
"pa.array(x, type=pa.int64())"
1 loops, best of 5: 27.2 usec per loop

(*) https://arrow.apache.org/docs/python/

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-19 Thread M.-A. Lemburg
On 19.11.2018 11:53, Antoine Pitrou wrote:
> On Mon, 19 Nov 2018 11:28:46 +0100
> Victor Stinner  wrote:
>> Python internals rely on internals to implement further optimizations,
>> than modifying an "immutable" tuple, bytes or str object, because you
>> can do that at the C level. But I'm not sure that I would like 3rd
>> party extensions to rely on such things.
> 
> I'm not even talking about *modifying* tuples or str objects, I'm
> talking about *accessing* their value without going through an abstract
> API that does slot lookups, indirect function calls and object unboxing.
> 
> For example, people may need a fast way to access the UTF-8
> representation of a unicode object.  Without making indirect function
> calls, and ideally without making a copy of the data either.  How do
> you do that using the generic C API?

Something else you need to consider is creating instances of
types, e.g. a tuple. In C you will have to be able to put
values into the data structure before it is passed outside
the function in order to build the tuple.

If you remove this possibility to have to copy data all the
time, losing the advantages of having a rich C API.
 --
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Nov 19 2018)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...   http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/


::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
  http://www.malemburg.com/

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-19 Thread Antoine Pitrou
On Mon, 19 Nov 2018 11:28:46 +0100
Victor Stinner  wrote:
> I would expect that the most common source of speed up of a C
> extension is the removal of the cost of bytecode evaluation (ceval.c
> loop).

Well, I don't.  All previous experiments showed that simply compiling
Python code to C code using the "generic" C API yielded a 30%
improvement.

Conversely, the C _pickle module can be 100x faster than the pure
Python pickle module.  It's doing it *not* by using the generic C
API, but by special-casing access to concrete types.  You don't get
that level of performance simply by removing the cost of bytecode
evaluation:

# C version
$ python3 -m timeit -s "import pickle; x = list(range(1000))"
"pickle.dumps(x)" 10 loops, best of 3: 19 usec per loop

# Python version
$ python3 -m timeit -s "import pickle; x = list(range(1000))"
"pickle._dumps(x)" 100 loops, best of 3: 2.25 msec per loop

So, the numbers are on my side.  So is the abundant experience of
experts such as the Cython developers.

> Python internals rely on internals to implement further optimizations,
> than modifying an "immutable" tuple, bytes or str object, because you
> can do that at the C level. But I'm not sure that I would like 3rd
> party extensions to rely on such things.

I'm not even talking about *modifying* tuples or str objects, I'm
talking about *accessing* their value without going through an abstract
API that does slot lookups, indirect function calls and object unboxing.

For example, people may need a fast way to access the UTF-8
representation of a unicode object.  Without making indirect function
calls, and ideally without making a copy of the data either.  How do
you do that using the generic C API?

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-19 Thread Victor Stinner
Le lun. 19 nov. 2018 à 10:48, Antoine Pitrou  a écrit :
> If the C API only provides Python-level semantics, then it will
> roughly have the speed of pure Python (modulo bytecode execution).
>
> There are important use cases for the C API where it is desired to have
> fast type-specific access to Python objects such as tuples, ints,
> strings, etc.  This is relied upon by modules such as _json and _pickle,
> and third-party extensions as well.

Are you sure that using PyDict_GetItem() is really way faster than
PyObject_GetItem()? Did someone run a benchmark to have numbers?

I would expect that the most common source of speed up of a C
extension is the removal of the cost of bytecode evaluation (ceval.c
loop).

Python internals rely on internals to implement further optimizations,
than modifying an "immutable" tuple, bytes or str object, because you
can do that at the C level. But I'm not sure that I would like 3rd
party extensions to rely on such things. For example, unicodeobject.c
uses the following function to check if a str object can be modified
in-place, or if a new str object must be created:

#ifdef Py_DEBUG
static int
unicode_is_singleton(PyObject *unicode)
{
PyASCIIObject *ascii = (PyASCIIObject *)unicode;
if (unicode == unicode_empty)
return 1;
if (ascii->state.kind != PyUnicode_WCHAR_KIND && ascii->length == 1)
{
Py_UCS4 ch = PyUnicode_READ_CHAR(unicode, 0);
if (ch < 256 && unicode_latin1[ch] == unicode)
return 1;
}
return 0;
}
#endif

static int
unicode_modifiable(PyObject *unicode)
{
assert(_PyUnicode_CHECK(unicode));
if (Py_REFCNT(unicode) != 1)
return 0;
if (_PyUnicode_HASH(unicode) != -1)
return 0;
if (PyUnicode_CHECK_INTERNED(unicode))
return 0;
if (!PyUnicode_CheckExact(unicode))
return 0;
#ifdef Py_DEBUG
/* singleton refcount is greater than 1 */
assert(!unicode_is_singleton(unicode));
#endif
return 1;
}

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-19 Thread Antoine Pitrou
On Sun, 18 Nov 2018 16:53:19 +0100
Stefan Behnel  wrote:
> 
> So, in Cython, we use macros wherever possible, and often avoid generic
> protocols in favour of type specialisations. We sometimes keep local copies
> of C-API helper functions, because inlining them allows the C compiler to
> strip down and streamline the implementation at compile time, rather than
> jumping through generic code. (Also, it's sometimes required in order to
> backport new CPython features to Py2.7+.)

Also this approach allows those ballooning compile times that are part
of Cython's charm and appeal ;-)
(sorry, couldn't resist)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-19 Thread Antoine Pitrou
On Fri, 16 Nov 2018 09:46:36 -0800
Brett Cannon  wrote:
> 
> I think part of the challenge here (and I believe it has been brought up
> elsewhere) is no one knows what kind of API is necessary for some faster VM
> other than PyPy. To me, the only C API that would could potentially start
> working toward and promoting **today** is one which is stripped to its bare
> bones and worst mirrors Python syntax. For instance, I have seen
> PyTuple_GET_ITEM() brought up a couple of times. But that's not syntax in
> Python, so I wouldn't feel comfortable including that in a simplified API.
> You really only need attribute access and object calling to make object
> indexing work, although for simplicity I can see wanting to provide an
> indexing API.

If the C API only provides Python-level semantics, then it will
roughly have the speed of pure Python (modulo bytecode execution).

There are important use cases for the C API where it is desired to have
fast type-specific access to Python objects such as tuples, ints,
strings, etc.  This is relied upon by modules such as _json and _pickle,
and third-party extensions as well.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-18 Thread Brett Cannon
On Fri, 16 Nov 2018 at 10:11, Paul Moore  wrote:

> On Fri, 16 Nov 2018 at 17:49, Brett Cannon  wrote:
> > And Just to be clear, I totally support coming up with a totally
> stripped-down C API as I have outlined above as that shouldn't be
> controversial for any VM that wants to have a C-level API.
>
> If a stripped down API like this is intended as "use this and you get
> compatibility across multiple Python interpreters and multiple Python
> versions" (essentially a much stronger and more effective version of
> the stable ABI) then I'm solidly in favour (and such an API has clear
> trade-offs that allow people to judge whether it's the right choice
> for them).
>

Yes, that's what I'm getting at. Basically we have to approach this from
the "start with nothing and build up until we have _just_ enough and thus
we know **everyone** now and into the future can support it", or we
approach with "take what we have now and start peeling back until we
_think_ it's good enough". Personally, I think the former is more
future-proof.


>
> Having this alongside the existing API, which would still be supported
> for projects that need low-level access or backward compatibility (or
> simply don't have the resources to change), but which will remain
> CPython-specific, seems like a perfectly fine idea.
>

And it can be done as wrappers around the current C API and as an external
project to start. As Nathaniel pointed out in another thread, this is
somewhat like what Py_LIMITED_API was meant to be, but I think we all admit
we slightly messed up by making it opt-out instead of opt-in and so we
didn't explicitly control that API as well as we probably should have (I
know I have probably screwed up by accidentally including import functions
by forgetting it was opt-out).

I also don't think it was necessarily designed from a minimalist
perspective to begin with as it defines things in terms of what's _not_ in
Py_LIMITED_API instead of explicitly listing what _is_. So it may (or may
not) lead to a different set of APIs in the end when you have to explicitly
list every API to include.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-18 Thread Stefan Behnel
Neil Schemenauer schrieb am 17.11.18 um 00:10:
> I think making PyObject an opaque pointer would help.

... well, as long as type checks are still as fast as with "ob_type", and
visible to the C compiler so that it can eliminate redundant ones, I
wouldn't mind. :)


> - Borrowed references are a problem.  However, because they are so
>   commonly used and because the source code changes needed to change
>   to a non-borrowed API is non-trivial, I don't think we should try
>   to change this.  Maybe we could just discourage their use?

FWIW, the code that Cython generates has a macro guard [1] that makes it
avoid borrowed references where possible, e.g. when it detects compilation
under PyPy. That's definitely doable already, right now.


> - It would be nice to make PyTypeObject an opaque pointer as well.
>   I think that's a lot more difficult than making PyObject opaque.
>   So, I don't think we should attempt it in the near future.  Maybe
>   we could make a half-way step and discourage accessing ob_type
>   directly.  We would provide functions (probably inline) to do what
>   you would otherwise do by using op->ob_type->.

I've sometimes been annoyed by the fact that protocol checks require two
pointer indirections in CPython (or even three in some cases), so that the
C compiler is essentially prevented from making any assumptions, and the
CPU branch prediction is also stretched a bit more than necessary. At
least, the slot check usually comes right before the call, so that the
lookups are not wasted. Inline functions are unlikely to improve that
situation, but at least they shouldn't make it worse, and they would be
more explicit.

Needless to say that Cython also has a macro guard in [1] that disables
direct slot access and makes it fall back to C-API calls, for users and
Python implementations where direct slot support is not wanted/available.


>   One reason you want to discourage access to ob_type is that
>   internally there is not necessarily one PyTypeObject structure for
>   each Python level type.  E.g. the VM might have specialized types
>   for certain sub-domains.  This is like the different flavours of
>   strings, depending on the set of characters stored in them.  Or,
>   you could have different list types.  One type of list if all
>   values are ints, for example.

An implementation like this could also be based on the buffer protocol.
It's already supported by the array.array type (which people probably also
just use when they have a need like this and don't want to resort to NumPy).


>   Basically, with CPython op->ob_type is super fast.  For other VMs,
>   it could be a lot slower.  By accessing ob_type you are saying
>   "give me all possible type information for this object pointer".
>   By using functions to get just what you need, you could be putting
>   less burden on the VM.  E.g. "is this object an instance of some
>   type" is faster to compute.

Agreed. I think that inline functions (well, or macros, because why not?)
that check for certain protocols explicitly could be helpful.


> - APIs that return pointers to the internals of objects are a
>   problem.  E.g. PySequence_Fast_ITEMS().  For CPython, this is
>   really fast because it is just exposing the internal details of
>   the layout that is already in the correct format.  For other VMs,
>   that API could be expensive to emulate.  E.g. you have a list to
>   store only ints.  If someone calls PySequence_Fast_ITEMS(), you
>   have to create real PyObjects for all of the list elements.

But that's intended by the caller, right? They want a flat serial
representation of the sequence, with potential conversion to a (list) array
if necessary. They might be a bit badly named, but that's exactly the
contract of the "PySequence_Fast_*()" line of functions.

In Cython, we completely avoid these functions, because they are way too
generic for optimisation purposes. Direct type checks and code
specialisation are much more effective.


> - Reducing the size of the API seems helpful.  E.g. we don't need
>   PyObject_CallObject() *and* PyObject_Call().  Also, do we really
>   need all the type specific APIs, PyList_GetItem() vs
>   PyObject_GetItem()?  In some cases maybe we can justify the bigger
>   API due to performance.  To add a new API, someone should have a
>   benchmark that shows a real speedup (not just that they imagine it
>   makes a difference).

So, in Cython, we use macros wherever possible, and often avoid generic
protocols in favour of type specialisations. We sometimes keep local copies
of C-API helper functions, because inlining them allows the C compiler to
strip down and streamline the implementation at compile time, rather than
jumping through generic code. (Also, it's sometimes required in order to
backport new CPython features to Py2.7+.)

PyPy's cpyext often just maps type specific C-API functions to the same
generic code, obviously, but in CPython, having a way to bypass protocols
and going 

Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-16 Thread Neil Schemenauer
On 2018-11-16, Nathaniel Smith wrote:
> [..] it seems like you should investigate (a) whether you can make
> Py_LIMITED_API *be* that API, instead of having two different
> ifdefs

That might be a good idea.  One problem is that we might like to
make backwards incompatible changes to Py_LIMITED_API.  Maybe it
doesn't matter if no extensions actually use Py_LIMITED_API.
Keeping API and ABI compatibility with the existing Py_LIMITED_API
could be difficult.

What would be the downside of using a new CPP define?  We could
deprecate Py_LIMITED_API and the new API could do the job.

Also, I think extensions should have to option to turn the ABI
compatibility off.  For some extensions, they will not want to
convert if there is a big performance hit (some macros turn into
non-inlined functions, call functions rather than access a
non-opaque structure).

Maybe there is a reason my toggling idea won't work.  If we can use
a CPP define to toggle between inline and non-inline functions, I
think it should work.  Maybe it will get complicated.

Providing ABI compatibility like Py_LIMITED_API is a different goal
than making the API more friendly to alternative Python VMs.  So,
maybe it is a mistake to try to tackle both goals at once.  However,
the goals seem closely related and so it would be a shame to do a
bunch of work and not achieve both.


Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-16 Thread Nathaniel Smith
On Fri, Nov 16, 2018 at 3:12 PM Neil Schemenauer  wrote:
> Also, the extension module should not take a big performance hit.
> So, you can't change all APIs that were macros into non-inlined
> functions.  People are not going to accept that and rightly so.
> However, it could be that we introduce a new ifdef like
> Py_LIMITED_API that gives a stable ABI.  E.g. when that's enabled,
> most everything would turn into non-inline functions.  In exchange
> for the performance hit, your extension would become ABI compatible
> between a range of CPython releases.  That would be a nice feature.
> Basically a more useful version of Py_LIMITED_API.

It seems like a lot of the things being talked about here actually
*are* features of Py_LIMITED_API. E.g. it does a lot of work to hide
the internal layout of PyTypeObject, and of course the whole selling
point is that it's stable across multiple Python versions.

If that's the kind of ABI you're looking for, then it seems like you
should investigate (a) whether you can make Py_LIMITED_API *be* that
API, instead of having two different ifdefs, (b) why no popular
extension modules actually use Py_LIMITED_API. I'm guessing it's
partly due to limits of the API, but also things like: lack of docs
and examples, lack of py2 support, ...

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-16 Thread Neil Schemenauer
On 2018-11-16, Brett Cannon wrote:
> I think part of the challenge here (and I believe it has been
> brought up elsewhere) is no one knows what kind of API is
> necessary for some faster VM other than PyPy.

I think we have some pretty good ideas as to what are the
problematic parts of the current API.  Victor's C-API web site has
details[1].  We can ask other implementors which parts are hard to
support.

Here are my thoughts about some desired changes:

- We are *not* getting rid of refcounting for extension modules.  That
  would require a whole new API.  We might as well start from
  scratch with Python 4.  No one wants that.  However, it is likely
  different VMs use a different GC internally and only use
  refcounting for objects passed through the C-API.  Using
  refcounted handles is the usual implementation approach.  We can
  make some changes to make that easier.  I think making PyObject an
  opaque pointer would help.

- Borrowed references are a problem.  However, because they are so
  commonly used and because the source code changes needed to change
  to a non-borrowed API is non-trivial, I don't think we should try
  to change this.  Maybe we could just discourage their use?  For
  CPython, using a borrowed reference API is faster.  For other
  Python implementations, it is likely slower and maybe much slower.
  So, if you are an extension module that wants to work well with
  other VMs, you should avoid those APIs.

- It would be nice to make PyTypeObject an opaque pointer as well.
  I think that's a lot more difficult than making PyObject opaque.
  So, I don't think we should attempt it in the near future.  Maybe
  we could make a half-way step and discourage accessing ob_type
  directly.  We would provide functions (probably inline) to do what
  you would otherwise do by using op->ob_type->.

  One reason you want to discourage access to ob_type is that
  internally there is not necessarily one PyTypeObject structure for
  each Python level type.  E.g. the VM might have specialized types
  for certain sub-domains.  This is like the different flavours of
  strings, depending on the set of characters stored in them.  Or,
  you could have different list types.  One type of list if all
  values are ints, for example.

  Basically, with CPython op->ob_type is super fast.  For other VMs,
  it could be a lot slower.  By accessing ob_type you are saying
  "give me all possible type information for this object pointer".
  By using functions to get just what you need, you could be putting
  less burden on the VM.  E.g. "is this object an instance of some
  type" is faster to compute.

- APIs that return pointers to the internals of objects are a
  problem.  E.g. PySequence_Fast_ITEMS().  For CPython, this is
  really fast because it is just exposing the internal details of
  the layout that is already in the correct format.  For other VMs,
  that API could be expensive to emulate.  E.g. you have a list to
  store only ints.  If someone calls PySequence_Fast_ITEMS(), you
  have to create real PyObjects for all of the list elements.

- Reducing the size of the API seems helpful.  E.g. we don't need
  PyObject_CallObject() *and* PyObject_Call().  Also, do we really
  need all the type specific APIs, PyList_GetItem() vs
  PyObject_GetItem()?  In some cases maybe we can justify the bigger
  API due to performance.  To add a new API, someone should have a
  benchmark that shows a real speedup (not just that they imagine it
  makes a difference).

I don't think we should change CPython internals to try to use this
new API.  E.g. we know that getting ob_type is fast so just leave
the code that does that alone.  Maybe in the far distant future,
if we have successfully got extension modules to switch to using
the new API, we could consider changing CPython internals.  There
would have to be a big benefit though to justify the code churn.
E.g. if my tagged pointers experiment shows significant performance
gains (it hasn't yet).

I like Nathaniel Smith's idea of doing the new API as a separate
project, outside the cpython repo.  It is possible that in that
effort, we would like some minor changes to cpython in order to make
the new API more efficient, for example.  Those should be pretty
limited changes because we are hoping that the new API will work on
top of old Python versions, e.g. 3.6.

To avoid exposing APIs that should be hidden, re-organizing include
files is an idea.  However, that doesn't help for old versions of
Python.  So, I'm thinking that Dino's idea of just duplicating the
prototypes would be better.  We would like a minimal API and so the
number of duplicated prototypes shouldn't be too large.

Victor's recent work in changing some macros to inline functions is
not really related to the new API project, IMHO.  I don't think
there is a problem to leave an existing macro as a macro.  If we
need to introduce new APIs, e.g. to help hide PyTypeObject, those
APIs could use inline functions.  That 

Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-16 Thread Paul Moore
On Fri, 16 Nov 2018 at 17:49, Brett Cannon  wrote:
> And Just to be clear, I totally support coming up with a totally 
> stripped-down C API as I have outlined above as that shouldn't be 
> controversial for any VM that wants to have a C-level API.

If a stripped down API like this is intended as "use this and you get
compatibility across multiple Python interpreters and multiple Python
versions" (essentially a much stronger and more effective version of
the stable ABI) then I'm solidly in favour (and such an API has clear
trade-offs that allow people to judge whether it's the right choice
for them).

Having this alongside the existing API, which would still be supported
for projects that need low-level access or backward compatibility (or
simply don't have the resources to change), but which will remain
CPython-specific, seems like a perfectly fine idea.

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-16 Thread Victor Stinner
Brett:
> But otherwise I think we are making assumptions here. For me, unless we are 
> trying to trim the C API down to just what is syntactically supported in 
> Python and in such a way that it hides all C-level details I feel like we are 
> guessing at what's best for other VMs, both today and in the future, until 
> they can tell us that e.g. tuple indexing is actually not a problem 
> performance-wise.

The current API of PyTuple_GET_ITEM() allows to write:

   PyObject **items = _GET_ITEM(tuple, 0);

to access PyTupleObject.ob_item. Not only it's possible, but it's used
commonly in the CPython code base. Last week I replaced
_GET_ITEM() pattern with a new _PyTuple_ITEMS() macro which is
private.

To be able to return PyObject**, you have to convert the full tuple
into PyObject* objects which is inefficient if your VM uses something
different (again, PyPy doesn't use PyObject* at all).

More generally, I like to use PyTuple_GET_ITEM() example, just because
it's easy to understand this macro. But it's maybe not a good example
:-)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-16 Thread Brett Cannon
On Wed, 14 Nov 2018 at 16:09, Gregory P. Smith  wrote:

> It seems like the discussion so far is:
>>
>> Victor: "I know people when people hear 'new API' they get scared and
>> think we're going to do a Python-3-like breaking transition, but don't
>> worry, we're never going to do that."
>> Nathaniel: "But then what does the new API add?"
>> Greg: "It lets us do a Python-3-like breaking transition!"
>>
>
> That is not what I am proposing but it seems too easy for people to
> misunderstand it as such. Sorry.
>
> Between everything discussed across this thread I believe we have enough
> information to suggest that we can avoid an "everyone's afraid of a new 3"
> mistake by instead making a shim available with a proposed new API that
> works on top of existing Python VM(s) so that if we decide to drop the old
> API being public in the future, we could do so *without a breaking
> transition*.
>

I know that has always been my hope, especially if any new API is actually
going to be more restrictive instead of broader.


>
> Given that, I suggest not worrying about defining a new C API within the
> CPython project and release itself (yet).
>

+1 from me. Until we have a PEP outlining the actual proposed API I'm not
ready to have it go into 'master'. Helping show the shape of the API by
wrapping pre-existing APIs I think that's going to be the way to sell it.


>
> Without an available benefit, little will use it (and given the function
> call overhead we want to isolate some concepts, we know it will perform
> worse on today's VMs).
>
> That "top-5" module using it idea?  Maintain forks (hooray for git) of
> whatever your definition of "top-5" projects is that use the new API
> instead of the CPython API.  If you attempt this on things like NumPy, you
> may be shocked at the states (plural on purpose) of their extension module
> code.  That holds true for a lot of popular modules.
>
> Part of the point of this work is to demonstrate that non-incremental
> order of magnitude performance change can be had on a Python VM that only
> supports such an API can be done in its own fork of CPython, PyPy,
> VictorBikeshedPy, FbIsAfraidToReleaseANewGcVmPy, etc. implementation to
> help argue for figuring out a viable not-breaking-the-world transition plan
> to do such a C API change thing in CPython itself.
>

I think part of the challenge here (and I believe it has been brought up
elsewhere) is no one knows what kind of API is necessary for some faster VM
other than PyPy. To me, the only C API that would could potentially start
working toward and promoting **today** is one which is stripped to its bare
bones and worst mirrors Python syntax. For instance, I have seen
PyTuple_GET_ITEM() brought up a couple of times. But that's not syntax in
Python, so I wouldn't feel comfortable including that in a simplified API.
You really only need attribute access and object calling to make object
indexing work, although for simplicity I can see wanting to provide an
indexing API.

But otherwise I think we are making assumptions here. For me, unless we are
trying to trim the C API down to just what is syntactically supported in
Python and in such a way that it hides all C-level details I feel like we
are guessing at what's best for other VMs, both today and in the future,
until they can tell us that e.g. tuple indexing is actually not a problem
performance-wise.

And Just to be clear, I totally support coming up with a totally
stripped-down C API as I have outlined above as that shouldn't be
controversial for any VM that wants to have a C-level API.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-14 Thread Gregory P. Smith
>
> It seems like the discussion so far is:
>
> Victor: "I know people when people hear 'new API' they get scared and
> think we're going to do a Python-3-like breaking transition, but don't
> worry, we're never going to do that."
> Nathaniel: "But then what does the new API add?"
> Greg: "It lets us do a Python-3-like breaking transition!"
>

That is not what I am proposing but it seems too easy for people to
misunderstand it as such. Sorry.

Between everything discussed across this thread I believe we have enough
information to suggest that we can avoid an "everyone's afraid of a new 3"
mistake by instead making a shim available with a proposed new API that
works on top of existing Python VM(s) so that if we decide to drop the old
API being public in the future, we could do so *without a breaking
transition*.

Given that, I suggest not worrying about defining a new C API within the
CPython project and release itself (yet).

Without an available benefit, little will use it (and given the function
call overhead we want to isolate some concepts, we know it will perform
worse on today's VMs).

That "top-5" module using it idea?  Maintain forks (hooray for git) of
whatever your definition of "top-5" projects is that use the new API
instead of the CPython API.  If you attempt this on things like NumPy, you
may be shocked at the states (plural on purpose) of their extension module
code.  That holds true for a lot of popular modules.

Part of the point of this work is to demonstrate that non-incremental order
of magnitude performance change can be had on a Python VM that only
supports such an API can be done in its own fork of CPython, PyPy,
VictorBikeshedPy, FbIsAfraidToReleaseANewGcVmPy, etc. implementation to
help argue for figuring out a viable not-breaking-the-world transition plan
to do such a C API change thing in CPython itself.

-gps
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-14 Thread Victor Stinner
Le mer. 14 nov. 2018 à 17:28, Paul Moore  a écrit :
> OK, got it. Thanks for taking the time to clarify and respond to my
> concerns. Much appreciated.

I'm my fault. I am failing to explain my plan proplerly. It seems like
I had to update my website to better explain :-)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-14 Thread Paul Moore
On Wed, 14 Nov 2018 at 16:00, Victor Stinner  wrote:
>
> In short, you don't have to modify your C extensions and they will
> continue to work as before on Python 3.8.
[...]
> I hope that "later" we will get a faster CPython using  new optimizations there>, only compatible with C extensions compiled
> with the new C API. My secret hope is that it should ease the
> experimentation of a (yet another) JIT compiler for CPython :-)

OK, got it. Thanks for taking the time to clarify and respond to my
concerns. Much appreciated.
Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-14 Thread Victor Stinner
In short, you don't have to modify your C extensions and they will
continue to work as before on Python 3.8.

I only propose to add a new C API, don't touch the existing one in any
way. Introducing backward incompatible changes to the existing C API
is out of my plan.

/usr/bin/python3.8 will support C extensions compiled with the old C
API and C extensions compiled with the new C API.

My plan also includes to be able to write C extensions compatible with
the old and new C API in a single code base. As we have Python code
working nicely on Python 2 and Python 3 (thanks to six, mostly). In my
experience, having two branches or two repositories for two flavors of
the same code is a nice recipe towards inconsistent code and painful
workflow.

Le mer. 14 nov. 2018 à 15:53, Paul Moore  a écrit :
> It occurs to me that we may be talking at cross purposes. I noticed
> https://pythoncapi.readthedocs.io/backward_compatibility.html#forward-compatibility-with-python-3-8-and-newer
> which seems to be saying that 3rd party code *will* need to change for
> 3.8.

Oh. It's badly explained in that case. This section is only about C
extensions which really want to become compatible with the new C API.

> You mention removed functions there, so I guess "stop using the
> removed functions and you'll work with 3.8+ and <=3.7" is the
> compatible approach - but it doesn't offer a way for projects that
> *need* the functionality that's been removed to move forward.

If you need a removed functions, don't use the new C API.

> So to try to be clear, your proposal is that in 3.8:
>
> 1. The existing C API will remain
> 2. A new C API will be *added* that 3rd party projects can use should
> they wish to.

Yes, that's it. Add a new API, don't touch existing API.

> And in 3.9 onwards, both C APIs will remain, maybe with gradual and
> incremental changes that move users of the existing C API closer and
> closer to the new one (via deprecations, replacement APIs etc as per
> our normal compatibility rules).

Honestly, it's too early to say if we should modify the current C API
in any way.

I only plan to put advices in the *documentation*. Something like
"this function is really broken, don't use it" :-) Or "you can use xxx
instead which makes your code compatible with the new C API". But I
don't plan it to modify the doc soon. It's too early at this point.

> Or is the intention that at *some*
> point there will be a compatibility break and the existing API will
> simply be removed in favour of the "new" API?

That's out of the scope of *my* plan.

Maybe someone else will show up in 10 years and say "ok, let's
deprecate the old C API". But in my experience, legacy stuff never
goes away :-) (Python 2, anyone?)

> The above is clear, but I don't see what incentive there is in that
> scenario for anyone to actually migrate to the new API...

https://pythoncapi.readthedocs.io/ tries to explain why you should
want to be compatible with the new C API.

The main advantage of the new C API is to compile your C extension
once and use it on multiple runtimes:

* use PyPy for better performances (better than with the old C API)
* use a Python Debug Runtime which contains additional runtime checks
to detect various kinds of bugs in your C extension
* distribute a single binary working on multiple Python versions
(compile on 3.8, use it on 3.9): "stable ABI" -- we are no there yet,
I didn't check what should be done in practice for that

I hope that "later" we will get a faster CPython using , only compatible with C extensions compiled
with the new C API. My secret hope is that it should ease the
experimentation of a (yet another) JIT compiler for CPython :-)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-14 Thread Paul Moore
On Wed, 14 Nov 2018 at 14:39, Paul Moore  wrote:

> If it is the case that there's no need for any 3rd party code to
> change in order to continue working with 3.8+, then I apologise for
> the interruption.

This is where being able to edit posts, a la Discourse would be useful :-)

It occurs to me that we may be talking at cross purposes. I noticed
https://pythoncapi.readthedocs.io/backward_compatibility.html#forward-compatibility-with-python-3-8-and-newer
which seems to be saying that 3rd party code *will* need to change for
3.8. You mention removed functions there, so I guess "stop using the
removed functions and you'll work with 3.8+ and <=3.7" is the
compatible approach - but it doesn't offer a way for projects that
*need* the functionality that's been removed to move forward. That's
the type of hard break that I was trying to ask about, and which I
thought you said would not happen when you stated "I don't want to
force anyone to move to a new experimental API", and "No, the current
C API will remain available. No one is forced to do anything. That's
not part of my plan".

So to try to be clear, your proposal is that in 3.8:

1. The existing C API will remain
2. A new C API will be *added* that 3rd party projects can use should
they wish to.

And in 3.9 onwards, both C APIs will remain, maybe with gradual and
incremental changes that move users of the existing C API closer and
closer to the new one (via deprecations, replacement APIs etc as per
our normal compatibility rules). Or is the intention that at *some*
point there will be a compatibility break and the existing API will
simply be removed in favour of the "new" API? Fundamentally, that's
what I'm trying to get a clear picture of.

The above is clear, but I don't see what incentive there is in that
scenario for anyone to actually migrate to the new API...
Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-14 Thread Paul Moore
On Wed, 14 Nov 2018 at 14:28, Victor Stinner  wrote:
> > assuming the experiment is successful, forced (as opposed to opt-in)
> > migration to the new API would be handled in a gradual,
>
> No, the current C API will remain available. No one is forced to do
> anything. That's not part of my plan.

Oh, cool. So current code will continue working indefinitely? What's
the incentive for projects to switch to the new API in that case?
Won't we just end up having to carry two APIs indefinitely? Sorry if
this is all obvious, or was explained previously - as I said I've not
been following precisely because I assumed it was all being handled on
an "if you don't care you can ignore it and nothing will change"
basis, but Raymond's comments plus your suggestion that you needed to
test existing C extensions, made me wonder.

If it is the case that there's no need for any 3rd party code to
change in order to continue working with 3.8+, then I apologise for
the interruption.
Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-14 Thread Victor Stinner
Le mer. 14 nov. 2018 à 14:36, Paul Moore  a écrit :
> PS What percentage does "top 5" translate to? In terms of both
> downloads and actual numbers of extensions? With only 5, it would be
> very easy (I suspect) to get only scientific packages, and (for
> example) miss out totally on database APIs, or web helpers. You'll
> likely get a broader sense of where issues lie if you cover a wide
> range of application domains.

I don't want to force anyone to move to a new experimental API. I
don't want to propose patches to third party modules for example. I
would like to ensure that I don't break too many C extensions, or that
tools to convert C extensions to the new API work as expected :-)

Everything is experimental.

> PPS I'd like to see a summary of your backward compatibility plan.

https://pythoncapi.readthedocs.io/backward_compatibility.html

> assuming the experiment is successful, forced (as opposed to opt-in)
> migration to the new API would be handled in a gradual,

No, the current C API will remain available. No one is forced to do
anything. That's not part of my plan.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-14 Thread Paul Moore
On Tue, 13 Nov 2018 at 21:02, Victor Stinner  wrote:

> My plan is to select something like the top five most popular C
> extensions based on PyPI download statistics. I cannot test
> everything, I have to put practical limits.

You should probably also consider embedding applications - these have
the potential to be adversely affected too. One example would be vim,
which embeds Python, and makes fairly heavy use of the API (in some
relatively nonstandard ways, for better or worse).

Paul

PS What percentage does "top 5" translate to? In terms of both
downloads and actual numbers of extensions? With only 5, it would be
very easy (I suspect) to get only scientific packages, and (for
example) miss out totally on database APIs, or web helpers. You'll
likely get a broader sense of where issues lie if you cover a wide
range of application domains.

PPS I'd like to see a summary of your backward compatibility plan.
I've not been following this thread so maybe I missed it (if so, a
pointer would be appreciated), but I'd expect as a user that
extensions and embedding applications would *not* need a major rewrite
to work with Python 3.8 - that being the implication of "opt in". I'd
also expect that to remain true for any future version of Python -
assuming the experiment is successful, forced (as opposed to opt-in)
migration to the new API would be handled in a gradual, backward
compatibility respecting manner, exactly as any other changes to the C
API are. A hard break like Python 3, even if limited to the C API,
would be bad for users (for example, preventing adoption of Python 3.X
until the scientific stack migrates to the new API and works out how
to handle supporting old-API versions of Python...)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-14 Thread André Malo
On Dienstag, 13. November 2018 21:59:14 CET Victor Stinner wrote:
> Le mar. 13 nov. 2018 à 20:32, André Malo  a écrit :
> > As long as they are recompiled. However, they will lose a lot of
> > performance. Both these points have been mentioned somewhere, I'm
> > certain, but it cannot be stressed enough, IMHO.
> 
> Somewhere is here:
> https://pythoncapi.readthedocs.io/performance.html

> > I'm wondering, how you suggest to measure "major". I believe, every C
> > extension, which is public and running in production somewhere, is major
> > enough.
> 
> My plan is to select something like the top five most popular C
> extensions based on PyPI download statistics. I cannot test
> everything, I have to put practical limits.

You shouldn't. Chances are, that you don't even know them enough to do that. 
A scalable approach would be to talk to the projects and let them do it 
instead. No?

Cheers,
-- 
package Hacker::Perl::Another::Just;print
qq~@{[reverse split/::/ =>__PACKAGE__]}~;

#  André Malo  #  http://www.perlig.de  #


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-14 Thread Antoine Pitrou
On Wed, 14 Nov 2018 11:48:15 +0100
Victor Stinner  wrote:
> Le mer. 14 nov. 2018 à 11:24, Antoine Pitrou  a écrit :
> > For example in PyArrow we use PySequence_Fast_GET_ITEM() (*)  
> 
> Maybe PyArrow is a kind of C extension which should have one
> implementation for the new C API (PyPy) and one implementation for the
> current C API (CPython)?

Yes, maybe.  I'm just pointing out that we're using those macros and
removing them from the C API (or replacing them with non-inline
functions) would hurt us.

> > and even
> > PyType_HasFeature() (**) (to quickly check for multiple base types with
> > a single fetch and comparison).  
> 
> I'm not sure that PyType_HasFeature() is an issue?

I don't know.  You're the one who decides :-)

cheers

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-14 Thread Victor Stinner
Le mer. 14 nov. 2018 à 11:24, Antoine Pitrou  a écrit :
> For example in PyArrow we use PySequence_Fast_GET_ITEM() (*)

Maybe PyArrow is a kind of C extension which should have one
implementation for the new C API (PyPy) and one implementation for the
current C API (CPython)?

Cython can be used to generate two different C code from the same
source code using a different compilation mode.

> and even
> PyType_HasFeature() (**) (to quickly check for multiple base types with
> a single fetch and comparison).

I'm not sure that PyType_HasFeature() is an issue?

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-14 Thread Antoine Pitrou
On Wed, 14 Nov 2018 11:03:49 +0100
Victor Stinner  wrote:
> 
> Oh, I should stop to promote my "CPython fork" idea.
> 
> There is already an existing VM which is way faster than CPython but
> its performances are limited by the current C API. The VM is called...
> PyPy!
> 
> The bet is that migrating to a new C API would make your C extension faster.

Faster on PyPy... but potentially slower on CPython.  That's what we
(you :-)) need to investigate and solve.  Those macros and inline
functions are actually important for many use cases.

For example in PyArrow we use PySequence_Fast_GET_ITEM() (*) and even
PyType_HasFeature() (**) (to quickly check for multiple base types with
a single fetch and comparison).

(*)
https://github.com/apache/arrow/blob/master/cpp/src/arrow/python/iterators.h#L39-L86
(**)
https://github.com/apache/arrow/blob/master/cpp/src/arrow/python/helpers.cc#L266-L299

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-14 Thread Victor Stinner
Le mer. 14 nov. 2018 à 03:24, Nathaniel Smith  a écrit :
> So I think what you're saying is that your goal is to get a
> new/better/shinier VM, and the plan to accomplish that is:
>
> 1. Define a new C API.
> 2. Migrate projects to the new C API.
> 3. Build a new VM that gets benefits from only supporting the new API.
>
> This sounds exactly backwards to me?
>
> If you define the new API before you build the VM, then no-one is
> going to migrate, because why should they bother? You'd be asking
> overworked third-party maintainers to do a bunch of work with no
> benefit, except that maybe someday later something good might happen.

Oh, I should stop to promote my "CPython fork" idea.

There is already an existing VM which is way faster than CPython but
its performances are limited by the current C API. The VM is called...
PyPy!

The bet is that migrating to a new C API would make your C extension faster.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-13 Thread Nathaniel Smith
On Sun, Nov 11, 2018 at 3:19 PM, Victor Stinner  wrote:
> I'm not sure yet how far we should go towards a perfect API which
> doesn't leak everything. We have to move slowly, and make sure that we
> don't break major C extensions. We need to write tools to fully
> automate the conversion. If it's not possible, maybe the whole project
> will fail.

This is why I'm nervous about adding this directly to CPython. If
we're just talking about adding a few new API calls to replace old
ones that are awkward to use, then that's fine, that's not very risky.
But if you're talking about a large project that makes fundamental
changes in the C API (e.g., disallowing pointer dereferences, like
tagged pointers do), then yeah, there's a very large risk that that
might fail.

>> If so, then would it make more sense to develop this as an actual>> separate 
>> abstraction layer? That would have the huge advantage that it
>> could be distributed and versioned separately from CPython, different
>> packages could use different versions of the abstraction layer, PyPy
>> isn't forced to immediately add a bunch of new APIs...
>
> I didn't investigate this option. But I expect that you will have to
> write a full new API using a different prefix than "Py_". Otherwise,
> I'm not sure how you want to handle PyTuple_GET_ITEM() as a macro on
> one side (Include/tupleobject.h) and PyTuple_GET_ITEM() on the other
> side (hypotetical_new_api.h).
>
> Would it mean to duplicate all functions to get a different prefix?
>
> If you keep the "Py_" prefix, what I would like to ensure is that some
> functions are no longer accessible. How you remove
> PySequence_Fast_GET_ITEM() for example?
>
> For me, it seems simpler to modify CPython headers than starting on
> something new. It seems simpler to choose the proper level of
> compatibility. I start from an API 100% compatible (the current C
> API), and decide what is changed and how.

It may be simpler, but it's hugely more risky. Once you add something
to CPython, you can't take it back again without a huge amount of
work. You said above that the whole project might fail. But if it's in
CPython, failure is not acceptable! The whole problem you're trying to
solve is that the C API is too big, but your proposed solution starts
by making it bigger, so if your project fails then it makes the
problem even bigger...

I don't know if making it a separate project is the best approach or
not, it was just an idea :-). But it would have the huge benefit that
you can actually experiment and try things out without committing to
supporting them forever.

And I don't know the best answer to all your questions above, that's
what experimenting is for :-). But it certainly is technically
possible to make a new API that shares a common subset with the old
API, e.g.:

/* NewPython.h */
#include 
#define PyTuple_GET_ITEM PyTuple_Get_Item
#undef PySequence_Fast_GET_ITEM

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-13 Thread Nathaniel Smith
On Mon, Nov 12, 2018 at 10:46 PM, Gregory P. Smith  wrote:
>
> On Fri, Nov 9, 2018 at 5:50 PM Nathaniel Smith  wrote:
>>
>> On Fri, Nov 9, 2018 at 4:30 PM, Victor Stinner 
>> wrote:
>> > Ah, important points. I don't want to touch the current C API nor make
>> > it less efficient. And compatibility in both directions (current C API
>> > <=> new C API) is very important for me. There is no such plan as
>> > "Python 4" which would break the world and *force* everybody to
>> > upgrade to the new C API, or stay to Python 3 forever. No. The new C
>> > API must be an opt-in option, and current C API remains the default
>> > and not be changed.
>>
>> Doesn't this mean that you're just making the C API larger and more
>> complicated, rather than simplifying it? You cite some benefits
>> (tagged pointers, changing the layout of PyObject, making PyPy's life
>> easier), but I don't see how you can do any of those things so long as
>> the current C API remains supported.
[...]
> I'd love to get to a situation where the only valid ABI we support knows 
> nothing about internal structs at all. Today, PyObject memory layout is 
> exposed to the world and unchangable. :(
> This is a long process release wise (assume multiple stable releases go by 
> before we could declare that).

It seems like the discussion so far is:

Victor: "I know people when people hear 'new API' they get scared and
think we're going to do a Python-3-like breaking transition, but don't
worry, we're never going to do that."
Nathaniel: "But then what does the new API add?"
Greg: "It lets us do a Python-3-like breaking transition!"

To make a new API work we need to *either* have some plan for how it
will produce benefits without a big breaking transition, *or* some
plan for how to make this kind of transition viable. These are both
super super hard questions -- that's why this discussion has been
dragging on for a decade now! But you do have to pick one or the other
:-).

> Experimentation with new internal implementations can begin once we have a 
> new C API by explicitly breaking the old C API with-in such experiments (as 
> is required for most anything interesting).  All code that is written to the 
> new C API still works during this process, thus making the job of practical 
> testing of such new VM internals easier.

So I think what you're saying is that your goal is to get a
new/better/shinier VM, and the plan to accomplish that is:

1. Define a new C API.
2. Migrate projects to the new C API.
3. Build a new VM that gets benefits from only supporting the new API.

This sounds exactly backwards to me?

If you define the new API before you build the VM, then no-one is
going to migrate, because why should they bother? You'd be asking
overworked third-party maintainers to do a bunch of work with no
benefit, except that maybe someday later something good might happen.

And if you define the new API first, then when you start building the
VM you're 100% guaranteed to discover that the new API isn't *quite*
right for the optimizations you want to do, and have to change it
again to make a new-new API. And then go back to the maintainers who
you did convince to put their neck out and do work on spec, and
explain that haha whoops actually they need to update their code
*again*.

There have been lots of Python VM projects at this point. They've
faced many challenges, but I don't think any have failed because there
just wasn't enough pure-Python code around to test the VM internals.
If I were trying to build a new Python VM, that's not even in the top
10 of issues I'd be worried about...

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-13 Thread Victor Stinner
Le mar. 13 nov. 2018 à 20:32, André Malo  a écrit :
> As long as they are recompiled. However, they will lose a lot of performance.
> Both these points have been mentioned somewhere, I'm certain, but it cannot be
> stressed enough, IMHO.

Somewhere is here:
https://pythoncapi.readthedocs.io/performance.html

> I'm wondering, how you suggest to measure "major". I believe, every C
> extension, which is public and running in production somewhere, is major
> enough.

My plan is to select something like the top five most popular C
extensions based on PyPI download statistics. I cannot test
everything, I have to put practical limits.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-13 Thread André Malo
Victor Stinner wrote:

> Replacing macros with functions has little impact on backward
> compatibility. Most C extensions should still work if macros become
> functions.

As long as they are recompiled. However, they will lose a lot of performance. 
Both these points have been mentioned somewhere, I'm certain, but it cannot be 
stressed enough, IMHO.

> 
> I'm not sure yet how far we should go towards a perfect API which
> doesn't leak everything. We have to move slowly, and make sure that we
> don't break major C extensions. We need to write tools to fully
> automate the conversion. If it's not possible, maybe the whole project
> will fail.

I'm wondering, how you suggest to measure "major". I believe, every C 
extension, which is public and running in production somewhere, is major 
enough.

Maybe "easiness to fix"? Lines of code?

Cheers,
-- 
> Rätselnd, was ein Anthroposoph mit Unterwerfung zu tun hat...

Du gönnst einem keine einzige.-- Jean Claude und David Kastrup in dtl[...] 
Dieses Wort gibt so viele Stellen für einen Spelling Flame her, und


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-13 Thread André Malo
Victor Stinner wrote:

> Replacing macros with functions has little impact on backward
> compatibility. Most C extensions should still work if macros become
> functions.

As long as they are recompiled. However, they will lose a lot of performance. 
Both these points have been mentioned somewhere, I'm certain, but it cannot be 
stressed enough, IMHO.

> 
> I'm not sure yet how far we should go towards a perfect API which
> doesn't leak everything. We have to move slowly, and make sure that we
> don't break major C extensions. We need to write tools to fully
> automate the conversion. If it's not possible, maybe the whole project
> will fail.

I'm wondering, how you suggest to measure "major". I believe, every C 
extension, which is public and running in production somewhere, is major 
enough.

Maybe "easiness to fix"? Lines of code?

Cheers,
-- 
> Rätselnd, was ein Anthroposoph mit Unterwerfung zu tun hat...

[...] Dieses Wort gibt so viele Stellen für einen Spelling Flame her, und
Du gönnst einem keine einzige.-- Jean Claude und David Kastrup in dtl


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-13 Thread Victor Stinner
Le mar. 13 nov. 2018 à 08:13, Gregory P. Smith  a écrit :
> When things have only ever been macros (Py_INCREF, etc) the name can be 
> reused if there has never been a function of that name in an old C API.  But 
> beware of reuse for anything where the semantics change to avoid 
> misunderstandings about behavior from people familiar with the old API or 
> googling API names to look up behavior.

My plan is to only keep an existing function if it has no flaw. If it
has a flaw, it should be removed and maybe replaced with a new
function (or suggest a replacement using existing APIs). I don't want
to modify the behavior depending if it's the "old" or the "new" API.
My plan reuses the same code base, I don't want to put the whole body
of a function inside a "#ifdef NEWCAPI".


> I suspect optimizing for ease of transition from code written to the existing 
> C API to the new API by keeping names the same is the wrong thing to optimize 
> for.

Not all functions in the current C API are bad. Many functions are
just fine. For example, PyObject_GetAttr() returns a strong reference.
I don't see anything wrong with this API. Only a small portion of the
C API is "bad".


> Using entirely new names may actually be a good thing as it makes it 
> immediately clear which way a given piece of code is written. It'd also be 
> good for PyObject* the old C API thing be a different type from PythonHandle* 
> (a new API thing who's name I just made up) such that they could not be 
> passed around and exchanged for one another without a compiler complaint.  
> Code written using both APIs should not be allowed to transit objects 
> directly between different APIs.

On Windows, the HANDLE type is just an integer, it's not a pointer. If
it's a pointer, some developer may want to dereference it, whereas it
must really be a dummy integer. Consider tagged pointers: you don't
want to dereferenced a tagged pointer. But no, I don't plan to replace
"PyObject*". Again, I want to reduce the number of changes. If the
PyObject structure is not exposed, I don't think that it's an issue to
keep "PyObject*" type.

Example:
---
#include 

typedef struct _object PyObject;

PyObject* dummy(void)
{
return (PyObject *)NULL;
}

int main()
{
PyObject *obj = dummy();
return obj->ob_type;
}
---

This program is valid, except of the single line which attempts to
dereference PyObject*:

x.c: In function 'main':
x.c:13:15: error: dereferencing pointer to incomplete type 'PyObject
{aka struct _object}'
 return obj->ob_type;

If I could restart from scratch, I would design the C API differently.
For example, I'm not sure that I would use "global variables" (Python
thread state) to store the current exception. I would use similar like
Rust error handling:
https://doc.rust-lang.org/book/first-edition/error-handling.html

But that's not my plan. My plan is not to write a new bright world. My
plan is to make a "small step" towards a better API to make PyPy more
efficient and to allow to write a new more optimized CPython.

I also plan to *iterate* on the API rather than having a frozen API.
It's just that we cannot jump towards the perfect API at once. We need
small steps and make sure that we don't break too many C extensions at
each milestone. Maybe the new API should be versioned as Android NDK
for example.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-12 Thread Gregory P. Smith
On Sun, Nov 11, 2018 at 3:19 PM Victor Stinner  wrote:

> Le sam. 10 nov. 2018 à 04:02, Nathaniel Smith  a écrit :
> > So is it fair to say that your plan is that CPython will always use
> > the current ("old") API internally, and the "new" API will be
> > essentially an abstraction layer, that's designed to let people write
> > C extensions that target the old API, while also being flexible enough
> > to target PyPy and other "new different Python runtimes"?
>
> What we call is "Python C API" is not really an API. I mean, nobody
> tried to sit down and think about a proper API to access Python from
> C. We happened is that we had an implementation of Python written in C
> and it was cool to just "expose everything". By everything, I mean
> "everything". It's hard to find a secret CPython feature not exposed
> or leaked in the C API.
>
> The problem that I'm trying to solve to "fix the C API" to hide
> implementation details, with one constraint: don't create a new
> "Python 4". I don't want to break the backward compatibility without
> having a slow and smooth transition plan. The "old" and new C API must
> be supported in parallel, at least in the standard CPython,
> /usr/bin/python3.
>
> Writing a new API from scratch is nice, but it's harder to moving all
> existing C extensions from the "old" C API to a new one.
>
> Replacing macros with functions has little impact on backward
> compatibility. Most C extensions should still work if macros become
> functions.
>
> I'm not sure yet how far we should go towards a perfect API which
> doesn't leak everything. We have to move slowly, and make sure that we
> don't break major C extensions. We need to write tools to fully
> automate the conversion. If it's not possible, maybe the whole project
> will fail.
>
> I'm looking for a practical solutions based on the existing C API and
> the existing CPython code base.
>
> > If so, then would it make more sense to develop this as an actual
> > separate abstraction layer? That would have the huge advantage that it
> > could be distributed and versioned separately from CPython, different
> > packages could use different versions of the abstraction layer, PyPy
> > isn't forced to immediately add a bunch of new APIs...
>

I like where this thought is headed!

I didn't investigate this option. But I expect that you will have to
> write a full new API using a different prefix than "Py_". Otherwise,
> I'm not sure how you want to handle PyTuple_GET_ITEM() as a macro on
> one side (Include/tupleobject.h) and PyTuple_GET_ITEM() on the other
> side (hypotetical_new_api.h).
>
> Would it mean to duplicate all functions to get a different prefix?
>

For strict C, yes, namespacing has always been manual collision
avoidance/mangling.  You'd need a new unique name for anything defined as a
function that conflicts with the old C API function names.  You could hide
these from code if you are just reusing the name by using #define in the
new API header of the old name to the new unique name... but that makes
code a real challenge to read and understand at a glance.  Without
examining the header file imports the reader wouldn't know for example if
the code is calling the API that borrows a reference (old api, evil) or one
that gets its own reference (presumed new api).

When things have only ever been macros (Py_INCREF, etc) the name can be
reused if there has never been a function of that name in an old C API.
But beware of reuse for anything where the semantics change to avoid
misunderstandings about behavior from people familiar with the old API or
googling API names to look up behavior.

I suspect optimizing for ease of transition from code written to the
existing C API to the new API by keeping names the same is the wrong thing
to optimize for.

Using entirely new names may actually be a good thing as it makes it
immediately clear which way a given piece of code is written. It'd also be
good for PyObject* the old C API thing be a different type from
PythonHandle* (a new API thing who's name I just made up) such that they
could not be passed around and exchanged for one another without a compiler
complaint.  Code written using both APIs should not be allowed to transit
objects directly between different APIs.

-gps


>
> If you keep the "Py_" prefix, what I would like to ensure is that some
> functions are no longer accessible. How you remove
> PySequence_Fast_GET_ITEM() for example?
>
> For me, it seems simpler to modify CPython headers than starting on
> something new. It seems simpler to choose the proper level of
> compatibility. I start from an API 100% compatible (the current C
> API), and decide what is changed and how.
>
> Victor
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/greg%40krypto.org
>
___

Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-12 Thread Gregory P. Smith
On Fri, Nov 9, 2018 at 5:50 PM Nathaniel Smith  wrote:

> On Fri, Nov 9, 2018 at 4:30 PM, Victor Stinner 
> wrote:
> > Ah, important points. I don't want to touch the current C API nor make
> > it less efficient. And compatibility in both directions (current C API
> > <=> new C API) is very important for me. There is no such plan as
> > "Python 4" which would break the world and *force* everybody to
> > upgrade to the new C API, or stay to Python 3 forever. No. The new C
> > API must be an opt-in option, and current C API remains the default
> > and not be changed.
>
> Doesn't this mean that you're just making the C API larger and more
> complicated, rather than simplifying it? You cite some benefits
> (tagged pointers, changing the layout of PyObject, making PyPy's life
> easier), but I don't see how you can do any of those things so long as
> the current C API remains supported.
>
> -n
>

I believe the implied missing thing from Victor's description is this:

Experimentation with new internal implementations can begin once we have a
new C API by explicitly breaking the old C API with-in such experiments (as
is required for most anything interesting).  All code that is written to
the new C API still works during this process, thus making the job of
practical testing of such new VM internals easier.

>From there, you can make decisions on how heavily to push the world towards
adoption of the new C API and by when so that a runtime not supporting the
old API can be realized with a list of enticing carrot tasting benefits.
(devising any necessary pypy cpyext-like compatibility solutions for super
long lived code or things that sadly want to use the 3.7 ABI for 10 years
in the process - and similarly an extension to provide some or all of this
API/ABI on top of older existing stable Python releases!)

I'd *love* to get to a situation where the only valid ABI we support knows
nothing about internal structs at all. Today, PyObject memory layout is
exposed to the world and unchangable. :(

This is a long process release wise (assume multiple stable releases go by
before we could declare that). But *we've got to start* by defining what we
want to provide as a seriously limited but functional API and ABI even if
it doesn't perform as well as things compiled against our existing
exposed-internals C API.  For *most* extension modules, performance of this
sort is not important. For the numpys of the world life is more
complicated, we should work with them to figure out their C API needs.

If it wasn't already obvious, you've got my support on this. :)
-gps

PS If the conversation devolves to arguing about "new" being a bad name,
that's a good sign.  I suggest calling it the Vorpal API after the bunny.
Or be boring and just use a year number for the name.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-11 Thread Victor Stinner
Le sam. 10 nov. 2018 à 04:02, Nathaniel Smith  a écrit :
> So is it fair to say that your plan is that CPython will always use
> the current ("old") API internally, and the "new" API will be
> essentially an abstraction layer, that's designed to let people write
> C extensions that target the old API, while also being flexible enough
> to target PyPy and other "new different Python runtimes"?

What we call is "Python C API" is not really an API. I mean, nobody
tried to sit down and think about a proper API to access Python from
C. We happened is that we had an implementation of Python written in C
and it was cool to just "expose everything". By everything, I mean
"everything". It's hard to find a secret CPython feature not exposed
or leaked in the C API.

The problem that I'm trying to solve to "fix the C API" to hide
implementation details, with one constraint: don't create a new
"Python 4". I don't want to break the backward compatibility without
having a slow and smooth transition plan. The "old" and new C API must
be supported in parallel, at least in the standard CPython,
/usr/bin/python3.

Writing a new API from scratch is nice, but it's harder to moving all
existing C extensions from the "old" C API to a new one.

Replacing macros with functions has little impact on backward
compatibility. Most C extensions should still work if macros become
functions.

I'm not sure yet how far we should go towards a perfect API which
doesn't leak everything. We have to move slowly, and make sure that we
don't break major C extensions. We need to write tools to fully
automate the conversion. If it's not possible, maybe the whole project
will fail.

I'm looking for a practical solutions based on the existing C API and
the existing CPython code base.

> If so, then would it make more sense to develop this as an actual
> separate abstraction layer? That would have the huge advantage that it
> could be distributed and versioned separately from CPython, different
> packages could use different versions of the abstraction layer, PyPy
> isn't forced to immediately add a bunch of new APIs...

I didn't investigate this option. But I expect that you will have to
write a full new API using a different prefix than "Py_". Otherwise,
I'm not sure how you want to handle PyTuple_GET_ITEM() as a macro on
one side (Include/tupleobject.h) and PyTuple_GET_ITEM() on the other
side (hypotetical_new_api.h).

Would it mean to duplicate all functions to get a different prefix?

If you keep the "Py_" prefix, what I would like to ensure is that some
functions are no longer accessible. How you remove
PySequence_Fast_GET_ITEM() for example?

For me, it seems simpler to modify CPython headers than starting on
something new. It seems simpler to choose the proper level of
compatibility. I start from an API 100% compatible (the current C
API), and decide what is changed and how.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-10 Thread Neil Schemenauer
On 2018-11-09, Dino Viehland wrote:
> Rather than adding yet another pre-processor directive for this I would
> suggest just adding a new header file that only has the new stable API.
> For example it could just be "py.h" or "pyapi.h".  It would have all of the
> definitions for the stable API.

I like this idea.  It will be easier to define a minimal and clean
API with this approach.  I believe it can mostly be a subset of the
current API.

I think we could Dino's idea with Nathaniel's suggestion of
developing it separate from CPython.  Victor's C-API project is
already attempting to provide backwards compatibility.  I.e. you can
have an extension module that uses the new API but compiles and runs
with older versions of Python (e.g. 3.6).  So, whatever is inside
this new API, it must be possible to build it on top of the existing
Python API.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-09 Thread Dino Viehland
>
> That's exactly why I dislike "New", it's like adding "Ex" or "2" to a
> function name :-)
>
> Well, before bikeshedding on the C define name, I would prefer to see
> if the overall idea of trying to push code for the new C API in the
> master branch is a good idea, or if it's too early and the experiment
> must continue in a fork.


Rather than adding yet another pre-processor directive for this I would
suggest just adding a new header file that only has the new stable API.
For example it could just be "py.h" or "pyapi.h".  It would have all of the
definitions for the stable API.

While that would involve some duplication from the existing headers, I
don't think it would be such a big deal - the idea is the API won't change,
methods won't be removed, and occasionally new methods will get
added in a very thoughtful manner.  Having it be separate will force
thought and conversation about it.

It would also make it very easy to look and see what exactly is in the
stable API as well.  There's be a pretty flat list which can be consulted,
and hopefully it ends up not being super huge either.

BTW, thanks for continuing to push on this Victor, it seems like great
progress!

On Fri, Nov 9, 2018 at 4:57 PM Victor Stinner  wrote:

> Le sam. 10 nov. 2018 à 01:49, Michael Selik  a écrit :
> >> It uses an opt-in option (Py_NEWCAPI define -- I'm not sure about the
> >> name) to get the new API. The current C API is unchanged.
> >
> > While one can hope that this will be the only time the C API will be
> revised, it may be better to number it instead of calling it "NEW". 20
> years from now, it won't feel new anymore.
>
> That's exactly why I dislike "New", it's like adding "Ex" or "2" to a
> function name :-)
>
> Well, before bikeshedding on the C define name, I would prefer to see
> if the overall idea of trying to push code for the new C API in the
> master branch is a good idea, or if it's too early and the experiment
> must continue in a fork.
>
> Victor
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/dinoviehland%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-09 Thread Nathaniel Smith
On Fri, Nov 9, 2018 at 6:03 PM, Victor Stinner  wrote:
> Le sam. 10 nov. 2018 à 02:50, Nathaniel Smith  a écrit :
>> Doesn't this mean that you're just making the C API larger and more
>> complicated, rather than simplifying it? You cite some benefits
>> (tagged pointers, changing the layout of PyObject, making PyPy's life
>> easier), but I don't see how you can do any of those things so long as
>> the current C API remains supported.
>
> Tagged pointers and changing the layout of PyObject can only be
> experimented in a new different Python runtime which only supports C
> extensions compiled with the new C API. Technically, it can be CPython
> compiled with a different flag, as there is already python3-dbg (debug
> mode, ./configure --with-pydebug) and python3 (release mode). Or it
> can be CPython fork.
>
> I don't propose to experiment tagged pointer or changing the layout of
> PyObject in CPython. It may require too many changes and it's unclear
> if it's worth it or not. I only propose to implement the least
> controversial part of the new C API in the master branch, since
> maintaining this new C API in a fork is painful.
>
> I cannot promise that it will make PyPy's life easier. PyPy developers
> already told me that they already implemented the support of the
> current C API. The promise is that if you use the new C API, PyPy
> should be more efficient, because it would have less things to
> emulate. To be honest, I'm not sure at this point, I don't know PyPy
> internals. I also know that PyPy developers always complain when we
> *add new functions* to the C API, and there is a non-zero risk that I
> would like to add new functions, since current ones have issues :-) I
> am working with PyPy to involve them in the new C API.

So is it fair to say that your plan is that CPython will always use
the current ("old") API internally, and the "new" API will be
essentially an abstraction layer, that's designed to let people write
C extensions that target the old API, while also being flexible enough
to target PyPy and other "new different Python runtimes"?

If so, then would it make more sense to develop this as an actual
separate abstraction layer? That would have the huge advantage that it
could be distributed and versioned separately from CPython, different
packages could use different versions of the abstraction layer, PyPy
isn't forced to immediately add a bunch of new APIs...

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-09 Thread Victor Stinner
Sometimes, code is easier to understand than a long explanation, so
here is a very simple example of modified function for the new C API:

https://bugs.python.org/issue35206
https://github.com/python/cpython/pull/10443/files

PyTuple_GET_ITEM() becomes a function call and the function
implementation checks arguments at runtime if compiled in debug mode.

Technically, the header file still uses a macro, to implicitly cast to
PyObject*, since currently the macro accepts any type, and the new C
API should not change that.

Victor
Le sam. 10 nov. 2018 à 01:53, Victor Stinner  a écrit :
>
> To hide all implementation details, I propose to stop using macros and
> use function calls instead. For example, replace:
>
> #define PyTuple_GET_ITEM(op, i) \
>(((PyTupleObject *)(op))->ob_item[i])
>
> with:
>
> # define PyTuple_GET_ITEM(op, i) PyTuple_GetItem(op, i)
>
> With this change, C extensions using PyTuple_GET_ITEM() does no longer
> dereference PyObject* nor access PyTupleObject.ob_item. For example,
> PyPy doesn't have to convert all tuple items to PyObject, but only
> create one PyObject for the requested item. Another example is that it
> becomes possible to use a "CPython debug runtime" which checks at
> runtime that the first argument is a tuple and that the index is
> valid. For a longer explanation, see the idea of different "Python
> runtimes":
>
>https://pythoncapi.readthedocs.io/runtimes.html
>
> Replacing macros with function calls is only a first step. It doesn't
> solve the problem of borrowed references for example.
>
> Obviously, such change has a cost on performances. Sadly, I didn't run
> a benchmark yet. At this point, I mostly care about correctness and
> the feasibility of the whole project. I also hope that the new C API
> will allow to implement new optimizations which cannot even be
> imagined today, because of the backward compatibility. The question is
> if the performance balance is positive or not at the all :-)
> Hopefully, there is no urgency to take any decision at this point. The
> whole project is experimental and can be cancelled anytime.
>
> Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-09 Thread Victor Stinner
Le sam. 10 nov. 2018 à 02:50, Nathaniel Smith  a écrit :
> Doesn't this mean that you're just making the C API larger and more
> complicated, rather than simplifying it? You cite some benefits
> (tagged pointers, changing the layout of PyObject, making PyPy's life
> easier), but I don't see how you can do any of those things so long as
> the current C API remains supported.

Tagged pointers and changing the layout of PyObject can only be
experimented in a new different Python runtime which only supports C
extensions compiled with the new C API. Technically, it can be CPython
compiled with a different flag, as there is already python3-dbg (debug
mode, ./configure --with-pydebug) and python3 (release mode). Or it
can be CPython fork.

I don't propose to experiment tagged pointer or changing the layout of
PyObject in CPython. It may require too many changes and it's unclear
if it's worth it or not. I only propose to implement the least
controversial part of the new C API in the master branch, since
maintaining this new C API in a fork is painful.

I cannot promise that it will make PyPy's life easier. PyPy developers
already told me that they already implemented the support of the
current C API. The promise is that if you use the new C API, PyPy
should be more efficient, because it would have less things to
emulate. To be honest, I'm not sure at this point, I don't know PyPy
internals. I also know that PyPy developers always complain when we
*add new functions* to the C API, and there is a non-zero risk that I
would like to add new functions, since current ones have issues :-) I
am working with PyPy to involve them in the new C API.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-09 Thread Nathaniel Smith
On Fri, Nov 9, 2018 at 4:30 PM, Victor Stinner  wrote:
> Ah, important points. I don't want to touch the current C API nor make
> it less efficient. And compatibility in both directions (current C API
> <=> new C API) is very important for me. There is no such plan as
> "Python 4" which would break the world and *force* everybody to
> upgrade to the new C API, or stay to Python 3 forever. No. The new C
> API must be an opt-in option, and current C API remains the default
> and not be changed.

Doesn't this mean that you're just making the C API larger and more
complicated, rather than simplifying it? You cite some benefits
(tagged pointers, changing the layout of PyObject, making PyPy's life
easier), but I don't see how you can do any of those things so long as
the current C API remains supported.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-09 Thread Victor Stinner
Le sam. 10 nov. 2018 à 01:49, Michael Selik  a écrit :
>> It uses an opt-in option (Py_NEWCAPI define -- I'm not sure about the
>> name) to get the new API. The current C API is unchanged.
>
> While one can hope that this will be the only time the C API will be revised, 
> it may be better to number it instead of calling it "NEW". 20 years from now, 
> it won't feel new anymore.

That's exactly why I dislike "New", it's like adding "Ex" or "2" to a
function name :-)

Well, before bikeshedding on the C define name, I would prefer to see
if the overall idea of trying to push code for the new C API in the
master branch is a good idea, or if it's too early and the experiment
must continue in a fork.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-09 Thread Victor Stinner
To hide all implementation details, I propose to stop using macros and
use function calls instead. For example, replace:

#define PyTuple_GET_ITEM(op, i) \
   (((PyTupleObject *)(op))->ob_item[i])

with:

# define PyTuple_GET_ITEM(op, i) PyTuple_GetItem(op, i)

With this change, C extensions using PyTuple_GET_ITEM() does no longer
dereference PyObject* nor access PyTupleObject.ob_item. For example,
PyPy doesn't have to convert all tuple items to PyObject, but only
create one PyObject for the requested item. Another example is that it
becomes possible to use a "CPython debug runtime" which checks at
runtime that the first argument is a tuple and that the index is
valid. For a longer explanation, see the idea of different "Python
runtimes":

   https://pythoncapi.readthedocs.io/runtimes.html

Replacing macros with function calls is only a first step. It doesn't
solve the problem of borrowed references for example.

Obviously, such change has a cost on performances. Sadly, I didn't run
a benchmark yet. At this point, I mostly care about correctness and
the feasibility of the whole project. I also hope that the new C API
will allow to implement new optimizations which cannot even be
imagined today, because of the backward compatibility. The question is
if the performance balance is positive or not at the all :-)
Hopefully, there is no urgency to take any decision at this point. The
whole project is experimental and can be cancelled anytime.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-09 Thread Michael Selik
On Fri, Nov 9, 2018 at 4:33 PM Victor Stinner  wrote:

> It uses an opt-in option (Py_NEWCAPI define -- I'm not sure about the
> name) to get the new API. The current C API is unchanged.
>

While one can hope that this will be the only time the C API will be
revised, it may be better to number it instead of calling it "NEW". 20
years from now, it won't feel new anymore.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-09 Thread Victor Stinner
Hi,

The current C API of Python is both a strength and a weakness of the
Python ecosystem as a whole. It's a strength because it allows to
quickly reuse a huge number of existing libraries by writing a glue
for them. It made numpy possible and this project is a big sucess!
It's a weakness because of its cost on the maintenance, it prevents
optimizations, and more generally it prevents to experiment modifying
Python internals.

For example, CPython cannot use tagged pointers, because the existing
C API is heavily based on the ability to dereference a PyObject*
object and access directly members of objects (like PyTupleObject).
For example, Py_INCREF() modifies *directly* PyObject.ob_refcnt. It's
not possible neither to use a Python compiled in debug mode on C
extensions (compiled in release mode), because the ABI is different in
debug mode. As a consequence, nobody uses the debug mode, whereas it
is very helpful to develop C extensions and investigate bugs.

I also consider that the C API gives too much work to PyPy (for their
"cpyext" module). A better C API (not leaking implementation) details
would make PyPy more efficient (and simplify its implementation in the
long term, when the support for the old C API can be removed). For
example, PyList_GetItem(list, 0) currently converts all items of the
list to PyObject* in PyPy, it can waste memory if only the first item
of the list is needed. PyPy has much more efficient storage than an
array of PyObject* for lists.

I wrote a website to explain all these issues with much more details:

   https://pythoncapi.readthedocs.io/

I identified "bad APIs" like using borrowed references or giving
access to PyObject** (ex: PySequence_Fast_ITEMS).

I already wrote an (incomplete) implementation of a new C API which
doesn't leak implementation details:

   https://github.com/pythoncapi/pythoncapi

It uses an opt-in option (Py_NEWCAPI define -- I'm not sure about the
name) to get the new API. The current C API is unchanged.

Ah, important points. I don't want to touch the current C API nor make
it less efficient. And compatibility in both directions (current C API
<=> new C API) is very important for me. There is no such plan as
"Python 4" which would break the world and *force* everybody to
upgrade to the new C API, or stay to Python 3 forever. No. The new C
API must be an opt-in option, and current C API remains the default
and not be changed.

I have different ideas for the compatibility part, but I'm not sure of
what are the best options yet.

My short term for the new C API would be to ease the experimentation
of projects like tagged pointers. Currently, I have to maintain the
implementation of a new C API which is not really convenient.

--

Today I tried to abuse the Py_DEBUG define for the new C API, but it
seems to be a bad idea:

   https://github.com/python/cpython/pull/10435

A *new* define is needed to opt-in for the new C API.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com