Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Le mer. 21 nov. 2018 à 12:11, Antoine Pitrou a écrit : > You mean the same API can compile to two different things depending on > a configuration? Yes, my current plan is to keep #include but have an opt-in define to switch to the new C API. > I expect it to be error-prone. For example, let's suppose I want to > compile in a given mode, but I also use Numpy's C API. Will the > compile mode "leak" to Numpy as well? For example, if we continue to use Py_LIMITED_API: I don't think that Numpy currently uses #ifdef Py_LIMITED_API, nor plans to do that. If we add a new define (ex: my current proof-of-concept uses Py_NEWCAPI), we can make sure that it's not already used by Numpy :-) > What if a third-party header > includes "Python.h" before I do the "#define" that's necessary? IMHO the define should be added by distutils directly, using -D in compiler flags. I wouldn't suggest: #define #include But the two APIs should diverge, so your C extension should also use a define to decide to use the old or the new API. So something will break if you mess up in the compilation :-) Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Tue, 20 Nov 2018 23:17:05 +0100 Victor Stinner wrote: > Le mar. 20 nov. 2018 à 23:08, Stefan Krah a écrit : > > Intuitively, it should probably not be part of a limited API, but I never > > quite understood the purpose of this API, because I regularly need any > > function that I can get my hands on. > > (...) > > Reading typed strings directly into an array with minimal overhead. > > IMHO performance and hiding implementation details are exclusive. You > should either use the C API with impl. details for best performances, > or use a "limited" C API for best compatibility. > > Since I would like to not touch the C API with impl. details, you can > imagine to have two compilation modes: one for best performances on > CPython, one for best compatibility (ex: compatible with PyPy). I'm > not sure how the "compilation mode" will be selected. You mean the same API can compile to two different things depending on a configuration? I expect it to be error-prone. For example, let's suppose I want to compile in a given mode, but I also use Numpy's C API. Will the compile mode "leak" to Numpy as well? What if a third-party header includes "Python.h" before I do the "#define" that's necessary? Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On 11/20/2018 10:33 PM, Nathaniel Smith wrote: On Tue, Nov 20, 2018 at 6:05 PM Glenn Linderman wrote: On 11/20/2018 2:17 PM, Victor Stinner wrote: IMHO performance and hiding implementation details are exclusive. You should either use the C API with impl. details for best performances, or use a "limited" C API for best compatibility. The "limited" C API concept would seem to be quite sufficient for extensions that want to extend Python functionality to include new system calls, etc. (pywin32, pyMIDI, pySide, etc.) whereas the numpy and decimal might want best performance. To make things more complicated: numpy and decimal are in a category of modules where if you want them to perform well on JIT-based VMs, then there's no possible C API that can achieve that. To get the benefits of a JIT on code using numpy or decimal, the JIT has to be able to see into their internals to do inlining etc., which means they can't be written in C at all [1], at which point the C API becomes irrelevant. It's not clear to me how this affects any of the discussion in CPython, since supporting JITs might not be part of the goal of a new C API, and I'm not sure how many packages fall between the numpy/decimal side and the pure-ffi side. -n [1] Well, there's also the option of teaching your Python JIT to handle LLVM bitcode as a source language, which is the approach that Graal is experimenting with. It seems completely wacky to me to hope you could write a C API emulation layer like PyPy's cpyext, and compile that + C extension code to LLVM bitcode, translate the LLVM bitcode to JVM bytecode, inline the whole mess into your Python JIT, and then fold everything away to produce something reasonable. But I could be wrong, and Oracle is throwing a lot of money at Graal so I guess we'll find out. Interesting, thanks for the introduction to wacky. I was quite content with the idea that numpy, and other modules that would choose to use the unlimited API, would be sacrificing portability to non-CPython implementations... except by providing a Python equivalent (decimal, and some others do that, IIRC). Regarding JIT in general, though, it would seem that "precompiled" extensions like numpy would not need to be re-compiled by the JIT. But if it does, then the JIT better understand/support C syntax, but JVM JITs probably don't! so that leads to the scenario you describe. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Tue, Nov 20, 2018 at 6:05 PM Glenn Linderman wrote: > > On 11/20/2018 2:17 PM, Victor Stinner wrote: >> IMHO performance and hiding implementation details are exclusive. You >> should either use the C API with impl. details for best performances, >> or use a "limited" C API for best compatibility. > > The "limited" C API concept would seem to be quite sufficient for extensions > that want to extend Python functionality to include new system calls, etc. > (pywin32, pyMIDI, pySide, etc.) whereas the numpy and decimal might want best > performance. To make things more complicated: numpy and decimal are in a category of modules where if you want them to perform well on JIT-based VMs, then there's no possible C API that can achieve that. To get the benefits of a JIT on code using numpy or decimal, the JIT has to be able to see into their internals to do inlining etc., which means they can't be written in C at all [1], at which point the C API becomes irrelevant. It's not clear to me how this affects any of the discussion in CPython, since supporting JITs might not be part of the goal of a new C API, and I'm not sure how many packages fall between the numpy/decimal side and the pure-ffi side. -n [1] Well, there's also the option of teaching your Python JIT to handle LLVM bitcode as a source language, which is the approach that Graal is experimenting with. It seems completely wacky to me to hope you could write a C API emulation layer like PyPy's cpyext, and compile that + C extension code to LLVM bitcode, translate the LLVM bitcode to JVM bytecode, inline the whole mess into your Python JIT, and then fold everything away to produce something reasonable. But I could be wrong, and Oracle is throwing a lot of money at Graal so I guess we'll find out. -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On 11/20/2018 2:17 PM, Victor Stinner wrote: Le mar. 20 nov. 2018 à 23:08, Stefan Krah a écrit : Intuitively, it should probably not be part of a limited API, but I never quite understood the purpose of this API, because I regularly need any function that I can get my hands on. (...) Reading typed strings directly into an array with minimal overhead. IMHO performance and hiding implementation details are exclusive. You should either use the C API with impl. details for best performances, or use a "limited" C API for best compatibility. The "limited" C API concept would seem to be quite sufficient for extensions that want to extend Python functionality to include new system calls, etc. (pywin32, pyMIDI, pySide, etc.) whereas the numpy and decimal might want best performance. Since I would like to not touch the C API with impl. details, you can imagine to have two compilation modes: one for best performances on CPython, one for best compatibility (ex: compatible with PyPy). I'm not sure how the "compilation mode" will be selected. The nicest interface from a compilation point of view would be to have two #include files: One to import the limited API, and one to import the performance API. Importing both should be allowed and should work. If you import the performance API, you have to learn more, and be more careful. Of course, there might be appropriate subsets of each API, having multiple include files, to avoid including everything, but that is a refinement. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/v%2Bpython%40g.nevcal.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Le mar. 20 nov. 2018 à 23:08, Stefan Krah a écrit : > Intuitively, it should probably not be part of a limited API, but I never > quite understood the purpose of this API, because I regularly need any > function that I can get my hands on. > (...) > Reading typed strings directly into an array with minimal overhead. IMHO performance and hiding implementation details are exclusive. You should either use the C API with impl. details for best performances, or use a "limited" C API for best compatibility. Since I would like to not touch the C API with impl. details, you can imagine to have two compilation modes: one for best performances on CPython, one for best compatibility (ex: compatible with PyPy). I'm not sure how the "compilation mode" will be selected. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Mon, Nov 19, 2018 at 04:08:07PM +0100, Victor Stinner wrote: > Le lun. 19 nov. 2018 à 13:18, Stefan Krah a écrit : > > In practice people desperately *have* to use whatever is there, including > > functions with underscores that are not even officially in the C-API. > > > > I have to use _PyFloat_Pack* in order to be compatible with CPython, > > Oh, I never used this function. These functions are private (name > prefixed by "_") and excluded from the limited API. > > For me, the limited API should be functions available on all Python > implementations. Does it make sense to provide PyFloat_Pack4() in > MicroPython, Jython, IronPython and PyPy? Or is it something more > specific to CPython? I don't know the answer. If yes, open an issue to > propose to make this function public? It depends on what the goal is: If PyPy wants to be able to use as many C extensions as possible, then yes. The function is just one example of what people have to use to be 100% compatible with CPython (or copy these functions and maintain them ...). Intuitively, it should probably not be part of a limited API, but I never quite understood the purpose of this API, because I regularly need any function that I can get my hands on. > > I need PyUnicode_KIND() > > IMHO this one should not be part of the public API. The only usage > would be to micro-optimize, but such API is very specific to one > Python implementation. For example, PyPy doesn't use "compact string" > but UTF-8 internally. If you use PyUnicode_KIND(), your code becomes > incompatible with PyPy. > > What is your use case? Reading typed strings directly into an array with minimal overhead. > I would prefer to expose the "_PyUnicodeWriter" API than PyUnicode_KIND(). > > > need PyUnicode_AsUTF8AndSize(), > > Again, that's a micro-optimization and it's very specific to CPython: > result cached in the "immutable" str object. I don't want to put it in > a public API. PyUnicode_AsUTF8String() is better since it doesn't > require an internal cache. > > > I *wish* there were PyUnicode_AsAsciiAndSize(). > > PyUnicode_AsASCIIString() looks good to me. Sadly, it doesn't return > the length, but usually the length is not needed. Yes, these are all just examples. It's also very useful to be able to do PyLong_Type.tp_as_number->nb_multiply or grab as_integer_ratio from the float PyMethodDef. The latter two cases are for speed reasons but also because sometimes you *don't* want a method from a subclass (Serhiy was very good in finding corner cases :-). Most C modules that I've seen have some internals. Psycopg2: PyDateTime_DELTA_GET_MICROSECONDS PyDateTime_DELTA_GET_DAYS PyDateTime_DELTA_GET_SECONDS PyList_GET_ITEM Bytes_GET_SIZE Py_BEGIN_ALLOW_THREADS Py_END_ALLOW_THREADS floatobject.h and longintrepr.h are also popular. Stefan Krah ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Tue, Nov 20, 2018 at 1:34 AM Petr Viktorin wrote: > > On 11/19/18 12:14 PM, Victor Stinner wrote: > > To design a new C API, I see 3 options: > > > > (1) add more functions to the existing Py_LIMITED_API > > (2) "fork" the current public C API: remove functions and hide as much > > implementation details as possible > > (3) write a new C API from scratch, based on the current C API. > > Something like #define newcapi_Object_GetItem PyObject_GetItem"? > > Sorry, but "#undef " doesn't work. Only very few > > functions are defined using "#define ...". > > > > I dislike (1) because it's too far from what is currently used in > > practice. Moreover, I failed to find anyone who can explain me how the > > C API is used in the wild, which functions are important or not, what > > is the C API, etc. > > One big, complex project that now uses the limited API is PySide. They > do some workarounds, but the limited API works. Here's a writeup of the > troubles they have with it: > https://github.com/pyside/pyside2-setup/blob/5.11/sources/shiboken2/libshiboken/pep384impl_doc.rst AFAIK the only two projects that use the limited API are PySide-generated modules and cffi-generated modules. I guess if there is some cleanup needed to remove stuff that snuck into the limited API, then that will be fine as long as you make sure they aren't used by either of those two projects. For the regular C API, I guess the PyPy folks, and especially Matti Picus, probably know more than anyone else about what parts are actually used in the wild, since they've spent way more time digging into real projects. (Do you want to know about the exact conditions in which real projects rely on being able to skip calling PyType_Ready on a statically allocated PyTypeObject? Matti knows...) > I hope the new C API will be improvements (and clarifications) of the > stable ABI, rather than a completely new thing. > My ideal would be that Python 4.0 would keep the same API (with > questionable things emulated & deprecated), but break *ABI*. The "new C > API" would become that new stable ABI -- and this time it'd be something > we'd really want to support, without reservations. We already break ABI with every feature release – at least for the main ABI. The limited ABI supposedly doesn't, but probably does, and as noted above it has such limited use that it's probably still possible to fix any stuff that's leaked out accidentally. -n -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On 11/19/18 12:14 PM, Victor Stinner wrote: To design a new C API, I see 3 options: (1) add more functions to the existing Py_LIMITED_API (2) "fork" the current public C API: remove functions and hide as much implementation details as possible (3) write a new C API from scratch, based on the current C API. Something like #define newcapi_Object_GetItem PyObject_GetItem"? Sorry, but "#undef " doesn't work. Only very few functions are defined using "#define ...". I dislike (1) because it's too far from what is currently used in practice. Moreover, I failed to find anyone who can explain me how the C API is used in the wild, which functions are important or not, what is the C API, etc. One big, complex project that now uses the limited API is PySide. They do some workarounds, but the limited API works. Here's a writeup of the troubles they have with it: https://github.com/pyside/pyside2-setup/blob/5.11/sources/shiboken2/libshiboken/pep384impl_doc.rst I propose (2). We control how much changes we do at each milestone, and we start from the maximum compatibility with current C API. Each change can be discussed and experimented to define what is the C API, what we want, etc. I'm working on this approach for 1 year, that's why many discussions popped up around specific changes :-) I hope the new C API will be improvements (and clarifications) of the stable ABI, rather than a completely new thing. My ideal would be that Python 4.0 would keep the same API (with questionable things emulated & deprecated), but break *ABI*. The "new C API" would become that new stable ABI -- and this time it'd be something we'd really want to support, without reservations. One thing that did not work with the stable ABI was that it's "opt-out"; I think we can agree that a new one must be "opt-in" from the start. I'd also like the "new API" to be a *strict subset* of the stable ABI: if a new function needs to be added, it should be added to both. Some people recently proposed (3) on python-dev. I dislike this option because it starts by breaking the backward compatibility. It looks like (1), but worse. The goal and the implementation are unclear to me. -- Replacing PyDict_GetItem() (specialized call) with PyObject_Dict() (generic API) is not part of my short term plan. I wrote it in the roadmap, but as I wrote before, each change should be discusssed, experimented, benchmarked, etc. Victor Le lun. 19 nov. 2018 à 12:02, M.-A. Lemburg a écrit : On 19.11.2018 11:53, Antoine Pitrou wrote: On Mon, 19 Nov 2018 11:28:46 +0100 Victor Stinner wrote: Python internals rely on internals to implement further optimizations, than modifying an "immutable" tuple, bytes or str object, because you can do that at the C level. But I'm not sure that I would like 3rd party extensions to rely on such things. I'm not even talking about *modifying* tuples or str objects, I'm talking about *accessing* their value without going through an abstract API that does slot lookups, indirect function calls and object unboxing. For example, people may need a fast way to access the UTF-8 representation of a unicode object. Without making indirect function calls, and ideally without making a copy of the data either. How do you do that using the generic C API? Something else you need to consider is creating instances of types, e.g. a tuple. In C you will have to be able to put values into the data structure before it is passed outside the function in order to build the tuple. If you remove this possibility to have to copy data all the time, losing the advantages of having a rich C API. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Nov 19 2018) Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/encukou%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Mon., Nov. 19, 2018, 14:04 Neil Schemenauer On 2018-11-19, Antoine Pitrou wrote: > > There are important use cases for the C API where it is desired to have > > fast type-specific access to Python objects such as tuples, ints, > > strings, etc. This is relied upon by modules such as _json and _pickle, > > and third-party extensions as well. > > Thank you for pointing this out. The feedback from Stefan on what > Cython would like (e.g. more access to functions that are currently > "internal") is useful too. Keeping our dreams tied to reality > is important. ;-P > > It seems to me that we can't "have our cake and eat it too". I.e. on > the one hand hide CPython implementation internals but on the other > hand allow extensions that want to take advantage of those internals > to provide the best performance. > No, but those are different APIs as well. E.g. no one is saying CPython has to do away with any of its API. What I and some others have said is the CPython API is too broad to be called "universal". > Maybe we could have a multiple levels of API: > > A) maximum portability (Py_LIMITED_API) > > B) source portability (non-stable ABI, inlined functions) > > C) portability but poor performance on non-CPython VMs >(PySequence_Fast_ITEMS, borrowed refs, etc) > I don't know own how doable that is as e.g. borrowed refs are not pleasant to simulate. > D) non-portability, CPython specific (access to more internals like >Stefan was asking for). The extension would have to be >re-implemented on each VM or provide a pure Python >alternative. > I think it would be nice if the extension module could explicitly > choose which level of API it wants to use. > Yes, and I thought we were working towards nesting our header files so you very clearly opted into your level of compatibility. In my head there's: - bare minimum, cross-VM, gets you FFI - CPython API for more performance that we're willing to maintain - Everything open for e.g. CPython with no compatibility guarantees Due note my first point isn't necessarily worrying about crazy performance to start. I would assume an alternative VM would help make up for this with a faster runtime where dropping into C is more about FFI than performance (we all know PyPy, for instance, wished people just wrote more Python code). Otherwise we're back to the idea of standardizing on some Cython solution to help make perfect easier without tying oneself to the C API (like Julia's FFI solution). > It would be interesting to do a census on what extensions are out > there. If they mostly fall into wanting level "C" then I think this > API overhaul is not going to work out too well. Level C is mostly > what we have now. No point in putting the effort into A and B if no > one will use them. It won't until someone can show benefits for switching. This is very much a chicken-and-egg problem. ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On 2018-11-19, Antoine Pitrou wrote: > There are important use cases for the C API where it is desired to have > fast type-specific access to Python objects such as tuples, ints, > strings, etc. This is relied upon by modules such as _json and _pickle, > and third-party extensions as well. Thank you for pointing this out. The feedback from Stefan on what Cython would like (e.g. more access to functions that are currently "internal") is useful too. Keeping our dreams tied to reality is important. ;-P It seems to me that we can't "have our cake and eat it too". I.e. on the one hand hide CPython implementation internals but on the other hand allow extensions that want to take advantage of those internals to provide the best performance. Maybe we could have a multiple levels of API: A) maximum portability (Py_LIMITED_API) B) source portability (non-stable ABI, inlined functions) C) portability but poor performance on non-CPython VMs (PySequence_Fast_ITEMS, borrowed refs, etc) D) non-portability, CPython specific (access to more internals like Stefan was asking for). The extension would have to be re-implemented on each VM or provide a pure Python alternative. I think it would be nice if the extension module could explicitly choose which level of API it wants to use. It would be interesting to do a census on what extensions are out there. If they mostly fall into wanting level "C" then I think this API overhaul is not going to work out too well. Level C is mostly what we have now. No point in putting the effort into A and B if no one will use them. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On 2018-11-19, Victor Stinner wrote: > Moreover, I failed to find anyone who can explain me how the C API > is used in the wild, which functions are important or not, what is > the C API, etc. One idea is to download a large sample of extension modules from PyPI and then analyze them with some automated tool (maybe libclang). I guess it is possible there is a large non-public set of extensions that we would miss. Regards, Neil ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Hi Stefan, Le lun. 19 nov. 2018 à 13:18, Stefan Krah a écrit : > In practice people desperately *have* to use whatever is there, including > functions with underscores that are not even officially in the C-API. > > I have to use _PyFloat_Pack* in order to be compatible with CPython, Oh, I never used this function. These functions are private (name prefixed by "_") and excluded from the limited API. For me, the limited API should be functions available on all Python implementations. Does it make sense to provide PyFloat_Pack4() in MicroPython, Jython, IronPython and PyPy? Or is it something more specific to CPython? I don't know the answer. If yes, open an issue to propose to make this function public? > I need PyUnicode_KIND() IMHO this one should not be part of the public API. The only usage would be to micro-optimize, but such API is very specific to one Python implementation. For example, PyPy doesn't use "compact string" but UTF-8 internally. If you use PyUnicode_KIND(), your code becomes incompatible with PyPy. What is your use case? I would prefer to expose the "_PyUnicodeWriter" API than PyUnicode_KIND(). > need PyUnicode_AsUTF8AndSize(), Again, that's a micro-optimization and it's very specific to CPython: result cached in the "immutable" str object. I don't want to put it in a public API. PyUnicode_AsUTF8String() is better since it doesn't require an internal cache. > I *wish* there were PyUnicode_AsAsciiAndSize(). PyUnicode_AsASCIIString() looks good to me. Sadly, it doesn't return the length, but usually the length is not needed. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Victor Stinner wrote: > Moreover, I failed to find anyone who can explain me how the C API is used > in the wild, which functions are important or not, what is the C API, etc. In practice people desperately *have* to use whatever is there, including functions with underscores that are not even officially in the C-API. I have to use _PyFloat_Pack* in order to be compatible with CPython, I need PySlice_Unpack() etc., I need PyUnicode_KIND(), need PyUnicode_AsUTF8AndSize(), I *wish* there were PyUnicode_AsAsciiAndSize(). In general, in daily use of the C-API I wish it were *larger* and not smaller. I often want functions that return C instead of Python values ot functions that take C instead of Python values. The ideal situation for me would be a lower layer library, say libcpython.a that has all those functions like _PyFloat_Pack*. It would be an enormous amount of work though, especially since the status quo kind of works. Stefan Krah ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
To design a new C API, I see 3 options: (1) add more functions to the existing Py_LIMITED_API (2) "fork" the current public C API: remove functions and hide as much implementation details as possible (3) write a new C API from scratch, based on the current C API. Something like #define newcapi_Object_GetItem PyObject_GetItem"? Sorry, but "#undef " doesn't work. Only very few functions are defined using "#define ...". I dislike (1) because it's too far from what is currently used in practice. Moreover, I failed to find anyone who can explain me how the C API is used in the wild, which functions are important or not, what is the C API, etc. I propose (2). We control how much changes we do at each milestone, and we start from the maximum compatibility with current C API. Each change can be discussed and experimented to define what is the C API, what we want, etc. I'm working on this approach for 1 year, that's why many discussions popped up around specific changes :-) Some people recently proposed (3) on python-dev. I dislike this option because it starts by breaking the backward compatibility. It looks like (1), but worse. The goal and the implementation are unclear to me. -- Replacing PyDict_GetItem() (specialized call) with PyObject_Dict() (generic API) is not part of my short term plan. I wrote it in the roadmap, but as I wrote before, each change should be discusssed, experimented, benchmarked, etc. Victor Le lun. 19 nov. 2018 à 12:02, M.-A. Lemburg a écrit : > > On 19.11.2018 11:53, Antoine Pitrou wrote: > > On Mon, 19 Nov 2018 11:28:46 +0100 > > Victor Stinner wrote: > >> Python internals rely on internals to implement further optimizations, > >> than modifying an "immutable" tuple, bytes or str object, because you > >> can do that at the C level. But I'm not sure that I would like 3rd > >> party extensions to rely on such things. > > > > I'm not even talking about *modifying* tuples or str objects, I'm > > talking about *accessing* their value without going through an abstract > > API that does slot lookups, indirect function calls and object unboxing. > > > > For example, people may need a fast way to access the UTF-8 > > representation of a unicode object. Without making indirect function > > calls, and ideally without making a copy of the data either. How do > > you do that using the generic C API? > > Something else you need to consider is creating instances of > types, e.g. a tuple. In C you will have to be able to put > values into the data structure before it is passed outside > the function in order to build the tuple. > > If you remove this possibility to have to copy data all the > time, losing the advantages of having a rich C API. > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Experts (#1, Nov 19 2018) > >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ > >>> Python Database Interfaces ... http://products.egenix.com/ > >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ > > > ::: We implement business ideas - efficiently in both time and costs ::: > >eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 > D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg >Registered at Amtsgericht Duesseldorf: HRB 46611 >http://www.egenix.com/company/contact/ > http://www.malemburg.com/ > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Mon, 19 Nov 2018 11:53:42 +0100 Antoine Pitrou wrote: > On Mon, 19 Nov 2018 11:28:46 +0100 > Victor Stinner wrote: > > I would expect that the most common source of speed up of a C > > extension is the removal of the cost of bytecode evaluation (ceval.c > > loop). > > Well, I don't. All previous experiments showed that simply compiling > Python code to C code using the "generic" C API yielded a 30% > improvement. > > Conversely, the C _pickle module can be 100x faster than the pure > Python pickle module. It's doing it *not* by using the generic C > API, but by special-casing access to concrete types. You don't get > that level of performance simply by removing the cost of bytecode > evaluation: > > # C version > $ python3 -m timeit -s "import pickle; x = list(range(1000))" > "pickle.dumps(x)" 10 loops, best of 3: 19 usec per loop > > # Python version > $ python3 -m timeit -s "import pickle; x = list(range(1000))" > "pickle._dumps(x)" 100 loops, best of 3: 2.25 msec per loop And to show that this is important for third-party C extensions as well, PyArrow (*) has comparable performance using similar techniques: $ python -m timeit -s "import pyarrow as pa; x = list(range(1000))" "pa.array(x, type=pa.int64())" 1 loops, best of 5: 27.2 usec per loop (*) https://arrow.apache.org/docs/python/ Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On 19.11.2018 11:53, Antoine Pitrou wrote: > On Mon, 19 Nov 2018 11:28:46 +0100 > Victor Stinner wrote: >> Python internals rely on internals to implement further optimizations, >> than modifying an "immutable" tuple, bytes or str object, because you >> can do that at the C level. But I'm not sure that I would like 3rd >> party extensions to rely on such things. > > I'm not even talking about *modifying* tuples or str objects, I'm > talking about *accessing* their value without going through an abstract > API that does slot lookups, indirect function calls and object unboxing. > > For example, people may need a fast way to access the UTF-8 > representation of a unicode object. Without making indirect function > calls, and ideally without making a copy of the data either. How do > you do that using the generic C API? Something else you need to consider is creating instances of types, e.g. a tuple. In C you will have to be able to put values into the data structure before it is passed outside the function in order to build the tuple. If you remove this possibility to have to copy data all the time, losing the advantages of having a rich C API. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Nov 19 2018) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Mon, 19 Nov 2018 11:28:46 +0100 Victor Stinner wrote: > I would expect that the most common source of speed up of a C > extension is the removal of the cost of bytecode evaluation (ceval.c > loop). Well, I don't. All previous experiments showed that simply compiling Python code to C code using the "generic" C API yielded a 30% improvement. Conversely, the C _pickle module can be 100x faster than the pure Python pickle module. It's doing it *not* by using the generic C API, but by special-casing access to concrete types. You don't get that level of performance simply by removing the cost of bytecode evaluation: # C version $ python3 -m timeit -s "import pickle; x = list(range(1000))" "pickle.dumps(x)" 10 loops, best of 3: 19 usec per loop # Python version $ python3 -m timeit -s "import pickle; x = list(range(1000))" "pickle._dumps(x)" 100 loops, best of 3: 2.25 msec per loop So, the numbers are on my side. So is the abundant experience of experts such as the Cython developers. > Python internals rely on internals to implement further optimizations, > than modifying an "immutable" tuple, bytes or str object, because you > can do that at the C level. But I'm not sure that I would like 3rd > party extensions to rely on such things. I'm not even talking about *modifying* tuples or str objects, I'm talking about *accessing* their value without going through an abstract API that does slot lookups, indirect function calls and object unboxing. For example, people may need a fast way to access the UTF-8 representation of a unicode object. Without making indirect function calls, and ideally without making a copy of the data either. How do you do that using the generic C API? Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Le lun. 19 nov. 2018 à 10:48, Antoine Pitrou a écrit : > If the C API only provides Python-level semantics, then it will > roughly have the speed of pure Python (modulo bytecode execution). > > There are important use cases for the C API where it is desired to have > fast type-specific access to Python objects such as tuples, ints, > strings, etc. This is relied upon by modules such as _json and _pickle, > and third-party extensions as well. Are you sure that using PyDict_GetItem() is really way faster than PyObject_GetItem()? Did someone run a benchmark to have numbers? I would expect that the most common source of speed up of a C extension is the removal of the cost of bytecode evaluation (ceval.c loop). Python internals rely on internals to implement further optimizations, than modifying an "immutable" tuple, bytes or str object, because you can do that at the C level. But I'm not sure that I would like 3rd party extensions to rely on such things. For example, unicodeobject.c uses the following function to check if a str object can be modified in-place, or if a new str object must be created: #ifdef Py_DEBUG static int unicode_is_singleton(PyObject *unicode) { PyASCIIObject *ascii = (PyASCIIObject *)unicode; if (unicode == unicode_empty) return 1; if (ascii->state.kind != PyUnicode_WCHAR_KIND && ascii->length == 1) { Py_UCS4 ch = PyUnicode_READ_CHAR(unicode, 0); if (ch < 256 && unicode_latin1[ch] == unicode) return 1; } return 0; } #endif static int unicode_modifiable(PyObject *unicode) { assert(_PyUnicode_CHECK(unicode)); if (Py_REFCNT(unicode) != 1) return 0; if (_PyUnicode_HASH(unicode) != -1) return 0; if (PyUnicode_CHECK_INTERNED(unicode)) return 0; if (!PyUnicode_CheckExact(unicode)) return 0; #ifdef Py_DEBUG /* singleton refcount is greater than 1 */ assert(!unicode_is_singleton(unicode)); #endif return 1; } Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Sun, 18 Nov 2018 16:53:19 +0100 Stefan Behnel wrote: > > So, in Cython, we use macros wherever possible, and often avoid generic > protocols in favour of type specialisations. We sometimes keep local copies > of C-API helper functions, because inlining them allows the C compiler to > strip down and streamline the implementation at compile time, rather than > jumping through generic code. (Also, it's sometimes required in order to > backport new CPython features to Py2.7+.) Also this approach allows those ballooning compile times that are part of Cython's charm and appeal ;-) (sorry, couldn't resist) Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Fri, 16 Nov 2018 09:46:36 -0800 Brett Cannon wrote: > > I think part of the challenge here (and I believe it has been brought up > elsewhere) is no one knows what kind of API is necessary for some faster VM > other than PyPy. To me, the only C API that would could potentially start > working toward and promoting **today** is one which is stripped to its bare > bones and worst mirrors Python syntax. For instance, I have seen > PyTuple_GET_ITEM() brought up a couple of times. But that's not syntax in > Python, so I wouldn't feel comfortable including that in a simplified API. > You really only need attribute access and object calling to make object > indexing work, although for simplicity I can see wanting to provide an > indexing API. If the C API only provides Python-level semantics, then it will roughly have the speed of pure Python (modulo bytecode execution). There are important use cases for the C API where it is desired to have fast type-specific access to Python objects such as tuples, ints, strings, etc. This is relied upon by modules such as _json and _pickle, and third-party extensions as well. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Fri, 16 Nov 2018 at 10:11, Paul Moore wrote: > On Fri, 16 Nov 2018 at 17:49, Brett Cannon wrote: > > And Just to be clear, I totally support coming up with a totally > stripped-down C API as I have outlined above as that shouldn't be > controversial for any VM that wants to have a C-level API. > > If a stripped down API like this is intended as "use this and you get > compatibility across multiple Python interpreters and multiple Python > versions" (essentially a much stronger and more effective version of > the stable ABI) then I'm solidly in favour (and such an API has clear > trade-offs that allow people to judge whether it's the right choice > for them). > Yes, that's what I'm getting at. Basically we have to approach this from the "start with nothing and build up until we have _just_ enough and thus we know **everyone** now and into the future can support it", or we approach with "take what we have now and start peeling back until we _think_ it's good enough". Personally, I think the former is more future-proof. > > Having this alongside the existing API, which would still be supported > for projects that need low-level access or backward compatibility (or > simply don't have the resources to change), but which will remain > CPython-specific, seems like a perfectly fine idea. > And it can be done as wrappers around the current C API and as an external project to start. As Nathaniel pointed out in another thread, this is somewhat like what Py_LIMITED_API was meant to be, but I think we all admit we slightly messed up by making it opt-out instead of opt-in and so we didn't explicitly control that API as well as we probably should have (I know I have probably screwed up by accidentally including import functions by forgetting it was opt-out). I also don't think it was necessarily designed from a minimalist perspective to begin with as it defines things in terms of what's _not_ in Py_LIMITED_API instead of explicitly listing what _is_. So it may (or may not) lead to a different set of APIs in the end when you have to explicitly list every API to include. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Neil Schemenauer schrieb am 17.11.18 um 00:10: > I think making PyObject an opaque pointer would help. ... well, as long as type checks are still as fast as with "ob_type", and visible to the C compiler so that it can eliminate redundant ones, I wouldn't mind. :) > - Borrowed references are a problem. However, because they are so > commonly used and because the source code changes needed to change > to a non-borrowed API is non-trivial, I don't think we should try > to change this. Maybe we could just discourage their use? FWIW, the code that Cython generates has a macro guard [1] that makes it avoid borrowed references where possible, e.g. when it detects compilation under PyPy. That's definitely doable already, right now. > - It would be nice to make PyTypeObject an opaque pointer as well. > I think that's a lot more difficult than making PyObject opaque. > So, I don't think we should attempt it in the near future. Maybe > we could make a half-way step and discourage accessing ob_type > directly. We would provide functions (probably inline) to do what > you would otherwise do by using op->ob_type->. I've sometimes been annoyed by the fact that protocol checks require two pointer indirections in CPython (or even three in some cases), so that the C compiler is essentially prevented from making any assumptions, and the CPU branch prediction is also stretched a bit more than necessary. At least, the slot check usually comes right before the call, so that the lookups are not wasted. Inline functions are unlikely to improve that situation, but at least they shouldn't make it worse, and they would be more explicit. Needless to say that Cython also has a macro guard in [1] that disables direct slot access and makes it fall back to C-API calls, for users and Python implementations where direct slot support is not wanted/available. > One reason you want to discourage access to ob_type is that > internally there is not necessarily one PyTypeObject structure for > each Python level type. E.g. the VM might have specialized types > for certain sub-domains. This is like the different flavours of > strings, depending on the set of characters stored in them. Or, > you could have different list types. One type of list if all > values are ints, for example. An implementation like this could also be based on the buffer protocol. It's already supported by the array.array type (which people probably also just use when they have a need like this and don't want to resort to NumPy). > Basically, with CPython op->ob_type is super fast. For other VMs, > it could be a lot slower. By accessing ob_type you are saying > "give me all possible type information for this object pointer". > By using functions to get just what you need, you could be putting > less burden on the VM. E.g. "is this object an instance of some > type" is faster to compute. Agreed. I think that inline functions (well, or macros, because why not?) that check for certain protocols explicitly could be helpful. > - APIs that return pointers to the internals of objects are a > problem. E.g. PySequence_Fast_ITEMS(). For CPython, this is > really fast because it is just exposing the internal details of > the layout that is already in the correct format. For other VMs, > that API could be expensive to emulate. E.g. you have a list to > store only ints. If someone calls PySequence_Fast_ITEMS(), you > have to create real PyObjects for all of the list elements. But that's intended by the caller, right? They want a flat serial representation of the sequence, with potential conversion to a (list) array if necessary. They might be a bit badly named, but that's exactly the contract of the "PySequence_Fast_*()" line of functions. In Cython, we completely avoid these functions, because they are way too generic for optimisation purposes. Direct type checks and code specialisation are much more effective. > - Reducing the size of the API seems helpful. E.g. we don't need > PyObject_CallObject() *and* PyObject_Call(). Also, do we really > need all the type specific APIs, PyList_GetItem() vs > PyObject_GetItem()? In some cases maybe we can justify the bigger > API due to performance. To add a new API, someone should have a > benchmark that shows a real speedup (not just that they imagine it > makes a difference). So, in Cython, we use macros wherever possible, and often avoid generic protocols in favour of type specialisations. We sometimes keep local copies of C-API helper functions, because inlining them allows the C compiler to strip down and streamline the implementation at compile time, rather than jumping through generic code. (Also, it's sometimes required in order to backport new CPython features to Py2.7+.) PyPy's cpyext often just maps type specific C-API functions to the same generic code, obviously, but in CPython, having a way to bypass protocols and going
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On 2018-11-16, Nathaniel Smith wrote: > [..] it seems like you should investigate (a) whether you can make > Py_LIMITED_API *be* that API, instead of having two different > ifdefs That might be a good idea. One problem is that we might like to make backwards incompatible changes to Py_LIMITED_API. Maybe it doesn't matter if no extensions actually use Py_LIMITED_API. Keeping API and ABI compatibility with the existing Py_LIMITED_API could be difficult. What would be the downside of using a new CPP define? We could deprecate Py_LIMITED_API and the new API could do the job. Also, I think extensions should have to option to turn the ABI compatibility off. For some extensions, they will not want to convert if there is a big performance hit (some macros turn into non-inlined functions, call functions rather than access a non-opaque structure). Maybe there is a reason my toggling idea won't work. If we can use a CPP define to toggle between inline and non-inline functions, I think it should work. Maybe it will get complicated. Providing ABI compatibility like Py_LIMITED_API is a different goal than making the API more friendly to alternative Python VMs. So, maybe it is a mistake to try to tackle both goals at once. However, the goals seem closely related and so it would be a shame to do a bunch of work and not achieve both. Regards, Neil ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Fri, Nov 16, 2018 at 3:12 PM Neil Schemenauer wrote: > Also, the extension module should not take a big performance hit. > So, you can't change all APIs that were macros into non-inlined > functions. People are not going to accept that and rightly so. > However, it could be that we introduce a new ifdef like > Py_LIMITED_API that gives a stable ABI. E.g. when that's enabled, > most everything would turn into non-inline functions. In exchange > for the performance hit, your extension would become ABI compatible > between a range of CPython releases. That would be a nice feature. > Basically a more useful version of Py_LIMITED_API. It seems like a lot of the things being talked about here actually *are* features of Py_LIMITED_API. E.g. it does a lot of work to hide the internal layout of PyTypeObject, and of course the whole selling point is that it's stable across multiple Python versions. If that's the kind of ABI you're looking for, then it seems like you should investigate (a) whether you can make Py_LIMITED_API *be* that API, instead of having two different ifdefs, (b) why no popular extension modules actually use Py_LIMITED_API. I'm guessing it's partly due to limits of the API, but also things like: lack of docs and examples, lack of py2 support, ... -n -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On 2018-11-16, Brett Cannon wrote: > I think part of the challenge here (and I believe it has been > brought up elsewhere) is no one knows what kind of API is > necessary for some faster VM other than PyPy. I think we have some pretty good ideas as to what are the problematic parts of the current API. Victor's C-API web site has details[1]. We can ask other implementors which parts are hard to support. Here are my thoughts about some desired changes: - We are *not* getting rid of refcounting for extension modules. That would require a whole new API. We might as well start from scratch with Python 4. No one wants that. However, it is likely different VMs use a different GC internally and only use refcounting for objects passed through the C-API. Using refcounted handles is the usual implementation approach. We can make some changes to make that easier. I think making PyObject an opaque pointer would help. - Borrowed references are a problem. However, because they are so commonly used and because the source code changes needed to change to a non-borrowed API is non-trivial, I don't think we should try to change this. Maybe we could just discourage their use? For CPython, using a borrowed reference API is faster. For other Python implementations, it is likely slower and maybe much slower. So, if you are an extension module that wants to work well with other VMs, you should avoid those APIs. - It would be nice to make PyTypeObject an opaque pointer as well. I think that's a lot more difficult than making PyObject opaque. So, I don't think we should attempt it in the near future. Maybe we could make a half-way step and discourage accessing ob_type directly. We would provide functions (probably inline) to do what you would otherwise do by using op->ob_type->. One reason you want to discourage access to ob_type is that internally there is not necessarily one PyTypeObject structure for each Python level type. E.g. the VM might have specialized types for certain sub-domains. This is like the different flavours of strings, depending on the set of characters stored in them. Or, you could have different list types. One type of list if all values are ints, for example. Basically, with CPython op->ob_type is super fast. For other VMs, it could be a lot slower. By accessing ob_type you are saying "give me all possible type information for this object pointer". By using functions to get just what you need, you could be putting less burden on the VM. E.g. "is this object an instance of some type" is faster to compute. - APIs that return pointers to the internals of objects are a problem. E.g. PySequence_Fast_ITEMS(). For CPython, this is really fast because it is just exposing the internal details of the layout that is already in the correct format. For other VMs, that API could be expensive to emulate. E.g. you have a list to store only ints. If someone calls PySequence_Fast_ITEMS(), you have to create real PyObjects for all of the list elements. - Reducing the size of the API seems helpful. E.g. we don't need PyObject_CallObject() *and* PyObject_Call(). Also, do we really need all the type specific APIs, PyList_GetItem() vs PyObject_GetItem()? In some cases maybe we can justify the bigger API due to performance. To add a new API, someone should have a benchmark that shows a real speedup (not just that they imagine it makes a difference). I don't think we should change CPython internals to try to use this new API. E.g. we know that getting ob_type is fast so just leave the code that does that alone. Maybe in the far distant future, if we have successfully got extension modules to switch to using the new API, we could consider changing CPython internals. There would have to be a big benefit though to justify the code churn. E.g. if my tagged pointers experiment shows significant performance gains (it hasn't yet). I like Nathaniel Smith's idea of doing the new API as a separate project, outside the cpython repo. It is possible that in that effort, we would like some minor changes to cpython in order to make the new API more efficient, for example. Those should be pretty limited changes because we are hoping that the new API will work on top of old Python versions, e.g. 3.6. To avoid exposing APIs that should be hidden, re-organizing include files is an idea. However, that doesn't help for old versions of Python. So, I'm thinking that Dino's idea of just duplicating the prototypes would be better. We would like a minimal API and so the number of duplicated prototypes shouldn't be too large. Victor's recent work in changing some macros to inline functions is not really related to the new API project, IMHO. I don't think there is a problem to leave an existing macro as a macro. If we need to introduce new APIs, e.g. to help hide PyTypeObject, those APIs could use inline functions. That
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Fri, 16 Nov 2018 at 17:49, Brett Cannon wrote: > And Just to be clear, I totally support coming up with a totally > stripped-down C API as I have outlined above as that shouldn't be > controversial for any VM that wants to have a C-level API. If a stripped down API like this is intended as "use this and you get compatibility across multiple Python interpreters and multiple Python versions" (essentially a much stronger and more effective version of the stable ABI) then I'm solidly in favour (and such an API has clear trade-offs that allow people to judge whether it's the right choice for them). Having this alongside the existing API, which would still be supported for projects that need low-level access or backward compatibility (or simply don't have the resources to change), but which will remain CPython-specific, seems like a perfectly fine idea. Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Brett: > But otherwise I think we are making assumptions here. For me, unless we are > trying to trim the C API down to just what is syntactically supported in > Python and in such a way that it hides all C-level details I feel like we are > guessing at what's best for other VMs, both today and in the future, until > they can tell us that e.g. tuple indexing is actually not a problem > performance-wise. The current API of PyTuple_GET_ITEM() allows to write: PyObject **items = _GET_ITEM(tuple, 0); to access PyTupleObject.ob_item. Not only it's possible, but it's used commonly in the CPython code base. Last week I replaced _GET_ITEM() pattern with a new _PyTuple_ITEMS() macro which is private. To be able to return PyObject**, you have to convert the full tuple into PyObject* objects which is inefficient if your VM uses something different (again, PyPy doesn't use PyObject* at all). More generally, I like to use PyTuple_GET_ITEM() example, just because it's easy to understand this macro. But it's maybe not a good example :-) Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Wed, 14 Nov 2018 at 16:09, Gregory P. Smith wrote: > It seems like the discussion so far is: >> >> Victor: "I know people when people hear 'new API' they get scared and >> think we're going to do a Python-3-like breaking transition, but don't >> worry, we're never going to do that." >> Nathaniel: "But then what does the new API add?" >> Greg: "It lets us do a Python-3-like breaking transition!" >> > > That is not what I am proposing but it seems too easy for people to > misunderstand it as such. Sorry. > > Between everything discussed across this thread I believe we have enough > information to suggest that we can avoid an "everyone's afraid of a new 3" > mistake by instead making a shim available with a proposed new API that > works on top of existing Python VM(s) so that if we decide to drop the old > API being public in the future, we could do so *without a breaking > transition*. > I know that has always been my hope, especially if any new API is actually going to be more restrictive instead of broader. > > Given that, I suggest not worrying about defining a new C API within the > CPython project and release itself (yet). > +1 from me. Until we have a PEP outlining the actual proposed API I'm not ready to have it go into 'master'. Helping show the shape of the API by wrapping pre-existing APIs I think that's going to be the way to sell it. > > Without an available benefit, little will use it (and given the function > call overhead we want to isolate some concepts, we know it will perform > worse on today's VMs). > > That "top-5" module using it idea? Maintain forks (hooray for git) of > whatever your definition of "top-5" projects is that use the new API > instead of the CPython API. If you attempt this on things like NumPy, you > may be shocked at the states (plural on purpose) of their extension module > code. That holds true for a lot of popular modules. > > Part of the point of this work is to demonstrate that non-incremental > order of magnitude performance change can be had on a Python VM that only > supports such an API can be done in its own fork of CPython, PyPy, > VictorBikeshedPy, FbIsAfraidToReleaseANewGcVmPy, etc. implementation to > help argue for figuring out a viable not-breaking-the-world transition plan > to do such a C API change thing in CPython itself. > I think part of the challenge here (and I believe it has been brought up elsewhere) is no one knows what kind of API is necessary for some faster VM other than PyPy. To me, the only C API that would could potentially start working toward and promoting **today** is one which is stripped to its bare bones and worst mirrors Python syntax. For instance, I have seen PyTuple_GET_ITEM() brought up a couple of times. But that's not syntax in Python, so I wouldn't feel comfortable including that in a simplified API. You really only need attribute access and object calling to make object indexing work, although for simplicity I can see wanting to provide an indexing API. But otherwise I think we are making assumptions here. For me, unless we are trying to trim the C API down to just what is syntactically supported in Python and in such a way that it hides all C-level details I feel like we are guessing at what's best for other VMs, both today and in the future, until they can tell us that e.g. tuple indexing is actually not a problem performance-wise. And Just to be clear, I totally support coming up with a totally stripped-down C API as I have outlined above as that shouldn't be controversial for any VM that wants to have a C-level API. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
> > It seems like the discussion so far is: > > Victor: "I know people when people hear 'new API' they get scared and > think we're going to do a Python-3-like breaking transition, but don't > worry, we're never going to do that." > Nathaniel: "But then what does the new API add?" > Greg: "It lets us do a Python-3-like breaking transition!" > That is not what I am proposing but it seems too easy for people to misunderstand it as such. Sorry. Between everything discussed across this thread I believe we have enough information to suggest that we can avoid an "everyone's afraid of a new 3" mistake by instead making a shim available with a proposed new API that works on top of existing Python VM(s) so that if we decide to drop the old API being public in the future, we could do so *without a breaking transition*. Given that, I suggest not worrying about defining a new C API within the CPython project and release itself (yet). Without an available benefit, little will use it (and given the function call overhead we want to isolate some concepts, we know it will perform worse on today's VMs). That "top-5" module using it idea? Maintain forks (hooray for git) of whatever your definition of "top-5" projects is that use the new API instead of the CPython API. If you attempt this on things like NumPy, you may be shocked at the states (plural on purpose) of their extension module code. That holds true for a lot of popular modules. Part of the point of this work is to demonstrate that non-incremental order of magnitude performance change can be had on a Python VM that only supports such an API can be done in its own fork of CPython, PyPy, VictorBikeshedPy, FbIsAfraidToReleaseANewGcVmPy, etc. implementation to help argue for figuring out a viable not-breaking-the-world transition plan to do such a C API change thing in CPython itself. -gps ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Le mer. 14 nov. 2018 à 17:28, Paul Moore a écrit : > OK, got it. Thanks for taking the time to clarify and respond to my > concerns. Much appreciated. I'm my fault. I am failing to explain my plan proplerly. It seems like I had to update my website to better explain :-) Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Wed, 14 Nov 2018 at 16:00, Victor Stinner wrote: > > In short, you don't have to modify your C extensions and they will > continue to work as before on Python 3.8. [...] > I hope that "later" we will get a faster CPython using new optimizations there>, only compatible with C extensions compiled > with the new C API. My secret hope is that it should ease the > experimentation of a (yet another) JIT compiler for CPython :-) OK, got it. Thanks for taking the time to clarify and respond to my concerns. Much appreciated. Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
In short, you don't have to modify your C extensions and they will continue to work as before on Python 3.8. I only propose to add a new C API, don't touch the existing one in any way. Introducing backward incompatible changes to the existing C API is out of my plan. /usr/bin/python3.8 will support C extensions compiled with the old C API and C extensions compiled with the new C API. My plan also includes to be able to write C extensions compatible with the old and new C API in a single code base. As we have Python code working nicely on Python 2 and Python 3 (thanks to six, mostly). In my experience, having two branches or two repositories for two flavors of the same code is a nice recipe towards inconsistent code and painful workflow. Le mer. 14 nov. 2018 à 15:53, Paul Moore a écrit : > It occurs to me that we may be talking at cross purposes. I noticed > https://pythoncapi.readthedocs.io/backward_compatibility.html#forward-compatibility-with-python-3-8-and-newer > which seems to be saying that 3rd party code *will* need to change for > 3.8. Oh. It's badly explained in that case. This section is only about C extensions which really want to become compatible with the new C API. > You mention removed functions there, so I guess "stop using the > removed functions and you'll work with 3.8+ and <=3.7" is the > compatible approach - but it doesn't offer a way for projects that > *need* the functionality that's been removed to move forward. If you need a removed functions, don't use the new C API. > So to try to be clear, your proposal is that in 3.8: > > 1. The existing C API will remain > 2. A new C API will be *added* that 3rd party projects can use should > they wish to. Yes, that's it. Add a new API, don't touch existing API. > And in 3.9 onwards, both C APIs will remain, maybe with gradual and > incremental changes that move users of the existing C API closer and > closer to the new one (via deprecations, replacement APIs etc as per > our normal compatibility rules). Honestly, it's too early to say if we should modify the current C API in any way. I only plan to put advices in the *documentation*. Something like "this function is really broken, don't use it" :-) Or "you can use xxx instead which makes your code compatible with the new C API". But I don't plan it to modify the doc soon. It's too early at this point. > Or is the intention that at *some* > point there will be a compatibility break and the existing API will > simply be removed in favour of the "new" API? That's out of the scope of *my* plan. Maybe someone else will show up in 10 years and say "ok, let's deprecate the old C API". But in my experience, legacy stuff never goes away :-) (Python 2, anyone?) > The above is clear, but I don't see what incentive there is in that > scenario for anyone to actually migrate to the new API... https://pythoncapi.readthedocs.io/ tries to explain why you should want to be compatible with the new C API. The main advantage of the new C API is to compile your C extension once and use it on multiple runtimes: * use PyPy for better performances (better than with the old C API) * use a Python Debug Runtime which contains additional runtime checks to detect various kinds of bugs in your C extension * distribute a single binary working on multiple Python versions (compile on 3.8, use it on 3.9): "stable ABI" -- we are no there yet, I didn't check what should be done in practice for that I hope that "later" we will get a faster CPython using , only compatible with C extensions compiled with the new C API. My secret hope is that it should ease the experimentation of a (yet another) JIT compiler for CPython :-) Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Wed, 14 Nov 2018 at 14:39, Paul Moore wrote: > If it is the case that there's no need for any 3rd party code to > change in order to continue working with 3.8+, then I apologise for > the interruption. This is where being able to edit posts, a la Discourse would be useful :-) It occurs to me that we may be talking at cross purposes. I noticed https://pythoncapi.readthedocs.io/backward_compatibility.html#forward-compatibility-with-python-3-8-and-newer which seems to be saying that 3rd party code *will* need to change for 3.8. You mention removed functions there, so I guess "stop using the removed functions and you'll work with 3.8+ and <=3.7" is the compatible approach - but it doesn't offer a way for projects that *need* the functionality that's been removed to move forward. That's the type of hard break that I was trying to ask about, and which I thought you said would not happen when you stated "I don't want to force anyone to move to a new experimental API", and "No, the current C API will remain available. No one is forced to do anything. That's not part of my plan". So to try to be clear, your proposal is that in 3.8: 1. The existing C API will remain 2. A new C API will be *added* that 3rd party projects can use should they wish to. And in 3.9 onwards, both C APIs will remain, maybe with gradual and incremental changes that move users of the existing C API closer and closer to the new one (via deprecations, replacement APIs etc as per our normal compatibility rules). Or is the intention that at *some* point there will be a compatibility break and the existing API will simply be removed in favour of the "new" API? Fundamentally, that's what I'm trying to get a clear picture of. The above is clear, but I don't see what incentive there is in that scenario for anyone to actually migrate to the new API... Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Wed, 14 Nov 2018 at 14:28, Victor Stinner wrote: > > assuming the experiment is successful, forced (as opposed to opt-in) > > migration to the new API would be handled in a gradual, > > No, the current C API will remain available. No one is forced to do > anything. That's not part of my plan. Oh, cool. So current code will continue working indefinitely? What's the incentive for projects to switch to the new API in that case? Won't we just end up having to carry two APIs indefinitely? Sorry if this is all obvious, or was explained previously - as I said I've not been following precisely because I assumed it was all being handled on an "if you don't care you can ignore it and nothing will change" basis, but Raymond's comments plus your suggestion that you needed to test existing C extensions, made me wonder. If it is the case that there's no need for any 3rd party code to change in order to continue working with 3.8+, then I apologise for the interruption. Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Le mer. 14 nov. 2018 à 14:36, Paul Moore a écrit : > PS What percentage does "top 5" translate to? In terms of both > downloads and actual numbers of extensions? With only 5, it would be > very easy (I suspect) to get only scientific packages, and (for > example) miss out totally on database APIs, or web helpers. You'll > likely get a broader sense of where issues lie if you cover a wide > range of application domains. I don't want to force anyone to move to a new experimental API. I don't want to propose patches to third party modules for example. I would like to ensure that I don't break too many C extensions, or that tools to convert C extensions to the new API work as expected :-) Everything is experimental. > PPS I'd like to see a summary of your backward compatibility plan. https://pythoncapi.readthedocs.io/backward_compatibility.html > assuming the experiment is successful, forced (as opposed to opt-in) > migration to the new API would be handled in a gradual, No, the current C API will remain available. No one is forced to do anything. That's not part of my plan. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Tue, 13 Nov 2018 at 21:02, Victor Stinner wrote: > My plan is to select something like the top five most popular C > extensions based on PyPI download statistics. I cannot test > everything, I have to put practical limits. You should probably also consider embedding applications - these have the potential to be adversely affected too. One example would be vim, which embeds Python, and makes fairly heavy use of the API (in some relatively nonstandard ways, for better or worse). Paul PS What percentage does "top 5" translate to? In terms of both downloads and actual numbers of extensions? With only 5, it would be very easy (I suspect) to get only scientific packages, and (for example) miss out totally on database APIs, or web helpers. You'll likely get a broader sense of where issues lie if you cover a wide range of application domains. PPS I'd like to see a summary of your backward compatibility plan. I've not been following this thread so maybe I missed it (if so, a pointer would be appreciated), but I'd expect as a user that extensions and embedding applications would *not* need a major rewrite to work with Python 3.8 - that being the implication of "opt in". I'd also expect that to remain true for any future version of Python - assuming the experiment is successful, forced (as opposed to opt-in) migration to the new API would be handled in a gradual, backward compatibility respecting manner, exactly as any other changes to the C API are. A hard break like Python 3, even if limited to the C API, would be bad for users (for example, preventing adoption of Python 3.X until the scientific stack migrates to the new API and works out how to handle supporting old-API versions of Python...) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Dienstag, 13. November 2018 21:59:14 CET Victor Stinner wrote: > Le mar. 13 nov. 2018 à 20:32, André Malo a écrit : > > As long as they are recompiled. However, they will lose a lot of > > performance. Both these points have been mentioned somewhere, I'm > > certain, but it cannot be stressed enough, IMHO. > > Somewhere is here: > https://pythoncapi.readthedocs.io/performance.html > > I'm wondering, how you suggest to measure "major". I believe, every C > > extension, which is public and running in production somewhere, is major > > enough. > > My plan is to select something like the top five most popular C > extensions based on PyPI download statistics. I cannot test > everything, I have to put practical limits. You shouldn't. Chances are, that you don't even know them enough to do that. A scalable approach would be to talk to the projects and let them do it instead. No? Cheers, -- package Hacker::Perl::Another::Just;print qq~@{[reverse split/::/ =>__PACKAGE__]}~; # André Malo # http://www.perlig.de # ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Wed, 14 Nov 2018 11:48:15 +0100 Victor Stinner wrote: > Le mer. 14 nov. 2018 à 11:24, Antoine Pitrou a écrit : > > For example in PyArrow we use PySequence_Fast_GET_ITEM() (*) > > Maybe PyArrow is a kind of C extension which should have one > implementation for the new C API (PyPy) and one implementation for the > current C API (CPython)? Yes, maybe. I'm just pointing out that we're using those macros and removing them from the C API (or replacing them with non-inline functions) would hurt us. > > and even > > PyType_HasFeature() (**) (to quickly check for multiple base types with > > a single fetch and comparison). > > I'm not sure that PyType_HasFeature() is an issue? I don't know. You're the one who decides :-) cheers Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Le mer. 14 nov. 2018 à 11:24, Antoine Pitrou a écrit : > For example in PyArrow we use PySequence_Fast_GET_ITEM() (*) Maybe PyArrow is a kind of C extension which should have one implementation for the new C API (PyPy) and one implementation for the current C API (CPython)? Cython can be used to generate two different C code from the same source code using a different compilation mode. > and even > PyType_HasFeature() (**) (to quickly check for multiple base types with > a single fetch and comparison). I'm not sure that PyType_HasFeature() is an issue? Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Wed, 14 Nov 2018 11:03:49 +0100 Victor Stinner wrote: > > Oh, I should stop to promote my "CPython fork" idea. > > There is already an existing VM which is way faster than CPython but > its performances are limited by the current C API. The VM is called... > PyPy! > > The bet is that migrating to a new C API would make your C extension faster. Faster on PyPy... but potentially slower on CPython. That's what we (you :-)) need to investigate and solve. Those macros and inline functions are actually important for many use cases. For example in PyArrow we use PySequence_Fast_GET_ITEM() (*) and even PyType_HasFeature() (**) (to quickly check for multiple base types with a single fetch and comparison). (*) https://github.com/apache/arrow/blob/master/cpp/src/arrow/python/iterators.h#L39-L86 (**) https://github.com/apache/arrow/blob/master/cpp/src/arrow/python/helpers.cc#L266-L299 Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Le mer. 14 nov. 2018 à 03:24, Nathaniel Smith a écrit : > So I think what you're saying is that your goal is to get a > new/better/shinier VM, and the plan to accomplish that is: > > 1. Define a new C API. > 2. Migrate projects to the new C API. > 3. Build a new VM that gets benefits from only supporting the new API. > > This sounds exactly backwards to me? > > If you define the new API before you build the VM, then no-one is > going to migrate, because why should they bother? You'd be asking > overworked third-party maintainers to do a bunch of work with no > benefit, except that maybe someday later something good might happen. Oh, I should stop to promote my "CPython fork" idea. There is already an existing VM which is way faster than CPython but its performances are limited by the current C API. The VM is called... PyPy! The bet is that migrating to a new C API would make your C extension faster. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Sun, Nov 11, 2018 at 3:19 PM, Victor Stinner wrote: > I'm not sure yet how far we should go towards a perfect API which > doesn't leak everything. We have to move slowly, and make sure that we > don't break major C extensions. We need to write tools to fully > automate the conversion. If it's not possible, maybe the whole project > will fail. This is why I'm nervous about adding this directly to CPython. If we're just talking about adding a few new API calls to replace old ones that are awkward to use, then that's fine, that's not very risky. But if you're talking about a large project that makes fundamental changes in the C API (e.g., disallowing pointer dereferences, like tagged pointers do), then yeah, there's a very large risk that that might fail. >> If so, then would it make more sense to develop this as an actual>> separate >> abstraction layer? That would have the huge advantage that it >> could be distributed and versioned separately from CPython, different >> packages could use different versions of the abstraction layer, PyPy >> isn't forced to immediately add a bunch of new APIs... > > I didn't investigate this option. But I expect that you will have to > write a full new API using a different prefix than "Py_". Otherwise, > I'm not sure how you want to handle PyTuple_GET_ITEM() as a macro on > one side (Include/tupleobject.h) and PyTuple_GET_ITEM() on the other > side (hypotetical_new_api.h). > > Would it mean to duplicate all functions to get a different prefix? > > If you keep the "Py_" prefix, what I would like to ensure is that some > functions are no longer accessible. How you remove > PySequence_Fast_GET_ITEM() for example? > > For me, it seems simpler to modify CPython headers than starting on > something new. It seems simpler to choose the proper level of > compatibility. I start from an API 100% compatible (the current C > API), and decide what is changed and how. It may be simpler, but it's hugely more risky. Once you add something to CPython, you can't take it back again without a huge amount of work. You said above that the whole project might fail. But if it's in CPython, failure is not acceptable! The whole problem you're trying to solve is that the C API is too big, but your proposed solution starts by making it bigger, so if your project fails then it makes the problem even bigger... I don't know if making it a separate project is the best approach or not, it was just an idea :-). But it would have the huge benefit that you can actually experiment and try things out without committing to supporting them forever. And I don't know the best answer to all your questions above, that's what experimenting is for :-). But it certainly is technically possible to make a new API that shares a common subset with the old API, e.g.: /* NewPython.h */ #include #define PyTuple_GET_ITEM PyTuple_Get_Item #undef PySequence_Fast_GET_ITEM -n -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Mon, Nov 12, 2018 at 10:46 PM, Gregory P. Smith wrote: > > On Fri, Nov 9, 2018 at 5:50 PM Nathaniel Smith wrote: >> >> On Fri, Nov 9, 2018 at 4:30 PM, Victor Stinner >> wrote: >> > Ah, important points. I don't want to touch the current C API nor make >> > it less efficient. And compatibility in both directions (current C API >> > <=> new C API) is very important for me. There is no such plan as >> > "Python 4" which would break the world and *force* everybody to >> > upgrade to the new C API, or stay to Python 3 forever. No. The new C >> > API must be an opt-in option, and current C API remains the default >> > and not be changed. >> >> Doesn't this mean that you're just making the C API larger and more >> complicated, rather than simplifying it? You cite some benefits >> (tagged pointers, changing the layout of PyObject, making PyPy's life >> easier), but I don't see how you can do any of those things so long as >> the current C API remains supported. [...] > I'd love to get to a situation where the only valid ABI we support knows > nothing about internal structs at all. Today, PyObject memory layout is > exposed to the world and unchangable. :( > This is a long process release wise (assume multiple stable releases go by > before we could declare that). It seems like the discussion so far is: Victor: "I know people when people hear 'new API' they get scared and think we're going to do a Python-3-like breaking transition, but don't worry, we're never going to do that." Nathaniel: "But then what does the new API add?" Greg: "It lets us do a Python-3-like breaking transition!" To make a new API work we need to *either* have some plan for how it will produce benefits without a big breaking transition, *or* some plan for how to make this kind of transition viable. These are both super super hard questions -- that's why this discussion has been dragging on for a decade now! But you do have to pick one or the other :-). > Experimentation with new internal implementations can begin once we have a > new C API by explicitly breaking the old C API with-in such experiments (as > is required for most anything interesting). All code that is written to the > new C API still works during this process, thus making the job of practical > testing of such new VM internals easier. So I think what you're saying is that your goal is to get a new/better/shinier VM, and the plan to accomplish that is: 1. Define a new C API. 2. Migrate projects to the new C API. 3. Build a new VM that gets benefits from only supporting the new API. This sounds exactly backwards to me? If you define the new API before you build the VM, then no-one is going to migrate, because why should they bother? You'd be asking overworked third-party maintainers to do a bunch of work with no benefit, except that maybe someday later something good might happen. And if you define the new API first, then when you start building the VM you're 100% guaranteed to discover that the new API isn't *quite* right for the optimizations you want to do, and have to change it again to make a new-new API. And then go back to the maintainers who you did convince to put their neck out and do work on spec, and explain that haha whoops actually they need to update their code *again*. There have been lots of Python VM projects at this point. They've faced many challenges, but I don't think any have failed because there just wasn't enough pure-Python code around to test the VM internals. If I were trying to build a new Python VM, that's not even in the top 10 of issues I'd be worried about... -n -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Le mar. 13 nov. 2018 à 20:32, André Malo a écrit : > As long as they are recompiled. However, they will lose a lot of performance. > Both these points have been mentioned somewhere, I'm certain, but it cannot be > stressed enough, IMHO. Somewhere is here: https://pythoncapi.readthedocs.io/performance.html > I'm wondering, how you suggest to measure "major". I believe, every C > extension, which is public and running in production somewhere, is major > enough. My plan is to select something like the top five most popular C extensions based on PyPI download statistics. I cannot test everything, I have to put practical limits. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Victor Stinner wrote: > Replacing macros with functions has little impact on backward > compatibility. Most C extensions should still work if macros become > functions. As long as they are recompiled. However, they will lose a lot of performance. Both these points have been mentioned somewhere, I'm certain, but it cannot be stressed enough, IMHO. > > I'm not sure yet how far we should go towards a perfect API which > doesn't leak everything. We have to move slowly, and make sure that we > don't break major C extensions. We need to write tools to fully > automate the conversion. If it's not possible, maybe the whole project > will fail. I'm wondering, how you suggest to measure "major". I believe, every C extension, which is public and running in production somewhere, is major enough. Maybe "easiness to fix"? Lines of code? Cheers, -- > Rätselnd, was ein Anthroposoph mit Unterwerfung zu tun hat... Du gönnst einem keine einzige.-- Jean Claude und David Kastrup in dtl[...] Dieses Wort gibt so viele Stellen für einen Spelling Flame her, und ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Victor Stinner wrote: > Replacing macros with functions has little impact on backward > compatibility. Most C extensions should still work if macros become > functions. As long as they are recompiled. However, they will lose a lot of performance. Both these points have been mentioned somewhere, I'm certain, but it cannot be stressed enough, IMHO. > > I'm not sure yet how far we should go towards a perfect API which > doesn't leak everything. We have to move slowly, and make sure that we > don't break major C extensions. We need to write tools to fully > automate the conversion. If it's not possible, maybe the whole project > will fail. I'm wondering, how you suggest to measure "major". I believe, every C extension, which is public and running in production somewhere, is major enough. Maybe "easiness to fix"? Lines of code? Cheers, -- > Rätselnd, was ein Anthroposoph mit Unterwerfung zu tun hat... [...] Dieses Wort gibt so viele Stellen für einen Spelling Flame her, und Du gönnst einem keine einzige.-- Jean Claude und David Kastrup in dtl ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Le mar. 13 nov. 2018 à 08:13, Gregory P. Smith a écrit : > When things have only ever been macros (Py_INCREF, etc) the name can be > reused if there has never been a function of that name in an old C API. But > beware of reuse for anything where the semantics change to avoid > misunderstandings about behavior from people familiar with the old API or > googling API names to look up behavior. My plan is to only keep an existing function if it has no flaw. If it has a flaw, it should be removed and maybe replaced with a new function (or suggest a replacement using existing APIs). I don't want to modify the behavior depending if it's the "old" or the "new" API. My plan reuses the same code base, I don't want to put the whole body of a function inside a "#ifdef NEWCAPI". > I suspect optimizing for ease of transition from code written to the existing > C API to the new API by keeping names the same is the wrong thing to optimize > for. Not all functions in the current C API are bad. Many functions are just fine. For example, PyObject_GetAttr() returns a strong reference. I don't see anything wrong with this API. Only a small portion of the C API is "bad". > Using entirely new names may actually be a good thing as it makes it > immediately clear which way a given piece of code is written. It'd also be > good for PyObject* the old C API thing be a different type from PythonHandle* > (a new API thing who's name I just made up) such that they could not be > passed around and exchanged for one another without a compiler complaint. > Code written using both APIs should not be allowed to transit objects > directly between different APIs. On Windows, the HANDLE type is just an integer, it's not a pointer. If it's a pointer, some developer may want to dereference it, whereas it must really be a dummy integer. Consider tagged pointers: you don't want to dereferenced a tagged pointer. But no, I don't plan to replace "PyObject*". Again, I want to reduce the number of changes. If the PyObject structure is not exposed, I don't think that it's an issue to keep "PyObject*" type. Example: --- #include typedef struct _object PyObject; PyObject* dummy(void) { return (PyObject *)NULL; } int main() { PyObject *obj = dummy(); return obj->ob_type; } --- This program is valid, except of the single line which attempts to dereference PyObject*: x.c: In function 'main': x.c:13:15: error: dereferencing pointer to incomplete type 'PyObject {aka struct _object}' return obj->ob_type; If I could restart from scratch, I would design the C API differently. For example, I'm not sure that I would use "global variables" (Python thread state) to store the current exception. I would use similar like Rust error handling: https://doc.rust-lang.org/book/first-edition/error-handling.html But that's not my plan. My plan is not to write a new bright world. My plan is to make a "small step" towards a better API to make PyPy more efficient and to allow to write a new more optimized CPython. I also plan to *iterate* on the API rather than having a frozen API. It's just that we cannot jump towards the perfect API at once. We need small steps and make sure that we don't break too many C extensions at each milestone. Maybe the new API should be versioned as Android NDK for example. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Sun, Nov 11, 2018 at 3:19 PM Victor Stinner wrote: > Le sam. 10 nov. 2018 à 04:02, Nathaniel Smith a écrit : > > So is it fair to say that your plan is that CPython will always use > > the current ("old") API internally, and the "new" API will be > > essentially an abstraction layer, that's designed to let people write > > C extensions that target the old API, while also being flexible enough > > to target PyPy and other "new different Python runtimes"? > > What we call is "Python C API" is not really an API. I mean, nobody > tried to sit down and think about a proper API to access Python from > C. We happened is that we had an implementation of Python written in C > and it was cool to just "expose everything". By everything, I mean > "everything". It's hard to find a secret CPython feature not exposed > or leaked in the C API. > > The problem that I'm trying to solve to "fix the C API" to hide > implementation details, with one constraint: don't create a new > "Python 4". I don't want to break the backward compatibility without > having a slow and smooth transition plan. The "old" and new C API must > be supported in parallel, at least in the standard CPython, > /usr/bin/python3. > > Writing a new API from scratch is nice, but it's harder to moving all > existing C extensions from the "old" C API to a new one. > > Replacing macros with functions has little impact on backward > compatibility. Most C extensions should still work if macros become > functions. > > I'm not sure yet how far we should go towards a perfect API which > doesn't leak everything. We have to move slowly, and make sure that we > don't break major C extensions. We need to write tools to fully > automate the conversion. If it's not possible, maybe the whole project > will fail. > > I'm looking for a practical solutions based on the existing C API and > the existing CPython code base. > > > If so, then would it make more sense to develop this as an actual > > separate abstraction layer? That would have the huge advantage that it > > could be distributed and versioned separately from CPython, different > > packages could use different versions of the abstraction layer, PyPy > > isn't forced to immediately add a bunch of new APIs... > I like where this thought is headed! I didn't investigate this option. But I expect that you will have to > write a full new API using a different prefix than "Py_". Otherwise, > I'm not sure how you want to handle PyTuple_GET_ITEM() as a macro on > one side (Include/tupleobject.h) and PyTuple_GET_ITEM() on the other > side (hypotetical_new_api.h). > > Would it mean to duplicate all functions to get a different prefix? > For strict C, yes, namespacing has always been manual collision avoidance/mangling. You'd need a new unique name for anything defined as a function that conflicts with the old C API function names. You could hide these from code if you are just reusing the name by using #define in the new API header of the old name to the new unique name... but that makes code a real challenge to read and understand at a glance. Without examining the header file imports the reader wouldn't know for example if the code is calling the API that borrows a reference (old api, evil) or one that gets its own reference (presumed new api). When things have only ever been macros (Py_INCREF, etc) the name can be reused if there has never been a function of that name in an old C API. But beware of reuse for anything where the semantics change to avoid misunderstandings about behavior from people familiar with the old API or googling API names to look up behavior. I suspect optimizing for ease of transition from code written to the existing C API to the new API by keeping names the same is the wrong thing to optimize for. Using entirely new names may actually be a good thing as it makes it immediately clear which way a given piece of code is written. It'd also be good for PyObject* the old C API thing be a different type from PythonHandle* (a new API thing who's name I just made up) such that they could not be passed around and exchanged for one another without a compiler complaint. Code written using both APIs should not be allowed to transit objects directly between different APIs. -gps > > If you keep the "Py_" prefix, what I would like to ensure is that some > functions are no longer accessible. How you remove > PySequence_Fast_GET_ITEM() for example? > > For me, it seems simpler to modify CPython headers than starting on > something new. It seems simpler to choose the proper level of > compatibility. I start from an API 100% compatible (the current C > API), and decide what is changed and how. > > Victor > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/greg%40krypto.org > ___
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Fri, Nov 9, 2018 at 5:50 PM Nathaniel Smith wrote: > On Fri, Nov 9, 2018 at 4:30 PM, Victor Stinner > wrote: > > Ah, important points. I don't want to touch the current C API nor make > > it less efficient. And compatibility in both directions (current C API > > <=> new C API) is very important for me. There is no such plan as > > "Python 4" which would break the world and *force* everybody to > > upgrade to the new C API, or stay to Python 3 forever. No. The new C > > API must be an opt-in option, and current C API remains the default > > and not be changed. > > Doesn't this mean that you're just making the C API larger and more > complicated, rather than simplifying it? You cite some benefits > (tagged pointers, changing the layout of PyObject, making PyPy's life > easier), but I don't see how you can do any of those things so long as > the current C API remains supported. > > -n > I believe the implied missing thing from Victor's description is this: Experimentation with new internal implementations can begin once we have a new C API by explicitly breaking the old C API with-in such experiments (as is required for most anything interesting). All code that is written to the new C API still works during this process, thus making the job of practical testing of such new VM internals easier. >From there, you can make decisions on how heavily to push the world towards adoption of the new C API and by when so that a runtime not supporting the old API can be realized with a list of enticing carrot tasting benefits. (devising any necessary pypy cpyext-like compatibility solutions for super long lived code or things that sadly want to use the 3.7 ABI for 10 years in the process - and similarly an extension to provide some or all of this API/ABI on top of older existing stable Python releases!) I'd *love* to get to a situation where the only valid ABI we support knows nothing about internal structs at all. Today, PyObject memory layout is exposed to the world and unchangable. :( This is a long process release wise (assume multiple stable releases go by before we could declare that). But *we've got to start* by defining what we want to provide as a seriously limited but functional API and ABI even if it doesn't perform as well as things compiled against our existing exposed-internals C API. For *most* extension modules, performance of this sort is not important. For the numpys of the world life is more complicated, we should work with them to figure out their C API needs. If it wasn't already obvious, you've got my support on this. :) -gps PS If the conversation devolves to arguing about "new" being a bad name, that's a good sign. I suggest calling it the Vorpal API after the bunny. Or be boring and just use a year number for the name. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Le sam. 10 nov. 2018 à 04:02, Nathaniel Smith a écrit : > So is it fair to say that your plan is that CPython will always use > the current ("old") API internally, and the "new" API will be > essentially an abstraction layer, that's designed to let people write > C extensions that target the old API, while also being flexible enough > to target PyPy and other "new different Python runtimes"? What we call is "Python C API" is not really an API. I mean, nobody tried to sit down and think about a proper API to access Python from C. We happened is that we had an implementation of Python written in C and it was cool to just "expose everything". By everything, I mean "everything". It's hard to find a secret CPython feature not exposed or leaked in the C API. The problem that I'm trying to solve to "fix the C API" to hide implementation details, with one constraint: don't create a new "Python 4". I don't want to break the backward compatibility without having a slow and smooth transition plan. The "old" and new C API must be supported in parallel, at least in the standard CPython, /usr/bin/python3. Writing a new API from scratch is nice, but it's harder to moving all existing C extensions from the "old" C API to a new one. Replacing macros with functions has little impact on backward compatibility. Most C extensions should still work if macros become functions. I'm not sure yet how far we should go towards a perfect API which doesn't leak everything. We have to move slowly, and make sure that we don't break major C extensions. We need to write tools to fully automate the conversion. If it's not possible, maybe the whole project will fail. I'm looking for a practical solutions based on the existing C API and the existing CPython code base. > If so, then would it make more sense to develop this as an actual > separate abstraction layer? That would have the huge advantage that it > could be distributed and versioned separately from CPython, different > packages could use different versions of the abstraction layer, PyPy > isn't forced to immediately add a bunch of new APIs... I didn't investigate this option. But I expect that you will have to write a full new API using a different prefix than "Py_". Otherwise, I'm not sure how you want to handle PyTuple_GET_ITEM() as a macro on one side (Include/tupleobject.h) and PyTuple_GET_ITEM() on the other side (hypotetical_new_api.h). Would it mean to duplicate all functions to get a different prefix? If you keep the "Py_" prefix, what I would like to ensure is that some functions are no longer accessible. How you remove PySequence_Fast_GET_ITEM() for example? For me, it seems simpler to modify CPython headers than starting on something new. It seems simpler to choose the proper level of compatibility. I start from an API 100% compatible (the current C API), and decide what is changed and how. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On 2018-11-09, Dino Viehland wrote: > Rather than adding yet another pre-processor directive for this I would > suggest just adding a new header file that only has the new stable API. > For example it could just be "py.h" or "pyapi.h". It would have all of the > definitions for the stable API. I like this idea. It will be easier to define a minimal and clean API with this approach. I believe it can mostly be a subset of the current API. I think we could Dino's idea with Nathaniel's suggestion of developing it separate from CPython. Victor's C-API project is already attempting to provide backwards compatibility. I.e. you can have an extension module that uses the new API but compiles and runs with older versions of Python (e.g. 3.6). So, whatever is inside this new API, it must be possible to build it on top of the existing Python API. Regards, Neil ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
> > That's exactly why I dislike "New", it's like adding "Ex" or "2" to a > function name :-) > > Well, before bikeshedding on the C define name, I would prefer to see > if the overall idea of trying to push code for the new C API in the > master branch is a good idea, or if it's too early and the experiment > must continue in a fork. Rather than adding yet another pre-processor directive for this I would suggest just adding a new header file that only has the new stable API. For example it could just be "py.h" or "pyapi.h". It would have all of the definitions for the stable API. While that would involve some duplication from the existing headers, I don't think it would be such a big deal - the idea is the API won't change, methods won't be removed, and occasionally new methods will get added in a very thoughtful manner. Having it be separate will force thought and conversation about it. It would also make it very easy to look and see what exactly is in the stable API as well. There's be a pretty flat list which can be consulted, and hopefully it ends up not being super huge either. BTW, thanks for continuing to push on this Victor, it seems like great progress! On Fri, Nov 9, 2018 at 4:57 PM Victor Stinner wrote: > Le sam. 10 nov. 2018 à 01:49, Michael Selik a écrit : > >> It uses an opt-in option (Py_NEWCAPI define -- I'm not sure about the > >> name) to get the new API. The current C API is unchanged. > > > > While one can hope that this will be the only time the C API will be > revised, it may be better to number it instead of calling it "NEW". 20 > years from now, it won't feel new anymore. > > That's exactly why I dislike "New", it's like adding "Ex" or "2" to a > function name :-) > > Well, before bikeshedding on the C define name, I would prefer to see > if the overall idea of trying to push code for the new C API in the > master branch is a good idea, or if it's too early and the experiment > must continue in a fork. > > Victor > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/dinoviehland%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Fri, Nov 9, 2018 at 6:03 PM, Victor Stinner wrote: > Le sam. 10 nov. 2018 à 02:50, Nathaniel Smith a écrit : >> Doesn't this mean that you're just making the C API larger and more >> complicated, rather than simplifying it? You cite some benefits >> (tagged pointers, changing the layout of PyObject, making PyPy's life >> easier), but I don't see how you can do any of those things so long as >> the current C API remains supported. > > Tagged pointers and changing the layout of PyObject can only be > experimented in a new different Python runtime which only supports C > extensions compiled with the new C API. Technically, it can be CPython > compiled with a different flag, as there is already python3-dbg (debug > mode, ./configure --with-pydebug) and python3 (release mode). Or it > can be CPython fork. > > I don't propose to experiment tagged pointer or changing the layout of > PyObject in CPython. It may require too many changes and it's unclear > if it's worth it or not. I only propose to implement the least > controversial part of the new C API in the master branch, since > maintaining this new C API in a fork is painful. > > I cannot promise that it will make PyPy's life easier. PyPy developers > already told me that they already implemented the support of the > current C API. The promise is that if you use the new C API, PyPy > should be more efficient, because it would have less things to > emulate. To be honest, I'm not sure at this point, I don't know PyPy > internals. I also know that PyPy developers always complain when we > *add new functions* to the C API, and there is a non-zero risk that I > would like to add new functions, since current ones have issues :-) I > am working with PyPy to involve them in the new C API. So is it fair to say that your plan is that CPython will always use the current ("old") API internally, and the "new" API will be essentially an abstraction layer, that's designed to let people write C extensions that target the old API, while also being flexible enough to target PyPy and other "new different Python runtimes"? If so, then would it make more sense to develop this as an actual separate abstraction layer? That would have the huge advantage that it could be distributed and versioned separately from CPython, different packages could use different versions of the abstraction layer, PyPy isn't forced to immediately add a bunch of new APIs... -n -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Sometimes, code is easier to understand than a long explanation, so here is a very simple example of modified function for the new C API: https://bugs.python.org/issue35206 https://github.com/python/cpython/pull/10443/files PyTuple_GET_ITEM() becomes a function call and the function implementation checks arguments at runtime if compiled in debug mode. Technically, the header file still uses a macro, to implicitly cast to PyObject*, since currently the macro accepts any type, and the new C API should not change that. Victor Le sam. 10 nov. 2018 à 01:53, Victor Stinner a écrit : > > To hide all implementation details, I propose to stop using macros and > use function calls instead. For example, replace: > > #define PyTuple_GET_ITEM(op, i) \ >(((PyTupleObject *)(op))->ob_item[i]) > > with: > > # define PyTuple_GET_ITEM(op, i) PyTuple_GetItem(op, i) > > With this change, C extensions using PyTuple_GET_ITEM() does no longer > dereference PyObject* nor access PyTupleObject.ob_item. For example, > PyPy doesn't have to convert all tuple items to PyObject, but only > create one PyObject for the requested item. Another example is that it > becomes possible to use a "CPython debug runtime" which checks at > runtime that the first argument is a tuple and that the index is > valid. For a longer explanation, see the idea of different "Python > runtimes": > >https://pythoncapi.readthedocs.io/runtimes.html > > Replacing macros with function calls is only a first step. It doesn't > solve the problem of borrowed references for example. > > Obviously, such change has a cost on performances. Sadly, I didn't run > a benchmark yet. At this point, I mostly care about correctness and > the feasibility of the whole project. I also hope that the new C API > will allow to implement new optimizations which cannot even be > imagined today, because of the backward compatibility. The question is > if the performance balance is positive or not at the all :-) > Hopefully, there is no urgency to take any decision at this point. The > whole project is experimental and can be cancelled anytime. > > Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Le sam. 10 nov. 2018 à 02:50, Nathaniel Smith a écrit : > Doesn't this mean that you're just making the C API larger and more > complicated, rather than simplifying it? You cite some benefits > (tagged pointers, changing the layout of PyObject, making PyPy's life > easier), but I don't see how you can do any of those things so long as > the current C API remains supported. Tagged pointers and changing the layout of PyObject can only be experimented in a new different Python runtime which only supports C extensions compiled with the new C API. Technically, it can be CPython compiled with a different flag, as there is already python3-dbg (debug mode, ./configure --with-pydebug) and python3 (release mode). Or it can be CPython fork. I don't propose to experiment tagged pointer or changing the layout of PyObject in CPython. It may require too many changes and it's unclear if it's worth it or not. I only propose to implement the least controversial part of the new C API in the master branch, since maintaining this new C API in a fork is painful. I cannot promise that it will make PyPy's life easier. PyPy developers already told me that they already implemented the support of the current C API. The promise is that if you use the new C API, PyPy should be more efficient, because it would have less things to emulate. To be honest, I'm not sure at this point, I don't know PyPy internals. I also know that PyPy developers always complain when we *add new functions* to the C API, and there is a non-zero risk that I would like to add new functions, since current ones have issues :-) I am working with PyPy to involve them in the new C API. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Fri, Nov 9, 2018 at 4:30 PM, Victor Stinner wrote: > Ah, important points. I don't want to touch the current C API nor make > it less efficient. And compatibility in both directions (current C API > <=> new C API) is very important for me. There is no such plan as > "Python 4" which would break the world and *force* everybody to > upgrade to the new C API, or stay to Python 3 forever. No. The new C > API must be an opt-in option, and current C API remains the default > and not be changed. Doesn't this mean that you're just making the C API larger and more complicated, rather than simplifying it? You cite some benefits (tagged pointers, changing the layout of PyObject, making PyPy's life easier), but I don't see how you can do any of those things so long as the current C API remains supported. -n -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Le sam. 10 nov. 2018 à 01:49, Michael Selik a écrit : >> It uses an opt-in option (Py_NEWCAPI define -- I'm not sure about the >> name) to get the new API. The current C API is unchanged. > > While one can hope that this will be the only time the C API will be revised, > it may be better to number it instead of calling it "NEW". 20 years from now, > it won't feel new anymore. That's exactly why I dislike "New", it's like adding "Ex" or "2" to a function name :-) Well, before bikeshedding on the C define name, I would prefer to see if the overall idea of trying to push code for the new C API in the master branch is a good idea, or if it's too early and the experiment must continue in a fork. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
To hide all implementation details, I propose to stop using macros and use function calls instead. For example, replace: #define PyTuple_GET_ITEM(op, i) \ (((PyTupleObject *)(op))->ob_item[i]) with: # define PyTuple_GET_ITEM(op, i) PyTuple_GetItem(op, i) With this change, C extensions using PyTuple_GET_ITEM() does no longer dereference PyObject* nor access PyTupleObject.ob_item. For example, PyPy doesn't have to convert all tuple items to PyObject, but only create one PyObject for the requested item. Another example is that it becomes possible to use a "CPython debug runtime" which checks at runtime that the first argument is a tuple and that the index is valid. For a longer explanation, see the idea of different "Python runtimes": https://pythoncapi.readthedocs.io/runtimes.html Replacing macros with function calls is only a first step. It doesn't solve the problem of borrowed references for example. Obviously, such change has a cost on performances. Sadly, I didn't run a benchmark yet. At this point, I mostly care about correctness and the feasibility of the whole project. I also hope that the new C API will allow to implement new optimizations which cannot even be imagined today, because of the backward compatibility. The question is if the performance balance is positive or not at the all :-) Hopefully, there is no urgency to take any decision at this point. The whole project is experimental and can be cancelled anytime. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On Fri, Nov 9, 2018 at 4:33 PM Victor Stinner wrote: > It uses an opt-in option (Py_NEWCAPI define -- I'm not sure about the > name) to get the new API. The current C API is unchanged. > While one can hope that this will be the only time the C API will be revised, it may be better to number it instead of calling it "NEW". 20 years from now, it won't feel new anymore. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
Hi, The current C API of Python is both a strength and a weakness of the Python ecosystem as a whole. It's a strength because it allows to quickly reuse a huge number of existing libraries by writing a glue for them. It made numpy possible and this project is a big sucess! It's a weakness because of its cost on the maintenance, it prevents optimizations, and more generally it prevents to experiment modifying Python internals. For example, CPython cannot use tagged pointers, because the existing C API is heavily based on the ability to dereference a PyObject* object and access directly members of objects (like PyTupleObject). For example, Py_INCREF() modifies *directly* PyObject.ob_refcnt. It's not possible neither to use a Python compiled in debug mode on C extensions (compiled in release mode), because the ABI is different in debug mode. As a consequence, nobody uses the debug mode, whereas it is very helpful to develop C extensions and investigate bugs. I also consider that the C API gives too much work to PyPy (for their "cpyext" module). A better C API (not leaking implementation) details would make PyPy more efficient (and simplify its implementation in the long term, when the support for the old C API can be removed). For example, PyList_GetItem(list, 0) currently converts all items of the list to PyObject* in PyPy, it can waste memory if only the first item of the list is needed. PyPy has much more efficient storage than an array of PyObject* for lists. I wrote a website to explain all these issues with much more details: https://pythoncapi.readthedocs.io/ I identified "bad APIs" like using borrowed references or giving access to PyObject** (ex: PySequence_Fast_ITEMS). I already wrote an (incomplete) implementation of a new C API which doesn't leak implementation details: https://github.com/pythoncapi/pythoncapi It uses an opt-in option (Py_NEWCAPI define -- I'm not sure about the name) to get the new API. The current C API is unchanged. Ah, important points. I don't want to touch the current C API nor make it less efficient. And compatibility in both directions (current C API <=> new C API) is very important for me. There is no such plan as "Python 4" which would break the world and *force* everybody to upgrade to the new C API, or stay to Python 3 forever. No. The new C API must be an opt-in option, and current C API remains the default and not be changed. I have different ideas for the compatibility part, but I'm not sure of what are the best options yet. My short term for the new C API would be to ease the experimentation of projects like tagged pointers. Currently, I have to maintain the implementation of a new C API which is not really convenient. -- Today I tried to abuse the Py_DEBUG define for the new C API, but it seems to be a bad idea: https://github.com/python/cpython/pull/10435 A *new* define is needed to opt-in for the new C API. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com