[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
On 17. 12. 21 4:02, Jim J. Jewett wrote: Petr Viktorin wrote: In Python 3.11, Python still implements around 100 types as "static types" which are not compatible with subinterpreters, ... seems like changing it may break the C API *and* the stable ABI If sub-interpreters each need their own copy of even immutable built-in types, then what advantage do they have over separate processes? They need copies of all *Python* objects. A non-Python library may allow several Python wrappers/proxies for a single internal object, effectively sharing that object between subinterpreters. (Which is a problem for removing the GIL -- currently all operations done by such wrappers are protected by the GIL.) OK, so what is the advantage of having multiple interpreters? The only advantage I can see is that if you're embedding what are essentially several distinct python processes, you can still keep them all inside the single process used by the embedding program. But seems pretty far along the "they're already compiling anyhow; so the ABI isn't crucial" path. You should be able to use Python as an implementation detail of a library.For example, an application should be able to use several such libraries, without their Python runtimes influencing each other. See PEP 630 for some more details: https://www.python.org/dev/peps/pep-0630/#motivation ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CQ36ECT4PFXQMPDIIDHCG2YFYFCAXDPZ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
Petr Viktorin wrote: >>> In Python 3.11, Python still implements around 100 types as "static >>> types" which are not compatible with subinterpreters, ... >>> seems like changing it may break the C API *and* the stable ABI > > If sub-interpreters each need their own copy of even immutable built-in > > types, then what advantage do they have over separate processes? > They need copies of all *Python* objects. A non-Python library may allow > several Python wrappers/proxies for a single internal object, > effectively sharing that object between subinterpreters. > (Which is a problem for removing the GIL -- currently all operations > done by such wrappers are protected by the GIL.) OK, so what is the advantage of having multiple interpreters? The only advantage I can see is that if you're embedding what are essentially several distinct python processes, you can still keep them all inside the single process used by the embedding program. But seems pretty far along the "they're already compiling anyhow; so the ABI isn't crucial" path. -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/C2Z2RPRAIGYDODATM5BQQL6DA6LEOVVN/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
On Thu, 16 Dec 2021 11:38:28 -0700 Eric Snow wrote: > On Thu, Dec 16, 2021 at 4:34 AM Antoine Pitrou wrote: > > As a data point, in PyArrow, we have a bunch of C++ code that interacts > > with Python but doesn't belong in a particular Python module. That C++ > > code can of course have global state, including perhaps Python objects. > > Thanks for that example! > > > What might be nice would be a C API to allow creating interpreter-local > > opaque structs, for example: > > > > void* Py_GetInterpreterLocal(const char* unique_name); > > void* Py_SetInterpreterLocal(const char* unique_name, > > void* ptr, void(*)() destructor); > > That's interesting. I can imagine that as just a step beyond the > module state API, with the module being implicit. Do you think this > would be an improvement over using module state? (I'm genuinely > curious.) It would certainly be much easier to use (you just have to choose a unique name, like e.g. for capsules). Regards Antoine. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4DIIJJKJBPGWL3UX5WNU35QRJX7U3BBA/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
On Thu, Dec 16, 2021 at 4:34 AM Antoine Pitrou wrote: > As a data point, in PyArrow, we have a bunch of C++ code that interacts > with Python but doesn't belong in a particular Python module. That C++ > code can of course have global state, including perhaps Python objects. Thanks for that example! > What might be nice would be a C API to allow creating interpreter-local > opaque structs, for example: > > void* Py_GetInterpreterLocal(const char* unique_name); > void* Py_SetInterpreterLocal(const char* unique_name, > void* ptr, void(*)() destructor); That's interesting. I can imagine that as just a step beyond the module state API, with the module being implicit. Do you think this would be an improvement over using module state? (I'm genuinely curious.) -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/KB7ET6XXJFTJDBHL7ABEPSGTD3M2RNAW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
On Thu, 16 Dec 2021 13:25:53 +0100 Petr Viktorin wrote: > On 16. 12. 21 12:33, Antoine Pitrou wrote: > > On Tue, 14 Dec 2021 10:38:25 -0700 > > Eric Snow wrote: > >> > >> So we (the core devs) would effectively be requiring those extensions > >> to support subinterpreters, regardless of letting them opt out. This > >> situation has been weighing heavily on my mind since Nathaniel brought > >> this up. Here are some ideas I've had or heard of about what we could > >> do to help: > >> > >> * add a page to the C-API documentation about how to support > >> subinterpreters > >> * identify the extensions most likely to be impacted and offer to help > >> * add more helpers to the C-API to make adding subinterpreter support > >> less painful > >> * fall back to loading the extension in its own namespace (e.g. use > >> ldm_open()) > >> * fall back to copying the extension's file and loading from the copied > >> file > >> * ... > > > > As a data point, in PyArrow, we have a bunch of C++ code that interacts > > with Python but doesn't belong in a particular Python module. That C++ > > code can of course have global state, including perhaps Python objects. > > > > What might be nice would be a C API to allow creating interpreter-local > > opaque structs, for example: > > > > void* Py_GetInterpreterLocal(const char* unique_name); > > void* Py_SetInterpreterLocal(const char* unique_name, > > void* ptr, void(*)() destructor); > > > > > > Then in extension code you'd be able to write, e.g.: > > What's the reason these can't be tied to the module? Because the module is simply not known. This is C++ utility code called from several different Cython extension modules. > (As the author of PEP 630, which argues that module state is the best > default place for this kind of mutable "globals", I'm interested in the > cases where it isn't so.) That works in a world where all third-party code using the CPython C API lives in a particular extension module. While it is certainly the most common case, I doubt it is universal. > How do you ensure these Python objects are destroyed by/before > Py_Finalize()? (If you do that -- I realize it's not something people > typically think about.) When finalizing a given (sub)interpreter, it would visit all registered interpreter locals and call their "destructor" callback. Regards Antoine. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/R4FKSJZI366ZO76OUDF4WCY6RQRYMNCG/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
On 16. 12. 21 12:33, Antoine Pitrou wrote: On Tue, 14 Dec 2021 10:38:25 -0700 Eric Snow wrote: So we (the core devs) would effectively be requiring those extensions to support subinterpreters, regardless of letting them opt out. This situation has been weighing heavily on my mind since Nathaniel brought this up. Here are some ideas I've had or heard of about what we could do to help: * add a page to the C-API documentation about how to support subinterpreters * identify the extensions most likely to be impacted and offer to help * add more helpers to the C-API to make adding subinterpreter support less painful * fall back to loading the extension in its own namespace (e.g. use ldm_open()) * fall back to copying the extension's file and loading from the copied file * ... As a data point, in PyArrow, we have a bunch of C++ code that interacts with Python but doesn't belong in a particular Python module. That C++ code can of course have global state, including perhaps Python objects. What might be nice would be a C API to allow creating interpreter-local opaque structs, for example: void* Py_GetInterpreterLocal(const char* unique_name); void* Py_SetInterpreterLocal(const char* unique_name, void* ptr, void(*)() destructor); Then in extension code you'd be able to write, e.g.: What's the reason these can't be tied to the module? (As the author of PEP 630, which argues that module state is the best default place for this kind of mutable "globals", I'm interested in the cases where it isn't so.) How do you ensure these Python objects are destroyed by/before Py_Finalize()? (If you do that -- I realize it's not something people typically think about.) ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QUHYVSXGQHXONLRLZNOMH5VMRS44PHPO/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
On Tue, 14 Dec 2021 10:38:25 -0700 Eric Snow wrote: > > So we (the core devs) would effectively be requiring those extensions > to support subinterpreters, regardless of letting them opt out. This > situation has been weighing heavily on my mind since Nathaniel brought > this up. Here are some ideas I've had or heard of about what we could > do to help: > > * add a page to the C-API documentation about how to support subinterpreters > * identify the extensions most likely to be impacted and offer to help > * add more helpers to the C-API to make adding subinterpreter support > less painful > * fall back to loading the extension in its own namespace (e.g. use > ldm_open()) > * fall back to copying the extension's file and loading from the copied file > * ... As a data point, in PyArrow, we have a bunch of C++ code that interacts with Python but doesn't belong in a particular Python module. That C++ code can of course have global state, including perhaps Python objects. What might be nice would be a C API to allow creating interpreter-local opaque structs, for example: void* Py_GetInterpreterLocal(const char* unique_name); void* Py_SetInterpreterLocal(const char* unique_name, void* ptr, void(*)() destructor); Then in extension code you'd be able to write, e.g.: typedef struct { PyObject* cached_decimal; std::vector some_other_data; } MyLocalState ; static void destroy_state(MyLocalState* state) { Py_XDECREF(state->cached_decimal); delete state; } static MyLocalState* get_state() { MyLocalState* state = NULL; state = Py_GetInterpreterLocal("pyarrow._lib.internal"); if (state == NULL) { state = new MyLocalState; Py_SetInterpreterLocal("pyarrow._lib.internal", state, destroy_state); } return state; } PyObject* get_decimal_module() { MyLocalState* state = get_state(); if (state->cached_decimal == NULL) { state->cached_decimal = PyImport_ImportModule("decimal"); } return state->cached_decimal; } > > I'd appreciate your thoughts on what we can do to help. Thanks! > > -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TTVWSOOVMY2AKYARI5VSQ3RF4U7DHLX6/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
On 16. 12. 21 3:41, Jim J. Jewett wrote: In Python 3.11, Python still implements around 100 types as "static types" which are not compatible with subinterpreters, like _Type and _Type. I opened https://bugs.python.org/issue40601 about these static types, but it seems like changing it may break the C API *and* the stable ABI (maybe a clever hack will avoid that). If sub-interpreters each need their own copy of even immutable built-in types, then what advantage do they have over separate processes? They need copies of all *Python* objects. A non-Python library may allow several Python wrappers/proxies for a single internal object, effectively sharing that object between subinterpreters. (Which is a problem for removing the GIL -- currently all operations done by such wrappers are protected by the GIL.) ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/KNMHKD3EXPXIMYEQOHEQ76DK64YRNQQX/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
> In Python 3.11, Python still implements around 100 types as "static > types" which are not compatible with subinterpreters, like > _Type and _Type. I opened > https://bugs.python.org/issue40601 about these static types, but it > seems like changing it may break the C API *and* the stable ABI (maybe > a clever hack will avoid that). If sub-interpreters each need their own copy of even immutable built-in types, then what advantage do they have over separate processes? -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/B7WO5B426HBTG6KZVKQXTJSBQL2S2ILQ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
Hi Brett, IMO the PEP 630 is a good summary and a practical guide explaining how to port existing C extensions to newer C API which are compatible with subinterpreters, unloading a C extension and be able to load a C extension more than once (in the same interpreter): https://www.python.org/dev/peps/pep-0630/ I dislike justifying these changes only with subinterpreters. IMO it's better to justify that these changes are needed to be able to load and unload Python cleanly when an application embeds Python. A common use case is to support plugins in different programming languages, including Python. Some IRC clients and text editors have this use case. It's unpleasant when Python leaks lot of memory when Python is "unloaded", especially if the application is designed to load Python once per plugin. It's even more unpleasant when... they are bugs :-( Examples of changes needed by the PEP 630: * Add a state to the module: use a PyModuleDef.m_size value greater than 0 (usually a structure is used with ".m_size = sizeof(_abcmodule_state)") * Convert static types to heap types and store these types in the module state * Move global variable and function "static" variables into the module state * IMO the most complicated part is to retrieve the module state from functions which don't directly get the module as an argument, but only an instance of a type defined in the module. In some type "slot" function, you will need to use ... a *private* function... which was only added recently, in *Python 3.10*: _PyType_GetModuleByDef(). IMO the big problem of _PyType_GetModuleByDef() is that developers want supporting Python versions older than Python 3.10. For example, right now, numpy supports Python 3.7 and newer. Moreover, the fact that the function remains private in Python 3.11 is also an issue. Another challenge is how to check if a C extension is "fully" compatible with subinterpreters? In Python 3.11, Python still implements around 100 types as "static types" which are not compatible with subinterpreters, like _Type and _Type. I opened https://bugs.python.org/issue40601 about these static types, but it seems like changing it may break the C API *and* the stable ABI (maybe a clever hack will avoid that). One idea would be to add a macro excluding functions known to be unsafe with subinterpreters from the C API. For example, exclude "PyLong_Type" if the Py_SUBINTERPRETER_API macro is defined. These static types are accessed directly, but also indirectly. For example, PyLong_CheckExact() is implemented as a macro which access directly this type. Should we remove this function from the C API? Or implement it as a regular "opaque" function if Py_SUBINTERPRETER_API is defined? I would prefer an error at the build, rather than a crash at runtime :-( Victor On Wed, Dec 15, 2021 at 12:06 AM Brett Cannon wrote: > > > > On Tue, Dec 14, 2021 at 9:41 AM Eric Snow wrote: >> >> One of the open questions relative to subinterpreters is: how to >> reduce the amount of work required for extension modules to support >> them? Thanks to Petr Viktorin for a lot of work he's done in this >> area (e.g. PEP 489)! Extensions also have the option to opt out of >> subinterpreter support. >> >> However, that's only one part of the story. A while back Nathaniel >> expressed concerns with how making subinterpreters more accessible >> will have a negative side effect affecting projects that publish large >> extensions, e.g. numpy. Not all extensions support subinterpreters >> due to global state (incl. in library dependencies). The amount of >> work to get there may be large. As subinterpreters increase in usage >> in the community, so will demand increase for subinterpreter support >> in those extensions. Consequently, such projects be pressured to do >> the extra work (which is made even more stressful by the short-handed >> nature of most open source projects) . >> >> So we (the core devs) would effectively be requiring those extensions >> to support subinterpreters, regardless of letting them opt out. This >> situation has been weighing heavily on my mind since Nathaniel brought >> this up. Here are some ideas I've had or heard of about what we could >> do to help: >> >> * add a page to the C-API documentation about how to support subinterpreters >> * identify the extensions most likely to be impacted and offer to help >> * add more helpers to the C-API to make adding subinterpreter support >> less painful >> * fall back to loading the extension in its own namespace (e.g. use >> ldm_open()) >> * fall back to copying the extension's file and loading from the copied file >> * ... >> >> I'd appreciate your thoughts on what we can do to help. Thanks! > > > What are the requirements put upon an extension in order to support > subinterpreters? you hint at global state at the C level, but nothing else is > mentioned. Is that it? > ___ > Python-Dev mailing list --
[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
Yeah, no (mutable) global state at the C level. It would also be good to implement multi-phase init (PEP 489), but I don't expect that to require much work itself. -eric On Tue, Dec 14, 2021 at 4:04 PM Brett Cannon wrote: > > > > On Tue, Dec 14, 2021 at 9:41 AM Eric Snow wrote: >> >> One of the open questions relative to subinterpreters is: how to >> reduce the amount of work required for extension modules to support >> them? Thanks to Petr Viktorin for a lot of work he's done in this >> area (e.g. PEP 489)! Extensions also have the option to opt out of >> subinterpreter support. >> >> However, that's only one part of the story. A while back Nathaniel >> expressed concerns with how making subinterpreters more accessible >> will have a negative side effect affecting projects that publish large >> extensions, e.g. numpy. Not all extensions support subinterpreters >> due to global state (incl. in library dependencies). The amount of >> work to get there may be large. As subinterpreters increase in usage >> in the community, so will demand increase for subinterpreter support >> in those extensions. Consequently, such projects be pressured to do >> the extra work (which is made even more stressful by the short-handed >> nature of most open source projects) . >> >> So we (the core devs) would effectively be requiring those extensions >> to support subinterpreters, regardless of letting them opt out. This >> situation has been weighing heavily on my mind since Nathaniel brought >> this up. Here are some ideas I've had or heard of about what we could >> do to help: >> >> * add a page to the C-API documentation about how to support subinterpreters >> * identify the extensions most likely to be impacted and offer to help >> * add more helpers to the C-API to make adding subinterpreter support >> less painful >> * fall back to loading the extension in its own namespace (e.g. use >> ldm_open()) >> * fall back to copying the extension's file and loading from the copied file >> * ... >> >> I'd appreciate your thoughts on what we can do to help. Thanks! > > > What are the requirements put upon an extension in order to support > subinterpreters? you hint at global state at the C level, but nothing else is > mentioned. Is that it? ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BQU3PVN6MHR2P24RAUPJSWFS547W7FPM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
On Tue, Dec 14, 2021 at 9:41 AM Eric Snow wrote: > One of the open questions relative to subinterpreters is: how to > reduce the amount of work required for extension modules to support > them? Thanks to Petr Viktorin for a lot of work he's done in this > area (e.g. PEP 489)! Extensions also have the option to opt out of > subinterpreter support. > > However, that's only one part of the story. A while back Nathaniel > expressed concerns with how making subinterpreters more accessible > will have a negative side effect affecting projects that publish large > extensions, e.g. numpy. Not all extensions support subinterpreters > due to global state (incl. in library dependencies). The amount of > work to get there may be large. As subinterpreters increase in usage > in the community, so will demand increase for subinterpreter support > in those extensions. Consequently, such projects be pressured to do > the extra work (which is made even more stressful by the short-handed > nature of most open source projects) . > > So we (the core devs) would effectively be requiring those extensions > to support subinterpreters, regardless of letting them opt out. This > situation has been weighing heavily on my mind since Nathaniel brought > this up. Here are some ideas I've had or heard of about what we could > do to help: > > * add a page to the C-API documentation about how to support > subinterpreters > * identify the extensions most likely to be impacted and offer to help > * add more helpers to the C-API to make adding subinterpreter support > less painful > * fall back to loading the extension in its own namespace (e.g. use > ldm_open()) > * fall back to copying the extension's file and loading from the copied > file > * ... > > I'd appreciate your thoughts on what we can do to help. Thanks! > What *are* the requirements put upon an extension in order to support subinterpreters? you hint at global state at the C level, but nothing else is mentioned. Is that it? ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/S4QV6SYRGEN7IZZX6YBLS3DQRNLRGCKH/ Code of Conduct: http://python.org/psf/codeofconduct/