[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-17 Thread Petr Viktorin

On 17. 12. 21 4:02, Jim J. Jewett wrote:

Petr Viktorin wrote:

In Python 3.11, Python still implements around 100 types as "static
types" which are not compatible with subinterpreters,

...

seems like changing it may break the C API *and* the stable ABI



If sub-interpreters each need their own copy of even immutable built-in types, 
then what advantage do they have over separate processes?



They need copies of all *Python* objects. A non-Python library may allow
several Python wrappers/proxies for a single internal object,
effectively sharing that object between subinterpreters.
(Which is a problem for removing the GIL -- currently all operations
done by such wrappers are protected by the GIL.)


OK, so what is the advantage of having multiple interpreters?

The only advantage I can see is that if you're embedding what are essentially several 
distinct python processes, you can still keep them all inside the single process used by 
the embedding program.  But seems pretty far along the "they're already compiling 
anyhow; so the ABI isn't crucial" path.


You should be able to use Python as an implementation detail of a 
library.For example, an application should be able to use several such 
libraries, without their Python runtimes influencing each other.


See PEP 630 for some more details: 
https://www.python.org/dev/peps/pep-0630/#motivation

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CQ36ECT4PFXQMPDIIDHCG2YFYFCAXDPZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-16 Thread Jim J. Jewett
Petr Viktorin wrote:
>>> In Python 3.11, Python still implements around 100 types as "static
>>> types" which are not compatible with subinterpreters,
...
>>> seems like changing it may break the C API *and* the stable ABI

> > If sub-interpreters each need their own copy of even immutable built-in 
> > types, then what advantage do they have over separate processes?

> They need copies of all *Python* objects. A non-Python library may allow 
> several Python wrappers/proxies for a single internal object, 
> effectively sharing that object between subinterpreters.
> (Which is a problem for removing the GIL -- currently all operations 
> done by such wrappers are protected by the GIL.)

OK, so what is the advantage of having multiple interpreters?

The only advantage I can see is that if you're embedding what are essentially 
several distinct python processes, you can still keep them all inside the 
single process used by the embedding program.  But seems pretty far along the 
"they're already compiling anyhow; so the ABI isn't crucial" path.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/C2Z2RPRAIGYDODATM5BQQL6DA6LEOVVN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-16 Thread Antoine Pitrou
On Thu, 16 Dec 2021 11:38:28 -0700
Eric Snow  wrote:

> On Thu, Dec 16, 2021 at 4:34 AM Antoine Pitrou  wrote:
> > As a data point, in PyArrow, we have a bunch of C++ code that interacts
> > with Python but doesn't belong in a particular Python module.  That C++
> > code can of course have global state, including perhaps Python objects.  
> 
> Thanks for that example!
> 
> > What might be nice would be a C API to allow creating interpreter-local
> > opaque structs, for example:
> >
> > void* Py_GetInterpreterLocal(const char* unique_name);
> > void* Py_SetInterpreterLocal(const char* unique_name,
> >  void* ptr, void(*)() destructor);  
> 
> That's interesting.  I can imagine that as just a step beyond the
> module state API, with the module being implicit.  Do you think this
> would be an improvement over using module state?  (I'm genuinely
> curious.)

It would certainly be much easier to use (you just have to choose a
unique name, like e.g. for capsules).

Regards

Antoine.


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4DIIJJKJBPGWL3UX5WNU35QRJX7U3BBA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-16 Thread Eric Snow
On Thu, Dec 16, 2021 at 4:34 AM Antoine Pitrou  wrote:
> As a data point, in PyArrow, we have a bunch of C++ code that interacts
> with Python but doesn't belong in a particular Python module.  That C++
> code can of course have global state, including perhaps Python objects.

Thanks for that example!

> What might be nice would be a C API to allow creating interpreter-local
> opaque structs, for example:
>
> void* Py_GetInterpreterLocal(const char* unique_name);
> void* Py_SetInterpreterLocal(const char* unique_name,
>  void* ptr, void(*)() destructor);

That's interesting.  I can imagine that as just a step beyond the
module state API, with the module being implicit.  Do you think this
would be an improvement over using module state?  (I'm genuinely
curious.)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KB7ET6XXJFTJDBHL7ABEPSGTD3M2RNAW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-16 Thread Antoine Pitrou
On Thu, 16 Dec 2021 13:25:53 +0100
Petr Viktorin  wrote:
> On 16. 12. 21 12:33, Antoine Pitrou wrote:
> > On Tue, 14 Dec 2021 10:38:25 -0700
> > Eric Snow  wrote:  
> >>
> >> So we (the core devs) would effectively be requiring those extensions
> >> to support subinterpreters, regardless of letting them opt out.  This
> >> situation has been weighing heavily on my mind since Nathaniel brought
> >> this up.  Here are some ideas I've had or heard of about what we could
> >> do to help:
> >>
> >> * add a page to the C-API documentation about how to support 
> >> subinterpreters
> >> * identify the extensions most likely to be impacted and offer to help
> >> * add more helpers to the C-API to make adding subinterpreter support
> >> less painful
> >> * fall back to loading the extension in its own namespace (e.g. use 
> >> ldm_open())
> >> * fall back to copying the extension's file and loading from the copied 
> >> file
> >> * ...  
> > 
> > As a data point, in PyArrow, we have a bunch of C++ code that interacts
> > with Python but doesn't belong in a particular Python module.  That C++
> > code can of course have global state, including perhaps Python objects.
> > 
> > What might be nice would be a C API to allow creating interpreter-local
> > opaque structs, for example:
> > 
> > void* Py_GetInterpreterLocal(const char* unique_name);
> > void* Py_SetInterpreterLocal(const char* unique_name,
> >   void* ptr, void(*)() destructor);
> > 
> > 
> > Then in extension code you'd be able to write, e.g.:
> 
> What's the reason these can't be tied to the module?

Because the module is simply not known.  This is C++ utility code
called from several different Cython extension modules.

> (As the author of PEP 630, which argues that module state is the best 
> default place for this kind of mutable "globals", I'm interested in the 
> cases where it isn't so.)

That works in a world where all third-party code using the CPython C
API lives in a particular extension module.  While it is certainly the
most common case, I doubt it is universal.

> How do you ensure these Python objects are destroyed by/before 
> Py_Finalize()? (If you do that -- I realize it's not something people 
> typically think about.)

When finalizing a given (sub)interpreter, it would visit all registered
interpreter locals and call their "destructor" callback.

Regards

Antoine.


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/R4FKSJZI366ZO76OUDF4WCY6RQRYMNCG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-16 Thread Petr Viktorin

On 16. 12. 21 12:33, Antoine Pitrou wrote:

On Tue, 14 Dec 2021 10:38:25 -0700
Eric Snow  wrote:


So we (the core devs) would effectively be requiring those extensions
to support subinterpreters, regardless of letting them opt out.  This
situation has been weighing heavily on my mind since Nathaniel brought
this up.  Here are some ideas I've had or heard of about what we could
do to help:

* add a page to the C-API documentation about how to support subinterpreters
* identify the extensions most likely to be impacted and offer to help
* add more helpers to the C-API to make adding subinterpreter support
less painful
* fall back to loading the extension in its own namespace (e.g. use ldm_open())
* fall back to copying the extension's file and loading from the copied file
* ...


As a data point, in PyArrow, we have a bunch of C++ code that interacts
with Python but doesn't belong in a particular Python module.  That C++
code can of course have global state, including perhaps Python objects.

What might be nice would be a C API to allow creating interpreter-local
opaque structs, for example:

void* Py_GetInterpreterLocal(const char* unique_name);
void* Py_SetInterpreterLocal(const char* unique_name,
  void* ptr, void(*)() destructor);


Then in extension code you'd be able to write, e.g.:




What's the reason these can't be tied to the module?
(As the author of PEP 630, which argues that module state is the best 
default place for this kind of mutable "globals", I'm interested in the 
cases where it isn't so.)


How do you ensure these Python objects are destroyed by/before 
Py_Finalize()? (If you do that -- I realize it's not something people 
typically think about.)

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QUHYVSXGQHXONLRLZNOMH5VMRS44PHPO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-16 Thread Antoine Pitrou
On Tue, 14 Dec 2021 10:38:25 -0700
Eric Snow  wrote:
> 
> So we (the core devs) would effectively be requiring those extensions
> to support subinterpreters, regardless of letting them opt out.  This
> situation has been weighing heavily on my mind since Nathaniel brought
> this up.  Here are some ideas I've had or heard of about what we could
> do to help:
> 
> * add a page to the C-API documentation about how to support subinterpreters
> * identify the extensions most likely to be impacted and offer to help
> * add more helpers to the C-API to make adding subinterpreter support
> less painful
> * fall back to loading the extension in its own namespace (e.g. use 
> ldm_open())
> * fall back to copying the extension's file and loading from the copied file
> * ...

As a data point, in PyArrow, we have a bunch of C++ code that interacts
with Python but doesn't belong in a particular Python module.  That C++
code can of course have global state, including perhaps Python objects.

What might be nice would be a C API to allow creating interpreter-local
opaque structs, for example:

void* Py_GetInterpreterLocal(const char* unique_name);
void* Py_SetInterpreterLocal(const char* unique_name,
 void* ptr, void(*)() destructor);


Then in extension code you'd be able to write, e.g.:


typedef struct {
  PyObject* cached_decimal;
  std::vector some_other_data;
} MyLocalState ;

static void destroy_state(MyLocalState* state) {
  Py_XDECREF(state->cached_decimal);
  delete state;
}

static MyLocalState* get_state() {
  MyLocalState* state = NULL;
  state = Py_GetInterpreterLocal("pyarrow._lib.internal");
  if (state == NULL) {
state = new MyLocalState;
Py_SetInterpreterLocal("pyarrow._lib.internal", state,
   destroy_state);
  }
  return state;
}


PyObject* get_decimal_module() {
  MyLocalState* state = get_state();
  if (state->cached_decimal == NULL) {
state->cached_decimal = PyImport_ImportModule("decimal");
  }
  return state->cached_decimal;
}



> 
>  I'd appreciate your thoughts on what we can do to help.  Thanks!
> 
> -eric



___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TTVWSOOVMY2AKYARI5VSQ3RF4U7DHLX6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-16 Thread Petr Viktorin

On 16. 12. 21 3:41, Jim J. Jewett wrote:

In Python 3.11, Python still implements around 100 types as "static
types" which are not compatible with subinterpreters, like
_Type and _Type. I opened
https://bugs.python.org/issue40601 about these static types, but it
seems like changing it may break the C API *and* the stable ABI (maybe
a clever hack will avoid that).


If sub-interpreters each need their own copy of even immutable built-in types, 
then what advantage do they have over separate processes?


They need copies of all *Python* objects. A non-Python library may allow 
several Python wrappers/proxies for a single internal object, 
effectively sharing that object between subinterpreters.
(Which is a problem for removing the GIL -- currently all operations 
done by such wrappers are protected by the GIL.)

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KNMHKD3EXPXIMYEQOHEQ76DK64YRNQQX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-15 Thread Jim J. Jewett
> In Python 3.11, Python still implements around 100 types as "static
> types" which are not compatible with subinterpreters, like
> _Type and _Type. I opened
> https://bugs.python.org/issue40601 about these static types, but it
> seems like changing it may break the C API *and* the stable ABI (maybe
> a clever hack will avoid that).

If sub-interpreters each need their own copy of even immutable built-in types, 
then what advantage do they have over separate processes?

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/B7WO5B426HBTG6KZVKQXTJSBQL2S2ILQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-15 Thread Victor Stinner
Hi Brett,

IMO the PEP 630 is a good summary and a practical guide explaining how
to port existing C extensions to newer C API which are compatible with
subinterpreters, unloading a C extension and be able to load a C
extension more than once (in the same interpreter):
https://www.python.org/dev/peps/pep-0630/

I dislike justifying these changes only with subinterpreters. IMO it's
better to justify that these changes are needed to be able to load and
unload Python cleanly when an application embeds Python. A common use
case is to support plugins in different programming languages,
including Python. Some IRC clients and text editors have this use
case. It's unpleasant when Python leaks lot of memory when Python is
"unloaded", especially if the application is designed to load Python
once per plugin. It's even more unpleasant when... they are bugs :-(

Examples of changes needed by the PEP 630:

* Add a state to the module: use a PyModuleDef.m_size value greater
than 0 (usually a structure is used with ".m_size =
sizeof(_abcmodule_state)")
* Convert static types to heap types and store these types in the module state
* Move global variable and function "static" variables into the module state
* IMO the most complicated part is to retrieve the module state from
functions which don't directly get the module as an argument, but only
an instance of a type defined in the module. In some type "slot"
function, you will need to use ... a *private* function... which was
only added recently, in *Python 3.10*: _PyType_GetModuleByDef().

IMO the big problem of _PyType_GetModuleByDef() is that developers
want supporting Python versions older than Python 3.10. For example,
right now, numpy supports Python 3.7 and newer. Moreover, the fact
that the function remains private in Python 3.11 is also an issue.

Another challenge is how to check if a C extension is "fully"
compatible with subinterpreters?

In Python 3.11, Python still implements around 100 types as "static
types" which are not compatible with subinterpreters, like
_Type and _Type. I opened
https://bugs.python.org/issue40601 about these static types, but it
seems like changing it may break the C API *and* the stable ABI (maybe
a clever hack will avoid that).

One idea would be to add a macro excluding functions known to be
unsafe with subinterpreters from the C API. For example, exclude
"PyLong_Type" if the  Py_SUBINTERPRETER_API macro is defined. These
static types are accessed directly, but also indirectly. For example,
PyLong_CheckExact() is implemented as a macro which access directly
this type. Should we remove this function from the C API? Or implement
it as a regular "opaque" function if Py_SUBINTERPRETER_API is defined?

I would prefer an error at the build, rather than a crash at runtime :-(

Victor


On Wed, Dec 15, 2021 at 12:06 AM Brett Cannon  wrote:
>
>
>
> On Tue, Dec 14, 2021 at 9:41 AM Eric Snow  wrote:
>>
>> One of the open questions relative to subinterpreters is: how to
>> reduce the amount of work required for extension modules to support
>> them?  Thanks to Petr Viktorin for a lot of work he's done in this
>> area (e.g. PEP 489)!  Extensions also have the option to opt out of
>> subinterpreter support.
>>
>> However, that's only one part of the story.  A while back Nathaniel
>> expressed concerns with how making subinterpreters more accessible
>> will have a negative side effect affecting projects that publish large
>> extensions, e.g. numpy.  Not all extensions support subinterpreters
>> due to global state (incl. in library dependencies).  The amount of
>> work to get there may be large.  As subinterpreters increase in usage
>> in the community, so will demand increase for subinterpreter support
>> in those extensions.  Consequently, such projects be pressured to do
>> the extra work (which is made even more stressful by the short-handed
>> nature of most open source projects) .
>>
>> So we (the core devs) would effectively be requiring those extensions
>> to support subinterpreters, regardless of letting them opt out.  This
>> situation has been weighing heavily on my mind since Nathaniel brought
>> this up.  Here are some ideas I've had or heard of about what we could
>> do to help:
>>
>> * add a page to the C-API documentation about how to support subinterpreters
>> * identify the extensions most likely to be impacted and offer to help
>> * add more helpers to the C-API to make adding subinterpreter support
>> less painful
>> * fall back to loading the extension in its own namespace (e.g. use 
>> ldm_open())
>> * fall back to copying the extension's file and loading from the copied file
>> * ...
>>
>>  I'd appreciate your thoughts on what we can do to help.  Thanks!
>
>
> What are the requirements put upon an extension in order to support 
> subinterpreters? you hint at global state at the C level, but nothing else is 
> mentioned. Is that it?
> ___
> Python-Dev mailing list -- 

[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-14 Thread Eric Snow
Yeah, no (mutable) global state at the C level.  It would also be good
to implement multi-phase init (PEP 489), but I don't expect that to
require much work itself.

-eric

On Tue, Dec 14, 2021 at 4:04 PM Brett Cannon  wrote:
>
>
>
> On Tue, Dec 14, 2021 at 9:41 AM Eric Snow  wrote:
>>
>> One of the open questions relative to subinterpreters is: how to
>> reduce the amount of work required for extension modules to support
>> them?  Thanks to Petr Viktorin for a lot of work he's done in this
>> area (e.g. PEP 489)!  Extensions also have the option to opt out of
>> subinterpreter support.
>>
>> However, that's only one part of the story.  A while back Nathaniel
>> expressed concerns with how making subinterpreters more accessible
>> will have a negative side effect affecting projects that publish large
>> extensions, e.g. numpy.  Not all extensions support subinterpreters
>> due to global state (incl. in library dependencies).  The amount of
>> work to get there may be large.  As subinterpreters increase in usage
>> in the community, so will demand increase for subinterpreter support
>> in those extensions.  Consequently, such projects be pressured to do
>> the extra work (which is made even more stressful by the short-handed
>> nature of most open source projects) .
>>
>> So we (the core devs) would effectively be requiring those extensions
>> to support subinterpreters, regardless of letting them opt out.  This
>> situation has been weighing heavily on my mind since Nathaniel brought
>> this up.  Here are some ideas I've had or heard of about what we could
>> do to help:
>>
>> * add a page to the C-API documentation about how to support subinterpreters
>> * identify the extensions most likely to be impacted and offer to help
>> * add more helpers to the C-API to make adding subinterpreter support
>> less painful
>> * fall back to loading the extension in its own namespace (e.g. use 
>> ldm_open())
>> * fall back to copying the extension's file and loading from the copied file
>> * ...
>>
>>  I'd appreciate your thoughts on what we can do to help.  Thanks!
>
>
> What are the requirements put upon an extension in order to support 
> subinterpreters? you hint at global state at the C level, but nothing else is 
> mentioned. Is that it?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BQU3PVN6MHR2P24RAUPJSWFS547W7FPM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-14 Thread Brett Cannon
On Tue, Dec 14, 2021 at 9:41 AM Eric Snow 
wrote:

> One of the open questions relative to subinterpreters is: how to
> reduce the amount of work required for extension modules to support
> them?  Thanks to Petr Viktorin for a lot of work he's done in this
> area (e.g. PEP 489)!  Extensions also have the option to opt out of
> subinterpreter support.
>
> However, that's only one part of the story.  A while back Nathaniel
> expressed concerns with how making subinterpreters more accessible
> will have a negative side effect affecting projects that publish large
> extensions, e.g. numpy.  Not all extensions support subinterpreters
> due to global state (incl. in library dependencies).  The amount of
> work to get there may be large.  As subinterpreters increase in usage
> in the community, so will demand increase for subinterpreter support
> in those extensions.  Consequently, such projects be pressured to do
> the extra work (which is made even more stressful by the short-handed
> nature of most open source projects) .
>
> So we (the core devs) would effectively be requiring those extensions
> to support subinterpreters, regardless of letting them opt out.  This
> situation has been weighing heavily on my mind since Nathaniel brought
> this up.  Here are some ideas I've had or heard of about what we could
> do to help:
>
> * add a page to the C-API documentation about how to support
> subinterpreters
> * identify the extensions most likely to be impacted and offer to help
> * add more helpers to the C-API to make adding subinterpreter support
> less painful
> * fall back to loading the extension in its own namespace (e.g. use
> ldm_open())
> * fall back to copying the extension's file and loading from the copied
> file
> * ...
>
>  I'd appreciate your thoughts on what we can do to help.  Thanks!
>

What *are* the requirements put upon an extension in order to support
subinterpreters? you hint at global state at the C level, but nothing else
is mentioned. Is that it?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/S4QV6SYRGEN7IZZX6YBLS3DQRNLRGCKH/
Code of Conduct: http://python.org/psf/codeofconduct/