Re: [Python-Dev] Let's change to C API!

2018-08-10 Thread Armin Rigo
Hi,

On 31 July 2018 at 13:55, Antoine Pitrou  wrote:
> It's just that I disagree that removing the C API will make CPython 2x
> faster.
>
> Actually, important modern optimizations for dynamic languages (such as
> inlining, type specialization, inline caches, object unboxing) don't
> seem to depend on the C API at all.

These are optimizations typically talked about in papers about dynamic
languages in general.  In my opinion, in the specific case of CPython,
they are all secondary to the following: (1) JIT, (2) GC, (3) object
model, (4) multithreading.

Currently, the C API only allows Psyco-style JITting (much slower than
PyPy).  All three other points might not be possible at all without a
seriously modified C API.  Why?  I have no proof, but only
circumstantial evidence.  Each of (2), (3), (4) has been done in at
least one other implementation: PyPy, Jython and IronPython.  Each of
these implementation has also got its share of troubles with emulating
the CPython C API.  You can continue to think that the C API has got
nothing to do with it.  I tend to think the opposite.  The continued
absence of major performance improvements for either CPython itself or
for any alternative Python implementation that *does* support the C
API natively is probably proof enough---I think that enough time has
passed, by now, to make this argument.


A bientôt,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A Subtle Bug in Class Initializations

2018-08-10 Thread Steve Dower

On 10Aug2018 0354, Erik Bray wrote:

Thanks!  I'm not sure what you mean by "on other OS's" though.  Do you
mean other OS's that happen to use Windows-style PE/COFF binaries?
Because other than Windows I'm not sure what we care about there.

For ELF binaries, at least on Linux (and probably elsewhere) it the
runtime loader can perform more sophisticated relocations when loading
a binary into memory, including relocating pointers in the binary's
.data section.  This allows it to initialize data in one executable
"A" with pointers to data in another library "B" *before* "A" is
considered fully loaded and executable.

So this problem never arises, at least on Linux.


That's exactly what I meant. I simply didn't know how/whether other 
loaders handled this case :) I recognise it's nothing to do with the 
binary format and everything to do with whether the loader knows what to 
do or not.



So I'm +1 for requiring passing NULL to PyVarObject_HEAD_INIT,
requiring PyType_Ready with an explicit base type argument, and maybe
(eventually) making PyVarObject_HEAD_INIT argumentless.


Since PyVarObject_HEAD_INIT currently requires PyType_Ready() in
extension modules already, then don't we just need to fix the built-in
types?

As far as the "eventually" case, I'd hope that eventually extension
modules are all using PyType_FromSpec() :)


+1 :)


Is that just a +1 for PyType_FromSpec(), or are you agreeing that we 
only need to fix the built-in types?


Cheers,
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can we split PEP 489 (extension module init) ?

2018-08-10 Thread Petr Viktorin

On 08/10/18 12:21, Stefan Behnel wrote:

Petr Viktorin schrieb am 10.08.2018 um 11:51:

On 08/10/18 11:21, Stefan Behnel wrote:

coming back to PEP 489 [1], the multi-phase extension module
initialization. We originally designed it as an "all or nothing" feature,
but as it turns out, the "all" part is so difficult to achieve that most
potential users end up with "nothing". So, my question is: could we split
it up so that projects can get at least the main advantages: module spec
and unicode module naming.

PEP 489 is a great protocol in the sense that it allows extension modules
to set themselves up in the same way that Python modules do: load, create
module, execute module code. Without it, creating the module and executing
its code are a single step that is outside of the control of CPython, which
prevents the module from knowing its metadata and CPython from knowing
up-front what the module will actually be.

Now, the problem with PEP 489 is that it requires support for reloading and
subinterpreters at the same time [2]. For this, extension modules must
essentially be free of static global state, which comprises both the module
code itself and any external native libraries that it uses. That is
somewhere between difficult and impossible to achieve. PEP 573 [3] explains
some of the reasons, and lists solutions for some of the issues, but cannot
solve the general problem that some extension modules simply cannot get rid
of their global state, and are therefore inherently incompatible with
reloading and subinterpreters.


Are there any issues that aren't explained in PEP 573?
I don't think Python modules should be *inherently* incompatible with
subinterpreters. Static global state is perhaps unavoidable in some cases,
but IMO it should be managed when it's exposed to Python.
If there are issues not in the PEPs, I'd like to collect the concrete cases
in some document.


There's always the case where an external native library simply isn't
re-entrant and/or requires configuration to be global. I know, there's
static linking and there are even ways to load an external shared library
multiple times, but that's just adding to the difficulties. Let's just
accept that some things are not easy enough to make for a good requirement.


For that case, I think the right thing to do is for the module to raise 
an extension when it's being initialized for the second time, or when 
the underlying library would be initialized for the second time.


"Avoid static global state" is a good rule of thumb for supporting 
subinterpreters nicely, but other strategies are possible.
If an underlying library just expects to be initialized once, and then 
work from several modules, the Python wrapper should ensure that (using 
global state, most likely). Other ways of handling things should be 
possible, depending on the underlying library.



I would like the requirement in [2] to be lifted in PEP 489, to make the
main features of the PEP generally available to all extension modules.

The question is then how to opt out of the subinterpreter support. The PEP
explicitly does not allow backporting new init slot functions/feeatures:

"Unknown slot IDs will cause the import to fail with SystemError."

But at least changing this in Py3.8 should be doable and would be really
nice.


I don't think we can just silently skip unknown slots -- that would mean
modules wouldn't be getting features they asked for.
Do you have some more sophisticated model for slots in mind, or is this
something to be designed?


Sorry for not being clear here. I was asking for changing the assumptions
that PEP 489 makes about modules that claim to support the multi-step
initialisation part of the PEP. Adding a new (flag?) slot was just one idea
for opting out of multi-initialisation support.


Would this be better than a flag + raising an error on init?
One big disadvantage of a big opt-out-of-everything button is that it 
doesn't encourage people to think about what the actual non-reentrant 
piece of code is.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A Subtle Bug in Class Initializations

2018-08-10 Thread Erik Bray
On Thu, Aug 9, 2018 at 7:21 PM Steve Dower  wrote:
>
> On 09Aug2018 0818, Erik Bray wrote:
> > On Mon, Aug 6, 2018 at 8:11 PM Eddie Elizondo  wrote:
> >> 3) Special case the initialization of PyType_Type and PyBaseObject_Type 
> >> within PyType_Ready to now make all calls to PyVarObject_HEAD_INIT use 
> >> NULL. To enable this a small change within PyType_Ready is needed to 
> >> initialize PyType_Type PyBaseObject:
> >
> > Coincidentally, I just wrote a long-ish blog post explaining in
> > technical details why PyVarObject_HEAD_INIT(_Type) pretty much
> > cannot work, at least for extension modules (it is not a problem in
> > the core library), on Windows (my post was focused on Cygwin but it is
> > a problem for Windows in general):
> > http://iguananaut.net/blog/programming/windows-data-import.html
> >
> > The TL;DR is that it's not possible on Windows to initialize a struct
> > with a pointer to some other data that's found in another DLL (i.e.
> > _Type), unless it happens to be a function, as a special case.
> > But _Type obviously is not, so thinks break.
>
> Great write-up! I think logically it should make sense that you cannot
> initialize a static value from a dynamically-linked library, but you've
> conclusively shown why that's the case. I'm not clear whether it's also
> the case on other OS's, but I don't see why it wouldn't be (unless they
> compile magic load-time resolution).

Thanks!  I'm not sure what you mean by "on other OS's" though.  Do you
mean other OS's that happen to use Windows-style PE/COFF binaries?
Because other than Windows I'm not sure what we care about there.

For ELF binaries, at least on Linux (and probably elsewhere) it the
runtime loader can perform more sophisticated relocations when loading
a binary into memory, including relocating pointers in the binary's
.data section.  This allows it to initialize data in one executable
"A" with pointers to data in another library "B" *before* "A" is
considered fully loaded and executable.

So this problem never arises, at least on Linux.

> > So I'm +1 for requiring passing NULL to PyVarObject_HEAD_INIT,
> > requiring PyType_Ready with an explicit base type argument, and maybe
> > (eventually) making PyVarObject_HEAD_INIT argumentless.
>
> Since PyVarObject_HEAD_INIT currently requires PyType_Ready() in
> extension modules already, then don't we just need to fix the built-in
> types?
>
> As far as the "eventually" case, I'd hope that eventually extension
> modules are all using PyType_FromSpec() :)

+1 :)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can we split PEP 489 (extension module init) ?

2018-08-10 Thread Stefan Behnel
Petr Viktorin schrieb am 10.08.2018 um 11:51:
> On 08/10/18 11:21, Stefan Behnel wrote:
>> coming back to PEP 489 [1], the multi-phase extension module
>> initialization. We originally designed it as an "all or nothing" feature,
>> but as it turns out, the "all" part is so difficult to achieve that most
>> potential users end up with "nothing". So, my question is: could we split
>> it up so that projects can get at least the main advantages: module spec
>> and unicode module naming.
>>
>> PEP 489 is a great protocol in the sense that it allows extension modules
>> to set themselves up in the same way that Python modules do: load, create
>> module, execute module code. Without it, creating the module and executing
>> its code are a single step that is outside of the control of CPython, which
>> prevents the module from knowing its metadata and CPython from knowing
>> up-front what the module will actually be.
>>
>> Now, the problem with PEP 489 is that it requires support for reloading and
>> subinterpreters at the same time [2]. For this, extension modules must
>> essentially be free of static global state, which comprises both the module
>> code itself and any external native libraries that it uses. That is
>> somewhere between difficult and impossible to achieve. PEP 573 [3] explains
>> some of the reasons, and lists solutions for some of the issues, but cannot
>> solve the general problem that some extension modules simply cannot get rid
>> of their global state, and are therefore inherently incompatible with
>> reloading and subinterpreters.
> 
> Are there any issues that aren't explained in PEP 573?
> I don't think Python modules should be *inherently* incompatible with
> subinterpreters. Static global state is perhaps unavoidable in some cases,
> but IMO it should be managed when it's exposed to Python.
> If there are issues not in the PEPs, I'd like to collect the concrete cases
> in some document.

There's always the case where an external native library simply isn't
re-entrant and/or requires configuration to be global. I know, there's
static linking and there are even ways to load an external shared library
multiple times, but that's just adding to the difficulties. Let's just
accept that some things are not easy enough to make for a good requirement.


>> I would like the requirement in [2] to be lifted in PEP 489, to make the
>> main features of the PEP generally available to all extension modules.
>>
>> The question is then how to opt out of the subinterpreter support. The PEP
>> explicitly does not allow backporting new init slot functions/feeatures:
>>
>> "Unknown slot IDs will cause the import to fail with SystemError."
>>
>> But at least changing this in Py3.8 should be doable and would be really
>> nice.
> 
> I don't think we can just silently skip unknown slots -- that would mean
> modules wouldn't be getting features they asked for.
> Do you have some more sophisticated model for slots in mind, or is this
> something to be designed?

Sorry for not being clear here. I was asking for changing the assumptions
that PEP 489 makes about modules that claim to support the multi-step
initialisation part of the PEP. Adding a new (flag?) slot was just one idea
for opting out of multi-initialisation support.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can we split PEP 489 (extension module init) ?

2018-08-10 Thread Petr Viktorin

On 08/10/18 11:21, Stefan Behnel wrote:

Hi,

coming back to PEP 489 [1], the multi-phase extension module
initialization. We originally designed it as an "all or nothing" feature,
but as it turns out, the "all" part is so difficult to achieve that most
potential users end up with "nothing". So, my question is: could we split
it up so that projects can get at least the main advantages: module spec
and unicode module naming.

PEP 489 is a great protocol in the sense that it allows extension modules
to set themselves up in the same way that Python modules do: load, create
module, execute module code. Without it, creating the module and executing
its code are a single step that is outside of the control of CPython, which
prevents the module from knowing its metadata and CPython from knowing
up-front what the module will actually be.

Now, the problem with PEP 489 is that it requires support for reloading and
subinterpreters at the same time [2]. For this, extension modules must
essentially be free of static global state, which comprises both the module
code itself and any external native libraries that it uses. That is
somewhere between difficult and impossible to achieve. PEP 573 [3] explains
some of the reasons, and lists solutions for some of the issues, but cannot
solve the general problem that some extension modules simply cannot get rid
of their global state, and are therefore inherently incompatible with
reloading and subinterpreters.


Are there any issues that aren't explained in PEP 573?
I don't think Python modules should be *inherently* incompatible with 
subinterpreters. Static global state is perhaps unavoidable in some 
cases, but IMO it should be managed when it's exposed to Python.
If there are issues not in the PEPs, I'd like to collect the concrete 
cases in some document.



I would like the requirement in [2] to be lifted in PEP 489, to make the
main features of the PEP generally available to all extension modules.

The question is then how to opt out of the subinterpreter support. The PEP
explicitly does not allow backporting new init slot functions/feeatures:

"Unknown slot IDs will cause the import to fail with SystemError."

But at least changing this in Py3.8 should be doable and would be really nice.


I don't think we can just silently skip unknown slots -- that would mean 
modules wouldn't be getting features they asked for.
Do you have some more sophisticated model for slots in mind, or is this 
something to be designed?




What do you think?

Stefan



[1] https://www.python.org/dev/peps/pep-0489/
[2]
https://www.python.org/dev/peps/pep-0489/#subinterpreters-and-interpreter-reloading
[3] https://www.python.org/dev/peps/pep-0573/


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Can we split PEP 489 (extension module init) ?

2018-08-10 Thread Stefan Behnel
Hi,

coming back to PEP 489 [1], the multi-phase extension module
initialization. We originally designed it as an "all or nothing" feature,
but as it turns out, the "all" part is so difficult to achieve that most
potential users end up with "nothing". So, my question is: could we split
it up so that projects can get at least the main advantages: module spec
and unicode module naming.

PEP 489 is a great protocol in the sense that it allows extension modules
to set themselves up in the same way that Python modules do: load, create
module, execute module code. Without it, creating the module and executing
its code are a single step that is outside of the control of CPython, which
prevents the module from knowing its metadata and CPython from knowing
up-front what the module will actually be.

Now, the problem with PEP 489 is that it requires support for reloading and
subinterpreters at the same time [2]. For this, extension modules must
essentially be free of static global state, which comprises both the module
code itself and any external native libraries that it uses. That is
somewhere between difficult and impossible to achieve. PEP 573 [3] explains
some of the reasons, and lists solutions for some of the issues, but cannot
solve the general problem that some extension modules simply cannot get rid
of their global state, and are therefore inherently incompatible with
reloading and subinterpreters.

I would like the requirement in [2] to be lifted in PEP 489, to make the
main features of the PEP generally available to all extension modules.

The question is then how to opt out of the subinterpreter support. The PEP
explicitly does not allow backporting new init slot functions/feeatures:

"Unknown slot IDs will cause the import to fail with SystemError."

But at least changing this in Py3.8 should be doable and would be really nice.

What do you think?

Stefan



[1] https://www.python.org/dev/peps/pep-0489/
[2]
https://www.python.org/dev/peps/pep-0489/#subinterpreters-and-interpreter-reloading
[3] https://www.python.org/dev/peps/pep-0573/

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com