[Python-Dev] Re: Latest PEP 554 updates.

2020-05-06 Thread Eric Snow
On Wed, May 6, 2020 at 2:25 PM Jeff Allen  wrote:
> Many thanks for working on this so carefully for so long. I'm happy to see 
> the per-interpreter GIL will now be studied fully before final commitment to 
> subinterpreters in the stdlib. I would have chipped in in those terms to the 
> review, but others succesfully argued for "provisional" inclusion, and I was 
> content with that.

No problem. :)

> My reason for worrying about this is that, while the C-API has been there for 
> some time, it has not had heavy use in taxing cases AFAIK, and I think there 
> is room for it to be incorrect. I am thinking more about Jython than CPython, 
> but ideally they are the same structures. When I put the structures to taxing 
> use cases on paper, they don't seem quite to work. Jython has been used in 
> environments with thread-pools, concurrency, and multiple interpreters, and 
> this aspect has had to be "fixed" several times.

That insight would be super helpful and much appreciated. :)  Is that
all on the docs you've linked?

> My use cases include sharing objects between interpreters, which I know the 
> PEP doesn't. The C-API docs acknowledge that object sharing can't be 
> prevented, but do their best to discourage it because of the hazards around 
> allocation. Trouble is, I think it can happen unawares. The fact that Java 
> takes on lifecycle management suggests it shouldn't be a fundamental problem 
> in Jython. I know from other discussion it's where many would like to end up, 
> even in CPython.

Yeah, for now we will strictly disallow sharing actual objects between
interpreters in Python Code.  It would be an interesting project to
try loosening that at some point (especially with immutable type), but
we're going to start from the safer position.

We have no plans to add any similar restrictions to the C-API, where
by you're typically much more free to shoot your own foot. :)

> This is all theory: I don't have even a model implementation, so I won't 
> pontificate. However, I do have pictures, without which I find it impossible 
> to think about this subject. I couldn't find your pictures, so share mine 
> here (WiP):
>
> https://the-very-slow-jython-project.readthedocs.io/en/latest/architecture/interpreter-structure.html#runtime-thread-and-interpreter-cpython
>
> I would be interested in how you solve the problem of finding the current 
> interpreter, discussed in the article. My preferred answer is:
>
> https://the-very-slow-jython-project.readthedocs.io/en/latest/architecture/interpreter-structure.html#critical-structures-revisited
>
> That's the API change I think is needed. It might not have a visible effect 
> on the PEP, but it's worth bearing in mind the risk of exposing a thing you 
> might shortly find you want to change.

This is great stuff, Jeff!  Thanks for sharing it.  I was able to skim
through but don't have time to dig in at the moment.  I'll reply in
detail as soon as I can.

In the meantime, the implementation of PEP 554 exposes a single part
of PyInterpreterState: the ID (an int).  The only other internal-ish
info we expose is whether or not an interpreter (by ID) is currently
running.  The only functionality we provide is: create, destroy, and
run_string().

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7RZCIKVRIKXTNFT7IRNLA3OQ5CX2AIJ6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Issues with import_fresh_module

2020-05-06 Thread Paul Ganssle
Thanks for the suggestion.

I think I tried something similar for tests that involved an environment
variable and found that it doesn't play nicely with coverage.py /at all/.

Also, I will have to solve this problem at some point anyway because the
property tests for the module (not currently included in the PR) include
tests that have the C and pure Python version running side-by-side,
which would be hard to achieve with subinterpreters.

On 5/6/20 4:51 PM, Nathaniel Smith wrote:
> On Wed, May 6, 2020 at 7:52 AM Paul Ganssle  wrote:
>> As part of PEP 399, an idiom for testing both C and pure Python versions of 
>> a library is suggested making use if import_fresh_module.
>>
>> Unfortunately, I'm finding that this is not amazingly robust. We have this 
>> issue: https://bugs.python.org/issue40058, where the tester for datetime 
>> needs to do some funky manipulations to the state of sys.modules for reasons 
>> that are now somewhat unclear, and still sys.modules is apparently left in a 
>> bad state.
>>
>> When implementing PEP 615, I ran into similar issues and found it very 
>> difficult to get two independent instances of the same module – one with the 
>> C extension blocked and one with it intact. I ended up manually importing 
>> the C and Python extensions and grafting them onto two "fresh" imports with 
>> nothing blocked.
> When I've had to deal with similar issues in the past, I've given up
> on messing with sys.modules and just had one test spawn a subprocess
> to do the import+run the actual tests. It's a big hammer, but the nice
> thing about big hammers is that there's no subtle issues, either they
> smash the thing or they don't.
>
> But, I don't know how awkward that would be to fit into Python's
> unittest system, if you have lots of tests you need to run this way.
>
> -n
>


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/H4TWK574BEUDVY4MGTSFJ5OKD4OVOWZZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Issues with import_fresh_module

2020-05-06 Thread Nathaniel Smith
On Wed, May 6, 2020 at 2:34 PM Paul Ganssle  wrote:
> I think I tried something similar for tests that involved an environment 
> variable and found that it doesn't play nicely with coverage.py at all.

This is a solvable problem:
https://coverage.readthedocs.io/en/coverage-5.1/subprocess.html

But yeah, convincing your test framework to jump through the necessary
hoops might be tricky. (Last time I did this I was using pytest-cov,
which automatically takes care of all the details, so I'm not sure how
tough it is.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Q3K6IT774HAS2IS62HN3NRV5VCBWTVLO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

2020-05-06 Thread Serhiy Storchaka

06.05.20 00:46, Victor Stinner пише:

Subinterpreters and multiprocessing have basically the same speed on
this benchmark.


It does not look like there are some advantages of subinterpreters 
against multiprocessing.


I am wondering how much 3.9 will be slower than 3.8 in single-thread 
single-interpreter mode after getting rid of all process-wide singletons 
and caches (Py_None, Py_True, Py_NonImplemented. small integers, 
strings, tuples, _Py_IDENTIFIER, _PyArg_Parser, etc). Not mentioning 
breaking binary compatibility.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Y3NU7O5NBTIRMYU7V4IIOVV4OGN2VT3W/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Issues with import_fresh_module

2020-05-06 Thread Brett Cannon
I'm drowning in work this month, so if you need me to look at something then I 
unfortunately need a point-blank link of what you want me to look at with a 
targeted question.

As for import_fresh_module() not being robust: of course it isn't because it's 
mucking with import stuff in a very non-standard way.  All it's doing is an 
import and clearing the module from sys.modules. The extras it provides is to 
shove None into sys.modules to trigger an ImportError and so you can block any 
acceleration module from being imported and to forcibly use the Python code. 
That's it.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VNQJBFHIEZLY6C5HNV5A6TNIWI7VAMOW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] A PEP PR that I closed until someone discusses context

2020-05-06 Thread joannah nanjekye
I saw a PR on the PEP repository that looked like a joke here :
https://github.com/python/peps/pull/1396

The author can give context to re-open if it was intentional.

-- 
Best,
Joannah Nanjekye

*"You think you know when you learn, are more sure when you can write, even
more when you can teach, but certain when you can program." Alan J. Perlis*
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/M4SEVZ6TDJSPSXYA2YUB7RKAICHO3IAX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: A PEP PR that I closed until someone discusses context

2020-05-06 Thread Steve Dower

On 06May2020 2204, joannah nanjekye wrote:
I saw a PR on the PEP repository that looked like a joke here : 
https://github.com/python/peps/pull/1396


The author can give context to re-open if it was intentional.


Given there isn't a real email address on the PEP, I'd assume it was 
meant as a joke.


I wouldn't put any more time into this.

Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2D72HPN7XAKERRCI4TEASGOMWJNNAIGK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

2020-05-06 Thread Nathaniel Smith
On Wed, May 6, 2020 at 10:03 AM Antoine Pitrou  wrote:
>
> On Tue, 5 May 2020 18:59:34 -0700
> Nathaniel Smith  wrote:
> > On Tue, May 5, 2020 at 3:47 PM Guido van Rossum  wrote:
> > >
> > > This sounds like a significant milestone!
> > >
> > > Is there some kind of optimized communication possible yet between 
> > > subinterpreters? (Otherwise I still worry that it's no better than 
> > > subprocesses -- and it could be worse because when one subinterpreter 
> > > experiences a hard crash or runs out of memory, all others have to die 
> > > with it.)
> >
> > As far as I understand it, the subinterpreter folks have given up on
> > optimized passing of objects, and are only hoping to do optimized
> > (zero-copy) passing of raw memory buffers.
>
> Which would be useful already, especially with pickle out-of-band
> buffers.

Sure, zero cost is always better than some cost, I'm not denying that
:-). What I'm trying to understand is whether the difference is
meaningful enough to justify subinterpreters' increased complexity,
fragility, and ecosystem breakage.

If your data is in large raw memory buffers to start with (like numpy
arrays or arrow dataframes), then yeah, serialization costs are
smaller proportion of IPC costs. And out-of-band buffers are an
elegant way of letting pickle users take advantage of that speedup
while still using the familiar pickle API. Thanks for writing that PEP
:-).

But when you're in the regime where you're working with large raw
memory buffers, then that's also the regime where inter-process
shared-memory becomes really efficient. Hence projects like Ray/Plasma
[1], which exist today, and even work for sharing data across
languages and across multi-machine clusters. And the pickle
out-of-band buffer API is general enough to work with shared memory
too.

And even if you can't quite manage zero-copy, and have to settle for
one-copy... optimized raw data copying is just *really fast*, similar
to memory access speeds. And CPU-bound, big-data-crunching apps are by
definition going to access that memory and do stuff with it that's
much more expensive than a single memcpy. So I still have trouble
figuring out how skipping a single memcpy will make subinterpreters
significantly faster that subprocesses in any real-world scenario.

-n

[1]
https://arrow.apache.org/blog/2017/08/08/plasma-in-memory-object-store/
https://github.com/ray-project/ray

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PCLCXUK2OOHL2DHEHKMB3LGCIT7247WM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Latest PEP 554 updates.

2020-05-06 Thread Jeff Allen

On 05/05/2020 16:45, Eric Snow wrote:

On Mon, May 4, 2020 at 11:30 AM Eric Snow  wrote:

Further feedback is welcome, though I feel like the PR is ready (or
very close to ready) for pronouncement.  Thanks again to all.

FYI, after consulting with the steering council I've decided to change
the target release to 3.10, when we expect to have per-interpreter GIL
landed.  That will help maximize the impact of the module and avoid
any confusion.  I'm undecided on releasing a 3.9-only module on PyPI.
If I do it will only be for folks to try it out early and I probably
won't advertise it much.

-eric


Eric:

Many thanks for working on this so carefully for so long. I'm happy to 
see the per-interpreter GIL will now be studied fully before final 
commitment to subinterpreters in the stdlib. I would have chipped in in 
those terms to the review, but others succesfully argued for 
"provisional" inclusion, and I was content with that.


My reason for worrying about this is that, while the C-API has been 
there for some time, it has not had heavy use in taxing cases AFAIK, and 
I think there is room for it to be incorrect. I am thinking more about 
Jython than CPython, but ideally they are the same structures. When I 
put the structures to taxing use cases on paper, they don't seem quite 
to work. Jython has been used in environments with thread-pools, 
concurrency, and multiple interpreters, and this aspect has had to be 
"fixed" several times.


My use cases include sharing objects between interpreters, which I know 
the PEP doesn't. The C-API docs acknowledge that object sharing can't be 
prevented, but do their best to discourage it because of the hazards 
around allocation. Trouble is, I think it can happen unawares. The fact 
that Java takes on lifecycle management suggests it shouldn't be a 
fundamental problem in Jython. I know from other discussion it's where 
many would like to end up, even in CPython.


This is all theory: I don't have even a model implementation, so I won't 
pontificate. However, I do have pictures, without which I find it 
impossible to think about this subject. I couldn't find your pictures, 
so share mine here (WiP):


https://the-very-slow-jython-project.readthedocs.io/en/latest/architecture/interpreter-structure.html#runtime-thread-and-interpreter-cpython

I would be interested in how you solve the problem of finding the 
current interpreter, discussed in the article. My preferred answer is:


https://the-very-slow-jython-project.readthedocs.io/en/latest/architecture/interpreter-structure.html#critical-structures-revisited

That's the API change I think is needed. It might not have a visible 
effect on the PEP, but it's worth bearing in mind the risk of exposing a 
thing you might shortly find you want to change.


Jeff Allen


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/E2BMM2IVKMDJGWOWQWCSDZCNPZOKEJMJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Issues with import_fresh_module

2020-05-06 Thread Guido van Rossum
The subtle issue is of course performance if you get too cozy with this
pattern...

On Wed, May 6, 2020 at 1:59 PM Nathaniel Smith  wrote:

> On Wed, May 6, 2020 at 7:52 AM Paul Ganssle  wrote:
> >
> > As part of PEP 399, an idiom for testing both C and pure Python versions
> of a library is suggested making use if import_fresh_module.
> >
> > Unfortunately, I'm finding that this is not amazingly robust. We have
> this issue: https://bugs.python.org/issue40058, where the tester for
> datetime needs to do some funky manipulations to the state of sys.modules
> for reasons that are now somewhat unclear, and still sys.modules is
> apparently left in a bad state.
> >
> > When implementing PEP 615, I ran into similar issues and found it very
> difficult to get two independent instances of the same module – one with
> the C extension blocked and one with it intact. I ended up manually
> importing the C and Python extensions and grafting them onto two "fresh"
> imports with nothing blocked.
>
> When I've had to deal with similar issues in the past, I've given up
> on messing with sys.modules and just had one test spawn a subprocess
> to do the import+run the actual tests. It's a big hammer, but the nice
> thing about big hammers is that there's no subtle issues, either they
> smash the thing or they don't.
>
> But, I don't know how awkward that would be to fit into Python's
> unittest system, if you have lots of tests you need to run this way.
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/SDSODK5ZSJUSGDFVFOAESHYLPPFANNWD/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3DDJQUVNMEGEEWNCZYEP7LKXQHY23Y46/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Issues with import_fresh_module

2020-05-06 Thread Chris Jerdonek
Have you also considered changes to the modules under test that might make
it easier for both implementations to exist and be tested side-by-side (so
with fewer hacks on the testing side)?

—Chris

On Wed, May 6, 2020 at 2:33 PM Paul Ganssle  wrote:

> Thanks for the suggestion.
>
> I think I tried something similar for tests that involved an environment
> variable and found that it doesn't play nicely with coverage.py *at all*.
>
> Also, I will have to solve this problem at some point anyway because the
> property tests for the module (not currently included in the PR) include
> tests that have the C and pure Python version running side-by-side, which
> would be hard to achieve with subinterpreters.
>
> On 5/6/20 4:51 PM, Nathaniel Smith wrote:
>
> On Wed, May 6, 2020 at 7:52 AM Paul Ganssle  
>  wrote:
>
> As part of PEP 399, an idiom for testing both C and pure Python versions of a 
> library is suggested making use if import_fresh_module.
>
> Unfortunately, I'm finding that this is not amazingly robust. We have this 
> issue: https://bugs.python.org/issue40058, where the tester for datetime 
> needs to do some funky manipulations to the state of sys.modules for reasons 
> that are now somewhat unclear, and still sys.modules is apparently left in a 
> bad state.
>
> When implementing PEP 615, I ran into similar issues and found it very 
> difficult to get two independent instances of the same module – one with the 
> C extension blocked and one with it intact. I ended up manually importing the 
> C and Python extensions and grafting them onto two "fresh" imports with 
> nothing blocked.
>
> When I've had to deal with similar issues in the past, I've given up
> on messing with sys.modules and just had one test spawn a subprocess
> to do the import+run the actual tests. It's a big hammer, but the nice
> thing about big hammers is that there's no subtle issues, either they
> smash the thing or they don't.
>
> But, I don't know how awkward that would be to fit into Python's
> unittest system, if you have lots of tests you need to run this way.
>
> -n
>
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/H4TWK574BEUDVY4MGTSFJ5OKD4OVOWZZ/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SLYLON2KLYCRYRWKY773MSZASJ7LC5JP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

2020-05-06 Thread Nathaniel Smith
On Wed, May 6, 2020 at 5:41 AM Victor Stinner  wrote:
>
> Hi Nathaniel,
>
> Le mer. 6 mai 2020 à 04:00, Nathaniel Smith  a écrit :
> > As far as I understand it, the subinterpreter folks have given up on
> > optimized passing of objects, and are only hoping to do optimized
> > (zero-copy) passing of raw memory buffers.
>
> I think that you misunderstood the PEP 554. It's a bare minimum API,
> and the idea is to *extend* it later to have an efficient
> implementation of "shared objects".

No, I get this part :-)

> IMO it should easy to share *data* (object "content") between
> subinterpreters, but each interpreter should have its own PyObject
> which exposes the data at the Python level. See the PyObject has a
> proxy to data.

So when you say "shared object" you mean that you're sharing a raw
memory buffer, and then you're writing a Python object that stores its
data inside that memory buffer instead of inside its __dict__:

class MySharedObject:
def __init__(self, shared_memview, shared_lock):
self._shared_memview = shared_memview
self._shared_lock = shared_lock

@property
def my_attr(self):
with self._shared_lock:
return struct.unpack_from(MY_ATTR_FORMAT,
self._shared_memview, MY_ATTR_OFFSET)[0]

@my_attr.setter
def my_attr(self, new_value):
with self._shared_lock:
struct.pack_into(MY_ATTR_FORMAT, self._shared_memview,
MY_ATTR_OFFSET, new_value)

This is an interesting idea, but I think when most people say "sharing
objects between subinterpreters", they mean being able to pass some
pre-existing object between subinterpreters cheaply, while this
requires defining custom objects with custom locking. So we should
probably use different terms for them to avoid confusion :-).

This is an interesting idea, and it's true that it's not considered in
my post you're responding to. I was focusing on copying objects, not
sharing objects on an ongoing basis. You can't implement this kind of
"shared object" using a pipe/socket, because those create two
independent copies of the data.

But... if this is what you want, you can do the exact same thing with
subprocesses too. OSes provide inter-process shared memory and
inter-process locks. 'MySharedObject' above would work exactly the
same. So I think the conclusion still holds: there aren't any plans to
make IPC between subinterpreters meaningfully faster than IPC between
subprocesses.

> I don't think that we have to reinvent the wheel. threading,
> multiprocessing and asyncio already designed such APIs. We should to
> design similar APIs and even simply reuse code.

Or, we could simply *use* the code instead of using subinterpreters
:-). (Or write new and better code, I feel like there's a lot of room
for a modern 'multiprocessing' competitor.) The question I'm trying to
figure out is what advantage subinterpreters give us over these proven
technologies, and I'm still not seeing it.

> My hope is that "synchronization" (in general, locks in specific) will
> be more efficient in the same process, than synchronization between
> multiple processes.

Hmm, I would be surprised by that – the locks in modern OSes are
highly-optimized, and designed to work across subprocesses. For
example, on Linux, futexes work across processes. Have you done any
benchmarks?

Also btw, note that if you want to use async within your
subinterpreters, then that rules out a lot of tools like regular
locks, because they can't be integrated into an event loop. If your
subinterpreters are using async, then you pretty much *have* to use
full-fledged sockets or equivalent for synchronization.

> I would be interested to have a generic implementation of "remote
> object": a empty proxy object which forward all operations to a
> different interpreter. It will likely be inefficient, but it may be
> convenient for a start. If a method returns an object, a new proxy
> should be created. Simple scalar types like int and short strings may
> be serialized (copied).

How would this be different than
https://docs.python.org/3/library/multiprocessing.html#proxy-objects ?

How would you handle input arguments -- would those get proxied as well?

Also, does this mean the other subinterpreter has to be running an
event loop to process these incoming requests? Or is the idea that the
other subinterpreter would process these inside a traditional Python
thread, so users are exposed to all the classic shared-everything
locking issues?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/53XQ52JVILNQH7IQC7SHKFSNHWD4DNX6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Issues with import_fresh_module

2020-05-06 Thread Nathaniel Smith
On Wed, May 6, 2020 at 7:52 AM Paul Ganssle  wrote:
>
> As part of PEP 399, an idiom for testing both C and pure Python versions of a 
> library is suggested making use if import_fresh_module.
>
> Unfortunately, I'm finding that this is not amazingly robust. We have this 
> issue: https://bugs.python.org/issue40058, where the tester for datetime 
> needs to do some funky manipulations to the state of sys.modules for reasons 
> that are now somewhat unclear, and still sys.modules is apparently left in a 
> bad state.
>
> When implementing PEP 615, I ran into similar issues and found it very 
> difficult to get two independent instances of the same module – one with the 
> C extension blocked and one with it intact. I ended up manually importing the 
> C and Python extensions and grafting them onto two "fresh" imports with 
> nothing blocked.

When I've had to deal with similar issues in the past, I've given up
on messing with sys.modules and just had one test spawn a subprocess
to do the import+run the actual tests. It's a big hammer, but the nice
thing about big hammers is that there's no subtle issues, either they
smash the thing or they don't.

But, I don't know how awkward that would be to fit into Python's
unittest system, if you have lots of tests you need to run this way.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SDSODK5ZSJUSGDFVFOAESHYLPPFANNWD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Issues with import_fresh_module

2020-05-06 Thread Paul Ganssle
No worries, I actually seem to have solved the immediately pressing
problem that was blocking PEP 615 by doing this:

@functools.lru_cache(1)
def get_modules():
    import zoneinfo as c_module
    py_module = import_fresh_module("zoneinfo", blocked=["_czoneinfo"])

    return py_module, c_module

I'll have to dig in to figure out exactly /why/ that works, and why it
/doesn't/ work in the reference implementation (which has the the C
implementation living at `zoneinfo._czoneinfo` instead of at
`_czoneinfo`), and hopefully that will shed some light on the other
issues. For the moment I've got something that appears to work and a
suggestive pattern of behavior as to why it wasn't working, so that
actually seems like it will help me solve my short term goal of getting
zoneinfo merged ASAP and my long term goal of ensuring that the tests
are robust.

Thanks!
Paul

On 5/6/20 3:55 PM, Brett Cannon wrote:
> I'm drowning in work this month, so if you need me to look at something then 
> I unfortunately need a point-blank link of what you want me to look at with a 
> targeted question.
>
> As for import_fresh_module() not being robust: of course it isn't because 
> it's mucking with import stuff in a very non-standard way.  All it's doing 
> is an import and clearing the module from sys.modules. The extras it provides 
> is to shove None into sys.modules to trigger an ImportError and so you can 
> block any acceleration module from being imported and to forcibly use the 
> Python code. That's it.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/VNQJBFHIEZLY6C5HNV5A6TNIWI7VAMOW/
> Code of Conduct: http://python.org/psf/codeofconduct/


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2RRWGNT2WRFG5OYLXLDKQGJDDZT456KE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: A PEP PR that I closed until someone discusses context

2020-05-06 Thread Terry Reedy

On 5/6/2020 5:28 PM, Steve Dower wrote:

On 06May2020 2204, joannah nanjekye wrote:
I saw a PR on the PEP repository that looked like a joke here : 
https://github.com/python/peps/pull/1396


The author can give context to re-open if it was intentional.


Given there isn't a real email address on the PEP, I'd assume it was 
meant as a joke.


I wouldn't put any more time into this.


Except to ban the graffiti artist, if possible.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HYS7GTIGOOSG4ETTKD3ABHCIZSIEV2LG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-05-06 Thread David Mertz
Hi Guido, Pablo & Lysandros,

I'm excited about this improvement to Python, and was interested to hear
about it at the language summit as well.  I happen to be friends with
Alessandro Warth, whom you cited in the PEP as developing the packrat
parsing technique you use (at least in part).  I wrote to him to ask if he
knew being cited, and he responded in part with these comments.  The
additional link may perhaps be useful for you:

Alex: (If they had gotten in touch, I would have pointed them at my
> dissertation, which I think had a simpler description of that algorithm.
> There's also the Ohm implementation [https://github.com/harc/ohm], where
> I figured out how to simplify it further.)
>


-- 
The dead increasingly dominate and strangle both the living and the
not-yet born.  Vampiric capital and undead corporate persons abuse
the lives and control the thoughts of homo faber. Ideas, once born,
become abortifacients against new conceptions.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZYTT24HSRGH7RIDYJRGCSTUQIYIHWHTI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

2020-05-06 Thread Cameron Simpson

On 06May2020 23:05, Serhiy Storchaka  wrote:

06.05.20 00:46, Victor Stinner пише:

Subinterpreters and multiprocessing have basically the same speed on
this benchmark.


It does not look like there are some advantages of subinterpreters 
against multiprocessing.


Maybe I'm missing something, but the example that comes to my mind is 
embedding a Python interpreter in an existing nonPython programme.


My pet one-day-in-the-future example is mutt, whose macro language is...  
crude.  And mutt is single threaded.


However, it is easy to envisage a monolithic multithreaded programme 
which has use for Python subinterpreters to work on the larger 
programme's in-memory data structures.


I haven't a real world example to hand, but that is the architectural 
situation where I'd consider multiprocessing to be inappropriate or 
infeasible because the target data are all in the one memory space.


Cheers,
Cameron Simpson 
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OJRWUSZLMUWRF4UJK7ZQBAAUKWI6BSY6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

2020-05-06 Thread Emily Bowman
 Main memory bus or cache contention? Integer execution ports full?
Throttling? VTune is useful to find out where the bottleneck is, things
like that tend to happen when you start loading every logical core.

On Tue, May 5, 2020 at 4:45 PM Joseph Jenne via Python-Dev <
python-dev@python.org> wrote:

> I'm seeing a drop in performance of both multiprocess and subinterpreter
> based runs in the 8-CPU case, where performance drops by about half
> despite having enough logical CPUs, while the other cases scale quite
> well. Is there some issue with python multiprocessing/subinterpreters on
> the same logical core?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CDF6K2ALJV6MWCKF54JCXDWSR4HNBMKR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Issues with import_fresh_module

2020-05-06 Thread Paul Ganssle
As part of PEP 399 , an idiom
for testing both C and pure Python versions of a library is suggested
making use if import_fresh_module.

Unfortunately, I'm finding that this is not amazingly robust. We have
this issue: https://bugs.python.org/issue40058, where the tester for
datetime needs to do some funky manipulations
to
the state of sys.modules for reasons that are now somewhat unclear, and
still sys.modules is apparently left in a bad state.

When implementing PEP 615, I ran into similar issues and found it very
difficult to get two independent instances of the same module – one with
the C extension blocked and one with it intact. I ended up manually
importing the C and Python extensions and grafting them onto two "fresh"
imports with nothing blocked
.

This seems to work most of the time in my repo, but when I import it
into CPython, I'm now seeing failures due to this issue. The immediate
symptom is that assertRaises is seeing a mismatch between the exception
raised by the module and the exception *on* the module. Here's the
Travis error

(ignore the part about `tzdata`, that needs to be removed as
misleading), and here's the test
.
Evidently calling module.ZoneInfo("Bad_Zone") is raising a different
module's ZoneInfoNotFoundError in some cases and I have no idea why.

Is anyone familiar more familiar with the import system willing to take
a look at these issues?

Thanks,
Paul


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6H4E3XDPU4YU4HZEEOBB4RL6ZQMC57YG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

2020-05-06 Thread Victor Stinner
Hi Nathaniel,

Le mer. 6 mai 2020 à 04:00, Nathaniel Smith  a écrit :
> As far as I understand it, the subinterpreter folks have given up on
> optimized passing of objects, and are only hoping to do optimized
> (zero-copy) passing of raw memory buffers.

I think that you misunderstood the PEP 554. It's a bare minimum API,
and the idea is to *extend* it later to have an efficient
implementation of "shared objects".

--

IMO it should easy to share *data* (object "content") between
subinterpreters, but each interpreter should have its own PyObject
which exposes the data at the Python level. See the PyObject has a
proxy to data.

It would badly hurt performance if a PyObject is shared by two
interpreters: it would require locking or atomic variables for
PyObject members and PyGC_Head members.

It seems like right now, the PEP 554 doesn't support sharing data, so
it should still be designed and implemented later.

Who owns the data? When can we release memory? Which interpreter
releases the memory? I read somewhere that data is owned by the
interpreter which allocates the memory, and its memory would be
released in the same interpreter.

How do we track data lifetime? I imagine a reference counter. When it
reaches zero, the interpreter which allocates the data can release it
"later" (it doesn't have to be done "immediately").

How to lock the whole data or a portion of data to prevent data races?
If data doesn't contain any PyObject, it may be safe to allow
concurrent writes, but readers should be prepared for inconsistencies
depending on the access pattern. If two interpreters access separated
parts of the data, we may allow lock-free access.

I don't think that we have to reinvent the wheel. threading,
multiprocessing and asyncio already designed such APIs. We should to
design similar APIs and even simply reuse code.

My hope is that "synchronization" (in general, locks in specific) will
be more efficient in the same process, than synchronization between
multiple processes.

--

I would be interested to have a generic implementation of "remote
object": a empty proxy object which forward all operations to a
different interpreter. It will likely be inefficient, but it may be
convenient for a start. If a method returns an object, a new proxy
should be created. Simple scalar types like int and short strings may
be serialized (copied).

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5B5M62C7YNMXJW2ULXQ3XAGEM4F3C67S/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Maintainer for Multiprocessing

2020-05-06 Thread Philipp Helo Rehs
Hello,

it looks like davin is no longer active and there is a pending merge request 
open more than two years.

https://github.com/python/cpython/pull/4819

How can this get merged?

Kind regards
 Philipp Rehs
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TZKUAU46WCSA6NFTOVOQWTAX3T33QWRX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

2020-05-06 Thread Guido van Rossum
Okay, an image is appearing. It sounds like GIL-free subinterpreters may
one day shine because IPC is faster and simpler within one process than
between multiple processes. This is not exactly what I got from PEP 554 but
it is sufficient for me to have confidence in the project.

On Wed, May 6, 2020 at 5:41 AM Victor Stinner  wrote:

> Hi Nathaniel,
>
> Le mer. 6 mai 2020 à 04:00, Nathaniel Smith  a écrit :
> > As far as I understand it, the subinterpreter folks have given up on
> > optimized passing of objects, and are only hoping to do optimized
> > (zero-copy) passing of raw memory buffers.
>
> I think that you misunderstood the PEP 554. It's a bare minimum API,
> and the idea is to *extend* it later to have an efficient
> implementation of "shared objects".
>
> --
>
> IMO it should easy to share *data* (object "content") between
> subinterpreters, but each interpreter should have its own PyObject
> which exposes the data at the Python level. See the PyObject has a
> proxy to data.
>
> It would badly hurt performance if a PyObject is shared by two
> interpreters: it would require locking or atomic variables for
> PyObject members and PyGC_Head members.
>
> It seems like right now, the PEP 554 doesn't support sharing data, so
> it should still be designed and implemented later.
>
> Who owns the data? When can we release memory? Which interpreter
> releases the memory? I read somewhere that data is owned by the
> interpreter which allocates the memory, and its memory would be
> released in the same interpreter.
>
> How do we track data lifetime? I imagine a reference counter. When it
> reaches zero, the interpreter which allocates the data can release it
> "later" (it doesn't have to be done "immediately").
>
> How to lock the whole data or a portion of data to prevent data races?
> If data doesn't contain any PyObject, it may be safe to allow
> concurrent writes, but readers should be prepared for inconsistencies
> depending on the access pattern. If two interpreters access separated
> parts of the data, we may allow lock-free access.
>
> I don't think that we have to reinvent the wheel. threading,
> multiprocessing and asyncio already designed such APIs. We should to
> design similar APIs and even simply reuse code.
>
> My hope is that "synchronization" (in general, locks in specific) will
> be more efficient in the same process, than synchronization between
> multiple processes.
>
> --
>
> I would be interested to have a generic implementation of "remote
> object": a empty proxy object which forward all operations to a
> different interpreter. It will likely be inefficient, but it may be
> convenient for a start. If a method returns an object, a new proxy
> should be created. Simple scalar types like int and short strings may
> be serialized (copied).
>
> Victor
> --
> Night gathers, and now my watch begins. It shall not end until my death.
>


-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/U57XTLNEMV5SNL34UEHKEEWKSADASIMS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

2020-05-06 Thread Antoine Pitrou
On Tue, 5 May 2020 18:59:34 -0700
Nathaniel Smith  wrote:
> On Tue, May 5, 2020 at 3:47 PM Guido van Rossum  wrote:
> >
> > This sounds like a significant milestone!
> >
> > Is there some kind of optimized communication possible yet between 
> > subinterpreters? (Otherwise I still worry that it's no better than 
> > subprocesses -- and it could be worse because when one subinterpreter 
> > experiences a hard crash or runs out of memory, all others have to die with 
> > it.)  
> 
> As far as I understand it, the subinterpreter folks have given up on
> optimized passing of objects, and are only hoping to do optimized
> (zero-copy) passing of raw memory buffers.

Which would be useful already, especially with pickle out-of-band
buffers.

Regards

Antoine.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GW56W4ELTFRWINYNVO7MDYQROQ7NNDE2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Maintainer for Multiprocessing

2020-05-06 Thread Antoine Pitrou
On Wed, 06 May 2020 09:42:02 -
"Philipp Helo Rehs"  wrote:
> Hello,
> 
> it looks like davin is no longer active and there is a pending merge request 
> open more than two years.
> 
> https://github.com/python/cpython/pull/4819

For the record, I've punted on this for a while because reviewing it
properly means taking a dive into the multiprocessing Proxy / Manager
implementation.  It's also quite low-priority for me (I've never used
that part of multiprocessing).

As for Davin, it seems he posted a reply on the PR.

Regards

Antoine.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3LJLAUW4CVXFA2CYBMQ3HRWCUXXXLIEZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

2020-05-06 Thread Barry Scott


> On 5 May 2020, at 23:40, Guido van Rossum  wrote:
> 
> Is there some kind of optimized communication possible yet between 
> subinterpreters? (Otherwise I still worry that it's no better than 
> subprocesses -- and it could be worse because when one subinterpreter 
> experiences a hard crash or runs out of memory, all others have to die with 
> it.)
> 

I had already concluded that this would not be useful for the use cases I have 
at work.
The running out of memory and the hard crash is what would stop me using this 
in production.

For my day job I work on a service that forks slave processes to handle I/O 
transactions.
There is a monitor process that manages the total memory of all slaves and 
shutdown and replaces
slaves when they use too much memory. Typically there are 60 to 100 slaves with 
a core each to play with.
The service runs 24x365.

Barry

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WM2WQYP4P4USOJZPURCVGXVTPVJSHDXP/
Code of Conduct: http://python.org/psf/codeofconduct/