[issue40440] allow array.array construction from memoryview w/o copy

2020-05-01 Thread Davin Potts


Davin Potts  added the comment:

Being able to create an array.array without making a copy of a memoryview's 
contents does sound valuable.  We do not always want to modify the size of the 
array, as evidenced by array.array's existing functionality where its 
size-changing manipulations (like append) are suppressed when exporting a 
buffer.  So I think it is okay to not require a copy be made when constructing 
an array.array in this way.

Serhiy's example is a good one for demonstrating how different parts of an 
array.array can be treated as having different types as far as getting and 
setting items.  I have met a number of hardware groups in mostly larger 
companies that use array.array to expose raw data being read directly from 
devices.  They wastefully make copies of their often large array.array objects, 
each with a distinct type code, so that they can make use of array.array's 
index() and count() and other functions, which are not available on a 
memoryview.

Within the core of Python (that is, including the standard library but 
excluding 3rd party packages), we have a healthy number of examples of objects 
that expose a buffer via the Buffer Protocol but they lack the symmetry of 
going the other way to enable creation from an existing buffer.  My sense is it 
would be a welcome thing to see something like array.array, that is designed to 
work with low-level data types, support creation from an existing buffer 
without the need for a copy -- this is the explicit purpose of the Buffer 
Protocol after all but array.array only supports export, not creation, which 
currently makes array.array feel inconsistent.

--
nosy: +davin

___
Python tracker 
<https://bugs.python.org/issue40440>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39584] multiprocessing.shared_memory: MacOS crashes by running attached Python code

2020-02-12 Thread Davin Potts


Davin Potts  added the comment:

My sense is that it would be nice if we can catch this before ftruncate does 
something nasty.

Where else is ftruncate used in CPython that this could similarly trigger a 
problem?  How is it handled there (or not)?

--

___
Python tracker 
<https://bugs.python.org/issue39584>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33082] multiprocessing docs bury very important 'callback=' args

2019-09-13 Thread Davin Potts


Davin Potts  added the comment:

I appreciate the functionality offered by the callbacks and have found good 
uses for them, as Chad clearly does/has.  That said, the thought of expanding 
the documentation on the callbacks had not come up for me.

Reading through the proposed changes to the prose explanations, the choice of 
words has changed but not significantly and virtually no new concepts are being 
explained.

I agree with Julien that the docs should stay as they are.

Chad:  Thank you for advocating for things you think more people need to know 
about even if we do not update the docs this time.

--
resolution:  -> rejected
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue33082>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35727] sys.exit() in a multiprocessing.Process does not align with Python behavior

2019-09-13 Thread Davin Potts


Davin Potts  added the comment:

I believe the mentality behind multiprocessing.Process triggering an exit code 
of 1 when sys.exit() is invoked inside its process is to indicate a 
non-standard exit out of its execution.  There may yet be other side effects 
that could be triggered by having a sys.exit(0) translate into an exit code of 
0 from the Process's process -- and we might not notice them with the current 
tests.

Was there a particular use case that motivates this suggested change?

--

___
Python tracker 
<https://bugs.python.org/issue35727>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2019-09-13 Thread Davin Potts


Change by Davin Potts :


--
pull_requests: +15722
pull_request: https://github.com/python/cpython/pull/16103

___
Python tracker 
<https://bugs.python.org/issue22393>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37652] Multiprocessing shared_memory ValueError on race with ShareableList

2019-09-11 Thread Davin Potts


Davin Potts  added the comment:

Apologies, one of the quotes in my previous response should have been 
attributed to @mental.

I think @pierreglaser phrased it very nicely:
> shared_memory is a low level python module. Precautions should be made when 
> handling concurrently the shared_memory objects using synchronization 
> primitives for example. I'm not sure this should be done internally in the 
> SharedMemory class -- especially, we don't want to slow down concurrent READ 
> access.


Per the further suggestion:
> +1 For a documentation addition.

I can take a crack at adding something more along the lines of this discussion, 
but I would very much welcome suggestions (@bjs, @mental, @pierreglaser)...

--

___
Python tracker 
<https://bugs.python.org/issue37652>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37652] Multiprocessing shared_memory ValueError on race with ShareableList

2019-09-11 Thread Davin Potts


Davin Potts  added the comment:

Short responses to questions/comments from @bjs, followed by hopefully helpful 
further comments:

> Are you supposed to ever use a raw SharedMemory buffer directly?

Yes.


> What atomicity guarantees are there for ShareableList operations and 
> read/write to the SharedMemory buffer?

None.


> I've had a fix spinning for about a day now, it introduced a 
> `multiprocessing.Lock` and it was simply wrapped around any struct packing 
> and unpacking calls.

That sounds like a nice general-purpose fix for situations where it is 
impossible to plan ahead to know when one or more processes will need to modify 
the ShareableList/SharedMemory.buf.  When it is possible to design code to 
ensure, because of the execution flow through the code, no two 
processes/threads will attempt to modify and access/modify the same location in 
memory at the same time, locks become unnecessary.  Locks are great tools but 
they generally result in slower executing code.


> What are the use cases for SharedMemory and ShareableList?

Speed.  If we don't care about speed, we can use distributed shared memory 
through the SyncManager -- this keeps one copy of a dict/list/whatever in the 
process memory space of a single process and all other processes may modify or 
access it through two-sided communication with that "owner" process.  If we do 
care about speed, we use SharedMemory and ShareableList and other things 
created on top of SharedMemory -- this effectively gives us fast, communal 
memory access where we avoid the cost of communication except for when we truly 
need to synchronize (where multiprocessing.Lock can help).

Reduced memory footprint.  If I have a "very large" amount of data consuming a 
significant percentage of the available memory on my system, I can make it 
available to multiple processes without duplication.  This provides processes 
with fast access to that data, as fast as if each were accessing data in its 
own process memory space.  It is one thing to imagine using this in 
parallel-executing code, but this can be just as useful in the Python 
interactive shell.  One such scenario:  after starting a time-consuming, 
non-parallel calculation in one Python shell, it is possible to open a new 
Python shell in another window and attach to the data through shared memory to 
continue work while the calculation runs in the first window.

--

___
Python tracker 
<https://bugs.python.org/issue37652>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37637] multiprocessing numpy.ndarray not transmitted properly

2019-09-11 Thread Davin Potts


Davin Potts  added the comment:

Marking as closed after providing an example of how to send NumPy arrays as 
bytes with the send_bytes() function.

--
resolution:  -> not a bug
stage:  -> resolved
status:  -> closed

___
Python tracker 
<https://bugs.python.org/issue37637>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38119] resource tracker destroys shared memory segments when other processes should still have valid access

2019-09-11 Thread Davin Potts


Change by Davin Potts :


--
keywords: +patch
pull_requests: +15618
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/15989

___
Python tracker 
<https://bugs.python.org/issue38119>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37754] Consistency of Unix's shared_memory implementation with windows

2019-09-11 Thread Davin Potts


Davin Potts  added the comment:

I have created issue38119 to track a fix to the inappropriate use of resource 
tracker with shared memory segments, but this does not replace or supersede 
what is discussed here.

--

___
Python tracker 
<https://bugs.python.org/issue37754>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38119] resource tracker destroys shared memory segments when other processes should still have valid access

2019-09-11 Thread Davin Potts


New submission from Davin Potts :

The resource tracker currently destroys (via _posixshmem.shm_unlink) shared 
memory segments on posix systems when any independently created Python process 
with a handle on a shared memory segment exits (gracefully or otherwise).  This 
breaks the expected cross-platform behavior that a shared memory segment 
persists at least as long as any running process has a handle on that segment.

As described with an example scenario in issue37754:
Let's say a three processes P1, P2 and P3 are trying to communicate using 
shared memory.
 --> P1 creates the shared memory block, and waits for P2 and P3 to access it.
 --> P2 starts and attaches this shared memory segment, writes some data to it 
and exits.
 --> Now in case of Unix, shm_unlink is called as soon as P2 exits. (This is by 
action of the resource tracker.)
 --> Now, P3 starts and tries to attach the shared memory segment.
 --> P3 will not be able to attach the shared memory segment in Unix, because 
shm_unlink has been called on that segment.
 --> Whereas, P3 will be able to attach to the shared memory segment in Windows.

Another key scenario we expect to work but does not currently:
1. A multiprocessing.managers.SharedMemoryManager is instantiated and started 
in process A.
2. A shared memory segment is created using that manager in process A.
3. A serialized representation of that shared memory segment is deserialized in 
process B.
4. Process B does work with the shared memory segment that is also still 
visible to process A.
5. Process B exits cleanly.
6. Process A reads data from the shared memory segment after process B is gone. 
 (This currently fails.)

The SharedMemoryManager provides a flexible means for ensuring cleanup of 
shared memory segments.  The current resource tracker attempts to treat shared 
memory segments as equivalent to semaphore references, which is too narrow of 
an interpretation.  As such, the current resource tracker should not be 
attempting to enforce cleanup of shared memory segments because it breaks 
expected behavior and significantly limits functionality.

--
assignee: davin
components: Library (Lib)
messages: 351960
nosy: davin, pablogsal, pitrou, vinay0410, vstinner
priority: normal
severity: normal
status: open
title: resource tracker destroys shared memory segments when other processes 
should still have valid access
type: behavior
versions: Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue38119>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35267] reproducible deadlock with multiprocessing.Pool

2019-09-11 Thread Davin Potts


Davin Potts  added the comment:

I second what @vstinner already said in the comments for PR11143, that this 
should not merely be documented.

--
nosy: +davin

___
Python tracker 
<https://bugs.python.org/issue35267>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38084] multiprocessing cannot recover from crashed worker

2019-09-10 Thread Davin Potts


Davin Potts  added the comment:

Agreed with @ppperry that this is a duplicate of issue22393.

The proposed patch in issue22393 is, for the moment, out of sync with more 
recent changes.  That patch's approach would result in the loss of all partial 
results from a Pool.map, but it may be faster to update and review.

--

___
Python tracker 
<https://bugs.python.org/issue38084>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38084] multiprocessing cannot recover from crashed worker

2019-09-10 Thread Davin Potts


Davin Potts  added the comment:

Thanks to Pablo's good work with implementing the use of multiprocessing's 
Process.sentinel, the logic for handling PoolWorkers that die has been 
centralized into Pool._maintain_pool().  If _maintain_pool() can also identify 
which job died with the dead PoolWorker, then it should be possible to put a 
corresponding message on the outqueue to indicate an exception occurred but 
pool can otherwise continue its work.


The question of whether Pool.map() should expose a timeout parameter deserves a 
separate discussion and should not be considered a path forward on this issue 
as it would require that users always specify and somehow know beforehand how 
long it should take for results to be returned from workers.  Exposing the 
timeout control may have other practical benefits elsewhere but not here.

--

___
Python tracker 
<https://bugs.python.org/issue38084>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38084] multiprocessing cannot recover from crashed worker

2019-09-10 Thread Davin Potts


Davin Potts  added the comment:

Sharing for the sake of documenting a few things going on in this particular 
example:
* When a PoolWorker process exits in this way (os._exit(anything)), the 
PoolWorker never gets the chance to send a signal of failure (normally sent via 
the outqueue) to the MainProcess.
* In the current logic of the MainProcess, Pool._maintain_pool() detects the 
termination of that PoolWorker process and starts a new PoolWorker process to 
replace it, maintaining the desired size of Pool.
* The infinite hang observed in this example comes from the original p.map() 
call performing an unlimited-timeout wait for a result to appear on the 
outqueue, hence an infinite wait.  This wait is performed in MapResult.get() 
which does expose a timeout parameter though it is not possible to control 
through Pool.map().  It is not at all a correct, general solution, but exposing 
the control on this timeout and setting it to 1.0 seconds permits Steve's repro 
code snippet to run to completion (no infinite hang, raises a 
multiprocessing.context.TimeoutError).

--

___
Python tracker 
<https://bugs.python.org/issue38084>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38018] Increase Code Coverage for multiprocessing.shared_memory

2019-09-09 Thread Davin Potts


Davin Potts  added the comment:

Initial review of the test failure suggests a likely flaw in the mechanism used 
by the resource tracker.

I will continue investigating more tomorrow.

--

___
Python tracker 
<https://bugs.python.org/issue38018>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38018] Increase Code Coverage for multiprocessing.shared_memory

2019-09-09 Thread Davin Potts


Davin Potts  added the comment:


New changeset d14e39c8d9a9b525c7dcd83b2a260e2707fa85c1 by Davin Potts (Vinay 
Sharma) in branch 'master':
bpo-38018: Increase code coverage for multiprocessing.shared_memory (GH-15662)
https://github.com/python/cpython/commit/d14e39c8d9a9b525c7dcd83b2a260e2707fa85c1


--

___
Python tracker 
<https://bugs.python.org/issue38018>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37185] use os.memfd_create in multiprocessing.shared_memory?

2019-09-09 Thread Davin Potts


Davin Potts  added the comment:

Unless I am missing something, memfd_create appears to be specific to the Linux 
kernel still so we would need to replicate its behavior on all of the other 
unix systems.

To your point, but quoting from the docs, "separate invocations of memfd_create 
with the same name will not return descriptors for the same region of memory".  
If it is possible to use the anonymous shared memory created via memfd_create 
in another process (which is arguably the primary motivation / use case for 
multiprocessing.shared_memory), we would need to replicate the unique way of 
referencing a shared memory segment when trying to attach to it from other 
processes.

To permit resource management of a shared memory segment (in the sense of 
ensuring the shared memory segment is always unlinked at the end), the 
multiprocessing.managers.SharedMemoryManager exists.  Because destroying a 
shared memory segment at exit is not always desirable, the SharedMemoryManager 
provides additional control over when it is appropriate to unlink a shared 
memory segment.

--

___
Python tracker 
<https://bugs.python.org/issue37185>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37754] Consistency of Unix's shared_memory implementation with windows

2019-09-09 Thread Davin Potts


Davin Potts  added the comment:

A shared semaphore approach for the resource tracker sounds appealing as a way 
to make the behavior on Windows and posix systems more consistent.  However 
this might get implemented, we should not artificially prevent users from 
having some option to persist beyond the last Python process's exit.

I like the point that @eryksun makes that we could instead consider using 
NtMakePermanentObject on Windows to permit more posix-like behavior instead, 
but I do not think we want to head down a path of using undocumented NT APIs.

In the current code, the resource tracker inappropriately triggers 
_posixshmem.shm_unlink; we need to fix this in the immediate short term (before 
3.8 is released) as it breaks the expected behavior @vinay0410 describes.

--

___
Python tracker 
<https://bugs.python.org/issue37754>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37754] alter size of segment using multiprocessing.shared_memory

2019-08-17 Thread Davin Potts


Davin Potts  added the comment:

Attempts to alter the size of a shared memory segment are met with a variety of 
different, nuanced behaviors on systems we want to support.  I agree that it 
would be valuable to be able to effectively realloc a shared memory segment, 
which thankfully the user can do with the current implementation although they 
become responsible for adjusting for platform-specific behaviors.  The design 
of the API in multiprocessing.shared_memory strives to be as feature-rich as 
possible while providing consistent behavior across platforms that can be 
reasonably supported; it also leaves the door open (so to speak) for users to 
exploit additional platform-specific capabilities of shared memory segments.

Knowing beforehand whether to create a segment or attach to an existing one is 
an important feature for a variety of use cases.  I believe this is discussed 
at some length in issue35813.  If what is discussed there does not help (it did 
get kind of long sometimes), please say so and we can talk through it more.

--

___
Python tracker 
<https://bugs.python.org/issue37754>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33725] Python crashes on macOS after fork with no exec

2019-05-05 Thread Davin Potts


Davin Potts  added the comment:

Victor raises an important question:  should the *default* start behavior be 
made consistent across platforms?  Assuming we change it on MacOS, the default 
start behavior on Windows and MacOS will be spawn but the default start 
behavior on Linux and FreeBSD (among others) will be fork.

Reasons to consider such a breaking change:
* This inconsistency in default start behavior on different platforms (Windows 
versus not) has historically been a significant source of confusion for many, 
many users.
* These days, the majority of users are not already familiar with the rule 
"fork-before-creating-threads" and so are surprised and confused when they fork 
a process that already has spun up multiple threads and bad things happen.
* We are changing the default on one platform (MacOS), which should prompt us 
to consider how are defaults are set elsewhere.

Reasons to reject such a breaking change:
* Though changing the default does not break everyone's code everywhere, it 
will require changes to any code that depends upon the default start method AND 
depends upon data/functions/stuff from the parent to also be present in the 
forked child process.

--
nosy: +pablogsal

___
Python tracker 
<https://bugs.python.org/issue33725>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33725] Python crashes on macOS after fork with no exec

2019-05-05 Thread Davin Potts


Davin Potts  added the comment:

I believe we must change the default behavior on MacOS to use spawn instead of 
fork.  Encouraging people to use fork by default on MacOS is encouraging them 
to create something that effectively will not work.  Keeping fork as the 
default behavior when we have already turned off all of the tests of fork 
behavior on MacOS also makes no sense.  Existing Python code that depends upon 
the default behavior (fork) on MacOS has already been broken -- if we make this 
change, we are arguably not breaking anyone's working code.

Users can and will still be able to specify the start mechanism on MacOS, 
including fork.  This empowers users to continue to handle even the most 
esoteric use cases without loss of functionality from multiprocessing.  Though 
admittedly, without an ability to test the behavior of fork, this will need to 
be marked as deprecated.

I will supply a patch making this change and updating the docs shortly after 
PyCon.

--

___
Python tracker 
<https://bugs.python.org/issue33725>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36364] errors in multiprocessing.shared_memory examples

2019-04-01 Thread Davin Potts


Davin Potts  added the comment:

Very much agreed, they're moving over to the main docs.

--

___
Python tracker 
<https://bugs.python.org/issue36364>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33725] Python crashes on macOS after fork with no exec

2019-03-09 Thread Davin Potts


Davin Potts  added the comment:

As best as I can see, there is no magic bullet to help mitigate this.

At a minimum, I am convinced we need to update the documentation to describe 
this behavior on MacOS and recommend alternatives.

I continue to give serious thought to the idea of changing the default start 
method on MacOS from fork to spawn.  This would be a breaking change though one 
could argue MacOS has already undergone a breaking change.  Is such a change 
warranted?

The alternative (which does not seem all that appealing) is that we start 
encouraging everyone to first consider the start method before attempting to 
use multiprocessing even for their first time.  Providing sensible defaults is 
to be preferred, but changing the default to reflect a non-trivial change in 
the underlying platform is still not to be taken lightly.

--

___
Python tracker 
<https://bugs.python.org/issue33725>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36102] TestSharedMemory fails on AMD64 FreeBSD CURRENT Shared 3.x

2019-02-25 Thread Davin Potts


Davin Potts  added the comment:

Closing.

Thank you Giampaolo for jumping in so quickly to review!

Thank you Victor for catching this on the buildbot.  Though what is this talk 
of "_if_ the color changes"? ;)

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue36102>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36102] TestSharedMemory fails on AMD64 FreeBSD CURRENT Shared 3.x

2019-02-25 Thread Davin Potts


Davin Potts  added the comment:


New changeset aadef2b41600cb6a4f845cdc4cea001c916d8745 by Davin Potts in branch 
'master':
bpo-36102: Prepend slash to all POSIX shared memory block names (#12036)
https://github.com/python/cpython/commit/aadef2b41600cb6a4f845cdc4cea001c916d8745


--

___
Python tracker 
<https://bugs.python.org/issue36102>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36102] TestSharedMemory fails on AMD64 FreeBSD CURRENT Shared 3.x

2019-02-25 Thread Davin Potts


Change by Davin Potts :


--
stage:  -> patch review

___
Python tracker 
<https://bugs.python.org/issue36102>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36102] TestSharedMemory fails on AMD64 FreeBSD CURRENT Shared 3.x

2019-02-25 Thread Davin Potts


Davin Potts  added the comment:

I have locally tested GH-12036 on all 5 of the aforementioned OSes and all are 
made happy by the patch.

Victor:  If we want to go ahead and apply this patch right away to hopefully 
make the FreeBSD buildbot go green, the nature of this change is sufficiently 
small that it will still be easy to change during the alpha period.

--
stage: patch review -> 

___
Python tracker 
<https://bugs.python.org/issue36102>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36102] TestSharedMemory fails on AMD64 FreeBSD CURRENT Shared 3.x

2019-02-25 Thread Davin Potts


Change by Davin Potts :


--
keywords: +patch
pull_requests: +12065
stage:  -> patch review

___
Python tracker 
<https://bugs.python.org/issue36102>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36102] TestSharedMemory fails on AMD64 FreeBSD CURRENT Shared 3.x

2019-02-25 Thread Davin Potts


Davin Potts  added the comment:

In local testing, I found the following systems to impose the leading slash as 
a requirement for simply creating a shared memory block:
* NetBSD 8.0
* FreeBSD 12.x
* TrueOS 18.12 (the OS formerly known as PC-BSD)

I found the following systems to have no required leading slash and all tests 
currently pass without modification:
* OpenBSD 6.4
* DragonflyBSD 5.4

--

___
Python tracker 
<https://bugs.python.org/issue36102>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36102] TestSharedMemory fails on AMD64 FreeBSD CURRENT Shared 3.x

2019-02-25 Thread Davin Potts


Davin Potts  added the comment:

Though apparently undocumented on FreeBSD, their implementation of shm_open 
differs from others in the following way:  all names for shared memory blocks 
*must* begin with a slash.  This requirement does not exist on OpenBSD.

According to its man page on shm_open, FreeBSD does at least communicate the 
following non-standard, additional requirement:
Two processes opening the same path
are guaranteed to access the same shared memory object if and only if
path begins with a slash (`/') character.


Given that this requirement is not universal and because a leading slash 
controls other behaviors on platforms like Windows, it would be confusing to 
make a leading slash a universal requirement.

Likewise, requiring users on FreeBSD to be aware of this nuance would be 
contrary to the goals of the SharedMemory class.


I will prepare a patch to prepend a leading slash onto the requested shared 
memory block name and detect the need for it.

I have verified that this does solve the problem on FreeBSD and that all tests 
then pass.  I will test NetBSD and DragonflyBSD to see if they also impose 
FreeBSD's undocumented requirement.

--

___
Python tracker 
<https://bugs.python.org/issue36102>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36099] Clarify the difference between mu and xbar in the statistics documentation

2019-02-23 Thread Davin Potts


Davin Potts  added the comment:

Without necessarily defining what each means, perhaps it is sufficient to 
change this clause in the docs:
it should be the mean of data

For pvariance() it could read as:
it should be the *population* mean of data

And for variance() it could read as:
it should be the *sample* mean of data

--
nosy: +davin

___
Python tracker 
<https://bugs.python.org/issue36099>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-23 Thread Davin Potts


Davin Potts  added the comment:

Steven: Your point about population versus sample makes sense and your point 
that altering their names would be a breaking change is especially important.  
I think that pretty well puts an end to my suggestion of alternative names and 
says the current pattern should be kept with NormalDist.

I particularly like the idea of using the TI Nspire and Casio Classpad to guide 
or help confirm what symbols might be recognizable to secondary students or 1st 
year university students.


Raymond: As an idea for examples demonstrating the code, what about an example 
where a plot of pdf is created, possibly for comparison with cdf?  This would 
require something like matplotlib but would help to visually communicate the 
concepts of pdf, perhaps with different sigma values?

--

___
Python tracker 
<https://bugs.python.org/issue36018>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-23 Thread Davin Potts


Davin Potts  added the comment:

@Giampaolo:  The docstring in the shared_memory module currently marks the API 
as experimental.  (You read my mind...)

I will start a new PR where we can work on the 
better-integration-into-the-larger-multiprocessing-docs and add comments there.

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-23 Thread Davin Potts


Davin Potts  added the comment:


New changeset e895de3e7f3cc2f7213b87621cfe9812ea4343f0 by Davin Potts in branch 
'master':
bpo-35813: Tests and docs for shared_memory (#11816)
https://github.com/python/cpython/commit/e895de3e7f3cc2f7213b87621cfe9812ea4343f0


--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-23 Thread Davin Potts


Davin Potts  added the comment:

There is an inconsistency worth paying attention to in the choice of names of 
the input parameters.

Currently in the statistics module, pvariance() accepts a parameter named "mu" 
and pstdev() and variance() each accept a parameter named "xbar".  The docs 
describe both "mu" and "xbar" as "it should be the mean of data".  I suggest it 
is worth rationalizing the names used within the statistics module for 
consistency before reusing "mu" or "xbar" or anything else in NormalDist.

Using the names of mathematical symbols that are commonly used to represent a 
concept is potentially confusing because those symbols are not always 
*universally* used.  For example, students are often introduced to new concepts 
in introductory mathematics texts where concepts such as "mean" appear in 
formulas and equations not as "mu" but as "xbar" or simply "m" or other simple 
(and hopefully "friendly") names/symbols.  As a mathematician, if I am told a 
variable is named, "mu", I still feel the need to ask what it represents.  
Sure, I can try guessing based upon context but I will usually have more than 
one guess that I could make.

Rather than continue down a path of using various 
mathematical-symbols-written-out-in-English-spelling, one alternative would be 
to use less ambiguous, more informative variable names such as "mean".  It 
might be worth considering a change to the parameter names of "mu" and "sigma" 
in NormalDist to names like "mean" and "stddev", respectively.  Or perhaps 
"mean" and "standard_deviation".  Or perhaps "mean" and "variance" would be 
easier still (recognizing that variance can be readily computed from standard 
deviation in this particular context).  In terms of consistency with other 
packages that users are likely to also use, scipy.stats functions/objects 
commonly refer to these concepts as "mean" and "var".

I like the idea of making NormalDist readily approachable for students as well 
as those more familiar with these concepts.  The offerings in scipy.stats are 
excellent but they are not always the most approachable things for new students 
of statistics.

--

___
Python tracker 
<https://bugs.python.org/issue36018>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-23 Thread Davin Potts


Davin Potts  added the comment:

> FWIW I bumped into this lib: http://semanchuk.com/philip/sysv_ipc/

The author of that lib, Philip Semanchuk, is one of the people participating in 
this effort -- he has posted above in msg334934 here on b.p.o. and has helped 
review the PR in GH-11816.

He is also the author of the posix_ipc package which was the original basis for 
our POSIX Shared Memory implementation here.

The decision to base our Unix platform support upon POSIX and not SystemV 
libraries came after considerable research and there are important differences 
between the two.  To oversimplify:  POSIX Shared Memory support has now been 
available for some time on Linux, *BSD, MacOS, and others and is something of a 
successor to the SystemV.


> That assumes a single app/process which spawns a child (the "worker").

Not true.  A manager started by one process can be connected to by another 
process that is not a child.  This is covered in the docs here:  
https://docs.python.org/3/library/multiprocessing.html#using-a-remote-manager  
That child can then request that shared memory blocks it creates be remotely 
tracked and managed by that remote process's manager.  While I would not expect 
this to be a common use case, this is a feature of BaseManager that we inherit 
into SharedMemoryManager.

The SyncManager.Lock can be used as part of this as well.  Thus, two unrelated 
apps/processes *can* coordinate their management of shared memory blocks 
through the SharedMemoryManager.


> That would translate into a new Semaphore(name=None, create=False)
> class which (possibly?) would also provide better performances
> compared to SyncManager.Semaphore

Right!  You might have noticed that Philip has such a semaphore construct in 
his posix_ipc lib.

I opted to not attempt to add this feature as part of this effort to both (1) 
keep focused on the core needs to work with shared memory, and (2) to take more 
time in the future to work out how to get cross-platform support for the 
semaphore right (as you point out, there are complications to work through).


> Extra 1: apparently there are also POSIX msgget(), msgrcv() and
> msgsnd() syscalls which could be used to implement a System-V message
> Queue similar to SyncManager.Queue later on.

Right!  This is also something Philip has in his posix_ipc lib.  This should be 
part of the roadmap for what we do next with SharedMemory.  This one may be 
complicated by the fact that not all platforms that implement POSIX Shared 
Memory chose to also implement these functions in the same way.  We will need 
time to work out what we can or can not reasonably do here.


> Extra 2: given the 2 distinct use-cases I wonder if the low-level
> component (shared_memory.py) really belongs to multiprocessing module

Given what I wrote above about how multiprocessing.managers does enable these 
use cases and the existing "distributed shared memory" support in 
multiprocessing, I think it logically belongs in multiprocessing.  I suggest 
that "shm_open" and "shm_unlink" are our low-level tools, which appropriately 
are in _posixshmem, but SharedMemory and the rest are high-level tools; 
SharedMemoryManager will not be able to cover all life-cycle management use 
cases thus SharedMemory will be needed by many and in contrast, "shm_open" and 
"shm_unlink" will be needed only by those wishing to do something wacky.  
(Note: I am not trying to make "wacky" sound like a bad thing because wacky can 
be very cool sometimes.)


Philip's ears should now be burning, I mentioned him so many times in this 
post.  Ah!  He beat me to it while I was writing this.  Awesome!

We would not be where we are with SharedMemory without his efforts over many 
years with his posix_ipc lib.

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-22 Thread Davin Potts


Davin Potts  added the comment:

> Code looks much better now. I'm still not convinced 
> "SharedMemory(name=None, create=False, size=0)" is the best API.
> How are you supposed to "create or attach" atomically?

We are consciously choosing to not support an atomic "create or attach".  This 
significantly simplifies the API and avoids the valid concerns raised around 
user confusion relating to that behavior (including the use of different 
specified 'size' values in a race) but does not preclude our potentially 
introducing this as a feature in the future.

This simpler API still supports a "try: create; except: attach" which is not 
atomic but effectively covers the primary use cases for "create or attach".  
Combined with a SyncManager.Lock, users can already achieve an atomic "create 
or attach" using this simpler API.


> Also, could you address my comment about size?
> https://bugs.python.org/issue35813#msg335731
>> Let me rephrase: are we forced to specify a value (aka call
>> ftruncate()) on create ? If we are as I think, could size have a
>> reasonable default value instead of 0? Basically I'm wondering if we
>> can relieve the common user from thinking about what size to use,
>> mostly because it's sort of a low level detail. Could it perhaps
>> default to mmap.PAGESIZE?

Apologies for not responding to your question already, Giampaolo.

For the same reasons that (in C) malloc does not provide a default size, I do 
not think we should attempt to provide a default here.  Not all platforms 
allocate shared memory blocks in chunks of mmap.PAGESIZE, thus on some 
platforms we would unnecessarily over-allocate no matter what default size we 
might choose.  I do not think we should expect users to know what mmap.PAGESIZE 
is on their system.  I think it is important that if a user requests a new 
allocation of memory, that they first consider how much memory will be needed.  
When attaching to an existing shared memory block, its size is already defined.

I think this even fits with CPython's over-allocation strategies behind things 
like list, where an empty list triggers no malloc at all.  We will not allocate 
memory until the user tells us how much to allocate.


> Also, there is no way to delete/unwrap memory without using an
> existing SharedMemory instance, which is something we may not have
> on startup. Perhaps we should have a "shared_memory.unlink(name)"
> function similar to os.unlink() which simply calls C shm_unlink().

It is not really possible to offer this on non-POSIX platforms so I think we 
should not attempt to offer a public "shared_memory.unlink(name)".  It is 
possible to invoke "shm_unlink" with the name of a shared memory block (for 
those who really need it) on platforms with POSIX Shared Memory support via:
shared_memory._posixshmem.shm_unlink('name')

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-20 Thread Davin Potts


Davin Potts  added the comment:

The simpler API is now implemented in GH-11816 as discussed previously.  
Notably:
> * We go with this simpler API:  SharedMemory(name=None, create=False, size=0)
> * 'size' is ignored when create=False
> * create=True acts like O_CREX and create=False only attaches to existing 
> shared memory blocks

As part of this change, the PosixSharedMemory and WindowsNamedSharedMemory 
classes are no more; they have been consolidated into the SharedMemory class 
with a single, simpler, consistent-across-platforms API.

On the SharedMemory class, 'size' is now stored by the __init__ and does not 
use fstat() as part of its property.

Also, SharedMemoryManager (and its close friends) has been relocated to the 
multiprocessing.managers submodule, matching the organization @Giampaolo 
outlined previously:
multiprocessing.managers.SharedMemoryManager
multiprocessing.managers._SharedMemoryTracker
multiprocessing.managers.SharedMemoryServer  (not documented)
multiprocessing.shared_memory.SharedMemory
multiprocessing.shared_memory.SharedList 
multiprocessing.shared_memory.WindowsNamedSharedMemory  (REMOVED)
multiprocessing.shared_memory.PosixSharedMemory  (REMOVED)

I believe this addresses all of the significant discussion topics in a way that 
brings together all of the excellent points being made.  Apologies if I have 
missed something -- I did not think so but I will go back through all of the 
discussions tomorrow to double-check.

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-17 Thread Davin Potts


Davin Potts  added the comment:

> I think we need the "create with exclusive behavior" option, even
> though we don't know how to implement it on Windows right now.

A fix to avoid the potential race condition on Windows is now part of GH-11816.


> To support 1 & 2, we could just have 'create'.  When true, it would
> act like O_CREX.  When false, you would get an error if the name
> doesn't already exist.

I am good with this and now it can be supported.


> a 3rd case where you have "co-equal" processes and any one of them
> could create and the others would attach.

There are some practical use cases motivating this.  Rather than debate the 
merits of those use cases, given the concern raised, perhaps we should forego 
supporting this 3rd case for now.


> Regarding 'size', I think it is a bit weird how it currently works.
> Maybe 'size' should only be valid if you are creating a new shared
> memory object.

This would avoid potential confusion in the details of how attempts to resize 
do/don't work on different platforms.  I would prefer to not need to explain 
that on MacOS, requesting a smaller size is disallowed.  This defers such 
issues until considering a "resize()" method as you suggest.

I like this.


> Should 'size' be a property that always does fstat() to find the
> size of the underlying file?

The potential exists for non-Python code to attach to these same shared memory 
blocks and alter their size via ftruncate() (only on certain Unix platforms).

We could choose to not support such "external" changes and let size be a fixed 
value from the time of instantiation.  But I would like to believe we can be 
more effective and safely use fstat() behind our reporting of 'size'.


> It seems unclear to me how you should avoid cluttering /var/run/shm
> with shared memory objects that people forget to cleanup.

This is the primary purpose of the SharedMemoryManager.  Admittedly, we will 
not convince everyone to use it when they should, just like we are not able to 
convince everyone to use NamedTemporaryFile for their temp files.


To update the proposed change to the API:

* We go with this simpler API:  SharedMemory(name=None, create=False, size=0)
* 'size' is ignored when create=False
* create=True acts like O_CREX and create=False only attaches to existing 
shared memory blocks

Remaining question:  do PosixSharedMemory and WindowsNamedSharedMemory mirror 
this simplified API or do we expose the added functionality each offers, 
permitting informed users to use things like 'mode' when they know it is 
enforced on a particular platform?

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-16 Thread Davin Potts


Davin Potts  added the comment:

@giampaolo:

> Also, what happens if you alter the size of an existing object with a smaller 
> value? Is the memory region overwritten?

Attaching to an existing shared memory block with a size=N which is smaller 
than its allocated size (say it was created with size=M and N Can't you just avoid calling ftruncate() if size is not passed (None)?

It looks like it does skip calling ftruncate() if size is 0.  From posixshmem.c:
if (size) {
DPRINTF("calling ftruncate, fd = %d, size = %ld\n", self->fd, size);
if (-1 == ftruncate(self->fd, (off_t)size)) {


>> I think this misses the ...
> It appears this is already covered:

Sorry for any confusion; I was interpreting your proposed parameter name, 
attach_if_exists, in the following way:
* attach_if_exists=True:   If exists, attach to it otherwise create one
* attach_if_exists=False:  Create a new one but do not attach to an existing 
with the same name
I did not see a way to indicate a desire to *only* attach without creation.  I 
need a way to test to see if a shared memory block already exists or not 
without risk of creating one.  At least this is how I was interpreting "attach 
if exists".


> Don't you also want to "create if it doesn't exist, else attach" as a single, 
> atomic operation?

Yes, I do!  This was part of my description for the parameter named "create" in 
msg335660:
When set to True, a new shared memory block will be created unless
one already exists with the supplied unique name, in which case that block
will be attached to and used.


> I'm not sure if there are or should be sync primitives to "wait for another 
> memory to join me" etc.

In the case of shared memory, I do not think so.  I think such signaling 
between processes, when needed, can be accomplished by our existing signaling 
mechanisms (like, via the Proxy Objects for Event or Semaphore).

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-16 Thread Davin Potts


Davin Potts  added the comment:

@giampaolo:

> 1) As for SharedMemoryManager, I believe it should live in 
> multiprocessing.managers, not shared_memory.py.

I am happy to see it live in multiprocessing.managers so long as we can provide 
a clean way of handling what happens on a platform where we can not support 
shared memory blocks.  (We have implementations for PosixSharedMemory and 
NamedSharedMemory which together cover Windows, Linux, MacOS, the *BSDs, and 
possibly others but that does not cover everything.)

@Neil has already raised this question of what do we want the behavior to be on 
these unsupported platforms on import?  If everything dependent upon shared 
memory blocks remains inside shared_memory.py, then we could raise a 
ModuleNotFoundError or ImportError or similar when attempting to `import 
shared_memory`.  If we move SharedMemoryManager to live in 
multiprocessing.managers, we need to decide how to handle (and communicate to 
the user appropriately) its potential absence.  So far, I am unable to find a 
good example of another module where they have chosen to split up such code 
rather than keeping it all bottled up inside a single module, but perhaps I 
have missed something?


> 2) Same for SharedMemoryServer (which is a subclass of 
> multiprocessing.managers.Server).

Same thing as above.  If we decide how to handle the unsupported platforms on 
import, we can re-organize appropriately.


> 3) ShareableList name is kinda inconsistent with other classes (they all have 
> a "Shared" prefix). I'd call it SharedList instead.

Oooh, interesting.  I am happy to see a name change here.  To share how I came 
up with its current name:  I had thought to deliberately break the naming 
pattern here to make it stand out.  The others, SharedMemory, 
SharedMemoryManager, and SharedMemoryServer, are all focused on the shared 
memory block itself which is something of a more primitive concept (like 
accessing SharedMemory.buf as a memoryview) compared to working with something 
like a list (a less primitive, more widely familiar concept).  Likewise, I 
thought a dict backed by shared memory might be called a ShareableDict and 
other things like a NumPy array backed by shared memory might be called a 
ShareableNDArray or similar.  I was hoping to find a different pattern for the 
names of these objects-backed-by-shared-memory-blocks, but I am uncertain I 
found the best name.


> 4) I have some reservations about SharedMemory's "flags" and "mode" args.

It sounds like you are agreeing with what I advocated in msg335660 (up above).  
Great!

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-16 Thread Davin Potts


Davin Potts  added the comment:

@giampaolo:

> Maybe something like this instead?
>  SharedMemory(name=None, attach_if_exists=False, size=0)

I think this misses the use case when wanting to ensure we only attach to an 
existing shared memory block and if it does not exist, we should raise an 
exception because we can not continue.  (If the shared memory block should 
already be there but it is not, this means something bad happened earlier and 
we might not know how to recover here.)

I believe the two dominant use cases to address are:
1) I want to create a shared memory block (either with or without a 
pre-conceived name).
2) I want to attach to an existing shared memory block by its unique name.

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-16 Thread Davin Potts


Davin Potts  added the comment:

@giampaolo:

> 1) it seems SharedMemory.close() does not destroy the memory region
> (I'm able to re-attach to it via name). If I'm not mistaken only
> the manager can do that.

Correct, close() does not and should not destroy the memory region because 
other processes may still be using it.  Only a call to unlink() triggers the 
destruction of the memory region and so unlink() should only be called once 
across all the processes with access to that shared memory block.

The unlink() method is available on the SharedMemory class.  No manager is 
required.  This is also captured in the docs.


> 2) I suggest to turn SharedMemory.buf in a read-onl property

Good idea!  I will make this change today, updating GH-11816.


> 3) it seems "size" kwarg cannot be zero (current default)

>From the docs:
When attaching to an existing shared memory block, set to 0 (which is the 
default).

This permits attaching to an existing shared memory block by name without 
needing to also already know its size.


> 4) I wonder if we should have multiprocessing.active_memory_children() or 
> something

I also think this would be helpful but...

> I'm not sure if active_memory_children() can return meaningful results with a 
> brand new process (I suppose it can't).

You are right.  As an aside, I think it interesting that in the implementation 
of "System V Shared Memory", its specification called for something like a 
system-wide registry where all still-allocated shared memory blocks were 
listed.  Despite the substantial influence System V Shared Memory had on the 
more modern implementations of "POSIX Shared Memory" and Windows' "Named Shared 
Memory", neither chose to make it part of their specification.

By encouraging the use of SharedMemoryManager to track and ensure cleanup, we 
are providing a reliable and cross-platform supportable best practice.  If 
something more low-level is needed by a user, they can choose to manage cleanup 
themselves.  This seems to parallel how we might encourage, "when opening a 
file, always use a with statement", yet users can still choose to call open() 
and later close() when they wish.

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-15 Thread Davin Potts


Davin Potts  added the comment:

Regarding the API of the SharedMemory class, its flags, mode, and read_only 
parameters are not universally enforced or simply not implemented on all 
platforms that offer POSIX Shared Memory or Windows Named Shared Memory.  A 
simplified API for the SharedMemory class that behaves consistently across all 
platforms would avoid confusion for users.  For users who have specific need of 
flags/mode/read_only controls on a platform that they know does indeed respect 
that control, they should still have a mechanism to leverage those controls.

I propose a simpler, consistent-across-platforms API like:
SharedMemory(name=None, create=False, size=0)

*name* and *size* retain their purpose though the former now defaults to 
None.
*create* is set to False to indicate no new shared memory block is to be
created because we only wish to attach to an already existing shared memory
block.  When set to True, a new shared memory block will be created unless
one already exists with the supplied unique name, in which case that block
will be attached to and used.

Example of attaching to an already existing shared memory block:
SharedMemory(name='uniquename')
Example of creating a new shared memory block where any new name will do:
SharedMemory(create=True, size=128)
Example of creating/attaching a shared memory block with a specific name:
SharedMemory(name='specialsnowflake', create=True, size=4096)


Even with its simplified API, SharedMemory will continue to be powered by 
PosixSharedMemory on systems where "POSIX Shared Memory" is implemented and 
powered by NamedSharedMemory on Windows systems.  The API for PosixSharedMemory 
will remain essentially unchanged from its current form:
PosixSharedMemory(name=None, flags=None, mode=0o600, size=0, 
read_only=False)
The API for NamedSharedMemory will be updated to no longer attempt to mirror 
its POSIX counterpart:
NamedSharedMemory(name=None, create=False, size=0, read_only=False)

To be clear:  the inconsistencies motivating this proposed API change is *not* 
only arising from differences between Windows and POSIX-supporting systems.  
For example, among systems implementing POSIX shared memory, the mode flag 
(which promises control over whether user/group/others can read/write to a 
shared memory block) is often but not always ignored; it differs from one OS to 
the next.


Alternatives/variations to this proposed API change:

* Leave the current APIs alone where all 3 classes have identical APIs.  
Feedback in discussions and from those experimenting with the code suggests 
this is creating confusion.

* Change all 3 classes to have the matching APIs again.  This unnecessarily 
thwarts the ability of users to exploit functionality that they know to be 
there on specific target platforms that they care about.

* Do not expose flags/mode/read_only as part of the input paramemters to 
PosixSharedMemory/NamedSharedMemory but do expose them as class attributes 
instead.  This arguably makes things unnecessarily complicated.  This is not a 
simple topic but its complexity can be treated in a more straightforward way.

* Use a parameter name other than 'create' (e.g. 'attach_only') in the newly 
proposed API.

* Make all input parameters keyword-only for greater flexibility in the API in 
the future.

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-15 Thread Davin Potts


Davin Potts  added the comment:

These questions (originally asked in comments on GH-11816) seemed more 
appropriate to discuss here:
Why should the user want to use `SharedMemory` directly?
Why not just go through the manager?  Also, perhaps a
naive question: don't you _always_ need a `start()`ed
manager in order for the processes to communicate?
Doesn't `SharedMemoryServer` has to be involved?

I think it helps to discuss the last question first.  A SharedMemoryManager is 
*not* needed for two processes to share information across a shared memory 
block, nor is a SharedMemoryServer required.  The docs have examples 
demonstrating this but here is another meant to showcase exactly this:

Start up a Python shell and do the following:
>>> from multiprocessing import shared_memory
>>> shm = shared_memory.SharedMemory(name=None, size=10)
>>> shm.buf[:5] = b'Feb15'
>>> shm.name  # Note this name and use it in the next steps
'psm_26792_26631'

Start up a second Python shell in a new window and do the following:
>>> from multiprocessing import shared_memory
>>> also_shm = shared_memory.SharedMemory(name='psm_26792_26631')  # 
Use that same name
>>> bytes(also_shm.buf[:5])
b'Feb15'

If also_shm.buf is further modified in the second shell, those
changes will be visible on shm.buf in the first shell.  The same
is true of the reverse.

The key point is that there is no sending of messages between the processes at 
all.  In stark contrast, SyncManager offers and supports objects held in 
"distributed shared memory" where messages must be sent from one process to 
another to access or manipulate data; those objects held in "distributed shared 
memory" *must* have a SyncManager+Server to enable their use.  That is not 
needed at all for SharedMemory because access to and manipulation of the data 
is performed directly without the cost-delay of messaging.

This begs a new question, "so what is the SharedMemoryManager used for then?"  
The docs answer:
To assist with the life-cycle management of shared memory
especially across distinct processes, a BaseManager subclass,
SharedMemoryManager, is also provided.
Because shared memory blocks are not "owned" by a single process, they are not 
destroyed/freed when a process exits.  A SharedMemoryManager is used to ensure 
the free-ing of a shared memory block when it is no longer needed.

New SharedMemory instances may be created via a SharedMemoryManager (in which 
case their birth-to-death life-cycle is being managed) or they may be created 
directly as seen in the above example.  Returning to the first question, "Why 
should the user want to use `SharedMemory` directly?", there are more use cases 
than these two:
1. In process 1, a shared memory block is created by calling 
SharedMemoryManager.SharedMemory().  In process 2, we need to attach to that 
existing shared memory block and can do so by referring to its name.  This is 
accomplished as in the above example by simply calling 
SharedMemory(name='uniquename').  We do not want to attach to it via a second 
SharedMemoryManager because only one manager should oversee the life-cycle of a 
single shared memory block.
2. Sometimes direct management of the life-cycle of a shared memory block is 
desirable.  For example, on systems supporting POSIX shared memory, it is a 
feature that shared memory blocks outlive processes.  Some services choose to 
speed a service restart by preserving state data in shared memory, saving the 
newly restarted service from rebuilding it.  The SharedMemoryManager provides 
one life-cycle strategy but can not cover all scenarios so the option to 
directly manage it is important.

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-12 Thread Davin Potts


Davin Potts  added the comment:

@Antoine:  SharedMemoryManager does not subclass SyncManager but it did 
previously.  This is the source of the confusion.  SharedMemoryManager 
subclasses BaseManager which does not provide Value, Array, list, dict, etc.

Agreed that the manager facility does not appear to see that much use in 
existing code.

When working with shared memory, I expect SharedMemoryManager to be much more 
popular.

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-11 Thread Davin Potts


Davin Potts  added the comment:

@giampaolo.rodola: It definitely helps.


Conceptually, SyncManager provides "distributed shared memory" where lists, 
dicts, etc. are held in memory by one process but may be accessed remotely from 
another via a Proxy Object.  Mutating a dict from one process requires sending 
a message to some other process to request the change be made.

In contrast, SharedMemoryManager provides non-distributed shared memory where a 
special region of memory is held by the OS kernel (not a process) and made 
directly addressable to many processes simultaneously.  Modifying any data in 
this special region of memory requires zero process-to-process communication; 
any of the processes may modify the data directly.

In a speed contest, the SharedMemoryManager wins in every use case -- and it is 
not a close race.  There are other advantages and disadvantages to each, but 
speed is the key differentiator.


Thinking ahead to the future of SharedMemoryManager, there is the potential for 
a POSIX shared memory based semaphore.  The performance of this semaphore 
across processes should drastically outperform SyncManager's semaphore.  It 
might be something we will want to support in the future.  SharedMemoryManager 
needs a synchronization mechanism now (in support of common use cases) to 
coordinate across processes, which is why I initially thought 
SharedMemoryManager should expose the Lock, Semaphore, Event, Barrier, etc. 
powered by distributed shared memory.  I am no longer sure this is the right 
choice for three reasons:
(1) it unnecessarily complicates and confuses the separation of what is powered 
by fast SystemV-style shared memory and what is powered by slow distributed 
shared memory,
(2) it would be a very simple example in the docs to show how to add our 
existing Lock or Semaphore to SharedMemoryManager via register(),
(3) if we one day implement POSIX shared memory semaphores (and equivalent 
where POSIX is not supported), we will have the burden of an existing 
lock/semaphore creation methods and apis with behavioral differences.


I propose that it would be clearer but no less usable if we drop these 
registered object types (created via calls to register()) from 
SharedMemoryManager.  It is one line of code for a user to add "Lock" to 
SharedMemoryManager, which I think we can demonstrate well with a simple 
example.

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-11 Thread Davin Potts


Davin Potts  added the comment:

@terry.reedy and @ronaldoussoren: I have asked Van again to provide comments 
here clarifying the topics of (1) copyright notices and (2) requiring the 
BSD-licensed-work's author to sign a contributor agreement.

Specifically regarding the appearance of __copyright__, I added my agreement to 
your comments on GH-11816 on this.

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-10 Thread Davin Potts


Davin Potts  added the comment:

@giampaolo.rodola: Your patch from 3 days ago in issue35917 included additional 
tests around the SharedMemoryManager which are now causing test failures in my 
new PR.  This is my fault because I altered SharedMemoryManager to no longer 
support functionality from SyncManager that I thought could be confusing to 
include.  I am just now discovering this and am not immediately sure if simply 
removing the SharedMemoryManager-relevant lines from your patch is the right 
solution but I wanted to mention this thought right away.

Thank you for discovering that SyncManager was being overlooked in the tests 
and the nice patch in issue35917.

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-10 Thread Davin Potts


Davin Potts  added the comment:

Docs and tests are now available in a new PR.  I have stayed focused on getting 
these docs and tests to everyone without delay but that means I have not yet 
had an opportunity to respond to the helpful comments, thoughtful questions, 
and threads that have popped up in the last few days.  I will follow up with 
all comments as quickly as possible starting in the morning.

There are two topics in particular that I hope will trigger a wider discussion: 
 the api around the SharedMemory class and the inclusion-worthiness of the 
shareable_wrap function.

Regarding the api of SharedMemory, the docs explain that not all of the current 
input parameters are supportable/enforceable across platforms.  I believe we 
want an api that is relevant across all platforms but at the same time we do 
not want to unnecessarily suppress/hide functionality that would be useful on 
some platforms -- there needs to be a balance between these motivations but 
where do we strike that balance?

Regarding the inclusion-worthiness of the shareable_wrap function, I 
deliberately did not include it in the docs but its docstring in the code 
explains its purpose.  If included, it would drastically simplify working with 
NumPy arrays; please see the code example in the docs demonstrating the use of 
NumPy arrays without the aid of the shareable_wrap function.  I have received 
feedback from others using this function also worth discussing.


Thank you to everyone who has already looked at the code and shared helpful 
thoughts -- please have a look at the tests and docs.

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-10 Thread Davin Potts


Change by Davin Potts :


--
pull_requests: +11834

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-10 Thread Davin Potts


Change by Davin Potts :


--
pull_requests: +11834, 11835, 11836

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-10 Thread Davin Potts


Change by Davin Potts :


--
pull_requests: +11834, 11835

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35903] Build of posixshmem.c should probe for required OS functions

2019-02-05 Thread Davin Potts


Davin Potts  added the comment:

Agreed that the logic for building that code needs exactly this sort of change. 
 Thanks for the patch!

It looks like your patch does not happily detect the dependencies on MacOS for 
some reason, but all appears well on Windows & Linux.  I will have a closer 
look in the morning on a MacOS system.

--

___
Python tracker 
<https://bugs.python.org/issue35903>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-04 Thread Davin Potts


Davin Potts  added the comment:

@lukasz.langa: Missing tests and documentation will be in by alpha2.

--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-03 Thread Davin Potts


Davin Potts  added the comment:

This work is the result of ~1.5 years of development effort, much of it 
accomplished at the last two core dev sprints.  The code behind it has been 
stable since September 2018 and tested as an independently installable package 
by multiple people.

I was encouraged by Lukasz, Yury, and others to check in this code early, not 
waiting for tests and docs, in order to both solicit more feedback and provide 
for broader testing.  I understand that doing such a thing is not at all a 
novelty.  Thankfully it is doing that -- I hope that feedback remains 
constructive and supportive.

There are some tests to be found in a branch (enh-tests-shmem) of 
github.com/applio/cpython which I think should become more comprehensive before 
inclusion.  Temporarily deferring and not including them as part of the first 
alpha should reduce the complexity of that release.

Regarding the BSD license on the C code being adopted, my conversations with 
Brett and subsequently Van have not raised concerns, far from it -- there is a 
process which is being followed to the letter.  If there are other reasons to 
object to the thoughtful adoption of code licensed like this one, that deserves 
a decoupled and larger discussion first.

--
nosy: +brett.cannon

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-02-01 Thread Davin Potts


Davin Potts  added the comment:


New changeset e5ef45b8f519a9be9965590e1a0a587ff584c180 by Davin Potts in branch 
'master':
bpo-35813: Added shared_memory submodule of multiprocessing. (#11664)
https://github.com/python/cpython/commit/e5ef45b8f519a9be9965590e1a0a587ff584c180


--

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-01-23 Thread Davin Potts


Change by Davin Potts :


--
keywords: +patch, patch
pull_requests: +11470, 11471
stage:  -> patch review

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-01-23 Thread Davin Potts


Change by Davin Potts :


--
keywords: +patch, patch, patch
pull_requests: +11470, 11471, 11472
stage:  -> patch review

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-01-23 Thread Davin Potts


Change by Davin Potts :


--
keywords: +patch
pull_requests: +11470
stage:  -> patch review

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35813] shared memory construct to avoid need for serialization between processes

2019-01-23 Thread Davin Potts


New submission from Davin Potts :

A facility for using shared memory would permit direct, zero-copy access to 
data across distinct processes (especially when created via multiprocessing) 
without the need for serialization, thus eliminating the primary performance 
bottleneck in the most common use cases for multiprocessing.

Currently, multiprocessing communicates data from one process to another by 
first serializing it (by default via pickle) on the sender's end then 
de-serializing it on the receiver's end.  Because distinct processes possess 
their own process memory space, no data in memory is common across processes 
and thus any information to be shared must be communicated over a 
socket/pipe/other mechanism.  Serialization via tools like pickle is convenient 
especially when supporting processes on physically distinct hardware with 
potentially different architectures (which multiprocessing does also support).  
Such serialization is wasteful and potentially unnecessary when multiple 
multiprocessing.Process instances are running on the same machine.  The cost of 
this serialization is believed to be a non-trivial drag on performance when 
using multiprocessing on multi-core and/or SMP machines.

While not a new concept (System V Shared Memory has been around for quite some 
time), the proliferation of support for shared memory segments on modern 
operating systems (Windows, Linux, *BSDs, and more) provides a means for 
exposing a consistent interface and api to a shared memory construct usable 
across platforms despite technical differences in the underlying implementation 
details of POSIX shared memory versus Native Shared Memory (Windows).

For further reading/reference:  Tools such as the posix_ipc module have 
provided fairly mature apis around POSIX shared memory and seen use in other 
projects.  The "shared-array", "shared_ndarray", and "sharedmem-numpy" packages 
all have interesting implementations for exposing NumPy arrays via shared 
memory segments.  PostgreSQL has a consistent internal API for offering shared 
memory across Windows/Unix platforms based on System V, enabling use on 
NetBSD/OpenBSD before those platforms supported POSIX shared memory.

At least initially, objects which support the buffer protocol can be most 
readily shared across processes via shared memory.  From a design standpoint, 
the use of a Manager instance is likely recommended to enforce access rules in 
different processes via proxy objects as well as cleanup of shared memory 
segments once an object is no longer referenced.  The documentation around 
multiprocessing's existing sharedctypes submodule (which uses a single  memory 
segment through the heap submodule with its own memory management 
implementation to "malloc" space for allowed ctypes and then "free" that space 
when no longer used, recycling it for use again from the shared memory segment) 
will need to be updated to avoid confusion over concepts.

Ultimately, the primary motivation is to provide a path for better parallel 
execution performance by eliminating the need to transmit data between distinct 
processes on a single system (not for use in distributed memory architectures). 
 Secondary use cases have been suggested including a means for sharing data 
across concurrent Python interactive shells, potential use with 
subinterpreters, and other traditional uses for shared memory since the first 
introduction of System V Shared Memory onwards.

--
assignee: davin
components: Library (Lib)
messages: 334278
nosy: davin, eric.snow, lukasz.langa, ned.deily, rhettinger, yselivanov
priority: normal
severity: normal
status: open
title: shared memory construct to avoid need for serialization between processes
type: enhancement
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue35813>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33725] Python crashes on macOS after fork with no exec

2018-12-09 Thread Davin Potts


Davin Potts  added the comment:

@ned.deily: Apologies, I misread what you wrote -- I would like to see the 
random segfaults that you were seeing on Mojave if you can still point me to a 
few.

--

___
Python tracker 
<https://bugs.python.org/issue33725>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33725] Python crashes on macOS after fork with no exec

2018-12-09 Thread Davin Potts


Davin Potts  added the comment:

Do we really need to disable the running of test_multiprocessing_fork entirely 
on MacOS?

My understanding so far is that not *all* of the system libraries on the mac 
are spinning up threads and so we should expect that there are situations where 
fork alone may be permissible, but of course we don't yet know what those are.  
Pragmatically speaking, I have not yet seen a report of 
test_multiprocessing_fork tests triggering this problem but I would like to 
see/hear that when it is observed (that's my pitch for leaving the tests 
enabled).

--

___
Python tracker 
<https://bugs.python.org/issue33725>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35219] macOS 10.14 Mojave crashes in multiprocessing

2018-11-14 Thread Davin Potts


Davin Potts  added the comment:

Resolution is marked dupe but status is still open.  Are we closing this one or 
is there a more specific remedy for this situation (as opposed to what 
issue33725 discusses) that would be helpful to document?

--
nosy: +davin

___
Python tracker 
<https://bugs.python.org/issue35219>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33725] Python crashes on macOS after fork with no exec

2018-11-14 Thread Davin Potts


Davin Potts  added the comment:

Given the original post mentioned 2.7.15, I wonder if it is feasible to fork 
near the beginning of execution, then maintain and pass around a 
multiprocessing.Pool to be used when needed instead of dynamically forking?  
Working with legacy code is almost always more interesting than you want it to 
be.

--

___
Python tracker 
<https://bugs.python.org/issue33725>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33725] Python crashes on macOS after fork with no exec

2018-11-14 Thread Davin Potts


Davin Potts  added the comment:

Barry's effort as well as comments in other links seem to all suggest that 
OBJC_DISABLE_INITIALIZE_FORK_SAFETY is not comprehensive in its ability to make 
other threads "safe" before forking.

"Objective-C classes defined by the OS frameworks remain fork-unsafe" (from 
@kapilt's first link) suggests we furthermore remain at risk using certain 
MacOS system libraries prior to any call to fork.

"To guarantee that forking is safe, the application must not be running any 
threads at the point of fork" (from @kapilt's second link) is an old truth that 
we continue to fight with even when we know very well that it's the truth.

For newly developed code, we have the alternative to employ spawn instead of 
fork to avoid these problems in Python, C, Ruby, etc.  For existing legacy code 
that employed fork and now surprises us by failing-fast on MacOS 10.13 and 
10.14, it seems we are forced to face a technical debt incurred back when the 
choice was first made to spin up threads and afterwards to use fork.

If we didn't already have an "obvious" (zen of Python) way to avoid such 
problems with spawn versus fork, I would feel this was something to solve in 
Python.  As to helping the poor unfortunate souls who must fight the good fight 
with legacy code, I am not sure what to do to help though I would like to be 
able to help.

--

___
Python tracker 
<https://bugs.python.org/issue33725>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35242] multiprocessing.Queue in an inconsistent state and a traceback silently suppressed if put an unpickable object and process's target function is finished

2018-11-14 Thread Davin Potts


Change by Davin Potts :


--
nosy: +davin

___
Python tracker 
<https://bugs.python.org/issue35242>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33196] multiprocessing: serialization must ensure that contexts are compatible (the same)

2018-11-14 Thread Davin Potts


Change by Davin Potts :


--
nosy: +davin

___
Python tracker 
<https://bugs.python.org/issue33196>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31308] forkserver process isn't re-launched if it died

2017-09-10 Thread Davin Potts

Davin Potts added the comment:

I have two concerns with this:
1) The implicit restart of the forkserver process seems in conflict with the 
zen of making things explicit.
2) This would seem to make forkserver's behavior inconsistent with the behavior 
of things like the Manager which similarly creates its own process for managing 
resources but does not automatically restart that process if it should die or 
become unreachable.  In the case of the Manager, I don't think we'd want it to 
automagically restart anything in these situations so it's not a simple matter 
of enhancing the Manager to adopt similar behavior.

I do appreciate the use cases that would be addressed by having a convenient 
way to detect that a forkserver has died and then restart it.  If the 
forkserver dies, I doubt we really want it to try to restart a potentially 
infinite number of times.

Maybe a better path would be if we had a way to explicitly request that the 
Process trigger a restart of the forkserver, if necessary, but this 
setting/request defaults to False?

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31308>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20854] multiprocessing.managers.Server: problem with returning proxy of registered object

2017-09-07 Thread Davin Potts

Davin Potts added the comment:

It appears that the multiple workarounds proposed by the OP (@allista) address 
the original request and that there is no bug or unintended behavior arising 
from multiprocessing itself.  Combined with the lack of activity in this 
discussion, I'm inclined to believe that the workarounds have satisfied the OP 
and this issue should be closed.

--
nosy: +davin
status: open -> pending
type:  -> behavior

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue20854>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30339] test_multiprocessing_main_handling: "RuntimeError: Timed out waiting for results" on x86 Windows7 3.x

2017-05-22 Thread Davin Potts

Davin Potts added the comment:

Patch on issue30317 also addresses this issue in a more flexible way.

--
dependencies: +test_timeout() of 
test_multiprocessing_spawn.WithManagerTestBarrier fails randomly on x86 
Windows7 3.x buildbot
nosy: +davin

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30339>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30317] test_timeout() of test_multiprocessing_spawn.WithManagerTestBarrier fails randomly on x86 Windows7 3.x buildbot

2017-05-22 Thread Davin Potts

Davin Potts added the comment:

To better accommodate very slow buildbots, a parameter is added in PR-1722 to 
scale up the timeout durations where they are necessary on a per-machine basis.

Relevant tests have a timeout set to some default number of seconds times a 
multiplier value.

The multiplier value can be controlled by the environment variable 
'CONF_TIMEOUT_MULTIPLIER' which defaults to a multiplier of 1.0 if not set.  On 
buildbots, this environment variable can be set by defining a parameter by that 
name in the buildbot configuration file for a machine.  Otherwise, this 
environment variable can be set in the usual way before running tests on 
non-buildbot machines.

--
nosy: +davin, zach.ware
stage:  -> patch review

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30317>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30317] test_timeout() of test_multiprocessing_spawn.WithManagerTestBarrier fails randomly on x86 Windows7 3.x buildbot

2017-05-22 Thread Davin Potts

Changes by Davin Potts <pyt...@discontinuity.net>:


--
pull_requests: +1810

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30317>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28053] parameterize what serialization is used in multiprocessing

2017-05-18 Thread Davin Potts

Davin Potts added the comment:

Docs need updating still.

--
versions: +Python 3.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue28053>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26434] multiprocessing cannot spawn grandchild from a Windows service

2017-05-18 Thread Davin Potts

Davin Potts added the comment:

Patch committed in 2.7 branch.

Thanks for your help, Marc.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26434>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26434] multiprocessing cannot spawn grandchild from a Windows service

2017-05-18 Thread Davin Potts

Davin Potts added the comment:


New changeset c47c315812b1fa9acb16510a7aa3b37d113def48 by Davin Potts (Marc 
Schlaich) in branch '2.7':
bpo-26434: Fix multiprocessing grandchilds in a Windows service (GH-1167)
https://github.com/python/cpython/commit/c47c315812b1fa9acb16510a7aa3b37d113def48


--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26434>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30379] multiprocessing Array create for ctypes.c_char, TypeError unless 1 char string arg used

2017-05-16 Thread Davin Potts

Davin Potts added the comment:

Perhaps I should've used ctypes.c_uint8 in that example/question instead.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30379] multiprocessing Array create for ctypes.c_char, TypeError unless 1 char string arg used

2017-05-16 Thread Davin Potts

Davin Potts added the comment:

Maybe I missed your point but why would you not want to do this instead?

>>> mp.Array(ctypes.c_int8, arr)
>

--
nosy: +davin

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30018] multiprocessing.Pool garbles call stack for __new__

2017-04-07 Thread Davin Potts

Davin Potts added the comment:

> I am unfortunately not at liberty to share the code I'm working on.

I very much understand and am very thankful you took the time to create a 
simple example that you could share.  Honestly, that's the reason I felt 
inspired to stop what I was doing to look at this now rather than later.


> I suppose I should just work around it by checking right away if the input to 
> my constructor has already been constructed!

There are probably a number of different ways to address it but your suggestion 
of adding a check to see if this is the first time that object has been 
constructed sounds like it might be an easy win.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30018>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30018] multiprocessing.Pool garbles call stack for __new__

2017-04-07 Thread Davin Potts

Davin Potts added the comment:

Expanding my above example to show how multiprocessing relates:
>>> import multiprocessing
>>> import os
>>> class Floof(object):
... def __new__(cls):
... print("New via pid=%d" % os.getpid())
... return object.__new__(cls)
... 
>>> os.getpid()   # parent pid
46560
>>> pool = multiprocessing.Pool(1)
>>> getter = pool.apply_async(Floof, (), {})  # output seen from child AND 
>>> parent
>>> New via pid=46583
New via pid=46560

>>> getter.get()  # everything seems to be working 
>>> as intended
<__main__.Floof object at 0x10866f250>


FWIW, near the end of my prior message:  s/it didn't merely/it merely/

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30018>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30018] multiprocessing.Pool garbles call stack for __new__

2017-04-07 Thread Davin Potts

Davin Potts added the comment:

It looks like the first 'Called Foo.__new__' is being reported by the child 
(pool of 1) process and the second 'Called Foo.__new__' is being reported by 
the parent process.  In multiprocessing, because objects are by default 
serialized using pickle, this may be caused by the unpickling of the Foo object 
by the parent process which is something you would not experience when using 
ThreadPool because it does not have the same need for serialization.

Example showing invocation of __new__ as part of unpickling:
>>> class Foo(object):
... def __new__(cls):
... print("New")
... return object.__new__(cls)
... 
>>> import pickle
>>> f = Foo()
New
>>> pf = pickle.dumps(f, protocol=2)
>>> pickle.loads(pf)  # unpickling triggers __new__
New
<__main__.Foo object at 0x1084a06d0>



Having discovered this phenomenon, is this causing a problem for you somewhere 
in code?  (Your example code on github was helpful, thank you, but it didn't 
merely demonstrated the behavior and didn't show where this was causing you 
pain.)

--
nosy: +davin

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30018>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29828] Allow registering after-fork initializers in multiprocessing

2017-03-16 Thread Davin Potts

Davin Potts added the comment:

Having a read through issue16500 and issue6721, I worry that this could again 
become bogged down with similar concerns.

With the specific example of NumPy, I am not sure I would want its random 
number generator to be reseeded with each forked process.  There are many 
situations where I very much need to preserve the original seed and/or current 
PRNG state.

I do not yet see a clear, motivating use case even after reading those two 
older issues.  I worry that if it were added it would (almost?) never get used 
either because the need is rare or because developers will more often think of 
how this can be solved in their own target functions when they first start up.  
The suggestion of a top-level function and Context method make good sense to me 
as a place to offer such a thing but is there a clearer use case?

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29828>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29795] Clarify how to share multiprocessing primitives

2017-03-12 Thread Davin Potts

Changes by Davin Potts <pyt...@discontinuity.net>:


--
resolution:  -> works for me
stage: needs patch -> resolved
status: open -> closed

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29795>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17560] problem using multiprocessing with really big objects?

2017-03-12 Thread Davin Potts

Davin Potts added the comment:

@artxyz: The current release of 2.7 is 2.7.13 -- if you are still using 2.7.5 
you might consider updating to the latest release.

As pointed out in the text of the issue, the multiprocessing pickler has been 
made pluggable in 3.3 and it's been made more conveniently so in 3.6.  The 
issue reported here arises from the constraints of working with large objects 
and pickle, hence the enhanced ability to take control of the multiprocessing 
pickler in 3.x applies.

I'll assign this issue to myself as a reminder to create a blog post around 
this example and potentially include it as a motivating need for controlling 
the multiprocessing pickler in the documentation.

--
assignee:  -> davin
nosy: +davin

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17560>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29795] Clarify how to share multiprocessing primitives

2017-03-11 Thread Davin Potts

Davin Potts added the comment:

On Windows, because that OS does not support fork, multiprocessing uses spawn 
to create new processes by default.  Note that in Python 3, multiprocessing 
provides the user with a choice of how to create new processes (i.e. fork, 
spawn, forkserver).

When fork is used, the 'q = Queue()' in this example would be executed once by 
the parent process before the fork takes place, the resulting child process 
continues execution from the same point as the parent when it triggered the 
fork, and thus both parent and child processes would see the same 
multiprocessing.Queue.  When spawn is used, a new process is spawned and the 
whole of this example script would be executed again from scratch by the child 
process, resulting in the child (spawned) process creating a new Queue object 
of its own with no sense of connection to the parent.


Would you be up for proposing replacement text to improve the documentation?  
Getting the documentation just right so that everyone understands it is worth 
spending time on.

--
nosy: +davin
stage:  -> needs patch
type: behavior -> enhancement

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29795>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29701] Add close method to queue.Queue

2017-03-04 Thread Davin Potts

Davin Potts added the comment:

The example of AMQP is perhaps a stronger argument for why 
multiprocessing.Queue.close should (or does) exist, not as much a reason for 
queue.Queue.

The strongest point, I think, is the argument that existing patterns are 
lacking.

In the multiprocessing module, the pattern of placing None into a queue.Queue 
to communicate between threads is also used but with a slightly different use 
case:  a queue may have multiple None's added to it so that the queue's 
contents may be fully consumed and at the end the consumers understand to not 
look for more work when they each get a None.  It might be restated as "do your 
work, then close".  If close were introduced to queue.Queue as proposed, it 
would not eliminate the need for this pattern.

Thankfully inside multiprocessing the number of threads is known (for example, 
a thread to manage each process created by multiprocessing) making code 
possible such as:  `inqueue.queue.extend([None] * size)`.  In the more general 
case, the point that `size` is not always known is a valid one.  In this same 
vein, other parts of multiprocessing could potentially make use of 
queue.Queue.close but at least in multiprocessing's specific case I'm not sure 
I see a compelling simplification to warrant the change.  Though 
multiprocessing doesn't provide one, I think it would be helpful to see 
concrete use cases where there would be a clear benefit.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29701>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29454] Shutting down consumer on a remote queue

2017-03-04 Thread Davin Potts

Davin Potts added the comment:

My understanding of other message queueing systems is that many are motivated 
by speed to the point that they will permit messages to be "lost" due to 
specific scenarios that would be overly costly to defend against.  Other 
message queueing systems adopt a philosophy that no message should ever be lost 
but as a compromise to speed do not promise that a message will be immediately 
recovered when caught in one of these problematic scenarios, only that it will 
eventually be recovered and processed fully.

It appears that the philosophy adopted or really the solution requirements lead 
to different best practices.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29454>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29454] Shutting down consumer on a remote queue

2017-03-04 Thread Davin Potts

Changes by Davin Potts <pyt...@discontinuity.net>:


--
stage:  -> needs patch
type: behavior -> enhancement

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29454>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29454] Shutting down consumer on a remote queue

2017-03-04 Thread Davin Potts

Davin Potts added the comment:

My understanding is that example uses a queue.Queue() to demonstrate how to 
create a custom, remote service from scratch.  The implementation in this 
simple example lacks the sophistication of multiprocessing.Queue() for handling 
situations such as the one raised by the OP.  The example was not attempting to 
demonstrate a comprehensive replacement for multiprocessing.Queue(), rather it 
was attempting to demonstrate the mechanism for creating and consuming a 
callable service hosted by a remote manager.  The documentation currently does 
not introduce this example well nor describe the above motivation.

As to why this simplistic implementation of a distributed queue appears to lose 
an item when the client is killed, it works in the following way:
1.  Let's say a server is started to hold a queue.Queue() which is populated 
with 1 item.
2.  A client requests an item from the server.
3.  The server receives the request and performs a blocking q.get() (where q is 
the queue.Queue() object held by the server).
4.  When the q.get() releases and returns an item, q has had one item removed 
leaving a queue size of 0 in our scenario, and then that item is sent from the 
server to the client.
5.  A client requests another item from the server.
6.  The server receives the request and performs a blocking q.get() on the 
queue.  Because there's nothing left to grab from the queue, the server blocks 
and waits for something to magically appear in the queue.  We'll have a 
"producer" put something into the queue in a moment but for the time being the 
server is stuck waiting on the q.get() and likewise the client is waiting on a 
response from the server.
7.  That client is killed in an unexpected, horrible death because someone 
accidentally hits it with a Cntrl-C.
8.  A "producer" comes along and puts a new item into the server's queue.
9.  The server's blocking q.get() call releases, q has had one item removed 
leaving a queue size of 0 again, and then that item is sent from the server to 
the client only the client is dead and the transmission fails.
10. A "producer" comes along and puts another new item into the server's queue.
11. The someone who accidentally, horribly killed the client now frantically 
restarts the client; the client requests an item from the server and the server 
responds with a new item.  However, this is the item introduced in step 10 and 
not the item from step 8.  Hence the item from step 8 appears lost.

Note that in our simplistic example from the docs, there is no functionality to 
repopulate the queue object when communication of the item fails to complete.  
In general, a multiprocessing.manager has no idea what a manager will contain 
and has no insight on what to do when a connection to a client is severed.


Augmenting the example in the docs to cover situations like this would 
significantly complicate the example but there are many others to consider on 
the way to building a comprehensive solution -- instead a person should choose 
multiprocessing.Queue() unless they have something particular in mind.

I think the example should be better introduced (the intro is terse) to explain 
its purpose and warn that it does not offer a comprehensive replacement for 
multiprocessing.Queue().  It does not need to go into all of the above 
explanation.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29454>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29575] doc 17.2.1: basic Pool example is too basic

2017-02-20 Thread Davin Potts

Davin Potts added the comment:

When passing judgement on what is "too basic", the initial example should be so 
basic as to be immediately digestible by as many people as possible.

Some background:
All too many examples mislead newcomers into believing that the number of 
processes should (a) match the number of processor cores, or (b) match the 
number of inputs to be processed.  This example currently attempts to dispel 
both notions.  In practice, and this depends upon what specific code is to be 
performed in parallel, it is not uncommon to find that slightly over-scheduling 
the number of processes versus the number of available cores can achieve 
superior throughput and performance.  In other cases, slightly under-scheduling 
may provide a win.  To help subtly encourage the newcomer, this example uses 5 
processes as opposed to something which might be mistaken for a common number 
of cores available on current multi-core processors.  Likewise, the number of 
distinct inputs to be processed deliberately does not match the number of 
processes nor a multiple of the number of processes.  This hopefully encourages 
the newcomer to not feel obligated to only accept inputs of a particular
  size or multiple.  Granted, optimizing for performance motivates tuning such 
things but this is the first example / first glance at what functionality is 
available.

Considering the suggested change:
* range(20) will likely produce more output than can be comfortably 
accommodated and easily read in the available browser window where most will 
see this
* the addition of execution time measurement is an interesting choice here 
given how computationally trivial the f(x) function is, which is perhaps what 
motivated the introduction of a time.sleep(1) inside that function; a 
ThreadPool would be more appropriate for a sleepy function such as this

Ultimately these changes complicate the example while potentially undermining 
its value.  An interesting improvement to this example might be to introduce a 
computationally taxing function which more clearly demonstrates the benefit of 
using a process Pool but still achieving the ideal of being immediately 
digestible and understood by the largest reading audience.  Some of the 
topics/variations in the proposed change might be better introduced and 
addressed later in the documentation rather than unnecessarily complicating the 
first example.

--
resolution:  -> works for me
stage:  -> resolved
status: open -> closed
type:  -> enhancement

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29575>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19675] Pool dies with excessive workers, but does not cleanup

2017-02-13 Thread Davin Potts

Davin Potts added the comment:

For triggering the exception, supplying a Process target that deliberately 
fails sounds right.

As for tests for the various start methods (fork/forkserver/spawn), if you are 
looking at the 3.x branches you'll find this was been consolidated so that one 
test could conceivably be written to handle multiple variants (see 
Lib/test/_test_multiprocessing.py).

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue19675>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19675] Pool dies with excessive workers, but does not cleanup

2017-02-13 Thread Davin Potts

Davin Potts added the comment:

@Winterflower: Thank you for encouraging @dsoprea to create the new PR and 
working to convert the previous patch.

@dsoprea: Thank you for taking the time to create the PR especially after this 
has been sitting unloved for so long.

Though the new workflow using PR's is still in a bit of a state of flux, my 
understanding is that we will want to have one PR per feature branch (i.e. one 
for each of 2.7, 3.6, 3.7) that we want to target.

Now that we seem to have spawned two parallel discussion tracks (one here and 
one in the PR https://github.com/python/cpython/pull/57), I'm not sure how best 
to resolve that but for the time being I'll offer code-related comments here as 
they're much more likely to be preserved (and thus discoverable) for posterity: 
 we do need some sort of tests around this to complete the patch -- something 
that would exercise both the non-exception and exception paths (and thus would 
detect that intended call to util.debug()).

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue19675>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19675] Pool dies with excessive workers, but does not cleanup

2017-02-13 Thread Davin Potts

Changes by Davin Potts <pyt...@discontinuity.net>:


--
versions: +Python 2.7, Python 3.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue19675>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9914] trace/profile conflict with the use of sys.modules[__name__]

2017-01-30 Thread Davin Potts

Davin Potts added the comment:

Though this issue is specifically concerned with runpy APIs and their impact 
especially in running unittest test scripts, it's worth commenting here for 
people who need a workaround in the short term:  code such as that shared in 
http://stackoverflow.com/q/41892297/1878788 can be made to run happily by 
creating a second script which imports the first and simply runs the test(s) 
from there.

In the specific case of the 'forkiter.py' from 
http://stackoverflow.com/q/41892297/1878788, one would create a 
'run_my_tests.py' with the contents:

from forkiter import main

if __name__ == "__main__":
exit(main())





Now this invocation of cProfile runs happily because pickle is able to see the 
module where all the needed classes/functions were defined:
python3.6 -m cProfile -o forkiter.prof ./run_my_tests.py

--
nosy: +davin

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue9914>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29284] Include thread_name_prefix in the concurrent.futures.ThreadPoolExecutor example 17.4.2.1

2017-01-22 Thread Davin Potts

Changes by Davin Potts <pyt...@discontinuity.net>:


--
nosy: +davin

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29284>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29345] More lost updates with multiprocessing.Value and .Array

2017-01-22 Thread Davin Potts

Davin Potts added the comment:

I'm having difficulty watching your video attachment.  Would it be possible to 
instead describe, preferably with example code that others can similarly try to 
reproduce the behavior, what you're experiencing?

Please keep in mind what the documentation repeatedly advises about the need 
for capturing your process-creating multiprocessing calls inside a "if __name__ 
== '__main__'" clause, especially on Windows platforms.

--
nosy: +davin

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29345>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20804] Sentinels identity lost when pickled (unittest.mock)

2017-01-10 Thread Davin Potts

Davin Potts added the comment:

Serhiy: The above discussion seemed to converge on the perspective that object 
identity should not survive pickling and that the point of a sentinel is object 
identity.  While your proposed patch may mechanically work, I believe it is in 
conflict with the outcome of the thoughtful discussion above.

--
nosy: +davin

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20804>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29229] incompatible: unittest.mock.sentinel and multiprocessing.Pool.map()

2017-01-10 Thread Davin Potts

Davin Potts added the comment:

I think this should be regarded as a duplicate of issue20804 though discussion 
in issue14577 is also related/relevant.

--
superseder:  -> Sentinels identity lost when pickled (unittest.mock)

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29229>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29229] incompatible: unittest.mock.sentinel and multiprocessing.Pool.map()

2017-01-10 Thread Davin Potts

Davin Potts added the comment:

This arises from the behavior of pickle (which is used by default in 
multiprocessing to serialize objects sent to / received from other processes in 
data exchanges), as seen with Python 3.6:

>>> import pickle
>>> x = pickle.dumps(mock.sentinel.foo)
>>> x
b'\x80\x03cunittest.mock\n_SentinelObject\nq\x00)\x81q\x01}q\x02X\x04\x00\x00\x00nameq\x03X\x03\x00\x00\x00fooq\x04sb.'
>>> pickle.loads(x)
sentinel.foo
>>> pickle.loads(x) == mock.sentinel.foo
False

--
nosy: +davin

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29229>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   3   4   >