On Sat, Apr 12, 2014 at 12:07 AM, Sturla Molden <sturla.mol...@gmail.com> wrote:
> On 12/04/14 00:39, Nathaniel Smith wrote:
>
>> The spawn mode is fine and all, but (a) the presence of something in
>> 3.4 helps only a minority of users, (b) "spawn" is not a full
>> replacement for fork;
>
> It basically does the same as on Windows. If you want portability to
> Windows, you must abide by these restrictions anyway.

Yes, but "sorry Unix guys, we've decided to take away this nice
feature from you because it doesn't work on Windows" is a really
terrible argument. If it can't be made to work, then fine, but fork
safety is just not *that* much to ask.

>> with large read-mostly data sets it can be a
>> *huge* win to load them into the parent process and then let them be
>> COW-inherited by forked children.
>
> The thing is that Python reference counts breaks COW fork. This has been
> discussed several times on the Python-dev list. What happens is that as
> soon as the child process updates a refcount, the OS copies the page.
> And because of how Python behaves, this copying of COW-marked pages
> quickly gets excessive. Effectively the performance of os.fork in Python
> will close to a non-COW fork. A suggested solution is to move the
> refcount out of the PyObject struct, and perhaps keep them in a
> dedicated heap. But doing so will be unfriendly to cache.

Yes, it's limited, but again this is not a reason to break it in the
cases where it *does* work. The case where I ran into this was loading
a big language model using SRILM:
  http://www.speech.sri.com/projects/srilm/
  https://github.com/njsmith/pysrilm
This produces a single Python object that references an opaque,
tens-of-gigabytes mess of C++ objects. For this case explicit shared
mem is useless, but fork worked brilliantly.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to