Re: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals

Sean Harrington Fri, 12 Oct 2018 05:37:17 -0700

Hi Nathaniel - this if this solution can be made performant, than I would
be more than satisfied.

I think this would require removing "func" from the "task tuple", and
storing the "func" "once per worker" somewhere globally (maybe a class
attribute set post-fork?).

This also has the beneficial outcome of increasing general performance of
Pool.map and friends. I've seen MANY folks across the interwebs doing
things like passing instance methods to map, resulting in "big" tasks, and
slower-than-sequential parallelized code. Parallelizing "instance methods"
by passing them to map, w/o needing to wrangle with staticmethods and
globals, would be a GREAT feature! It'd just be as easy as:

    Pool.map(self.func, ls)

What do you think about this idea? This is something I'd be able to take
on, assuming I get a few core dev blessings...

On Thu, Oct 4, 2018 at 6:15 AM Nathaniel Smith <n...@pobox.com> wrote:

> On Wed, Oct 3, 2018 at 6:30 PM, Sean Harrington <seanhar...@gmail.com>
> wrote:
> > with Pool(func_kwargs={"big_cache": big_cache}) as pool:
> >     pool.map(func, ls)
>
> I feel like it would be nicer to spell this:
>
> with Pool() as pool:
>     pool.map(functools.partial(func, big_cache=big_cache), ls)
>
> And this might also solve your problem, if pool.map is clever enough
> to only send the function object once to each worker?
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
>

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals

Reply via email to