Re: Opinion needed: multiprocessing usage

Pierre-Yves David Mon, 02 Dec 2019 03:09:09 -0800

On 11/29/19 8:51 PM, Gregory Szorc wrote:

On Nov 29, 2019, at 11:46, Augie Fackler <r...@durin42.com> wrote:

On Fri, Nov 29, 2019, 06:45 Pierre-Yves David<pierre-yves.da...@ens-lyon.org<mailto:pierre-yves.da...@ens-lyon.org>> wrote:




    On 11/12/19 4:35 AM, Gregory Szorc wrote:
    > On Mon, Nov 11, 2019 at 6:32 AM Augie Fackler <r...@durin42.com
    <mailto:r...@durin42.com>
    > <mailto:r...@durin42.com <mailto:r...@durin42.com>>> wrote:
    >
    >     (+indygreg)
    >
    >      > On Nov 11, 2019, at 03:04, Pierre-Yves David
    >     <pierre-yves.da...@ens-lyon.org
    <mailto:pierre-yves.da...@ens-lyon.org>
    >     <mailto:pierre-yves.da...@ens-lyon.org
    <mailto:pierre-yves.da...@ens-lyon.org>>> wrote:
    >      >
    >      > Hi everyone,
    >      >
    >      > I am looking into introducing parallelism into `hg
    >     debugupgraderepo`. I already have a very useful prototype that
    >     precompute in // copies information when converting to side-data
    >     storage. That prototype use multiprocessing because it is
    part of
    >     the stdlib and work quite well for this usecase.
    >      >
    >      > However, I know we refrained to use multiprocessing in
    the past.
    >     I know the import and boostrap cost was to heavy for things
    like `hg
    >     update`. However, I am not sure if there are other reason to
    rule
    >     out the multiprocessing module in the `hg debugupgraderepo`
    case.
    >
    >     I have basically only ever heard bad things about
    multiprocessing,
    >     especially on Windows which is the platform where you'd
    expect it to
    >     be the most useful (since there's no fork()). I think Greg
    has more
    >     details in his head.
    >
    >     That said, I guess feel free to experiment, in the knowledge
    that it
    >     probably isn't significantly better than our extant worker
    system?
    >
    >
    > multiprocessing is a pit of despair on Python 2.7. It is a bit
    better on
    > Python 3. But I still don't trust it. I think you are better off
    using
    > `concurrent.futures.ProcessPoolExecutor`.

    That looks great, but this is not available in python-2.7

There's a backport of the 3.x concurrent futures available on pypi,and AIUI it fixes some important bugs in the package that didn't everland in 2.x.


We have it vendored :)

Only used on Python 2 via pycompat shim IIRC.

I looked into the ProcessPoolExecutor further and It does not hasanything handi to deal with my "do not load the whole repositor data inmemory at once". So I think the multiprocessing handicraft patch Iemailed on friday are still relevant. Sending a V2 now with yuyafeedback applied.


Cheers,

--
Pierre-Yves David
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Re: Opinion needed: multiprocessing usage

Reply via email to