> On Nov 29, 2019, at 11:46, Augie Fackler <r...@durin42.com> wrote:
> 
> 
> 
> 
>> On Fri, Nov 29, 2019, 06:45 Pierre-Yves David 
>> <pierre-yves.da...@ens-lyon.org> wrote:
>> 
>> 
>> On 11/12/19 4:35 AM, Gregory Szorc wrote:
>> > On Mon, Nov 11, 2019 at 6:32 AM Augie Fackler <r...@durin42.com 
>> > <mailto:r...@durin42.com>> wrote:
>> > 
>> >     (+indygreg)
>> > 
>> >      > On Nov 11, 2019, at 03:04, Pierre-Yves David
>> >     <pierre-yves.da...@ens-lyon.org
>> >     <mailto:pierre-yves.da...@ens-lyon.org>> wrote:
>> >      >
>> >      > Hi everyone,
>> >      >
>> >      > I am looking into introducing parallelism into `hg
>> >     debugupgraderepo`. I already have a very useful prototype that
>> >     precompute in // copies information when converting to side-data
>> >     storage. That prototype use multiprocessing because it is part of
>> >     the stdlib and work quite well for this usecase.
>> >      >
>> >      > However, I know we refrained to use multiprocessing in the past.
>> >     I know the import and boostrap cost was to heavy for things like `hg
>> >     update`. However, I am not sure if there are other reason to rule
>> >     out the multiprocessing module in the `hg debugupgraderepo` case.
>> > 
>> >     I have basically only ever heard bad things about multiprocessing,
>> >     especially on Windows which is the platform where you'd expect it to
>> >     be the most useful (since there's no fork()). I think Greg has more
>> >     details in his head.
>> > 
>> >     That said, I guess feel free to experiment, in the knowledge that it
>> >     probably isn't significantly better than our extant worker system?
>> > 
>> > 
>> > multiprocessing is a pit of despair on Python 2.7. It is a bit better on 
>> > Python 3. But I still don't trust it. I think you are better off using 
>> > `concurrent.futures.ProcessPoolExecutor`.
>> 
>> That looks great, but this is not available in python-2.7
> 
> 
> There's a backport of the 3.x concurrent futures available on pypi, and AIUI 
> it fixes some important bugs in the package that didn't ever land in 2.x. 

We have it vendored :)

Only used on Python 2 via pycompat shim IIRC.

> 
>> 
>> > But I'm not even sure I trust ProcessPoolExecutor on Windows, especially 
>> > when `sys.executable` is `hg.exe` instead of `python.exe`: I think both 
>> > multiprocessing and concurrent.futures make assumptions about how to 
>> > invoke the "run a worker" code on a new process that is invalidated when 
>> > the main process isn't `python.exe`.
>> 
>> That's unfortunate :-/ Any way to reliably test this and get it fixed 
>> upstream ?
>> 
>> > So I think we may have to roll our own "start a worker" code. The 
>> > solution that's been bouncing around in my head is to add a `hg 
>> > debugworker` command (or similar) that dispatches work read from a 
>> > pipe/file descriptor/temp file to a named <module>.<function> callable. 
>> > When then implement a custom executor conforming to the interface that 
>> > concurrent.futures wants and we use that for work dispatch. One of the 
>> > hardest parts here is implementing a fair work scheduler. There are all 
>> > kinds of gnarly problems involving buffering, permissions, cross 
>> > platform differences, etc. Even Rust doesn't have a good cross-platform 
>> > library for this type of message passing last time I asked (a few months 
>> > ago I asked and was advised to use something like 0mq, which made me 
>> > sad). Maybe there is a reasonable Python library we can vendor. But I 
>> > suspect we'll find limitations in any implementation, as this is a 
>> > subtly hard problem.
>> 
>> Yeah, the problem is hard enough that I would rather have external 
>> library dealing with it.
>> 
>> -- 
>> Pierre-Yves David
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Reply via email to