On Wed, May 2, 2018, 4:53 AM Gregory Szorc <gregory.sz...@gmail.com> wrote:
> On 7/19/2017 12:15 PM, Larry Hastings wrote: > > > > > > On 07/19/2017 05:59 AM, Victor Stinner wrote: > >> Mercurial startup time is already 45.8x slower than Git whereas tested > >> Mercurial runs on Python 2.7.12. Now try to sell Python 3 to Mercurial > >> developers, with a startup time 2x - 3x slower... > > > > When Matt Mackall spoke at the Python Language Summit some years back, I > > recall that he specifically complained about Python startup time. He > > said Python 3 "didn't solve any problems for [them]"--they'd already > > solved their Unicode hygiene problems--and that Python's slow startup > > time was already a big problem for them. Python 3 being /even slower/ > > to start was absolutely one of the reasons why they didn't want to > upgrade. > > > > You might think "what's a few milliseconds matter". But if you run > > hundreds of commands in a shell script it adds up. git's speed is one > > of the few bright spots in its UX, and hg's comparative slowness here is > > a palpable disadvantage. > > > > > >> So please continue efforts for make Python startup even faster to beat > >> all other programming languages, and finally convince Mercurial to > >> upgrade ;-) > > > > I believe Mercurial is, finally, slowly porting to Python 3. > > > > https://www.mercurial-scm.org/wiki/Python3 > > > > Nevertheless, I can't really be annoyed or upset at them moving slowly > > to adopt Python 3, as Matt's objections were entirely legitimate. > > I just now found found this thread when searching the archive for > threads about startup time. And I was searching for threads about > startup time because Mercurial's startup time has been getting slower > over the past few months and this is causing substantial pain. > > As I posted back in 2014 [1], CPython's startup overhead was >10% of the > total CPU time in Mercurial's test suite. And when you factor in the > time to import modules that get Mercurial to a point where it can run > commands, it was more like 30%! > > Mercurial's full test suite currently runs `hg` ~25,000 times. Using > Victor's startup time numbers of 6.4ms for 2.7 and 14.5ms for > 3.7/master, Python startup overhead contributes ~160s on 2.7 and ~360s > on 3.7/master. Even if you divide this by the number of available CPU > cores, we're talking dozens of seconds of wall time just waiting for > CPython to get to a place where Mercurial's first bytecode can execute. > > And the problem is worse when you factor in the time it takes to import > Mercurial's own modules. > > As a concrete example, I recently landed a Mercurial patch [2] that > stubs out zope.interface to prevent the import of 9 modules on every > `hg` invocation. This "only" saved ~6.94ms for a typical `hg` > invocation. But this decreased the CPU time required to run the test > suite on my i7-6700K from ~4450s to ~3980s (~89.5% of original) - a > reduction of almost 8 minutes of CPU time (and over 1 minute of wall time)! > > By the time CPython gets Mercurial to a point where we can run useful > code, we've already blown most of or past the time budget where humans > perceive an action/command as instantaneous. If you ignore startup > overhead, Mercurial's performance compares quite well to Git's for many > operations. But the reality is that CPython startup overhead makes it > look like Mercurial is non-instantaneous before Mercurial even has the > opportunity to execute meaningful code! > > Mercurial provides a `chg` program that essentially spins up a daemon > `hg` process running a "command server" so the `chg` program [written in > C - no startup overhead] can dispatch commands to an already-running > Python/`hg` process and avoid paying the startup overhead cost. When you > run Mercurial's test suite using `chg`, it completes *minutes* faster. > `chg` exists mainly as a workaround for slow startup overhead. > > Changing gears, my day job is maintaining Firefox's build system. We use > Python heavily in the build system. And again, Python startup overhead > is problematic. I don't have numbers offhand, but we invoke likely a few > hundred Python processes as part of building Firefox. It should be > several thousand. But, we've had to "hack" parts of the build system to > "batch" certain build actions in single process invocations in order to > avoid Python startup overhead. This undermines the ability of some build > tools to formulate a reasonable understanding of the DAG and it causes a > bit of pain for build system developers and makes it difficult to > achieve "no-op" and fast incremental builds because we're always > invoking certain Python processes because we've had to move DAG > awareness out of the build backend and into Python. At some point, we'll > likely replace Python code with Rust so the build system is more "pure" > and easier to maintain and reason about. > > I've seen posts in this thread and elsewhere in the CPython development > universe that challenge whether milliseconds in startup time matter. > Speaking as a Mercurial and Firefox build system developer, > *milliseconds absolutely matter*. Going further, *fractions of > milliseconds matter*. For Mercurial's test suite with its ~25,000 Python > process invocations, 1ms translates to ~25s of CPU time. With 2.7, > Mercurial can dispatch commands in ~50ms. When you load common > extensions, it isn't uncommon to see process startup overhead of > 100-150ms! A millisecond here. A millisecond there. Before you know it, > we're talking *minutes* of CPU (and potentially wall) time in order to > run Mercurial's test suite (or build Firefox, or ...). > > From my perspective, Python process startup and module import overhead > is a severe problem for Python. I don't say this lightly, but in my mind > the problem causes me to question the viability of Python for popular > use cases, such as CLI applications. When choosing a programming > language, I want one that will scale as a project grows. Vanilla process > overhead has Python starting off significantly slower than compiled code > (or even Perl) and adding module import overhead into the mix makes > Python slower and slower as projects grow. As someone who has to deal > with this slowness on a daily basis, I can tell you that it is extremely > frustrating and it does matter. I hope that the importance of the > problem will be acknowledged (milliseconds *do* matter) and that > creative minds will band together to address it. Since I am > disproportionately impacted by this issue, if there's anything I can do > to help, let me know > Is your Python interpreter statically linked? The Python 3 ones from the anaconda distribution (use Miniconda!) are for Linux and macOS and that roughly halved our startup times. > Gregory > > [1] https://mail.python.org/pipermail/python-dev/2014-May/134528.html > [2] https://www.mercurial-scm.org/repo/hg/rev/856f381ad74b > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/mingw.android%40gmail.com >
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com