subject:"\[Python\-Dev\] A fast startup patch \(was\: Python startup time\)"

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-07 Thread Ryan Gonzalez


On May 7, 2018 9:15:32 PM Steve Dower <steve.do...@python.org> wrote:

“the data shows that a focused change to address file system inefficiencies 
has the potential to broadly and transparently deliver benefit to users 
without affecting existing code or workflows.”


This is consistent with a Node.js experiment I heard about where they 
compiled an entire application in a single (HUGE!) .js file. Reading a 
single large file from disk is quicker than many small files on every 
significant file system I’m aware of. Is there benefit to supporting import 
of .tar files as we currently do .zip? Or perhaps having a special 
fast-path for uncompressed .zip files?


I kind of built something like this, though I haven't really put in the 
effort to make it overly usable yet:


https://github.com/kirbyfan64/bluesnow

(Bonus points to anyone who gets the character reference in the name, 
though I seriously doubt it.)


Main thing I noticed was that reading compiled .pyc files is far faster 
than uncompiled Python code, even if you eliminate the disk access. Kind of 
obvious in retrospect, but still something to note


However, there are more obstacles to this in the Python world than the JS 
world. C extensions have a heavier prevalence here, distribution is a bit 
weirder (sorry, even with Pipfiles), and JavaScript already has an entire 
ecosystem built around packing files together from the web world.




Top-posted from my Windows phone

From: Carl Shapiro
Sent: Monday, May 7, 2018 14:36
To: Nathaniel Smith
Cc: Nick Coghlan; Python Dev
Subject: Re: [Python-Dev] A fast startup patch (was: Python startup time)

On Fri, May 4, 2018 at 6:58 PM, Nathaniel Smith <n...@pobox.com> wrote:
What are the obstacles to including "preloaded" objects in regular .pyc 
files, so that everyone can take advantage of this without rebuilding the 
interpreter?


The system we have developed can create a shared object file for each 
compiled Python file.  However, such a representation is not directly 
usable.  First, certain shared constants, such as interned strings, must be 
kept globally unique across object code files.  Second, some marshaled 
objects, such as the hashed collections, must be initialized with 
randomization state that is not available until after the hosting runtime 
has been initialized.


We are able to work around the first issue by generating a heap image with 
the transitive closure of all modules that will be loaded which allows us 
to easily maintain uniqueness guarantees.  We are able to work around the 
second issue with some unobservable changes to the affected data structures.

 
Based on our numbers, it appears there should be some hesitancy--at this 
time--to changing the format of compiled Python file for the sake of 
load-time performance.  In contrast, the data shows that a focused change 
to address file system inefficiencies has the potential to broadly and 
transparently deliver benefit to users without affecting existing code or 
workflows. 





--
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com





--
Ryan (ライアン)
Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else
https://refi64.com/


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-07 Thread Steve Dower

“the data shows that a focused change to address file system inefficiencies has 
the potential to broadly and transparently deliver benefit to users without 
affecting existing code or workflows.”

This is consistent with a Node.js experiment I heard about where they compiled 
an entire application in a single (HUGE!) .js file. Reading a single large file 
from disk is quicker than many small files on every significant file system I’m 
aware of. Is there benefit to supporting import of .tar files as we currently 
do .zip? Or perhaps having a special fast-path for uncompressed .zip files?

Top-posted from my Windows phone

From: Carl Shapiro
Sent: Monday, May 7, 2018 14:36
To: Nathaniel Smith
Cc: Nick Coghlan; Python Dev
Subject: Re: [Python-Dev] A fast startup patch (was: Python startup time)

On Fri, May 4, 2018 at 6:58 PM, Nathaniel Smith <n...@pobox.com> wrote:
What are the obstacles to including "preloaded" objects in regular .pyc files, 
so that everyone can take advantage of this without rebuilding the interpreter?

The system we have developed can create a shared object file for each compiled 
Python file.  However, such a representation is not directly usable.  First, 
certain shared constants, such as interned strings, must be kept globally 
unique across object code files.  Second, some marshaled objects, such as the 
hashed collections, must be initialized with randomization state that is not 
available until after the hosting runtime has been initialized.

We are able to work around the first issue by generating a heap image with the 
transitive closure of all modules that will be loaded which allows us to easily 
maintain uniqueness guarantees.  We are able to work around the second issue 
with some unobservable changes to the affected data structures.
 
Based on our numbers, it appears there should be some hesitancy--at this 
time--to changing the format of compiled Python file for the sake of load-time 
performance.  In contrast, the data shows that a focused change to address file 
system inefficiencies has the potential to broadly and transparently deliver 
benefit to users without affecting existing code or workflows. 

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-07 Thread Carl Shapiro

On Fri, May 4, 2018 at 6:58 PM, Nathaniel Smith  wrote:

> What are the obstacles to including "preloaded" objects in regular .pyc
> files, so that everyone can take advantage of this without rebuilding the
> interpreter?
>

The system we have developed can create a shared object file for each
compiled Python file.  However, such a representation is not directly
usable.  First, certain shared constants, such as interned strings, must be
kept globally unique across object code files.  Second, some marshaled
objects, such as the hashed collections, must be initialized with
randomization state that is not available until after the hosting runtime
has been initialized.

We are able to work around the first issue by generating a heap image with
the transitive closure of all modules that will be loaded which allows us
to easily maintain uniqueness guarantees.  We are able to work around the
second issue with some unobservable changes to the affected data structures.

Based on our numbers, it appears there should be some hesitancy--at this
time--to changing the format of compiled Python file for the sake of
load-time performance.  In contrast, the data shows that a focused change
to address file system inefficiencies has the potential to broadly and
transparently deliver benefit to users without affecting existing code or
workflows.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-05 Thread Nick Coghlan

On 6 May 2018 at 05:34, Brett Cannon  wrote:

>
>
> On Sat, 5 May 2018 at 10:41 Eric Fahlgren  wrote:
>
>> On Sat, May 5, 2018 at 10:30 AM, Toshio Kuratomi 
>> wrote:
>>
>>> On Fri, May 4, 2018, 7:00 PM Nathaniel Smith  wrote:
>>>
 What are the obstacles to including "preloaded" objects in regular .pyc
 files, so that everyone can take advantage of this without rebuilding the
 interpreter?

>>>
>>> Would this make .pyc files arch specific?
>>>
>>
>> Or have parallel "pyh" (Python "heap") files, that are architecture
>> specific...
>>
>
> .pyc files have tags to specify details about them (e.g. were they
> compiled with -OO), so this isn't an "all or nothing" option, nor does it
> require a different file extension. There just needs to be an appropriate
> finder that knows how to recognize a .pyc file with the appropriate tag
> that can be used, and then a loader that knows how to read that .pyc.
>

Right, this is the kind of change I had in mind (perhaps in combination
with Diana Clarke's suggestion from several months back to make pyc tagging
more feature-flag centric, rather than the current focus on a numeric
optimisation level).

We also wouldn't ever generate this hypothetical format implicitly -
similar to the new deterministic pyc's in 3.7, they'd be something you had
to explicitly request via a compileall invocation. In the Linux distro use
case then, the relevant distro packaging helper scripts and macros could
generate traditional cross-platform pyc files for no-arch packages, but
automatically switch to the load-time optimised arch-specific format if the
package was already arch-specific.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-05 Thread Brett Cannon

On Sat, 5 May 2018 at 10:41 Eric Fahlgren  wrote:

> On Sat, May 5, 2018 at 10:30 AM, Toshio Kuratomi 
> wrote:
>
>> On Fri, May 4, 2018, 7:00 PM Nathaniel Smith  wrote:
>>
>>> What are the obstacles to including "preloaded" objects in regular .pyc
>>> files, so that everyone can take advantage of this without rebuilding the
>>> interpreter?
>>>
>>
>> Would this make .pyc files arch specific?
>>
>
> Or have parallel "pyh" (Python "heap") files, that are architecture
> specific...
>

.pyc files have tags to specify details about them (e.g. were they compiled
with -OO), so this isn't an "all or nothing" option, nor does it require a
different file extension. There just needs to be an appropriate finder that
knows how to recognize a .pyc file with the appropriate tag that can be
used, and then a loader that knows how to read that .pyc.

> (But that would cost more stat calls.)
>

Nope, we actually cache directory contents so file lookup existence is
essentially free (this is why importlib.invalidate_caches() exists
specifically to work around when the timestamp is too coarse for a
directory content mutation).

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-05 Thread Terry Reedy


On 5/5/2018 2:33 PM, Toshio Kuratomi wrote:



On Sat, May 5, 2018, 10:40 AM Eric Fahlgren > wrote:


On Sat, May 5, 2018 at 10:30 AM, Toshio Kuratomi >wrote:

On Fri, May 4, 2018, 7:00 PM Nathaniel Smith > wrote:

What are the obstacles to including "preloaded" objects in
regular .pyc files, so that everyone can take advantage of
this without rebuilding the interpreter?


Would this make .pyc files arch specific?


Or have parallel "pyh" (Python "heap") files, that are architecture
specific... (But that would cost more stat calls.)


I ask because arch specific byte code files are a big change in 
consumers expectations.  It's not necessarily a bad change but it should 
be communicated to downstreams so they can decide how to adjust to it.


Linux distros which ship byte code files will need to build them for 
each arch, for instance.  People who ship just the byte code as an 
obfuscation of the source code will need to decide whether to ship 
packages for each arch they care about or change how they distribute.


It is an advertised feature that CPython *can* produce cross-platform 
version-specific .pyc files.  I believe this should continue, at least 
for a few releases.  They are currently named modname.cpython-xy.pyc, 
with optional '.opt-1', '.opt-2', and '.opt-4' tags inserted before  in 
__pycache__.  These name formats should continue to mean what they do now.


I believe *can* should not mean *always*.  Architecture-specific files 
will need an additional architecture tag anyway, such as win32 and 
win64, anyway.  Or would bitness and endianess be sufficient across 
platforms?   If we make architecture-specific the default, we could add 
startup and compile and compile_all options for the cross-platform 
format.  Or maybe add a recompile function that imports cross-platform 
.pycs and outputs local-architecture .pycs.


--
Terry Jan Reedy


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-05 Thread Miro Hrončok


On 5.5.2018 21:00, Nathaniel Smith wrote:
I think in the vast majority of cases currently .pyc files are built on 
the same architecture where they're used?


On Fedora (and by extension also on RHEL and CentOS) this is not rue. 
When the package is noarch (no extension module shipped, only pure 
Python) it is built and bytecompiled on a random architecture. 
Bytecompilation happens during build time.


If bytecode gets arch specific, we'd need to make all our Python 
packages arch specific or switch to install-time bytecompilation.



--
Miro Hrončok
--
Phone: +420777974800
IRC: mhroncok
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-05 Thread Glenn Linderman


On 5/5/2018 10:30 AM, Toshio Kuratomi wrote:
On Fri, May 4, 2018, 7:00 PM Nathaniel Smith > wrote:


What are the obstacles to including "preloaded" objects in regular
.pyc files, so that everyone can take advantage of this without
rebuilding the interpreter?


Would this make .pyc files arch specific?


Lots of room in the __pycache__ folder.

As compilation of the .py module proceeds, could it be determined if 
there is anything that needs to be architecture specific, and emit an 
architecture-specific one or an architecture-independent one as 
appropriate?  Data structures are mostly bitness-dependent, no?


But if an architecture-specific .pyc is required, could/should it be 
structured and named according to the OS conventions also:  .dll .so  .etc ?


Even if it doesn't contain executable code, the bytecode could be 
contained in appropriate data sections, and there has been talk about 
doing relocation of pointer in such pre-compiled data structures, and 
the linker _already_ can do that sort of thing...
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-05 Thread Nathaniel Smith

On Sat, May 5, 2018, 11:34 Toshio Kuratomi  wrote:

>
>
> On Sat, May 5, 2018, 10:40 AM Eric Fahlgren 
> wrote:
>
>> On Sat, May 5, 2018 at 10:30 AM, Toshio Kuratomi 
>> wrote:
>>
>>> On Fri, May 4, 2018, 7:00 PM Nathaniel Smith  wrote:
>>>
 What are the obstacles to including "preloaded" objects in regular .pyc
 files, so that everyone can take advantage of this without rebuilding the
 interpreter?

>>>
>>> Would this make .pyc files arch specific?
>>>
>>
>> Or have parallel "pyh" (Python "heap") files, that are architecture
>> specific... (But that would cost more stat calls.)
>>
>
> I ask because arch specific byte code files are a big change in consumers
> expectations.  It's not necessarily a bad change but it should be
> communicated to downstreams so they can decide how to adjust to it.
>
> Linux distros which ship byte code files will need to build them for each
> arch, for instance.  People who ship just the byte code as an obfuscation
> of the source code will need to decide whether to ship packages for each
> arch they care about or change how they distribute.
>

That's a good point.

One way to minimize the disruption would be to include both the old and new
info in the .pyc files, so at load time if the new version is incompatible
then you can fall back on the old way, even if it's a bit slower.

I think in the vast majority of cases currently .pyc files are built on the
same architecture where they're used? Pip and Debian/Ubuntu and the
interpreter's automatic compilation-on-import all build .pyc files on the
computer where they'll be run.

It might also be worth double checking much the memory layout of these
objects even varies. Obviously it'll be different for 32- and 64-bit
systems, but beyond that, most ISAs and OSes and compilers use pretty
similar struct layout rules AFAIK... we're not talking about actual machine
code.

-n

>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-05 Thread Toshio Kuratomi

On Sat, May 5, 2018, 10:40 AM Eric Fahlgren  wrote:

> On Sat, May 5, 2018 at 10:30 AM, Toshio Kuratomi 
> wrote:
>
>> On Fri, May 4, 2018, 7:00 PM Nathaniel Smith  wrote:
>>
>>> What are the obstacles to including "preloaded" objects in regular .pyc
>>> files, so that everyone can take advantage of this without rebuilding the
>>> interpreter?
>>>
>>
>> Would this make .pyc files arch specific?
>>
>
> Or have parallel "pyh" (Python "heap") files, that are architecture
> specific... (But that would cost more stat calls.)
>

I ask because arch specific byte code files are a big change in consumers
expectations.  It's not necessarily a bad change but it should be
communicated to downstreams so they can decide how to adjust to it.

Linux distros which ship byte code files will need to build them for each
arch, for instance.  People who ship just the byte code as an obfuscation
of the source code will need to decide whether to ship packages for each
arch they care about or change how they distribute.

-Toshio

>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-05 Thread Eric Fahlgren

On Sat, May 5, 2018 at 10:30 AM, Toshio Kuratomi  wrote:

> On Fri, May 4, 2018, 7:00 PM Nathaniel Smith  wrote:
>
>> What are the obstacles to including "preloaded" objects in regular .pyc
>> files, so that everyone can take advantage of this without rebuilding the
>> interpreter?
>>
>
> Would this make .pyc files arch specific?
>

Or have parallel "pyh" (Python "heap") files, that are architecture
specific... (But that would cost more stat calls.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-05 Thread Toshio Kuratomi

On Fri, May 4, 2018, 7:00 PM Nathaniel Smith  wrote:

> What are the obstacles to including "preloaded" objects in regular .pyc
> files, so that everyone can take advantage of this without rebuilding the
> interpreter?
>

Would this make .pyc files arch specific?

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-05 Thread Nick Coghlan

On 5 May 2018 at 11:58, Nathaniel Smith  wrote:

> What are the obstacles to including "preloaded" objects in regular .pyc
> files, so that everyone can take advantage of this without rebuilding the
> interpreter?
>
> Off the top of my head:
>
> We'd be making the in-memory layout of those objects part of the .pyc
> format, so we couldn't change that within a minor release. I suspect this
> wouldn't be a big change though, since we already commit to ABI
> compatibility for C extensions within a minor release? In principle there
> are some cases where this would be different (e.g. adding new fields at the
> end of an object is generally ABI compatible), but this might not be an
> issue for the types of objects we're talking about.
>

I'd frame this one a bit differently: what if we had a platform-specific
variant of the pyc format that was essentially a frozen module packaged as
an extension module? We probably couldn't quite do that for arbitrary
Python modules *today* (due to the remaining capability differences between
regular modules and extension modules), but multi-phase initialisation gets
things *much* closer to parity, and running embedded bytecode instead of
accessing the C API directly should avoid the limitations that exist for
classes defined in C.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-04 Thread Nathaniel Smith

What are the obstacles to including "preloaded" objects in regular .pyc
files, so that everyone can take advantage of this without rebuilding the
interpreter?

Off the top of my head:

We'd be making the in-memory layout of those objects part of the .pyc
format, so we couldn't change that within a minor release. I suspect this
wouldn't be a big change though, since we already commit to ABI
compatibility for C extensions within a minor release? In principle there
are some cases where this would be different (e.g. adding new fields at the
end of an object is generally ABI compatible), but this might not be an
issue for the types of objects we're talking about.

There's some memory management concern, since these are, y'know, heap
objects, and we wouldn't be heap allocating them. The main constraint would
be that you couldn't free them one at a time, but would have to free the
whole block at once. But I think it at least wouldn't be too hard to track
whether any of the objects in the block are still alive, and free the whole
block if there aren't any. E.g., we could have an object flag that means
"when this object is freed, don't call free(), instead find the containing
block and decrement its live-object count. You probably need this flag even
in the current version, right? (And the flag could also be an escape hatch
if we did need to change object size: check for the flag before accessing
the new fields.) Or maybe you could get clever tracking object liveness on
an page by page basis; not sure it's worth it though. Unloading
module-level objects is pretty rare.

I'm assuming these objects can have pointers to each other, and to well
known constants like None, so you need some kind of relocation engine to
fix those up. Right now I guess you're probably using the one built into
the dynamic loader? In theory it shouldn't be too hard to write our own –
basically just a list of offsets in the block where we need to add the base
address or write the address of a well known constant, I think?

Anything else I'm missing?

On Fri, May 4, 2018, 16:06 Carl Shapiro  wrote:

> On Fri, May 4, 2018 at 5:14 AM, Nick Coghlan  wrote:
>
>> This definitely seems interesting, but is it something you'd be seeing us
>> being able to take advantage of for conventional Python installations, or
>> is it more something you'd expect to be useful for purpose-built
>> interpreter instances? (e.g. if Mercurial were running their own Python,
>> they could precache the heap objects for their commonly imported modules in
>> their custom interpreter binary, regardless of whether those were standard
>> library modules or not).
>>
>
> Yes, this would be a win for a conventional Python installation as well.
> Specifically, users and their scripts would enjoy a reduction in
> cold-startup time.
>
> In the numbers I showed yesterday, the version of the interpreter with our
> patch applied included unmarshaled data for the modules that always appear
> on the sys.modules list after an ordinary interpreter cold-start.  I
> believe it is worthwhile to including that set of modules in the standard
> CPython interpreter build.  Expanding that set to include the commonly
> imported modules might be an additional win, especially for short-running
> scripts.
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/njs%40pobox.com
>

On May 4, 2018 16:06, "Carl Shapiro"  wrote:

On Fri, May 4, 2018 at 5:14 AM, Nick Coghlan  wrote:

> This definitely seems interesting, but is it something you'd be seeing us
> being able to take advantage of for conventional Python installations, or
> is it more something you'd expect to be useful for purpose-built
> interpreter instances? (e.g. if Mercurial were running their own Python,
> they could precache the heap objects for their commonly imported modules in
> their custom interpreter binary, regardless of whether those were standard
> library modules or not).
>

Yes, this would be a win for a conventional Python installation as well.
Specifically, users and their scripts would enjoy a reduction in
cold-startup time.

In the numbers I showed yesterday, the version of the interpreter with our
patch applied included unmarshaled data for the modules that always appear
on the sys.modules list after an ordinary interpreter cold-start.  I
believe it is worthwhile to including that set of modules in the standard
CPython interpreter build.  Expanding that set to include the commonly
imported modules might be an additional win, especially for short-running
scripts.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-04 Thread Carl Shapiro

On Fri, May 4, 2018 at 5:14 AM, Nick Coghlan  wrote:

> This definitely seems interesting, but is it something you'd be seeing us
> being able to take advantage of for conventional Python installations, or
> is it more something you'd expect to be useful for purpose-built
> interpreter instances? (e.g. if Mercurial were running their own Python,
> they could precache the heap objects for their commonly imported modules in
> their custom interpreter binary, regardless of whether those were standard
> library modules or not).
>

Yes, this would be a win for a conventional Python installation as well.
Specifically, users and their scripts would enjoy a reduction in
cold-startup time.

In the numbers I showed yesterday, the version of the interpreter with our
patch applied included unmarshaled data for the modules that always appear
on the sys.modules list after an ordinary interpreter cold-start.  I
believe it is worthwhile to including that set of modules in the standard
CPython interpreter build.  Expanding that set to include the commonly
imported modules might be an additional win, especially for short-running
scripts.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-04 Thread Nick Coghlan

On 4 May 2018 at 06:13, Carl Shapiro  wrote:

> Hello,
>
> Yesterday Neil Schemenauer mentioned some work that a colleague of mine
> (CCed) and I have done to improve CPython start-up time.  Given the recent
> discussion, it seems timely to discuss what we are doing and whether it is
> of interest to other people hacking on the CPython runtime.
>
> There are many ways to reduce the start-up time overhead.  For this
> experiment, we are specifically targeting the cost of unmarshaling heap
> objects from compiled Python bytecode.  Our measurements show this specific
> cost to represent 10% to 25% of the start-up time among the applications we
> have examined.
>
> Our approach to eliminating this overhead is to store unmarshaled objects
> into the data segment of the python executable.  We do this by processing
> the compiled python bytecode for a module, creating native object code with
> the unmarshaled objects in their in-memory representation, and linking this
> into the python executable.
>
> When a module is imported, we simply return a pointer to the top-level
> code object in the data segment directly without invoking the unmarshaling
> code or touching the file system.  What we are doing is conceptually
> similar to the existing capability to freeze a module, but we avoid
> non-trivial unmarshaling costs.
>

This definitely seems interesting, but is it something you'd be seeing us
being able to take advantage of for conventional Python installations, or
is it more something you'd expect to be useful for purpose-built
interpreter instances? (e.g. if Mercurial were running their own Python,
they could precache the heap objects for their commonly imported modules in
their custom interpreter binary, regardless of whether those were standard
library modules or not).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-03 Thread Gregory P. Smith

+1 to the concept!

On Thu, May 3, 2018 at 1:13 PM, Carl Shapiro  wrote:

> Hello,
>
> Yesterday Neil Schemenauer mentioned some work that a colleague of mine
> (CCed) and I have done to improve CPython start-up time.  Given the recent
> discussion, it seems timely to discuss what we are doing and whether it is
> of interest to other people hacking on the CPython runtime.
>
> There are many ways to reduce the start-up time overhead.  For this
> experiment, we are specifically targeting the cost of unmarshaling heap
> objects from compiled Python bytecode.  Our measurements show this specific
> cost to represent 10% to 25% of the start-up time among the applications we
> have examined.
>
> Our approach to eliminating this overhead is to store unmarshaled objects
> into the data segment of the python executable.  We do this by processing
> the compiled python bytecode for a module, creating native object code with
> the unmarshaled objects in their in-memory representation, and linking this
> into the python executable.
>
> When a module is imported, we simply return a pointer to the top-level
> code object in the data segment directly without invoking the unmarshaling
> code or touching the file system.  What we are doing is conceptually
> similar to the existing capability to freeze a module, but we avoid
> non-trivial unmarshaling costs.
>
> The patch is still under development and there is still a little bit more
> work to do.  With that in mind, the numbers look good but please take these
> with a grain of salt
>
> Baseline
>
> $ bench "./python.exe -c ''"
> benchmarking ./python.exe -c ''
> time 31.46 ms   (31.24 ms .. 31.78 ms)
>  1.000 R²   (0.999 R² .. 1.000 R²)
> mean 32.08 ms   (31.82 ms .. 32.63 ms)
> std dev  778.1 μs   (365.6 μs .. 1.389 ms)
>
> $ bench "./python.exe -c 'import difflib'"
> benchmarking ./python.exe -c 'import difflib'
> time 32.82 ms   (32.64 ms .. 33.02 ms)
>  1.000 R²   (1.000 R² .. 1.000 R²)
> mean 33.17 ms   (33.01 ms .. 33.44 ms)
> std dev  430.7 μs   (233.8 μs .. 675.4 μs)
>
>
> With our patch
>
> $ bench "./python.exe -c ''"
> benchmarking ./python.exe -c ''
> time 24.86 ms   (24.62 ms .. 25.08 ms)
>  0.999 R²   (0.999 R² .. 1.000 R²)
> mean 25.58 ms   (25.36 ms .. 25.94 ms)
> std dev  592.8 μs   (376.2 μs .. 907.8 μs)
>
> $ bench "./python.exe -c 'import difflib'"
> benchmarking ./python.exe -c 'import difflib'
> time 25.30 ms   (25.00 ms .. 25.55 ms)
>  0.999 R²   (0.998 R² .. 1.000 R²)
> mean 26.78 ms   (26.30 ms .. 27.64 ms)
> std dev  1.413 ms   (747.5 μs .. 2.250 ms)
> variance introduced by outliers: 20% (moderately inflated)
>
>
> Here are some numbers with the patch but with the stat calls preserved to
> isolate just the marshaling effects
>
> Baseline
>
> $ bench "./python.exe -c 'import difflib'"
> benchmarking ./python.exe -c 'import difflib'
> time 34.67 ms   (33.17 ms .. 36.52 ms)
>  0.995 R²   (0.990 R² .. 1.000 R²)
> mean 35.36 ms   (34.81 ms .. 36.25 ms)
> std dev  1.450 ms   (1.045 ms .. 2.133 ms)
> variance introduced by outliers: 12% (moderately inflated)
>
>
> With our patch (and calls to stat)
>
> $ bench "./python.exe -c 'import difflib'"
> benchmarking ./python.exe -c 'import difflib'
> time 30.24 ms   (29.02 ms .. 32.66 ms)
>  0.988 R²   (0.968 R² .. 1.000 R²)
> mean 31.86 ms   (31.13 ms .. 32.75 ms)
> std dev  1.789 ms   (1.329 ms .. 2.437 ms)
> variance introduced by outliers: 17% (moderately inflated)
>
>
> (This work was done in CPython 3.6 and we are exploring back-porting to
> 2.7 so we can run the hg startup benchmarks in the performance test suite.)
>
> This is effectively a drop-in replacement for the frozen module capability
> and (so far) required only minimal changes to the runtime.  To us, it seems
> like a very nice win without compromising on compatibility or complexity.
> I am happy to discuss more of the technical details until we have a public
> patch available.
>
> I hope this provides some optimism around the possibility of improving the
> start-up time of CPython.  What do you all think?
>
> Kindly,
>
> Carl
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> greg%40krypto.org
>
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] A fast startup patch (was: Python startup time)

2018-05-03 Thread Carl Shapiro

Hello,

Yesterday Neil Schemenauer mentioned some work that a colleague of mine
(CCed) and I have done to improve CPython start-up time.  Given the recent
discussion, it seems timely to discuss what we are doing and whether it is
of interest to other people hacking on the CPython runtime.

There are many ways to reduce the start-up time overhead.  For this
experiment, we are specifically targeting the cost of unmarshaling heap
objects from compiled Python bytecode.  Our measurements show this specific
cost to represent 10% to 25% of the start-up time among the applications we
have examined.

Our approach to eliminating this overhead is to store unmarshaled objects
into the data segment of the python executable.  We do this by processing
the compiled python bytecode for a module, creating native object code with
the unmarshaled objects in their in-memory representation, and linking this
into the python executable.

When a module is imported, we simply return a pointer to the top-level code
object in the data segment directly without invoking the unmarshaling code
or touching the file system.  What we are doing is conceptually similar to
the existing capability to freeze a module, but we avoid non-trivial
unmarshaling costs.

The patch is still under development and there is still a little bit more
work to do.  With that in mind, the numbers look good but please take these
with a grain of salt

Baseline

$ bench "./python.exe -c ''"
benchmarking ./python.exe -c ''
time 31.46 ms   (31.24 ms .. 31.78 ms)
 1.000 R²   (0.999 R² .. 1.000 R²)
mean 32.08 ms   (31.82 ms .. 32.63 ms)
std dev  778.1 μs   (365.6 μs .. 1.389 ms)

$ bench "./python.exe -c 'import difflib'"
benchmarking ./python.exe -c 'import difflib'
time 32.82 ms   (32.64 ms .. 33.02 ms)
 1.000 R²   (1.000 R² .. 1.000 R²)
mean 33.17 ms   (33.01 ms .. 33.44 ms)
std dev  430.7 μs   (233.8 μs .. 675.4 μs)


With our patch

$ bench "./python.exe -c ''"
benchmarking ./python.exe -c ''
time 24.86 ms   (24.62 ms .. 25.08 ms)
 0.999 R²   (0.999 R² .. 1.000 R²)
mean 25.58 ms   (25.36 ms .. 25.94 ms)
std dev  592.8 μs   (376.2 μs .. 907.8 μs)

$ bench "./python.exe -c 'import difflib'"
benchmarking ./python.exe -c 'import difflib'
time 25.30 ms   (25.00 ms .. 25.55 ms)
 0.999 R²   (0.998 R² .. 1.000 R²)
mean 26.78 ms   (26.30 ms .. 27.64 ms)
std dev  1.413 ms   (747.5 μs .. 2.250 ms)
variance introduced by outliers: 20% (moderately inflated)


Here are some numbers with the patch but with the stat calls preserved to
isolate just the marshaling effects

Baseline

$ bench "./python.exe -c 'import difflib'"
benchmarking ./python.exe -c 'import difflib'
time 34.67 ms   (33.17 ms .. 36.52 ms)
 0.995 R²   (0.990 R² .. 1.000 R²)
mean 35.36 ms   (34.81 ms .. 36.25 ms)
std dev  1.450 ms   (1.045 ms .. 2.133 ms)
variance introduced by outliers: 12% (moderately inflated)


With our patch (and calls to stat)

$ bench "./python.exe -c 'import difflib'"
benchmarking ./python.exe -c 'import difflib'
time 30.24 ms   (29.02 ms .. 32.66 ms)
 0.988 R²   (0.968 R² .. 1.000 R²)
mean 31.86 ms   (31.13 ms .. 32.75 ms)
std dev  1.789 ms   (1.329 ms .. 2.437 ms)
variance introduced by outliers: 17% (moderately inflated)


(This work was done in CPython 3.6 and we are exploring back-porting to 2.7
so we can run the hg startup benchmarks in the performance test suite.)

This is effectively a drop-in replacement for the frozen module capability
and (so far) required only minimal changes to the runtime.  To us, it seems
like a very nice win without compromising on compatibility or complexity.
I am happy to discuss more of the technical details until we have a public
patch available.

I hope this provides some optimism around the possibility of improving the
start-up time of CPython.  What do you all think?

Kindly,

Carl
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

Re: [Python-Dev] A fast startup patch (was: Python startup time)

[Python-Dev] A fast startup patch (was: Python startup time)

18 matches

Site Navigation

Mail list logo

Footer information