Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Serhiy Storchaka

On 29.01.16 19:05, Steve Dower wrote:

This is probably the code snippet that bothered me the most:

 ### Encoding table
 encoding_table=codecs.charmap_build(decoding_table)

It shows up in many of the encodings modules, and while it is not a bad
function in itself, we are obviously generating a known data structure
on every startup. Storing these in static data is a tradeoff between
disk space and startup performance, and one I think it likely to be
worthwhile.


$ ./python -m timeit -s "import codecs; from encodings.cp437 import 
decoding_table" -- "codecs.charmap_build(decoding_table)"

10 loops, best of 3: 4.36 usec per loop

Getting rid from charmap_build() would save you at most 4.4 microseconds 
per encoding. 0.0005 seconds if you have imported *all* standard encodings!


And how you expected to store encoding_table in more efficient way?

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Oscar Benjamin
On 30 January 2016 at 03:48, Steve Dower  wrote:
>
> It doesn't currently end up on disk. Some tables are partially or completely
> stored on disk as Python source code (some are partially generated from
> simple rules), but others are generated by inverting those. That process
> takes time that could be avoided by storing the generated tables, and
> storing all of it in a format that doesn't require parsing, compiling and
> executing (such as a native array).
>
> Potentially it could be a win all around if we stopped including the
> (larger) source files, but that doesn't seem like a good idea for
> maintaining portability to other implementations. The main thought is making
> the compiler binary bigger to avoid generating encoding tables at startup.

When I last tried to profile startup on Windows (I haven't used
Windows for some time now) it seemed that the time was totally
dominated by file system access. Essentially the limiting factor was
the inordinate number of stat calls and small file accesses. Although
this was probably Python 2.x which may not import those particular
modules and maybe it depends on virus scanner software etc.

Things may have changed now but I concluded that substantive gains
could only come from improving FS access. Perhaps something like
zipping up the standard library would see a big improvement.

--
Oscar
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Steve Dower

On 30Jan2016 0645, Serhiy Storchaka wrote:

$ ./python -m timeit -s "import codecs; from encodings.cp437 import
decoding_table" -- "codecs.charmap_build(decoding_table)"
10 loops, best of 3: 4.36 usec per loop

Getting rid from charmap_build() would save you at most 4.4 microseconds
per encoding. 0.0005 seconds if you have imported *all* standard encodings!


Just as happy to be proven wrong. Perhaps I misinterpreted my original 
profiling and then, embarrassingly, ran with the result for a long time 
without retesting.



And how you expected to store encoding_table in more efficient way?


There's nothing inefficient about its storage, but as it does not change 
it would be trivial to store it statically. Then "building" the map is 
simply obtaining a pointer into an already loaded memory page. Much 
faster than building it on load, but both are clearly insignificant 
compared to other factors.


Cheers,
Steve

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Serhiy Storchaka

On 30.01.16 18:31, Steve Dower wrote:

On 30Jan2016 0645, Serhiy Storchaka wrote:

$ ./python -m timeit -s "import codecs; from encodings.cp437 import
decoding_table" -- "codecs.charmap_build(decoding_table)"
10 loops, best of 3: 4.36 usec per loop

Getting rid from charmap_build() would save you at most 4.4 microseconds
per encoding. 0.0005 seconds if you have imported *all* standard
encodings!


Just as happy to be proven wrong. Perhaps I misinterpreted my original
profiling and then, embarrassingly, ran with the result for a long time
without retesting.


AFAIK the most time is spent in system calls like stat or open. 
Archiving the stdlib into the ZIP file and using zipimport can decrease 
Python startup time (perhaps there is an open issue about this).



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Brett Cannon
On Sat, 30 Jan 2016 at 10:21 Serhiy Storchaka  wrote:

> On 30.01.16 18:31, Steve Dower wrote:
> > On 30Jan2016 0645, Serhiy Storchaka wrote:
> >> $ ./python -m timeit -s "import codecs; from encodings.cp437 import
> >> decoding_table" -- "codecs.charmap_build(decoding_table)"
> >> 10 loops, best of 3: 4.36 usec per loop
> >>
> >> Getting rid from charmap_build() would save you at most 4.4 microseconds
> >> per encoding. 0.0005 seconds if you have imported *all* standard
> >> encodings!
> >
> > Just as happy to be proven wrong. Perhaps I misinterpreted my original
> > profiling and then, embarrassingly, ran with the result for a long time
> > without retesting.
>
> AFAIK the most time is spent in system calls like stat or open.
> Archiving the stdlib into the ZIP file and using zipimport can decrease
> Python startup time (perhaps there is an open issue about this).
>

Check the archives, but  I did trying freezing the entire stdlib and it
didn't really make a difference in startup, so I don't know if this still
holds true anymore.

At this point I think all of our knowledge of what takes the most amount of
time during startup is outdated and someone should try to really profile
the whole thing to see where the hotspots are (e.g., is it stat calls from
imports, is it actually some specific function, is it just so many little
things adding up to a big thing, etc.).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Steve Dower
Brett tried freezing the entire stdlib at one point (as we do for parts of 
importlib) and reported no significant improvement. Since that rules out code 
compilation as well as the OS calls, it'd seem the priority is to execute less 
code on startup.

Details of that work were posted to python-dev about twelve months ago, IIRC. 
Maybe a little longer.

Top-posted from my Windows Phone

-Original Message-
From: "Serhiy Storchaka" 
Sent: ‎1/‎30/‎2016 10:22
To: "python-dev@python.org" 
Subject: Re: [Python-Dev] More optimisation ideas

On 30.01.16 18:31, Steve Dower wrote:
> On 30Jan2016 0645, Serhiy Storchaka wrote:
>> $ ./python -m timeit -s "import codecs; from encodings.cp437 import
>> decoding_table" -- "codecs.charmap_build(decoding_table)"
>> 10 loops, best of 3: 4.36 usec per loop
>>
>> Getting rid from charmap_build() would save you at most 4.4 microseconds
>> per encoding. 0.0005 seconds if you have imported *all* standard
>> encodings!
>
> Just as happy to be proven wrong. Perhaps I misinterpreted my original
> profiling and then, embarrassingly, ran with the result for a long time
> without retesting.

AFAIK the most time is spent in system calls like stat or open. 
Archiving the stdlib into the ZIP file and using zipimport can decrease 
Python startup time (perhaps there is an open issue about this).


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Brett Cannon
On Sat, Jan 30, 2016, 12:30 Sven R. Kunze  wrote:

> On 30.01.2016 19:20, Serhiy Storchaka wrote:
> > AFAIK the most time is spent in system calls like stat or open.
> > Archiving the stdlib into the ZIP file and using zipimport can
> > decrease Python startup time (perhaps there is an open issue about this).
>
> Oh, please don't. One thing I love about Python is the ease of access.
>

It wouldn't be a requirement, just a nootion


> I personally think that startup time is not really a big issue; even
> when it comes to microbenchmarks.
>

You might not, but just about every command-line app does.

-brett


> Best,
> Sven
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Sven R. Kunze

On 30.01.2016 19:20, Serhiy Storchaka wrote:
AFAIK the most time is spent in system calls like stat or open. 
Archiving the stdlib into the ZIP file and using zipimport can 
decrease Python startup time (perhaps there is an open issue about this).


Oh, please don't. One thing I love about Python is the ease of access.

I personally think that startup time is not really a big issue; even 
when it comes to microbenchmarks.


Best,
Sven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Sven R. Kunze

On 30.01.2016 21:32, Brett Cannon wrote:
On Sat, Jan 30, 2016, 12:30 Sven R. Kunze > wrote:


On 30.01.2016 19:20, Serhiy Storchaka wrote:
> AFAIK the most time is spent in system calls like stat or open.
> Archiving the stdlib into the ZIP file and using zipimport can
> decrease Python startup time (perhaps there is an open issue
about this).

Oh, please don't. One thing I love about Python is the ease of access.


It wouldn't be a requirement, just a nootion



That's good. :)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com