Re: [Python-Dev] More optimisation ideas

2016-02-06 Thread Stephen Hansen
On Fri, Feb 5, 2016, at 10:33 AM, Emile van Sebille wrote:
> On 2/5/2016 9:37 AM, Alexander Walters wrote:
> >
> > On 2/5/2016 12:27, Emile van Sebille wrote:
> >> On 2/1/2016 9:20 AM, Ethan Furman wrote:
> >>> On 02/01/2016 08:40 AM, R. David Murray wrote:
> >> 
>  On the other hand, if the distros go the way Nick has (I think) been
>  advocating, and have a separate 'system python for system scripts' that
>  is independent of the one installed for user use, having the
>  system-only
>  python be frozen and sourceless would actually make sense on a
>  couple of
>  levels.
> >>>
> >>> Agreed.
> >>
> >> Except for that nasty licensing issue requiring source code.
> >>
> >> Emile
> > Licensing requires, in the GPL at least, that the *modified* sources be
> > made *available*, not that they be shipped with the product. Looking at
> > the Python license, and what tools already do, there is zero need to
> > ship the source to stay compliant.
> 
> Hmm, the annotated Open Source Definition explicitly states "The program 
> must include source code" -- how did I misinterpret that?

Couple things.

First, the OSD is not authoritative. Python's license establishes the
rules of its distribution: that Python's license is considered
compatible with the OSD doesn't actually mean your reading of anything
on the OSD page as having any binding meaning.

Second, OSD's Rule 2 means that those who are distributing Python -- the
PSF, originally -- must provide source code if they're distributing it
under Python's license, but it doesn't actually mean it must be packaged
with it in every download. In fact, its not today. The standard library
source is included in normal downloads, but the C source of Python
isn't. But you can download it readily though, so that's fine. Its fully
compliant with the OSD.

But! If Debian (pulling them out of a hat randomly) is distributing
Python, they aren't the PSF, and notably are not bound by the OSD rules,
only by Python's license terms. The PSF satisfied their requirements to
the licensing terms when releasing Python, but now Debian has Python,
and they are distributing it-- that's an entirely separate act, and you
must look at them as a separate actor in terms of the license. They
don't have to distribute it in the same license. They must be ABLE to
(as OSD's Rule 3 says), but they don't HAVE to. Some random person can
take Python, rename it Snakey, and release it under almost any license
they want and give no one the source code at all. 

Python has from the beginning allowed this:its actually in quite a few
closed source / proprietary products without ever advertising it and
providing no source, entirely legally and ethically -- Python's gone out
of its way to support this sort of use-case. 

As it happens, Debian usually distributes something very close to the
official release (sometimes they backport patches and such), and always
does so under the same license as Python (AFAICT), but they don't *have*
to. 

GPL is copyleft and requires its derivative works to be GPL'd (or at
least, no more restrictive then GPL)-- so in GPL, to distribute it you
MUST distribute it under GPL-compatible terms. Python is a permissive
license and allows anyone to do basically anything, INCLUDING produce
closed source releases if someone wanted to, or just release
modifications or modules that are available under different licenses. 

The OSD encompasses both ends of the spectrum: the GPL's mandate of
source access and the OSD's mandate of the receiver to be able to
distribute in the same terms they received (notably, NOT the same terms
it was originally released under).

-- 
Stephen Hansen
  m e @ i x o k a i  . i o
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-05 Thread Alexander Walters



On 2/5/2016 12:27, Emile van Sebille wrote:

On 2/1/2016 9:20 AM, Ethan Furman wrote:

On 02/01/2016 08:40 AM, R. David Murray wrote:



On the other hand, if the distros go the way Nick has (I think) been
advocating, and have a separate 'system python for system scripts' that
is independent of the one installed for user use, having the 
system-only
python be frozen and sourceless would actually make sense on a 
couple of

levels.


Agreed.


Except for that nasty licensing issue requiring source code.

Emile
Licensing requires, in the GPL at least, that the *modified* sources be 
made *available*, not that they be shipped with the product. Looking at 
the Python license, and what tools already do, there is zero need to 
ship the source to stay compliant.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-05 Thread Brett Cannon
On Fri, 5 Feb 2016 at 10:34 Emile van Sebille  wrote:

> On 2/5/2016 9:37 AM, Alexander Walters wrote:
> >
> >
> > On 2/5/2016 12:27, Emile van Sebille wrote:
> >> On 2/1/2016 9:20 AM, Ethan Furman wrote:
> >>> On 02/01/2016 08:40 AM, R. David Murray wrote:
> >> 
>  On the other hand, if the distros go the way Nick has (I think) been
>  advocating, and have a separate 'system python for system scripts'
> that
>  is independent of the one installed for user use, having the
>  system-only
>  python be frozen and sourceless would actually make sense on a
>  couple of
>  levels.
> >>>
> >>> Agreed.
> >>
> >> Except for that nasty licensing issue requiring source code.
> >>
> >> Emile
> > Licensing requires, in the GPL at least, that the *modified* sources be
> > made *available*, not that they be shipped with the product. Looking at
> > the Python license, and what tools already do, there is zero need to
> > ship the source to stay compliant.
>
> Hmm, the annotated Open Source Definition explicitly states "The program
> must include source code" -- how did I misinterpret that?
>

Because you left off the part following: "... and must allow distribution
in source code as well as compiled form". This is entirely a discussion of
distribution in a compiled form.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-05 Thread Emile van Sebille

On 2/1/2016 9:20 AM, Ethan Furman wrote:

On 02/01/2016 08:40 AM, R. David Murray wrote:



On the other hand, if the distros go the way Nick has (I think) been
advocating, and have a separate 'system python for system scripts' that
is independent of the one installed for user use, having the system-only
python be frozen and sourceless would actually make sense on a couple of
levels.


Agreed.


Except for that nasty licensing issue requiring source code.

Emile



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-05 Thread Emile van Sebille

On 2/5/2016 9:37 AM, Alexander Walters wrote:



On 2/5/2016 12:27, Emile van Sebille wrote:

On 2/1/2016 9:20 AM, Ethan Furman wrote:

On 02/01/2016 08:40 AM, R. David Murray wrote:



On the other hand, if the distros go the way Nick has (I think) been
advocating, and have a separate 'system python for system scripts' that
is independent of the one installed for user use, having the
system-only
python be frozen and sourceless would actually make sense on a
couple of
levels.


Agreed.


Except for that nasty licensing issue requiring source code.

Emile

Licensing requires, in the GPL at least, that the *modified* sources be
made *available*, not that they be shipped with the product. Looking at
the Python license, and what tools already do, there is zero need to
ship the source to stay compliant.


Hmm, the annotated Open Source Definition explicitly states "The program 
must include source code" -- how did I misinterpret that?


Emile

http://opensource.org/osd-annotated






___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-05 Thread Andrew Barnert via Python-Dev
On Friday, February 5, 2016 11:57 AM, Emile van Sebille  wrote:



> Aah, 'must' is less restrictive in this context than I expected. When 
> you combine the two halves the first part might be more accurately 
> phrased as 'The program must make source code available' rather than 
> 'must include' which I understood to mean 'ship with'.

First, step back and think of this in common sense terms: If being open source 
required any Python installation to have the .py source to the .pyc or .zip 
files in the stdlib, surely it would also require any Python installation to 
have the .c source to the interpreter too. But lots of people have Python 
without having the .c source.

Also, the GPL isn't typical of all open source licenses, it's only typical of 
_copyleft_ licenses. Permissive licenses, like Python's, are very different. 
Copyleft licenses are designed to make sure that all derived works are also 
copylefted; permissive licenses are designed to permit derived works as widely 
as possible. As the Python license specifically says, "All Python licenses, 
unlike the GPL, let you distribute a modified version without making your 
changes open source."

Meanwhile, the fact that someone has decided that the Python license qualifies 
under the Open Source Definition doesn't mean the OSD is the right way to 
understand it. Read the license itself, or one of the summaries at 
opensource.org or fsf.org. (And if you still can't figure something out, and 
it's important to your work, you almost certainly need to ask a lawyer.) So, if 
you think the first sentence of section 2 of the OSD contradicts the 
explanation in the rest of the paragraph--well, even if you're right, that 
doesn't affect Python's license at all.

Finally, if you want to see what it takes to actually make all the terms 
unambiguous both to ordinary human beings and to legal codes, see the GPL FAQ 
sections on their definitions of "propagate" and "convey". It may take you lots 
of careful reading to understand it, but when you finally do, it's definitely 
unambiguous.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-05 Thread Emile van Sebille

On 2/5/2016 10:38 AM, Brett Cannon wrote:



On Fri, 5 Feb 2016 at 10:34 Emile van Sebille > wrote:



 >> Except for that nasty licensing issue requiring source code.
 >>
 >> Emile
 > Licensing requires, in the GPL at least, that the *modified*
sources be
 > made *available*, not that they be shipped with the product.
Looking at
 > the Python license, and what tools already do, there is zero need to
 > ship the source to stay compliant.

Hmm, the annotated Open Source Definition explicitly states "The program
must include source code" -- how did I misinterpret that?


Because you left off the part following: "... and must allow
distribution in source code as well as compiled form". This is entirely
a discussion of distribution in a compiled form.



Aah, 'must' is less restrictive in this context than I expected. When 
you combine the two halves the first part might be more accurately 
phrased as 'The program must make source code available' rather than 
'must include' which I understood to mean 'ship with'.


Emile


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-05 Thread Nick Coghlan
On 5 February 2016 at 15:05, Steven D'Aprano  wrote:
> (I'm not even sure if this suggestion makes sense, since I'm not really
> sure what "freezing" the stdlib entails. Is it documented anywhere?)

It's not particularly well documented - most of the docs you'll find
are about freeze utilities that don't explain how they work, or the
FrozenImporter, which doesn't explain how to *create* a frozen module
and link it into your Python executable.

Your approach of thinking of a frozen module as a generated .pyc file
that has been converted to a builtin module is a pretty good working
model, though. (It isn't *entirely* accurate, but the discrepancies
are sufficiently arcane that they aren't going to matter in any case
that doesn't involve specifically poking around at the import related
attributes).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-04 Thread Terry Reedy

On 2/4/2016 12:18 PM, Sven R. Kunze wrote:

On 04.02.2016 14:09, Nick Coghlan wrote:

On 2 February 2016 at 06:39, Andrew Barnert via Python-Dev
  wrote:

On Feb 1, 2016, at 09:59,mike.romb...@comcast.net  wrote:

  If the stdlib were to use implicit namespace packages
(https://www.python.org/dev/peps/pep-0420/  ) and the various
loaders/importers as well, then python could do what I've done with an
embedded python application for years.  Freeze the stdlib (or put it
in a zipfile or whatever is fast).  Then arrange PYTHONPATH to first
look on the filesystem and then look in the frozen/ziped storage.

This is a great solution for experienced developers, but I think it would be 
pretty bad for novices or transplants from other languages (maybe even 
including Python 2).

There are already multiple duplicate questions every month on StackOverflow from people 
asking "how do I find the source to stdlib module X". The canonical answer 
starts off by explaining how to import the module and use its __file__, which everyone is 
able to handle. If we have to instead explain how to work out the .py name from the 
qualified module name, how to work out the stdlib path from sys.path, and then how to 
find the source from those two things, with the caveat that it may not be installed at 
all on some platforms, and how to make sure what they're asking about really is a stdlib 
module, and how to make sure they aren't shadowing it with a module elsewhere on 
sys.path, that's a lot more complicated. Especially when you consider that some people on 
Windows and Mac are writing Py
  thon scripts without ever learning how to use the terminal or find their 
Python packages via Explorer/Finder.

For folks that *do* know how to use the terminal:

$ python3 -m inspect --details inspect
Target: inspect
Origin: /usr/lib64/python3.4/inspect.py
Cached: /usr/lib64/python3.4/__pycache__/inspect.cpython-34.pyc
Loader: <_frozen_importlib.SourceFileLoader object at 0x7f0d8d23d9b0>

(And if they just want to *read* the source code, then leaving out
"--details" prints the full module source, and would work even if the
standard library were in a zip archive)


This is completely inadequate as a replacement for loading source into 
an editor, even if just for reading.


First, on Windows, the console defaults to 300 lines.  Print more and 
only the last 300 lines remain.  The max is buffer size is .  But 
setting the buffer to that is obnoxious because the buffer is then 
padded with blank lines to make  lines.  The little rectangle that 
one grabs in the scrollbar is then scaled down to almost nothing, 
becoming hard to grab.


Second is navigation.  No Find, Find-next, or Find-all.  Because of 
padding, moving to the unpadded 'bottom of file' is difficult.


Third, for a repository version, I would have to type, without error, 
instead of 'python3', some version of, for instance, some suffix of 
'F:/python/dev/35/PcBuild//python_d.exe'.  "" 
depends, I believe, on the build options.



I want to see and debug also core Python in PyCharm and this is not
acceptable.

If you want to make it opt-in, fine. But opt-out is a no-go. I have a
side-by-side comparison as we use Java and Python in production. It's
the *ease of access* that makes Python great compared to Java.

@Andrew
Even for experienced developers it just sucks and there are more
important things to do.


I agree that removing stdlib python source files by default is an poor 
idea. The disk space saved is trivial.  So, for me, would be nearly all 
of the time saving.


Over recent versions, more and more source files have been linked to in 
the docs.  Guido recently approved of linking the rest.  Removing source 
contradicts this trend.


Easily loading modules, including stdlib modules, into an IDLE Editor 
Window is a documented feature that goes back to the original commit in 
Aug 2000.  We not not usually break stdlib features without 
acknowledgement, some decussion, and a positive decision to do so.


Someone has already mentioned the degredation of tracebacks.

So why not just leave the source files alone in /Lib.  As far as I can 
see, they would not hurt anything   At least on Windows, zip files are 
treated as directories and python35.zip comes before /Lib on sys.path.


The Windows installer currently has an option, selected by default I 
believe, to run compileall.  Add to compileall an option to compile all 
to python35.zip rather than __pycache and  and use that in that 
installer.  Even if the zip is including in the installer, 
compileall-zip + source files would let adventurous people patch their 
stdlib files.


Editing a stdlib file, to see if a confirmed bug disappeared (it did), 
was how I made my first code contribution. If I had had to download and 
setup svn and maybe visual c to try a one line change, I would not have 
done it.


--
Terry Jan Reedy


___
Python-Dev mailing list

Re: [Python-Dev] More optimisation ideas

2016-02-04 Thread Nick Coghlan
On 2 February 2016 at 02:40, R. David Murray  wrote:
> On the other hand, if the distros go the way Nick has (I think) been
> advocating, and have a separate 'system python for system scripts' that
> is independent of the one installed for user use, having the system-only
> python be frozen and sourceless would actually make sense on a couple of
> levels.

While omitting Python source files does let us reduce base image sizes
(quite significantly), the current perspective in Fedora and Project
Atomic is that going bytecode-only (whether frozen or not) breaks too
many things to be worthwhile. As one simple example, it means
tracebacks no longer include source code lines, dramatically
increasing the difficulty of debugging failures.

As such, we're more likely to pursue minimisation efforts by splitting
the standard library up into "stuff essential distro components use"
and "the rest of the standard library that upstream defines" than by
figuring out how to avoid shipping source files (I believe Debian
already makes this distinction with the python-minimal vs python
split).

Zipping up the standard library doesn't break tracebacks though, so
it's potentially worth exploring that option further.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-04 Thread Nick Coghlan
On 2 February 2016 at 06:39, Andrew Barnert via Python-Dev
 wrote:
> On Feb 1, 2016, at 09:59, mike.romb...@comcast.net wrote:
>>
>>  If the stdlib were to use implicit namespace packages
>> ( https://www.python.org/dev/peps/pep-0420/ ) and the various
>> loaders/importers as well, then python could do what I've done with an
>> embedded python application for years.  Freeze the stdlib (or put it
>> in a zipfile or whatever is fast).  Then arrange PYTHONPATH to first
>> look on the filesystem and then look in the frozen/ziped storage.
>
> This is a great solution for experienced developers, but I think it would be 
> pretty bad for novices or transplants from other languages (maybe even 
> including Python 2).
>
> There are already multiple duplicate questions every month on StackOverflow 
> from people asking "how do I find the source to stdlib module X". The 
> canonical answer starts off by explaining how to import the module and use 
> its __file__, which everyone is able to handle. If we have to instead explain 
> how to work out the .py name from the qualified module name, how to work out 
> the stdlib path from sys.path, and then how to find the source from those two 
> things, with the caveat that it may not be installed at all on some 
> platforms, and how to make sure what they're asking about really is a stdlib 
> module, and how to make sure they aren't shadowing it with a module elsewhere 
> on sys.path, that's a lot more complicated. Especially when you consider that 
> some people on Windows and Mac are writing Python scripts without ever 
> learning how to use the terminal or find their Python packages via 
> Explorer/Finder.

For folks that *do* know how to use the terminal:

$ python3 -m inspect --details inspect
Target: inspect
Origin: /usr/lib64/python3.4/inspect.py
Cached: /usr/lib64/python3.4/__pycache__/inspect.cpython-34.pyc
Loader: <_frozen_importlib.SourceFileLoader object at 0x7f0d8d23d9b0>

(And if they just want to *read* the source code, then leaving out
"--details" prints the full module source, and would work even if the
standard library were in a zip archive)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-04 Thread Steven D'Aprano
On Thu, Feb 04, 2016 at 07:58:30PM -0500, Terry Reedy wrote:

> >>For folks that *do* know how to use the terminal:
> >>
> >>$ python3 -m inspect --details inspect
> >>Target: inspect
> >>Origin: /usr/lib64/python3.4/inspect.py
> >>Cached: /usr/lib64/python3.4/__pycache__/inspect.cpython-34.pyc
> >>Loader: <_frozen_importlib.SourceFileLoader object at 0x7f0d8d23d9b0>
> >>
> >>(And if they just want to *read* the source code, then leaving out
> >>"--details" prints the full module source, and would work even if the
> >>standard library were in a zip archive)
> 
> This is completely inadequate as a replacement for loading source into 
> an editor, even if just for reading.
[...]

I agree with Terry. The inspect trick Nick describes above is a great 
feature to have, but it's not a substitute for opening the source in an 
editor, not even on OSes where the command line tools are more powerful 
than Windows' default tools.

[...]
> I agree that removing stdlib python source files by default is an poor 
> idea. The disk space saved is trivial.  So, for me, would be nearly all 
> of the time saving.

I too would be very reluctant to remove the source files from Python by 
default, but I have an alternative. I don't know if this is a ridiculous 
idea or not, but now that the .pyc bytecode files are kept in a separate 
__pycache__ directory, could we freeze that directory and leave the 
source files available for reading?

(I'm not even sure if this suggestion makes sense, since I'm not really 
sure what "freezing" the stdlib entails. Is it documented anywhere?)


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-04 Thread Sven R. Kunze

On 04.02.2016 14:09, Nick Coghlan wrote:

On 2 February 2016 at 06:39, Andrew Barnert via Python-Dev
 wrote:

On Feb 1, 2016, at 09:59, mike.romb...@comcast.net wrote:

  If the stdlib were to use implicit namespace packages
( https://www.python.org/dev/peps/pep-0420/ ) and the various
loaders/importers as well, then python could do what I've done with an
embedded python application for years.  Freeze the stdlib (or put it
in a zipfile or whatever is fast).  Then arrange PYTHONPATH to first
look on the filesystem and then look in the frozen/ziped storage.

This is a great solution for experienced developers, but I think it would be 
pretty bad for novices or transplants from other languages (maybe even 
including Python 2).

There are already multiple duplicate questions every month on StackOverflow from people 
asking "how do I find the source to stdlib module X". The canonical answer 
starts off by explaining how to import the module and use its __file__, which everyone is 
able to handle. If we have to instead explain how to work out the .py name from the 
qualified module name, how to work out the stdlib path from sys.path, and then how to 
find the source from those two things, with the caveat that it may not be installed at 
all on some platforms, and how to make sure what they're asking about really is a stdlib 
module, and how to make sure they aren't shadowing it with a module elsewhere on 
sys.path, that's a lot more complicated. Especially when you consider that some people on 
Windows and Mac are writing Python scripts without ever learning how to use the terminal 
or find their Python packages via Explorer/Finder.

For folks that *do* know how to use the terminal:

$ python3 -m inspect --details inspect
Target: inspect
Origin: /usr/lib64/python3.4/inspect.py
Cached: /usr/lib64/python3.4/__pycache__/inspect.cpython-34.pyc
Loader: <_frozen_importlib.SourceFileLoader object at 0x7f0d8d23d9b0>

(And if they just want to *read* the source code, then leaving out
"--details" prints the full module source, and would work even if the
standard library were in a zip archive)


I want to see and debug also core Python in PyCharm and this is not 
acceptable.


If you want to make it opt-in, fine. But opt-out is a no-go. I have a 
side-by-side comparison as we use Java and Python in production. It's 
the *ease of access* that makes Python great compared to Java.


@Andrew
Even for experienced developers it just sucks and there are more 
important things to do.



Best,
Sven

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-01 Thread Ethan Furman

On 02/01/2016 08:40 AM, R. David Murray wrote:

On Mon, 01 Feb 2016 14:12:27 +1100, Steven D'Aprano wrote:



I find that being able to easily open stdlib .py files in a text editor
to read the source is extremely valuable. I've learned much more from
reading the source than from (e.g.) StackOverflow. Likewise, it's often
handy to do a grep over the stdlib. When you talk about freezing the
stdlib, what exactly does that mean?

- will the source files still be there?


Well, Brett said it would be optional, though perhaps the above
paragraph is asking about doing it in our Windows build.  But the linux
distros might make also use the option if it exists, so the question is
very meaningful.  However, you'd have to ask the distro if the source
would be shipped in the linux case, and I'd guess not in most cases.

I don't know about anyone else, but on my own development systems it is
not that unusual for me to *edit* the stdlib files (to add debug prints)
while debugging my own programs.  Freeze would definitely interfere with
that.  I could, of course, install a separate source build on my dev
system, but I thought it worth mentioning as a factor.


Yup, so do I.



On the other hand, if the distros go the way Nick has (I think) been
advocating, and have a separate 'system python for system scripts' that
is independent of the one installed for user use, having the system-only
python be frozen and sourceless would actually make sense on a couple of
levels.


Agreed.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-01 Thread R. David Murray
On Mon, 01 Feb 2016 14:12:27 +1100, Steven D'Aprano  wrote:
> On Sun, Jan 31, 2016 at 08:23:00PM +, Brett Cannon wrote:
> > So freezing the stdlib helps on UNIX and not on OS X (if my old testing is
> > still accurate). I guess the next question is what it does on Windows and
> > if we would want to ever consider freezing the stdlib as part of the build
> > process (and if we would want to change the order of importers on
> > sys.meta_path so frozen modules came after file-based ones).
> 
> I find that being able to easily open stdlib .py files in a text editor 
> to read the source is extremely valuable. I've learned much more from 
> reading the source than from (e.g.) StackOverflow. Likewise, it's often 
> handy to do a grep over the stdlib. When you talk about freezing the 
> stdlib, what exactly does that mean?
> 
> - will the source files still be there?

Well, Brett said it would be optional, though perhaps the above
paragraph is asking about doing it in our Windows build.  But the linux
distros might make also use the option if it exists, so the question is
very meaningful.  However, you'd have to ask the distro if the source
would be shipped in the linux case, and I'd guess not in most cases.

I don't know about anyone else, but on my own development systems it is
not that unusual for me to *edit* the stdlib files (to add debug prints)
while debugging my own programs.  Freeze would definitely interfere with
that.  I could, of course, install a separate source build on my dev
system, but I thought it worth mentioning as a factor.

On the other hand, if the distros go the way Nick has (I think) been
advocating, and have a separate 'system python for system scripts' that
is independent of the one installed for user use, having the system-only
python be frozen and sourceless would actually make sense on a couple of
levels.

--David
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-01 Thread Sven R. Kunze
Thanks, Brett. Wasn't aware of lazy imports as well. I think that one is 
even better reducing startup time as freezing stdlib.


On 31.01.2016 18:57, Brett Cannon wrote:
I have opened http://bugs.python.org/issue26252 to track writing the 
example (and before ppl go playing with the lazy loader, be aware of 
http://bugs.python.org/issue26186).


On Sun, 31 Jan 2016 at 09:26 Brett Cannon > wrote:


There are no example docs for it yet, but enough people have asked
this week about how to set up a custom importer that I will write
up a generic example case which will make sense for a lazy loader
(need to file the issue before I forget).


On Sun, 31 Jan 2016, 09:11 Donald Stufft > wrote:



On Jan 31, 2016, at 12:02 PM, Brett Cannon > wrote:

A lazy importer was added in Python 3.5


Is there any docs on how to actually use the LazyLoader in
3.5? I can’t seem to find any but I don’t really know the
import system that well.

-
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F
6E3C BCE9 3372 DCFA



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-01 Thread Andrew Barnert via Python-Dev
On Feb 1, 2016, at 09:59, mike.romb...@comcast.net wrote:
> 
>  If the stdlib were to use implicit namespace packages
> ( https://www.python.org/dev/peps/pep-0420/ ) and the various
> loaders/importers as well, then python could do what I've done with an
> embedded python application for years.  Freeze the stdlib (or put it
> in a zipfile or whatever is fast).  Then arrange PYTHONPATH to first
> look on the filesystem and then look in the frozen/ziped storage.

This is a great solution for experienced developers, but I think it would be 
pretty bad for novices or transplants from other languages (maybe even 
including Python 2).

There are already multiple duplicate questions every month on StackOverflow 
from people asking "how do I find the source to stdlib module X". The canonical 
answer starts off by explaining how to import the module and use its __file__, 
which everyone is able to handle. If we have to instead explain how to work out 
the .py name from the qualified module name, how to work out the stdlib path 
from sys.path, and then how to find the source from those two things, with the 
caveat that it may not be installed at all on some platforms, and how to make 
sure what they're asking about really is a stdlib module, and how to make sure 
they aren't shadowing it with a module elsewhere on sys.path, that's a lot more 
complicated. Especially when you consider that some people on Windows and Mac 
are writing Python scripts without ever learning how to use the terminal or 
find their Python packages via Explorer/Finder. 

And meanwhile, other people would be asking why their app runs slower on one 
machine than another, because they didn't expect that installing python-dev on 
top of python would slow down startup.

Finally, on Linux and Mac, the stdlib will usually be somewhere that's not 
user-writable--and we shouldn't expect users to have to mess with stuff in 
/usr/lib or /System/Library even if they do have sudo access. Of course we 
could put a "stdlib shadow" location on the sys.path and configure it for 
/usr/local/lib and /Library and/or for somewhere in -, but that just makes the 
lookup proceed even more complicated--not to mention that we've just added 
three stat calls to remove one open, at which point the optimization has 
probably become a pessimization.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-01 Thread mike . romberg
> " " == Barry Warsaw  writes:

>> On Feb 01, 2016, at 11:40 AM, R. David Murray wrote:

>> I don't know about anyone else, but on my own development
>> systems it is not that unusual for me to *edit* the stdlib
>> files (to add debug prints) while debugging my own programs.
>> Freeze would definitely interfere with that.  I could, of
>> course, install a separate source build on my dev system, but I
>> thought it worth mentioning as a factor.

   [snip]

 > But even with system scripts, I do need to step through them
 > occasionally.  If it were a matter of changing a shebang or
 > invoking the script with a different Python
 > (e.g. /usr/bin/python3s vs. /usr/bin/python3) to get the full
 > unpacked source, that would be fine.

  If the stdlib were to use implicit namespace packages
( https://www.python.org/dev/peps/pep-0420/ ) and the various
loaders/importers as well, then python could do what I've done with an
embedded python application for years.  Freeze the stdlib (or put it
in a zipfile or whatever is fast).  Then arrange PYTHONPATH to first
look on the filesystem and then look in the frozen/ziped storage.

  Normally the filesystem part is empty.   So, modules are loaded from
the frozen/zip area.  But if you wanna override one of the frozen
modules simply copy one or more .py files onto the file system.  I've
been doing this only with modules in the global scope.  But implicit
namespace packages seem to open the door for this with packages.

Mike
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-01 Thread Barry Warsaw
On Feb 01, 2016, at 11:40 AM, R. David Murray wrote:

>Well, Brett said it would be optional, though perhaps the above
>paragraph is asking about doing it in our Windows build.  But the linux
>distros might make also use the option if it exists, so the question is
>very meaningful.  However, you'd have to ask the distro if the source
>would be shipped in the linux case, and I'd guess not in most cases.

It's very likely the .py files would still be shipped, but perhaps in a -dev
package that isn't normally installed.

>I don't know about anyone else, but on my own development systems it is
>not that unusual for me to *edit* the stdlib files (to add debug prints)
>while debugging my own programs.  Freeze would definitely interfere with
>that.  I could, of course, install a separate source build on my dev
>system, but I thought it worth mentioning as a factor.

I do this too, though usually in a VM or chroot and not in my live system.  A
very common situation for me though is pdb stepping through my own code and
landing in -or passing through- stdlib.

>On the other hand, if the distros go the way Nick has (I think) been
>advocating, and have a separate 'system python for system scripts' that
>is independent of the one installed for user use, having the system-only
>python be frozen and sourceless would actually make sense on a couple of
>levels.

Yep, we've talked about it in Debian-land too, but never quite gotten around
to doing anything.  Certainly I'd like to see some consistency among Linux
distros there (i.e. discussed on linux-sig@).

But even with system scripts, I do need to step through them occasionally.  If
it were a matter of changing a shebang or invoking the script with a different
Python (e.g. /usr/bin/python3s vs. /usr/bin/python3) to get the full unpacked
source, that would be fine.

Cheers,
-Barry
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-01 Thread Nikolaus Rath

On Feb 01 2016, mike.romb...@comcast.net wrote:
" " == Barry Warsaw  writes: 


>> On Feb 01, 2016, at 11:40 AM, R. David Murray wrote: 
 
>> I don't know about anyone else, but on my own development 
>> systems it is not that unusual for me to *edit* the 
>> stdlib files (to add debug prints) while debugging my own 
>> programs.  Freeze would definitely interfere with that. 
>> I could, of course, install a separate source build on my 
>> dev system, but I thought it worth mentioning as a 
>> factor. 

   [snip] 

 > But even with system scripts, I do need to step through 
 > them occasionally.  If it were a matter of changing a 
 > shebang or invoking the script with a different Python 
 > (e.g. /usr/bin/python3s vs. /usr/bin/python3) to get the 
 > full unpacked source, that would be fine. 

  If the stdlib were to use implicit namespace packages 
( https://www.python.org/dev/peps/pep-0420/ ) and the various 
loaders/importers as well, then python could do what I've done 
with an embedded python application for years.  Freeze the 
stdlib (or put it in a zipfile or whatever is fast).  Then 
arrange PYTHONPATH to first look on the filesystem and then look 
in the frozen/ziped storage.


Presumably that would eliminate the performance advantages of the 
frozen/zipped storage because now Python would still have to issue 
all the stat calls to first check for the existence of a .py file.



Best,
-Nikolaus

(No Cc on replies please, I'm reading the list)
--
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

»Time flies like an arrow, fruit flies like a Banana.«
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-01 Thread Andrew Barnert via Python-Dev
On Feb 1, 2016, at 19:44, Terry Reedy  wrote:
> 
>> On 2/1/2016 3:39 PM, Andrew Barnert via Python-Dev wrote:
>> 
>> There are already multiple duplicate questions every month on
>> StackOverflow from people asking "how do I find the source to stdlib
>> module X". The canonical answer starts off by explaining how to
>> import the module and use its __file__, which everyone is able to
>> handle.
> 
> Perhaps even easier: start IDLE, hit Alt-M, type in module name as one would 
> import it, click OK.  If Python source is available, IDLE will open in an 
> editor window. with the path on the title bar.
> 
>> If we have to instead explain how to work out the .py name
>> from the qualified module name, how to work out the stdlib path from
>> sys.path, and then how to find the source from those two things, with
>> the caveat that it may not be installed at all on some platforms, and
>> how to make sure what they're asking about really is a stdlib module,
>> and how to make sure they aren't shadowing it with a module elsewhere
>> on sys.path, that's a lot more complicated.
> 
> The windows has the path on the title bar, so one can tell what was loaded.

The point of this thread is the suggestion that the stdlib modules be frozen or 
stored in a zipfile, unless a user modifies things in some way to make the 
source accessible. So, if a user hasn't done that (which no novice will know 
how to do), there won't be a path to show in the title bar, so IDLE won't be 
any more help than the command line.

(I suppose IDLE could grow a new feature to look up "associated source files" 
for a zipped stdlib or something, but that seems like a pretty big new feature.)

> IDLE currently uses imp.find_module (this could be updated), with a backup of 
> __import__(...).__file__, so it will load non-stdlib files that can be 
> imported.
> 
> > Finally, on Linux and Mac, the stdlib will usually be somewhere
> > that's not user-writable
> 
> On Windows, this depends on the install location.  Perhaps there should be an 
> option for edit-save or view only to avoid accidental changes.

The problem is that, if the standard way for users to see stdlib sources is to 
copy them from somewhere else (like $install/src/Lib) into a stdlib directory 
(like $install/Lib), then that stdlib directory has to be writable--and on Mac 
and Linux, it's not.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-01 Thread Terry Reedy

On 2/1/2016 3:39 PM, Andrew Barnert via Python-Dev wrote:


There are already multiple duplicate questions every month on
StackOverflow from people asking "how do I find the source to stdlib
module X". The canonical answer starts off by explaining how to
import the module and use its __file__, which everyone is able to
handle.


Perhaps even easier: start IDLE, hit Alt-M, type in module name as one 
would import it, click OK.  If Python source is available, IDLE will 
open in an editor window. with the path on the title bar.


 If we have to instead explain how to work out the .py name

from the qualified module name, how to work out the stdlib path from
sys.path, and then how to find the source from those two things, with
the caveat that it may not be installed at all on some platforms, and
how to make sure what they're asking about really is a stdlib module,
and how to make sure they aren't shadowing it with a module elsewhere
on sys.path, that's a lot more complicated.


The windows has the path on the title bar, so one can tell what was loaded.

IDLE currently uses imp.find_module (this could be updated), with a 
backup of __import__(...).__file__, so it will load non-stdlib files 
that can be imported.


> Finally, on Linux and Mac, the stdlib will usually be somewhere
> that's not user-writable

On Windows, this depends on the install location.  Perhaps there should 
be an option for edit-save or view only to avoid accidental changes.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-01 Thread Brett Cannon
On Mon, 1 Feb 2016 at 08:48 R. David Murray  wrote:

> On Mon, 01 Feb 2016 14:12:27 +1100, Steven D'Aprano 
> wrote:
> > On Sun, Jan 31, 2016 at 08:23:00PM +, Brett Cannon wrote:
> > > So freezing the stdlib helps on UNIX and not on OS X (if my old
> testing is
> > > still accurate). I guess the next question is what it does on Windows
> and
> > > if we would want to ever consider freezing the stdlib as part of the
> build
> > > process (and if we would want to change the order of importers on
> > > sys.meta_path so frozen modules came after file-based ones).
> >
> > I find that being able to easily open stdlib .py files in a text editor
> > to read the source is extremely valuable. I've learned much more from
> > reading the source than from (e.g.) StackOverflow. Likewise, it's often
> > handy to do a grep over the stdlib. When you talk about freezing the
> > stdlib, what exactly does that mean?
> >
> > - will the source files still be there?
>
> Well, Brett said it would be optional, though perhaps the above
> paragraph is asking about doing it in our Windows build.


Nope, it would probably need to be across all OSs to have consistent
semantics.


>   But the linux
> distros might make also use the option if it exists, so the question is
> very meaningful.  However, you'd have to ask the distro if the source
> would be shipped in the linux case, and I'd guess not in most cases.
>
> I don't know about anyone else, but on my own development systems it is
> not that unusual for me to *edit* the stdlib files (to add debug prints)
> while debugging my own programs.  Freeze would definitely interfere with
> that.  I could, of course, install a separate source build on my dev
> system, but I thought it worth mentioning as a factor.
>

This is what would need to be discussed in terms of how to handle this. For
instance, we already do stuff in (I believe) site.py when we detect the
build is in a checkout, so we could in that instance make sure the stdlib
file directory takes precedence over any frozen code (hence why I wondered
if the frozen importer on sys.meta_path should come after the sys.path
importer). If we did that then we could make installing the stdlib files
optional but still take precedence.

It's all workable, it's just a question of if we want to. This is why I
think we should get concrete benchmark numbers on Windows, Linux, and OS X
to see if this is even worth considering as something we provide in our own
binaries.


>
> On the other hand, if the distros go the way Nick has (I think) been
> advocating, and have a separate 'system python for system scripts' that
> is independent of the one installed for user use, having the system-only
> python be frozen and sourceless would actually make sense on a couple of
> levels.
>

It at least wouldn't hurt anything.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-31 Thread Brett Cannon
So freezing the stdlib helps on UNIX and not on OS X (if my old testing is
still accurate). I guess the next question is what it does on Windows and
if we would want to ever consider freezing the stdlib as part of the build
process (and if we would want to change the order of importers on
sys.meta_path so frozen modules came after file-based ones).

On Sun, 31 Jan 2016, 10:43 M.-A. Lemburg <m...@egenix.com> wrote:

> On 30.01.2016 20:15, Steve Dower wrote:
> > Brett tried freezing the entire stdlib at one point (as we do for parts
> of importlib) and reported no significant improvement. Since that rules out
> code compilation as well as the OS calls, it'd seem the priority is to
> execute less code on startup.
> >
> > Details of that work were posted to python-dev about twelve months ago,
> IIRC. Maybe a little longer.
>
> Freezing the entire stdlib does improve the startup time,
> simply because it removes stat calls, which dominate the startup
> time at least on Unix.
>
> It also allows sharing the stdlib byte code in memory, since it gets
> stored in static C structs which the OS will happily mmap into
> multiple processes for you without any additional effort.
>
> Our eGenix PyRun does exactly that. Even though the original
> motivation is a different one, the gained improvement in
> startup time is a nice side effect:
>
> http://www.egenix.com/products/python/PyRun/
>
> Aside: The encodings don't really make much difference here. The
> dictionaries aren't all that big, so generating them on the fly doesn't
> really create much overhead. The trade off in terms of
> maintainability/speed
> definitely leans toward maintainability. For the larger encoding
> tables we already have C implementations with appropriate data
> structures to make lookup speed vs. storage needs efficient.
>
> --
> Marc-Andre Lemburg
> eGenix.com
>
> Professional Python Services directly from the Experts (#1, Jan 31 2016)
> >>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
> >>> Python Database Interfaces ...   http://products.egenix.com/
> >>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/
> 
>
> ::: We implement business ideas - efficiently in both time and costs :::
>
>eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
> D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>Registered at Amtsgericht Duesseldorf: HRB 46611
>http://www.egenix.com/company/contact/
>   http://www.malemburg.com/
>
>
> > Top-posted from my Windows Phone
> >
> > -Original Message-
> > From: "Serhiy Storchaka" <storch...@gmail.com>
> > Sent: ‎1/‎30/‎2016 10:22
> > To: "python-dev@python.org" <python-dev@python.org>
> > Subject: Re: [Python-Dev] More optimisation ideas
> >
> > On 30.01.16 18:31, Steve Dower wrote:
> >> On 30Jan2016 0645, Serhiy Storchaka wrote:
> >>> $ ./python -m timeit -s "import codecs; from encodings.cp437 import
> >>> decoding_table" -- "codecs.charmap_build(decoding_table)"
> >>> 10 loops, best of 3: 4.36 usec per loop
> >>>
> >>> Getting rid from charmap_build() would save you at most 4.4
> microseconds
> >>> per encoding. 0.0005 seconds if you have imported *all* standard
> >>> encodings!
> >>
> >> Just as happy to be proven wrong. Perhaps I misinterpreted my original
> >> profiling and then, embarrassingly, ran with the result for a long time
> >> without retesting.
> >
> > AFAIK the most time is spent in system calls like stat or open.
> > Archiving the stdlib into the ZIP file and using zipimport can decrease
> > Python startup time (perhaps there is an open issue about this).
> >
> >
> > ___
> > Python-Dev mailing list
> > Python-Dev@python.org
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org
> >
> >
> >
> > ___
> > Python-Dev mailing list
> > Python-Dev@python.org
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/mal%40egenix.com
> >
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-31 Thread Terry Reedy

On 1/31/2016 12:09 PM, Antoine Pitrou wrote:


The following documentation leaves me absolutely clueless:

"""This class only works with loaders that define exec_module() as control
over what module type is used for the module is required.


No wonder.  I cannot parse it as an English sentence. It needs rewriting.


For those same
reasons, the loader’s create_module() method will be ignored (i.e., the
loader’s method should only return None). Finally, modules which substitute
the object placed into sys.modules will not work as there is no way to
properly replace the module references throughout the interpreter safely;
ValueError is raised if such a substitution is detected."""

(reference:
https://docs.python.org/3/library/importlib.html#importlib.util.LazyLoader)


--
Terry Jan Reedy


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-31 Thread Brett Cannon
I have opened http://bugs.python.org/issue26252 to track writing the
example (and before ppl go playing with the lazy loader, be aware of
http://bugs.python.org/issue26186).

On Sun, 31 Jan 2016 at 09:26 Brett Cannon  wrote:

> There are no example docs for it yet, but enough people have asked this
> week about how to set up a custom importer that I will write up a generic
> example case which will make sense for a lazy loader (need to file the
> issue before I forget).
>
> On Sun, 31 Jan 2016, 09:11 Donald Stufft  wrote:
>
>>
>> On Jan 31, 2016, at 12:02 PM, Brett Cannon  wrote:
>>
>> A lazy importer was added in Python 3.5
>>
>>
>> Is there any docs on how to actually use the LazyLoader in 3.5? I can’t
>> seem to find any but I don’t really know the import system that well.
>>
>> -
>> Donald Stufft
>> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372
>> DCFA
>>
>>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-31 Thread Donald Stufft

> On Jan 31, 2016, at 12:02 PM, Brett Cannon  wrote:
> 
> A lazy importer was added in Python 3.5


Is there any docs on how to actually use the LazyLoader in 3.5? I can’t seem to 
find any but I don’t really know the import system that well.

-
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-31 Thread Antoine Pitrou
Brett Cannon  python.org> writes:

> 
> 
> A lazy importer was added in Python 3.5 and it was not possible
> without the module spec refactoring.

Wow... Thank you, I didn't know about that.

Now for the next question: how am I supposed to use it?

The following documentation leaves me absolutely clueless:

"""This class only works with loaders that define exec_module() as control
over what module type is used for the module is required. For those same
reasons, the loader’s create_module() method will be ignored (i.e., the
loader’s method should only return None). Finally, modules which substitute
the object placed into sys.modules will not work as there is no way to
properly replace the module references throughout the interpreter safely;
ValueError is raised if such a substitution is detected."""

(reference:
https://docs.python.org/3/library/importlib.html#importlib.util.LazyLoader)

I want to import lazily the modules from package "foobar.*", but not
other modules as other libraries may depend on import side effects.
How do I do that? The quoted snippet doesn't really help.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-31 Thread Brett Cannon
There are no example docs for it yet, but enough people have asked this
week about how to set up a custom importer that I will write up a generic
example case which will make sense for a lazy loader (need to file the
issue before I forget).

On Sun, 31 Jan 2016, 09:11 Donald Stufft  wrote:

>
> On Jan 31, 2016, at 12:02 PM, Brett Cannon  wrote:
>
> A lazy importer was added in Python 3.5
>
>
> Is there any docs on how to actually use the LazyLoader in 3.5? I can’t
> seem to find any but I don’t really know the import system that well.
>
> -
> Donald Stufft
> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372
> DCFA
>
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-31 Thread M.-A. Lemburg
On 30.01.2016 20:15, Steve Dower wrote:
> Brett tried freezing the entire stdlib at one point (as we do for parts of 
> importlib) and reported no significant improvement. Since that rules out code 
> compilation as well as the OS calls, it'd seem the priority is to execute 
> less code on startup.
> 
> Details of that work were posted to python-dev about twelve months ago, IIRC. 
> Maybe a little longer.

Freezing the entire stdlib does improve the startup time,
simply because it removes stat calls, which dominate the startup
time at least on Unix.

It also allows sharing the stdlib byte code in memory, since it gets
stored in static C structs which the OS will happily mmap into
multiple processes for you without any additional effort.

Our eGenix PyRun does exactly that. Even though the original
motivation is a different one, the gained improvement in
startup time is a nice side effect:

http://www.egenix.com/products/python/PyRun/

Aside: The encodings don't really make much difference here. The
dictionaries aren't all that big, so generating them on the fly doesn't
really create much overhead. The trade off in terms of maintainability/speed
definitely leans toward maintainability. For the larger encoding
tables we already have C implementations with appropriate data
structures to make lookup speed vs. storage needs efficient.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Jan 31 2016)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...   http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/


::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
  http://www.malemburg.com/


> Top-posted from my Windows Phone
> 
> -Original Message-
> From: "Serhiy Storchaka" <storch...@gmail.com>
> Sent: ‎1/‎30/‎2016 10:22
> To: "python-dev@python.org" <python-dev@python.org>
> Subject: Re: [Python-Dev] More optimisation ideas
> 
> On 30.01.16 18:31, Steve Dower wrote:
>> On 30Jan2016 0645, Serhiy Storchaka wrote:
>>> $ ./python -m timeit -s "import codecs; from encodings.cp437 import
>>> decoding_table" -- "codecs.charmap_build(decoding_table)"
>>> 10 loops, best of 3: 4.36 usec per loop
>>>
>>> Getting rid from charmap_build() would save you at most 4.4 microseconds
>>> per encoding. 0.0005 seconds if you have imported *all* standard
>>> encodings!
>>
>> Just as happy to be proven wrong. Perhaps I misinterpreted my original
>> profiling and then, embarrassingly, ran with the result for a long time
>> without retesting.
> 
> AFAIK the most time is spent in system calls like stat or open. 
> Archiving the stdlib into the ZIP file and using zipimport can decrease 
> Python startup time (perhaps there is an open issue about this).
> 
> 
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org
> 
> 
> 
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/mal%40egenix.com
> 

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-31 Thread Antoine Pitrou

Hi,

If you want to make startup time faster for a broad range of applications,
please consider adding a lazy import facility in the stdlib.
I recently tried to write a lazy import mechanism using import hooks
(to make it portable from 2.6 to 3.5), it seems nearly impossible to do
so (or, at least, for an average Python programmer like me).

This would be much more useful (for actual users, not for architecture
astronauts) than refactoring the importlib APIs in each feature version...

Thanks in advance

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-31 Thread Brett Cannon
A lazy importer was added in Python 3.5 and it was not possible without the
module spec refactoring.

On Sun, 31 Jan 2016, 08:57 Antoine Pitrou  wrote:

>
> Hi,
>
> If you want to make startup time faster for a broad range of applications,
> please consider adding a lazy import facility in the stdlib.
> I recently tried to write a lazy import mechanism using import hooks
> (to make it portable from 2.6 to 3.5), it seems nearly impossible to do
> so (or, at least, for an average Python programmer like me).
>
> This would be much more useful (for actual users, not for architecture
> astronauts) than refactoring the importlib APIs in each feature version...
>
> Thanks in advance
>
> Antoine.
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-31 Thread Brett Cannon
On Sun, 31 Jan 2016, 15:36 Terry Reedy  wrote:

> On 1/31/2016 12:09 PM, Antoine Pitrou wrote:
>
> > The following documentation leaves me absolutely clueless:
> >
> > """This class only works with loaders that define exec_module() as
> control
> > over what module type is used for the module is required.
>
> No wonder.  I cannot parse it as an English sentence. It needs rewriting.
>

Feel free to open an issue to clarify the wording.

-Brett


> > For those same
> > reasons, the loader’s create_module() method will be ignored (i.e., the
> > loader’s method should only return None). Finally, modules which
> substitute
> > the object placed into sys.modules will not work as there is no way to
> > properly replace the module references throughout the interpreter safely;
> > ValueError is raised if such a substitution is detected."""
> >
> > (reference:
> >
> https://docs.python.org/3/library/importlib.html#importlib.util.LazyLoader
> )
>
> --
> Terry Jan Reedy
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-31 Thread Steven D'Aprano
On Sun, Jan 31, 2016 at 08:23:00PM +, Brett Cannon wrote:
> So freezing the stdlib helps on UNIX and not on OS X (if my old testing is
> still accurate). I guess the next question is what it does on Windows and
> if we would want to ever consider freezing the stdlib as part of the build
> process (and if we would want to change the order of importers on
> sys.meta_path so frozen modules came after file-based ones).

I find that being able to easily open stdlib .py files in a text editor 
to read the source is extremely valuable. I've learned much more from 
reading the source than from (e.g.) StackOverflow. Likewise, it's often 
handy to do a grep over the stdlib. When you talk about freezing the 
stdlib, what exactly does that mean?

- will the source files still be there?

- how will this affect people writing patches for bugs?



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Brett Cannon
On Sat, Jan 30, 2016, 12:30 Sven R. Kunze  wrote:

> On 30.01.2016 19:20, Serhiy Storchaka wrote:
> > AFAIK the most time is spent in system calls like stat or open.
> > Archiving the stdlib into the ZIP file and using zipimport can
> > decrease Python startup time (perhaps there is an open issue about this).
>
> Oh, please don't. One thing I love about Python is the ease of access.
>

It wouldn't be a requirement, just a nootion


> I personally think that startup time is not really a big issue; even
> when it comes to microbenchmarks.
>

You might not, but just about every command-line app does.

-brett


> Best,
> Sven
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Steve Dower
Brett tried freezing the entire stdlib at one point (as we do for parts of 
importlib) and reported no significant improvement. Since that rules out code 
compilation as well as the OS calls, it'd seem the priority is to execute less 
code on startup.

Details of that work were posted to python-dev about twelve months ago, IIRC. 
Maybe a little longer.

Top-posted from my Windows Phone

-Original Message-
From: "Serhiy Storchaka" <storch...@gmail.com>
Sent: ‎1/‎30/‎2016 10:22
To: "python-dev@python.org" <python-dev@python.org>
Subject: Re: [Python-Dev] More optimisation ideas

On 30.01.16 18:31, Steve Dower wrote:
> On 30Jan2016 0645, Serhiy Storchaka wrote:
>> $ ./python -m timeit -s "import codecs; from encodings.cp437 import
>> decoding_table" -- "codecs.charmap_build(decoding_table)"
>> 10 loops, best of 3: 4.36 usec per loop
>>
>> Getting rid from charmap_build() would save you at most 4.4 microseconds
>> per encoding. 0.0005 seconds if you have imported *all* standard
>> encodings!
>
> Just as happy to be proven wrong. Perhaps I misinterpreted my original
> profiling and then, embarrassingly, ran with the result for a long time
> without retesting.

AFAIK the most time is spent in system calls like stat or open. 
Archiving the stdlib into the ZIP file and using zipimport can decrease 
Python startup time (perhaps there is an open issue about this).


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Steve Dower

On 30Jan2016 0645, Serhiy Storchaka wrote:

$ ./python -m timeit -s "import codecs; from encodings.cp437 import
decoding_table" -- "codecs.charmap_build(decoding_table)"
10 loops, best of 3: 4.36 usec per loop

Getting rid from charmap_build() would save you at most 4.4 microseconds
per encoding. 0.0005 seconds if you have imported *all* standard encodings!


Just as happy to be proven wrong. Perhaps I misinterpreted my original 
profiling and then, embarrassingly, ran with the result for a long time 
without retesting.



And how you expected to store encoding_table in more efficient way?


There's nothing inefficient about its storage, but as it does not change 
it would be trivial to store it statically. Then "building" the map is 
simply obtaining a pointer into an already loaded memory page. Much 
faster than building it on load, but both are clearly insignificant 
compared to other factors.


Cheers,
Steve

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Serhiy Storchaka

On 30.01.16 18:31, Steve Dower wrote:

On 30Jan2016 0645, Serhiy Storchaka wrote:

$ ./python -m timeit -s "import codecs; from encodings.cp437 import
decoding_table" -- "codecs.charmap_build(decoding_table)"
10 loops, best of 3: 4.36 usec per loop

Getting rid from charmap_build() would save you at most 4.4 microseconds
per encoding. 0.0005 seconds if you have imported *all* standard
encodings!


Just as happy to be proven wrong. Perhaps I misinterpreted my original
profiling and then, embarrassingly, ran with the result for a long time
without retesting.


AFAIK the most time is spent in system calls like stat or open. 
Archiving the stdlib into the ZIP file and using zipimport can decrease 
Python startup time (perhaps there is an open issue about this).



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Brett Cannon
On Sat, 30 Jan 2016 at 10:21 Serhiy Storchaka  wrote:

> On 30.01.16 18:31, Steve Dower wrote:
> > On 30Jan2016 0645, Serhiy Storchaka wrote:
> >> $ ./python -m timeit -s "import codecs; from encodings.cp437 import
> >> decoding_table" -- "codecs.charmap_build(decoding_table)"
> >> 10 loops, best of 3: 4.36 usec per loop
> >>
> >> Getting rid from charmap_build() would save you at most 4.4 microseconds
> >> per encoding. 0.0005 seconds if you have imported *all* standard
> >> encodings!
> >
> > Just as happy to be proven wrong. Perhaps I misinterpreted my original
> > profiling and then, embarrassingly, ran with the result for a long time
> > without retesting.
>
> AFAIK the most time is spent in system calls like stat or open.
> Archiving the stdlib into the ZIP file and using zipimport can decrease
> Python startup time (perhaps there is an open issue about this).
>

Check the archives, but  I did trying freezing the entire stdlib and it
didn't really make a difference in startup, so I don't know if this still
holds true anymore.

At this point I think all of our knowledge of what takes the most amount of
time during startup is outdated and someone should try to really profile
the whole thing to see where the hotspots are (e.g., is it stat calls from
imports, is it actually some specific function, is it just so many little
things adding up to a big thing, etc.).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Oscar Benjamin
On 30 January 2016 at 03:48, Steve Dower  wrote:
>
> It doesn't currently end up on disk. Some tables are partially or completely
> stored on disk as Python source code (some are partially generated from
> simple rules), but others are generated by inverting those. That process
> takes time that could be avoided by storing the generated tables, and
> storing all of it in a format that doesn't require parsing, compiling and
> executing (such as a native array).
>
> Potentially it could be a win all around if we stopped including the
> (larger) source files, but that doesn't seem like a good idea for
> maintaining portability to other implementations. The main thought is making
> the compiler binary bigger to avoid generating encoding tables at startup.

When I last tried to profile startup on Windows (I haven't used
Windows for some time now) it seemed that the time was totally
dominated by file system access. Essentially the limiting factor was
the inordinate number of stat calls and small file accesses. Although
this was probably Python 2.x which may not import those particular
modules and maybe it depends on virus scanner software etc.

Things may have changed now but I concluded that substantive gains
could only come from improving FS access. Perhaps something like
zipping up the standard library would see a big improvement.

--
Oscar
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Serhiy Storchaka

On 29.01.16 19:05, Steve Dower wrote:

This is probably the code snippet that bothered me the most:

 ### Encoding table
 encoding_table=codecs.charmap_build(decoding_table)

It shows up in many of the encodings modules, and while it is not a bad
function in itself, we are obviously generating a known data structure
on every startup. Storing these in static data is a tradeoff between
disk space and startup performance, and one I think it likely to be
worthwhile.


$ ./python -m timeit -s "import codecs; from encodings.cp437 import 
decoding_table" -- "codecs.charmap_build(decoding_table)"

10 loops, best of 3: 4.36 usec per loop

Getting rid from charmap_build() would save you at most 4.4 microseconds 
per encoding. 0.0005 seconds if you have imported *all* standard encodings!


And how you expected to store encoding_table in more efficient way?

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Sven R. Kunze

On 30.01.2016 19:20, Serhiy Storchaka wrote:
AFAIK the most time is spent in system calls like stat or open. 
Archiving the stdlib into the ZIP file and using zipimport can 
decrease Python startup time (perhaps there is an open issue about this).


Oh, please don't. One thing I love about Python is the ease of access.

I personally think that startup time is not really a big issue; even 
when it comes to microbenchmarks.


Best,
Sven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-30 Thread Sven R. Kunze

On 30.01.2016 21:32, Brett Cannon wrote:
On Sat, Jan 30, 2016, 12:30 Sven R. Kunze > wrote:


On 30.01.2016 19:20, Serhiy Storchaka wrote:
> AFAIK the most time is spent in system calls like stat or open.
> Archiving the stdlib into the ZIP file and using zipimport can
> decrease Python startup time (perhaps there is an open issue
about this).

Oh, please don't. One thing I love about Python is the ease of access.


It wouldn't be a requirement, just a nootion



That's good. :)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-29 Thread francismb
Hi,

> 
> Storing these in static data is a tradeoff between
> disk space and startup performance, and one I think it likely to be
> worthwhile.

it's really an important trade off? As far a I understand from your
email those modules are always being loaded and the final data created.
won't the space be there (on mem or disk)?

Thanks in advance!
francis


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-01-29 Thread Steve Dower
It doesn't currently end up on disk. Some tables are partially or completely 
stored on disk as Python source code (some are partially generated from simple 
rules), but others are generated by inverting those. That process takes time 
that could be avoided by storing the generated tables, and storing all of it in 
a format that doesn't require parsing, compiling and executing (such as a 
native array).

Potentially it could be a win all around if we stopped including the (larger) 
source files, but that doesn't seem like a good idea for maintaining 
portability to other implementations. The main thought is making the compiler 
binary bigger to avoid generating encoding tables at startup.

Top-posted from my Windows Phone

-Original Message-
From: "francismb" <franci...@email.de>
Sent: ‎1/‎29/‎2016 13:56
To: "python-dev@python.org" <python-dev@python.org>
Subject: Re: [Python-Dev] More optimisation ideas

Hi,

> 
> Storing these in static data is a tradeoff between
> disk space and startup performance, and one I think it likely to be
> worthwhile.

it's really an important trade off? As far a I understand from your
email those modules are always being loaded and the final data created.
won't the space be there (on mem or disk)?

Thanks in advance!
francis


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] More optimisation ideas

2016-01-29 Thread Steve Dower
Since we're all talking about making Python faster, I thought I'd drop 
some previous ideas I've had here in case (1) someone wants to actually 
do them, and (2) they really are new ideas that haven't failed in the 
past. Mostly I was thinking about startup time.


Here are the list of modules imported on clean startup on my Windows, 
US-English machine (from -v and cleaned up a bit):


import _frozen_importlib
import _imp
import sys
import '_warnings'
import '_thread'
import '_weakref'
import '_frozen_importlib_external'
import '_io'
import 'marshal'
import 'nt'
import '_thread'
import '_weakref'
import 'winreg'
import 'zipimport'
import '_codecs'
import 'codecs'
import 'encodings.aliases'
import 'encodings'
import 'encodings.mbcs'
import '_signal'
import 'encodings.utf_8'
import 'encodings.latin_1'
import '_weakrefset'
import 'abc'
import 'io'
import 'encodings.cp437'
import 'errno'
import '_stat'
import 'stat'
import 'genericpath'
import 'ntpath'
import '_collections_abc'
import 'os'
import '_sitebuiltins'
import 'sysconfig'
import '_locale'
import '_bootlocale'
import 'encodings.cp1252'
import 'site'

Obviously the easiest first thing is to remove or delay unnecessary 
imports. But a while ago I used a native profiler to trace through this 
and the most impactful modules were the encodings:


import 'encodings.mbcs'
import 'encodings.utf_8'
import 'encodings.latin_1'
import 'encodings.cp437'
import 'encodings.cp1252'

While I don't doubt that we need all of these for *some* reason, 
aliases, cp437 and cp1252 are relatively expensive modules to import. 
Mostly due to having large static dictionaries or data structures 
generated on startup.


Given this is static and mostly read-only information[1], I see no 
reason why we couldn't either generate completely static versions of 
them, or better yet compile the resulting data structures into the core 
binary.


([1]: If being able to write to some of the encoding data is used by 
some people, I vote for breaking that for 3.6 and making it read-only.)


This is probably the code snippet that bothered me the most:

### Encoding table
encoding_table=codecs.charmap_build(decoding_table)

It shows up in many of the encodings modules, and while it is not a bad 
function in itself, we are obviously generating a known data structure 
on every startup. Storing these in static data is a tradeoff between 
disk space and startup performance, and one I think it likely to be 
worthwhile.


Anyway, just an idea if someone wants to try it and see what 
improvements we can get. I'd love to do it myself, but when it actually 
comes to finding time I keep coming up short.


Cheers,
Steve


P.S. If you just want to discuss optimisation techniques or benchmarking 
in general, without specific application to CPython 3.6, there's a whole 
internet out there. Please don't make me the cause of a pointless 
centithread. :)

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com