Eric Snow <ericsnowcurren...@gmail.com> added the comment:

On Fri, Aug 27, 2021 at 11:14 PM Larry Hastings <rep...@bugs.python.org> wrote:
> [snip] On the other hand: if we made a viable tool that could consume some 
> arbitrary
> set of .py files and produce a C file, and said C file could then be compiled 
> into a
> shared library, end users could enjoy this speedup over the subset of the 
> standard
> library their program used, and perhaps even their own source tree(s).

Yeah, that would be interesting to investigate.

On Sat, Aug 28, 2021 at 5:17 AM Marc-Andre Lemburg
<rep...@bugs.python.org> wrote:
> Eric's approach, as I understand it, is pretty much what PyRun does.
> [further details]

It's reassuring to hear that the approach is known to be viable. :)

> In fact, you save quite a bit of disk space compared to a full Python 
> installation and
> additionally benefit from the memory mapping the OS does for sharing access 
> to the
> marshal'ed byte code between processes.

That's a good point.

> That said, some things don't work with such an approach, e.g. a few packages
> include additional data files which they expect to find on disk. Since those 
> are
> not available anymore, they fail.
>
> For PyRun I have patched some of those packages to include the data in form of
> Python modules instead, so that it gets frozen as well, e.g. the Python 
> grammar files.

For stdlib modules it wouldn't be a big problem to set __file__ on
frozen modules.
Would that be enough to solve the problem?

On Sat, Aug 28, 2021 at 5:41 PM Gregory Szorc <rep...@bugs.python.org> wrote:
> When I investigated freezing the standard library for PyOxidizer, I ran into 
> a rash
> of problems. The frozen importer doesn't behave like PathFinder. It doesn't
> (didn't?) set some common module-level attributes

This is mostly fixable for stdlib modules.  Which attributes would
need to be added?  Are there other missing behaviors?

> Also, when I last looked at the CPython source, the frozen importer performed
> a linear scan of its indexed C array performing strcmp() on each entry until 
> it
> found what it was looking for. So adding hundreds of modules could result in
> sufficient overhead and justify using a more efficient lookup algorithm.
> (PyOxidizer uses Rust's HashMap to index modules by name.)

Yeah, we noticed this too.  I wasn't sure it was something to worry
about at first because we're not freezing the entire stdlib.  We're
freezing on the order of 10, plus all the (80+) encoding modules.  I
figured we could look at an alternative to that linear search
afterward if it made sense.

> * Make sure you run unit tests against the frozen modules. If you don't do 
> this, subtle differences in how the different importers behave will lead to 
> problems.

We'll do what we already do with importlib: run the tests against both
the frozen and the source modules.  Thanks for the reminder to do this
though!

On Sat, Aug 28, 2021 at 5:53 PM Gregory Szorc <rep...@bugs.python.org> wrote:
> Oh, PyOxidizer also ran into more general issues with the frozen importer in 
> that
> it broke various importlib APIs. e.g. because the frozen importer only 
> supports
> bytecode, you can't use .__loader__.get_source() to obtain the source of a 
> module.
> This makes tracebacks more opaque and breaks legitimate API consumers relying
> on these importlib interfaces.

Good point.  Supporting more of the FileLoader API on the frozen
loader is something to look into, at least for stdlib modules.

> The fundamental limitations with the frozen importer are why I implemented my
> own meta path importer (implemented in pure Rust), which is more fully 
> featured,
> like the PathFinder importer that most people rely on today. That importer is
> available on PyPI (https://pypi.org/project/oxidized-importer/) and has its 
> own API
> to facilitate PyOxidizer-like functionality
> (https://pyoxidizer.readthedocs.io/en/stable/oxidized_importer.html) if anyone
> wants to experiment with it.

Awesome!  I'll take a look.

On Sat, Aug 28, 2021 at 6:14 PM Guido van Rossum <rep...@bugs.python.org> wrote:
> I agree that we should shore up the frozen importer -- probably in a separate 
> PR though.
> (@Eric: do you think this is worth its own bpo issue?)

Yeah.

-eric

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue45020>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to