Eric Snow <ericsnowcurren...@gmail.com> added the comment:
On Fri, Aug 27, 2021 at 11:14 PM Larry Hastings <rep...@bugs.python.org> wrote: > [snip] On the other hand: if we made a viable tool that could consume some > arbitrary > set of .py files and produce a C file, and said C file could then be compiled > into a > shared library, end users could enjoy this speedup over the subset of the > standard > library their program used, and perhaps even their own source tree(s). Yeah, that would be interesting to investigate. On Sat, Aug 28, 2021 at 5:17 AM Marc-Andre Lemburg <rep...@bugs.python.org> wrote: > Eric's approach, as I understand it, is pretty much what PyRun does. > [further details] It's reassuring to hear that the approach is known to be viable. :) > In fact, you save quite a bit of disk space compared to a full Python > installation and > additionally benefit from the memory mapping the OS does for sharing access > to the > marshal'ed byte code between processes. That's a good point. > That said, some things don't work with such an approach, e.g. a few packages > include additional data files which they expect to find on disk. Since those > are > not available anymore, they fail. > > For PyRun I have patched some of those packages to include the data in form of > Python modules instead, so that it gets frozen as well, e.g. the Python > grammar files. For stdlib modules it wouldn't be a big problem to set __file__ on frozen modules. Would that be enough to solve the problem? On Sat, Aug 28, 2021 at 5:41 PM Gregory Szorc <rep...@bugs.python.org> wrote: > When I investigated freezing the standard library for PyOxidizer, I ran into > a rash > of problems. The frozen importer doesn't behave like PathFinder. It doesn't > (didn't?) set some common module-level attributes This is mostly fixable for stdlib modules. Which attributes would need to be added? Are there other missing behaviors? > Also, when I last looked at the CPython source, the frozen importer performed > a linear scan of its indexed C array performing strcmp() on each entry until > it > found what it was looking for. So adding hundreds of modules could result in > sufficient overhead and justify using a more efficient lookup algorithm. > (PyOxidizer uses Rust's HashMap to index modules by name.) Yeah, we noticed this too. I wasn't sure it was something to worry about at first because we're not freezing the entire stdlib. We're freezing on the order of 10, plus all the (80+) encoding modules. I figured we could look at an alternative to that linear search afterward if it made sense. > * Make sure you run unit tests against the frozen modules. If you don't do > this, subtle differences in how the different importers behave will lead to > problems. We'll do what we already do with importlib: run the tests against both the frozen and the source modules. Thanks for the reminder to do this though! On Sat, Aug 28, 2021 at 5:53 PM Gregory Szorc <rep...@bugs.python.org> wrote: > Oh, PyOxidizer also ran into more general issues with the frozen importer in > that > it broke various importlib APIs. e.g. because the frozen importer only > supports > bytecode, you can't use .__loader__.get_source() to obtain the source of a > module. > This makes tracebacks more opaque and breaks legitimate API consumers relying > on these importlib interfaces. Good point. Supporting more of the FileLoader API on the frozen loader is something to look into, at least for stdlib modules. > The fundamental limitations with the frozen importer are why I implemented my > own meta path importer (implemented in pure Rust), which is more fully > featured, > like the PathFinder importer that most people rely on today. That importer is > available on PyPI (https://pypi.org/project/oxidized-importer/) and has its > own API > to facilitate PyOxidizer-like functionality > (https://pyoxidizer.readthedocs.io/en/stable/oxidized_importer.html) if anyone > wants to experiment with it. Awesome! I'll take a look. On Sat, Aug 28, 2021 at 6:14 PM Guido van Rossum <rep...@bugs.python.org> wrote: > I agree that we should shore up the frozen importer -- probably in a separate > PR though. > (@Eric: do you think this is worth its own bpo issue?) Yeah. -eric ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue45020> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com