On Fri, May 7, 2021 at 8:14 PM Neil Schemenauer <nas-pyt...@arctrix.com> wrote: > > On 2021-05-07, Pablo Galindo Salgado wrote: > > Technically the main concern may be the size of the unmarshalled > > pyc files in memory, more than the storage size of disk. > > It would be cool if we could mmap the pyc files and have the VM run > code without an unmarshal step. One idea is something similar to > the Facebook "not another freeze" PR but with a twist. Their > approach was to dump out code objects so they could be loaded as if > they were statically defined structures. > > Instead, could we dump out the pyc data in a format similar to Cap'n > Proto? That way no unmarshal is needed. The VM would have to be > extensively changed to run code in that format. That's the hard > part. > > The benefit would be faster startup times. The unmarshal step is > costly. It would mostly solve the concern about these larger > linenum/colnum tables. We would only load that data into memory if > the table is accessed.
A simpler version would be to pack just the docstrings/lnotab/column numbers into a separate part of the .pyc, and store a reference to the file + offset to load them lazily on demand. No need for mmap. Could also store them in memory, but with some cheap compression applied, and decompress on access. None of these get accessed often. -n -- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/Q2DBRE5YKLTSPVCMUCXPEDXKFCA4UUGQ/ Code of Conduct: http://python.org/psf/codeofconduct/