Dino Viehland <[email protected]> added the comment:
The 20MB of savings is actually the amount of byte code that exists in the IG
code base. I was just measuring the web site code, and not the other various
Python code in the process (e.g. no std lib code, no 3rd party libraries,
etc...). The IG code base is pretty monolithic and starting up the site
requires about half of the code to get imported. So I think the 20MB per
process is a pretty realistic number.
I've also created a C extension and the object implementing the buffer protocol
looks like:
typedef struct {
PyObject_HEAD
const char* data;
size_t size;
Py_ssize_t hash;
CIceBreaker *breaker;
size_t exports;
PyObject* code_obj; /* borrowed reference, the code object keeps us alive */
} CIceBreakerCode;
All of the modules are currently getting compiled into a single memory mapped
file and then these objects get created which implement the buffer protocol for
each function. So the overhead it just takes a byte code w/ 16 opcodes before
it breaks even, so it is significantly lighter weight than using a memoryview
object.
It's certainly true that the byte code isn't the #1 source of memory here (the
code objects themselves are pretty big), but in the serialized state it ends up
representing 25% of the serialized data. I would expect when you add in ref
counts and typing information it's not quite as good, but reducing the overhead
of code by 20% is still a pretty nice win.
I can't make any promises about open sourcing the import system, but I can
certainly look into that as well.
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue36839>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com