> Any opinions?

I would use a different marshal implementation. Instead of defining
a stream format for marshal, make marshal dump its graph of objects
along with the actual memory layout. On load, copying can
be avoided; just a few pointers need to be updated. The resulting
marshal files would be platform-specific (wrt. endianness and pointer
width).

On marshaling, you copy all objects into a contiguous block
of memory (8-aligned), and dump that. On unmarshaling, you just
map that block. If the target supports true memory mapping with
page boundaries, you might be able to store multiple .pyc files
into a single page. This reformatting could be done offline
also.

A few things need to be considered:
- compatibility. The original marshal code would probably
  need to be preserved for the "marshal" module.
- relative pointers. Code objects, tuples, etc. contain
  pointers. Assuming the marshaled object cannot be loaded
  back into the same address, you need to adjust pointers.
  A common trick is to put a desired load address into the
  memory block, then try to load into that address. If the
  address is already taken, load into a different address,
  and walk though all objects, adjusting pointers.
- type references. On loading, you will need to patch all
  ob_type fields. Put the marshal codes into the ob_type
  field on marshalling, then switch on unmarshalling.
- references to interned strings. On loading, you can
  either intern them all, or you have a "fast interning"
  algorithm that assigns a fixed table of interned-string
  numbers.
- reference counting. Make sure all these objects start
  out with a reference count of 1, so they will never
  become garbage.

If you use a container file for multiple .pyc files,
you can have additional savings by sharing strings
across modules; this should help in particular for
reference to builtin symbols, and for common method
names. A fixed interning might become unnecessary as
the unique single string object in the container will
either become the interned string itself, or point it
it after being interned once.
With such a container system, unmarshalling should be
lazy; e.g. for each object, the value of ob_type can
be used to determine whether the object was
unmarshalled.

Of course, you still have the actual interpretation of
the top-level module code - if it's not the marshalling
but this part that actually costs performance, this
efficient marshalling algorithm won't help. It would be
interesting to find out which modules have a particularly
high startup cost - perhaps they can be rewritten.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to