At 01:02 PM 7/29/2009 +1200, Greg Ewing wrote:
P.J. Eby wrote:
So the optimum performance tradeoff depends on how many imports you
have *and* how many eggs you have on sys.path. If you have lots of
eggs and few imports, unzipped ones will probably be faster. If
you have lots of eggs and *lots* of imports, zipped ones will
probably be faster.
I'm wondering whether something could be gained by
cacheing the results of sys.path lookups somehow
between interpreter invocations.
Most of the time the contents of the directories
on one's PYTHONPATH don't change, so doing all this
statting and directory reading every time an
interpreter starts up seems rather suboptimal.
The catch is that then you need some way to know whether your cache
information is wrong/out-of-date. I suppose, though, that you could
do something like make a file that contains stat times, such that
modifying the contained directory would automatically invalidate the
cache info.
However, you'd probably gain more by making the core import logic
simply use the dircache module (or a C equivalent thereof) in place
of stat() calls. This would drop the per-import stat() count for
each directory to 1 (in place of several for .py, .pyc, .pyd/.so,
/__init__.py, etc.), at the cost of an initial listdir() call the
first time a directory is used. This would give normal imports most
of the speedup benefit that e.g. putting the stdlib in a zipfile does.
_______________________________________________
Distutils-SIG maillist - [email protected]
http://mail.python.org/mailman/listinfo/distutils-sig