At 01:02 PM 7/29/2009 +1200, Greg Ewing wrote:
P.J. Eby wrote:

So the optimum performance tradeoff depends on how many imports you have *and* how many eggs you have on sys.path. If you have lots of eggs and few imports, unzipped ones will probably be faster. If you have lots of eggs and *lots* of imports, zipped ones will probably be faster.

I'm wondering whether something could be gained by
cacheing the results of sys.path lookups somehow
between interpreter invocations.

Most of the time the contents of the directories
on one's PYTHONPATH don't change, so doing all this
statting and directory reading every time an
interpreter starts up seems rather suboptimal.

The catch is that then you need some way to know whether your cache information is wrong/out-of-date. I suppose, though, that you could do something like make a file that contains stat times, such that modifying the contained directory would automatically invalidate the cache info.

However, you'd probably gain more by making the core import logic simply use the dircache module (or a C equivalent thereof) in place of stat() calls. This would drop the per-import stat() count for each directory to 1 (in place of several for .py, .pyc, .pyd/.so, /__init__.py, etc.), at the cost of an initial listdir() call the first time a directory is used. This would give normal imports most of the speedup benefit that e.g. putting the stdlib in a zipfile does.

_______________________________________________
Distutils-SIG maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to