Martin v. Löwis wrote:

> A stat call will not only look at the directory entry, but also
> look at the inode. This will require another disk access, as the
> inode is at a different location of the disk.

That should be in favour of the directory-reading
approach, since e.g. to find out which if any of
x.py/x.pyc/x.pyo exists, you only need to look for
the names.

> It depends on the file system you are using. An NTFS directory
> lookup is a B-Tree search; ...

Yes, I know that some file systems are smarter;
MacOS HFS is another one that uses b-trees.

However it still seems to me that looking up a
path in a file system is a much heavier operation
than looking up a Python dict, even if everything
is in memory. You have to parse the path, and look
up each component separately in a different
directory tree or whatever.

The way I envisage it, you would read all the
directories and build a single dictionary mapping
fully-qualified module names to pathnames. Any
given import then takes at most one dict lookup
and one access of a known-to-exist file.

 > For
> a large directory, the cost of reading in the entire directory
> might be higher than the savings gained from not having to
> search it.

Possibly. I guess we'd need some timings to assess
the meaning of "large".

> Also, if we do our own directory caching, the question
> is when to invalidate the cache.

I think I'd be happy with having to do that explicitly.
I expect the vast majority of Python programs don't
need to track changes to the set of importable modules
during execution. The exceptions would be things like
IDEs, and they could do a cache flush before reloading
a module, etc.

--
Greg

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to