[issue23916] module importing performance regression
David Roundy added the comment: Here is a little script to demonstrate the regression (which yes, is still bothering me). -- type: -> performance versions: +Python 3.5 Added file: http://bugs.python.org/file47016/test.py ___ Python tracker <http://bugs.python.org/issue23916> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23916] module importing performance regression
David Roundy added the comment: My tests involved 8 million files on an ext4 file system. I expect that accounts for the difference. It's true that it's an excessive number of files, and maybe the best option is to ignore the problem. On Sat, Apr 11, 2015 at 2:52 PM Antoine Pitrou wrote: > > Antoine Pitrou added the comment: > > As for your question: > > > If the script > > directory is normally the last one in the search path couldn't you > > skip the > > listing of that directory without losing your optimization? > > Given the way the code is architected, that would complicate things > significantly. Also it would introduce a rather unexpected discrepancy. > > -- > > ___ > Python tracker > <http://bugs.python.org/issue23916> > ___ > -- ___ Python tracker <http://bugs.python.org/issue23916> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23916] module importing performance regression
David Roundy added the comment: I had suspected that might be the case. At this point mostly it's just a test case where I generated a lot of files to demonstrate the issue. In my test case hello world with one module import takes a minute and 40 seconds. I could make it take longer, of course, by creating more files. I do think scaling should be a consideration when introducing optimizations, when if getdents is usually pretty fast. If the script directory is normally the last one in the search path couldn't you skip the listing of that directory without losing your optimization? On Sat, Apr 11, 2015, 1:37 PM Antoine Pitrou wrote: > > Antoine Pitrou added the comment: > > This change is actually an optimization. The directory is only read once > and its contents are then cached, which allows for much quicker imports > when multiple modules are in the directory (common case of a Python > package). > > Can you tell us more about your setup? > - how many files are in the directory > - what filesystem is used > - whether the filesystem is local or remote (e.g. network-attached) > - your OS and OS version > > Also, how long is "very slowly"? > > -- > nosy: +pitrou > > ___ > Python tracker > <http://bugs.python.org/issue23916> > ___ > -- ___ Python tracker <http://bugs.python.org/issue23916> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23916] module importing performance regression
New submission from David Roundy: I have observed a performance regression in module importing. In python 3.4.2, importing a module from the current directory (where the script is located) causes the entire directory to be read. When there are many files in this directory, this can cause the script to run very slowly. In python 2.7.9, this behavior is not present. It would be preferable (in my opinion) to revert the change that causes python to read the entire user directory. -- messages: 240491 nosy: daveroundy priority: normal severity: normal status: open title: module importing performance regression versions: Python 3.4 ___ Python tracker <http://bugs.python.org/issue23916> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com