[issue23916] module importing performance regression

2017-07-16 Thread Brett Cannon

Brett Cannon added the comment:

I agree with Antoine that this shouldn't change. Having said that, it
wouldn't be hard to write your own finder using importlib that doesn't get
the directory contents and instead checks for the file directly (and you
could even set it just for your troublesome directory to get the
performance benefit from the default finder).

On Sun, Jul 16, 2017, 05:25 Antoine Pitrou,  wrote:

>
> Antoine Pitrou added the comment:
>
> Thanks for the reproducer.  I haven't changed my mind on the resolution,
> as it is an extremely unlikely usecase (a directory with 1e8 files is
> painful to manage with standard command-line tools).  I suggest you change
> your approach, for example you could use a directory hashing scheme to
> spread the files into smaller subdirectories.
>
> --
>
> ___
> Python tracker 
> 
> ___
>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23916] module importing performance regression

2017-07-16 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Thanks for the reproducer.  I haven't changed my mind on the resolution, as it 
is an extremely unlikely usecase (a directory with 1e8 files is painful to 
manage with standard command-line tools).  I suggest you change your approach, 
for example you could use a directory hashing scheme to spread the files into 
smaller subdirectories.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23916] module importing performance regression

2017-07-16 Thread David Roundy

David Roundy added the comment:

Here is a little script to demonstrate the regression (which yes, is still 
bothering me).

--
type:  -> performance
versions: +Python 3.5
Added file: http://bugs.python.org/file47016/test.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23916] module importing performance regression

2015-04-11 Thread David Roundy

David Roundy added the comment:

I had suspected that might be the case. At this point mostly it's just a
test case where I generated a lot of files to demonstrate the issue.  In my
test case hello world with one module import takes a minute and 40 seconds.
I could make it take longer, of course, by creating more files.

I do think scaling should be a consideration when introducing
optimizations, when if getdents is usually pretty fast.  If the script
directory is normally the last one in the search path couldn't you skip the
listing of that directory without losing your optimization?

On Sat, Apr 11, 2015, 1:37 PM Antoine Pitrou rep...@bugs.python.org wrote:


 Antoine Pitrou added the comment:

 This change is actually an optimization. The directory is only read once
 and its contents are then cached, which allows for much quicker imports
 when multiple modules are in the directory (common case of a Python
 package).

 Can you tell us more about your setup?
 - how many files are in the directory
 - what filesystem is used
 - whether the filesystem is local or remote (e.g. network-attached)
 - your OS and OS version

 Also, how long is very slowly?

 --
 nosy: +pitrou

 ___
 Python tracker rep...@bugs.python.org
 http://bugs.python.org/issue23916
 ___


--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23916
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23916] module importing performance regression

2015-04-11 Thread Antoine Pitrou

Antoine Pitrou added the comment:

I was asking questions because I wanted to have more precise data. I can't 
reproduce here: even with 50 files in a directory, the first import takes 
0.2s, not one minute.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23916
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23916] module importing performance regression

2015-04-11 Thread Antoine Pitrou

Antoine Pitrou added the comment:

As for your question:

 If the script
 directory is normally the last one in the search path couldn't you
 skip the
 listing of that directory without losing your optimization?

Given the way the code is architected, that would complicate things 
significantly. Also it would introduce a rather unexpected discrepancy.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23916
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23916] module importing performance regression

2015-04-11 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
nosy: +brett.cannon, eric.snow, ncoghlan

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23916
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23916] module importing performance regression

2015-04-11 Thread David Roundy

New submission from David Roundy:

I have observed a performance regression in module importing.  In python 3.4.2, 
importing a module from the current directory (where the script is located) 
causes the entire directory to be read.  When there are many files in this 
directory, this can cause the script to run very slowly.

In python 2.7.9, this behavior is not present.

It would be preferable (in my opinion) to revert the change that causes python 
to read the entire user directory.

--
messages: 240491
nosy: daveroundy
priority: normal
severity: normal
status: open
title: module importing performance regression
versions: Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23916
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23916] module importing performance regression

2015-04-11 Thread Antoine Pitrou

Antoine Pitrou added the comment:

This change is actually an optimization. The directory is only read once and 
its contents are then cached, which allows for much quicker imports when 
multiple modules are in the directory (common case of a Python package).

Can you tell us more about your setup?
- how many files are in the directory
- what filesystem is used
- whether the filesystem is local or remote (e.g. network-attached)
- your OS and OS version

Also, how long is very slowly?

--
nosy: +pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23916
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23916] module importing performance regression

2015-04-11 Thread David Roundy

David Roundy added the comment:

My tests involved 8 million files on an ext4 file system.  I expect that
accounts for the difference.  It's true that it's an excessive number of
files, and maybe the best option is to ignore the problem.

On Sat, Apr 11, 2015 at 2:52 PM Antoine Pitrou rep...@bugs.python.org
wrote:


 Antoine Pitrou added the comment:

 As for your question:

  If the script
  directory is normally the last one in the search path couldn't you
  skip the
  listing of that directory without losing your optimization?

 Given the way the code is architected, that would complicate things
 significantly. Also it would introduce a rather unexpected discrepancy.

 --

 ___
 Python tracker rep...@bugs.python.org
 http://bugs.python.org/issue23916
 ___


--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23916
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23916] module importing performance regression

2015-04-11 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Indeed, that doesn't sound like something we want to support. I'm closing then.

--
resolution:  - wont fix
stage:  - resolved
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23916
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com