New submission from Elijah Rippeth <[email protected]>:
I have a directory with hundreds of thousands of text files. I wanted to
explore one file, so I wrote the following code expecting it to happen
basically instantaneously because of how generators work:
```python
from pathlib import Path
base_dir = Path("/path/to/lotta/files/")
files = base_dir.glob("*.txt") # return immediately
first_file = next(files) # doesn't return immediately
```
to my surprise, this took a long time to finish since `next` on a generator
should be O(1).
A colleague pointed me to the following code:
https://github.com/python/cpython/blob/adcd2205565f91c6719f4141ab4e1da6d7086126/Lib/pathlib.py#L431
I assume calling this list is to "freeze" a potentially changing directory
since `scandir` relies on `os.stat`, but this causes a huge penalty and makes
the generator return-type a bit disingenuous. In any case, I think this is bug
worthy in someo sense.
----------
components: IO
messages: 393190
nosy: Elijah Rippeth
priority: normal
severity: normal
status: open
title: pathlib.Path.glob's generator is not a real generator
type: performance
versions: Python 3.6
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue44069>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com