New submission from Anthony Sottile <[email protected]>:
I did some profiling (attached a few files here with svgs) of running this
script:
```python
import io
import tokenize
# picked as the second longest file in cpython
with open('Lib/test/test_socket.py', 'rb') as f:
bio = io.BytesIO(f.read())
def main():
for _ in range(10):
bio.seek(0)
for _ in tokenize.tokenize(bio.readline):
pass
if __name__ == '__main__':
exit(main())
```
the first profile is before the optimization, the second is after the
optimization
The optimization takes the execution from ~6300ms to ~4500ms on my machine
(representing a 28% - 39% improvement depending on how you calculate it)
(I'll attach the pstats and svgs after creation, seems I can only attach one
file at once)
----------
components: Library (Lib)
files: out.pstats
messages: 385572
nosy: Anthony Sottile
priority: normal
severity: normal
status: open
title: tokenize spends a lot of time in `re.compile(...)`
type: performance
versions: Python 3.10, Python 3.9
Added file: https://bugs.python.org/file49759/out.pstats
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue43014>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com