Dennis Sweeney <[email protected]> added the comment:
It seems to me that the code sprawl mostly comes from the separate handling of
the four keyed/unkeyed and forward/reverse cases, which as far as I can tell
requires a branch in the innermost loop if not unrolled into separate cases. I
think recursive_merge.py is about as concise as possible while still
efficiently handling all four cases.
Is there any issue with recursion in this application? Even if there are
2**64-1 iterables, this would only mean 64 nested next() calls, which should be
within the stack on most machines, no?
I did the following benchmark:
py -3.9 -m pyperf timeit -s "from [FILENAME] import merge; from collections
import deque" "deque(merge(open('english.txt'), open('dutch.txt'),
open('french.txt'), open('german.txt'), open('italian.txt')), maxlen=0)"
new_merge.py: Mean +- std dev: 773 ms +- 16 ms
tournament_heap.py: Mean +- std dev: 665 ms +- 42 ms
losers.py: Mean +- std dev: 470 ms +- 31 ms
Existing heapq: Mean +- std dev: 313 ms +- 2 ms
recursive_merge.py: Mean +- std dev: 266 ms +- 3 ms
I can look at some more diverse benchmarks (types, iterable length imbalance,
lengths of an iterator's win-streak, etc.) soon.
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue38938>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com