Rémi Lapeyre <remi.lape...@henki.fr> added the comment: Hi everybody, I opened PR 11781 to add a key argument to functions in the bisect module. I agree with @dmtr's points that this addition is not a bad design.
As far as I can tell, the key function is at called at most once per item as this example where an assertion would break shows: import bisect from collections import defaultdict class Test: def __init__(self, value): self.value = value cache = defaultdict(int) def key(e): cache[e] += 1 assert cache[e] <= 1 return e.value l = [Test(i) for i in range(10000)] bisect.bisect(l, Test(25), key=key) ➜ cpython git:(add-key-argument-to-bisect) ./python.exe Python 3.8.0a1+ (heads/add-key-argument-to-bisect:b7aaa1adad, Feb 7 2019, 17:33:24) [Clang 10.0.0 (clang-1000.10.44.4)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import bisect >>> from collections import defaultdict >>> >>> >>> class Test: ... def __init__(self, value): ... self.value = value ... >>> >>> cache = defaultdict(int) >>> >>> def key(e): ... cache[e] += 1 ... assert cache[e] <= 1 ... return e.value ... >>> >>> l = [Test(i) for i in range(10000)] >>> >>> bisect.bisect(l, Test(25), key=key) 26 This argument can be used where the objects are immutable and I have not been able to see changes in bisect speed using Victor Stinner perf module: (cpython-venv) ➜ cpython git:(add-key-argument-to-bisect) ✗ make distclean && ./configure --with-pydebug --with-openssl=$(brew --prefix openssl) && make -sj && python -m perf timeit --rigorous -s "from bisect import bisect" "bisect(range(1_000_000_000_000_000), 25)" -o $(git rev-parse --short HEAD).json (cpython-venv) ➜ cpython git:(add-key-argument-to-bisect) ✗ git checkout cd90f6a369 (cpython-venv) ➜ cpython git:(cd90f6a369) ✗ make distclean && ./configure --with-pydebug --with-openssl=$(brew --prefix openssl) && make -sj && python -m perf timeit --rigorous -s "from bisect import bisect" "bisect(range(1_000_000_000_000_000), 25)" -o $(git rev-parse --short HEAD).json (cpython-venv) ➜ cpython git:(cd90f6a369) ✗ python -m perf compare_to cd90f6a369.json b7aaa1adad.json Mean +- std dev: [cd90f6a369] 36.2 us +- 1.0 us -> [b7aaa1adad] 35.7 us +- 0.5 us: 1.01x faster (-1%) (cd90f6a369 was somtime faster than b7aaa1adad, sometime slower but they always were less than one std dev from one another) As I wrote in the discussion of the PR, I suspect the branch predictor to predict reliably the branching in the hot path (though I don't know much about that and would love some input). For the record, here is the performance when a key function is given: (cpython-venv) ➜ cpython git:(add-key-argument-to-bisect) ✗ python -m perf timeit --rigorous -s "from bisect import bisect" "bisect(range(1_000_000_000_000_000), 25, key=lambda e: e)" ......................................... Mean +- std dev: 59.3 us +- 1.0 us It seems to me that adding the key parameter is the best solution possible. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue4356> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com