[issue2672] speed of set.update(
John Arbash Meinel [EMAIL PROTECTED] added the comment: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Alexander Belopolsky wrote: Alexander Belopolsky [EMAIL PROTECTED] added the comment: This has nothing to do with set.update, the difference is due to the time to setup the generator: $ python -m timeit -s 'x = set(range(1)); y = []' 'x.update(y)' 100 loops, best of 3: 0.38 usec per loop $ python -m timeit -s 'x = set(range(1)); y = (i for i in [])' 'x.update(y)' 100 loops, best of 3: 0.335 usec per loop -- nosy: +belopolsky __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2672 __ That is true, though if I just force a generator overhead: % python -m timeit -s 'x = set(range(1)); y = []' 'x.update(y)' 100 loops, best of 3: 0.204 usec per loop % python -m timeit -s 'x = set(range(1)); y = (i for i in [])' 'x.update(y)' 1000 loops, best of 3: 0.173 usec per loop % python -m timeit -s 'x = set(range(1)); l = []' 'x.update(i for i in l)' 100 loops, best of 3: 0.662 usec per loop python -m timeit -s 'x = set(range(1)); l = []; y = (i for i in l)' '(i for i in l); x.update(y)' 100 loops, best of 3: 1.87 usec per loop So if you compare consuming a generator multiple times to creating it each time, it is 0.662 usec - 0.173 usec = 0.489 usec to create a generator. So why does: (i for i in l); x.update(y) take an additional 1.208 usec. (I'm certainly willing to believe that set.update() is generator/list agnostic, but something weird is still happening.) John =:- -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFIENAoJdeBCYSNAAMRAk2yAJ4okAalR6zWD0/E5XHei/ckce+L7QCgstEQ l+6+bl7oAJMhdJ70viqicnQ= =pLX6 -END PGP SIGNATURE- -- title: speed of set.update([]) - speed of set.update( __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2672 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2672] speed of set.update(
Alexander Belopolsky [EMAIL PROTECTED] added the comment: On Thu, Apr 24, 2008 at 2:23 PM, John Arbash Meinel [EMAIL PROTECTED] wrote: .. So if you compare consuming a generator multiple times to creating it each time, it is 0.662 usec - 0.173 usec = 0.489 usec to create a generator. So why does: (i for i in l); x.update(y) take an additional 1.208 usec. (I'm certainly willing to believe that set.update() is generator/list agnostic, but something weird is still happening.) I've seen a similar strangeness in timings: $ python -m timeit '(i for i in [])' 10 loops, best of 3: 4.16 usec per loop but $ python -m timeit -s 'x = set()' 'x.update(i for i in [])' 100 loops, best of 3: 1.31 usec per loop on the other hand, $ python -m timeit -s 'x = []' 'x.extend(i for i in [])' 10 loops, best of 3: 4.54 usec per loop How can x.update(i for i in []) take *less* time than simply creating a genexp? Note that there is no apparent bytecode tricks here: 1 0 LOAD_CONST 0 (code object genexpr at 0xf7e88920, file stdin, line 1) 3 MAKE_FUNCTION0 6 BUILD_LIST 0 9 GET_ITER 10 CALL_FUNCTION1 13 RETURN_VALUE dis(lambda:x.update(i for i in [])) 1 0 LOAD_GLOBAL 0 (x) 3 LOAD_ATTR1 (update) 6 LOAD_CONST 0 (code object genexpr at 0xf7e88920, file stdin, line 1) 9 MAKE_FUNCTION0 12 BUILD_LIST 0 15 GET_ITER 16 CALL_FUNCTION1 19 CALL_FUNCTION1 22 RETURN_VALUE __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2672 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2672] speed of set.update(
Raymond Hettinger [EMAIL PROTECTED] added the comment: John, when y=[], the update method has to create a new list iterator on each invocation. But when y is a genexp, it is self-iterable (iow, iter (y) will return self, not a new object). Also, when doing timings, it can be helpful to factor-out the attribute lookup: python -m timeit -s 'x=set(range(1)); y=[]; xu=x.update' 'xu(y)' __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2672 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2672] speed of set.update(
John Arbash Meinel [EMAIL PROTECTED] added the comment: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Raymond Hettinger wrote: Raymond Hettinger [EMAIL PROTECTED] added the comment: John, when y=[], the update method has to create a new list iterator on each invocation. But when y is a genexp, it is self-iterable (iow, iter (y) will return self, not a new object). Also, when doing timings, it can be helpful to factor-out the attribute lookup: python -m timeit -s 'x=set(range(1)); y=[]; xu=x.update' 'xu(y)' __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2672 __ Sure, I wasn't surprised at the set.update(y) versus set.update([]) What I was surprised at is the time for: (i for i in []) being about 4x longer than set.update(i for i in []) Anyway, the original issue is probably closed, whether we want to track into the generator stuff or not is probably a different issue. John =:- -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFIEP4EJdeBCYSNAAMRAq+MAKC6tLjEtIBX7YgLNoYEfqjRKB4DzACglXjh cEVLEP5Hu3vpeVgVYdTbAVc= =94ja -END PGP SIGNATURE- __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2672 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2672] speed of set.update([])
Alexander Belopolsky [EMAIL PROTECTED] added the comment: This has nothing to do with set.update, the difference is due to the time to setup the generator: $ python -m timeit -s 'x = set(range(1)); y = []' 'x.update(y)' 100 loops, best of 3: 0.38 usec per loop $ python -m timeit -s 'x = set(range(1)); y = (i for i in [])' 'x.update(y)' 100 loops, best of 3: 0.335 usec per loop -- nosy: +belopolsky __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2672 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2672] speed of set.update([])
Raymond Hettinger [EMAIL PROTECTED] added the comment: I concur. The source code for set_update() in Objects/setobject.c shows that both versions are accessed through the iterator protocol, so what you're seeing are the basic performance differences (start-up and per-item costs) for genexps vs list iterators. -- resolution: - invalid status: open - closed __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2672 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2672] speed of set.update([])
New submission from John Arbash Meinel [EMAIL PROTECTED]: I was performance profiling some of my own code, and I ran into something unexpected. Specifically, set.update(empty_generator_expression) was significantly slower than set.update(empty_list_expression). I double checked my findings with timeit: With python 2.4.3: $ python -m timeit -s 'x = set(range(1))' 'x.update([])' 100 loops, best of 3: 0.296 usec per loop $ python -m timeit -s 'x = set(range(1))' 'x.update(y for y in [])' 100 loops, best of 3: 0.837 usec per loop $ python -m timeit -s 'x = set(range(1))' 'x.update([y for y in []])' 100 loops, best of 3: 0.462 usec per loop With 2.5.1 (on a different machine) $ python -m timeit -s 'x = set(range(1))' 'x.update([])' 100 loops, best of 3: 0.265 usec per loop $ python -m timeit -s 'x = set(range(1))' 'x.update(y for y in [])' 100 loops, best of 3: 0.717 usec per loop $ python -m timeit -s 'x = set(range(1))' 'x.update([y for y in []])' 100 loops, best of 3: 0.39 usec per loop So generally, it is about 2x faster to create the empty list expression and pass it in than to use an empty generator. -- components: Interpreter Core messages: 65694 nosy: jameinel severity: normal status: open title: speed of set.update([]) versions: Python 2.4, Python 2.5 __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2672 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2672] speed of set.update([])
Changes by John Arbash Meinel [EMAIL PROTECTED]: -- type: - performance __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2672 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2672] speed of set.update([])
Changes by Raymond Hettinger [EMAIL PROTECTED]: -- assignee: - rhettinger nosy: +rhettinger __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2672 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com