New submission from Terry J. Reedy <tjre...@udel.edu>: https://metarabbit.wordpress.com/2018/02/05/pythons-weak-performance-matters/, a blog post on cpython speed, clains "deleting a set of 1 billion strings takes >12 hours". (No other details provided.)
I don't have the 100+ gigabytes of ram needed to confirm this, but with installed 64 bit 3.7.0b1 with Win10 and 12 gigabyes, I confirmed that there is a pronounced super-linear growth in string set deletion (unlike with an integer set). At least half of ram was available. Seconds to create and delete sets millions integers strings of items create delete create delete 1 .08 .02 .36 .08 2 .15 .03 .75 .17 4 .30 .06 1.55 .36 8 .61 .12 3.18 .76 16 1.22 .24 6.48 1.80 < slightly more than double 32 2.4 .50 13.6 5.56 < more than triple 64 4.9 1.04 28 19 < nearly quadruple 128 10.9 2.25 <too large> 100 56 80 < quadruple with 1.5 x size For 100 million strings, I got about the same 56 and 80 seconds when timing with a clock, without the timeit gc suppression. I interrupted the 128M string run after several minutes. Even if there is swapping to disk during creation, I would not expect it during deletion. The timeit code: import timeit for i in (1,2,4,8,16,32,64,128): print(i, 'int') print(timeit.Timer(f's = {{n for n in range({i}*1000000)}}') .timeit(number=1)) print(timeit.Timer('del s', f's = {{n for n in range({i}*1000000)}}') .timeit(number=1)) for i in (1,2,4,8,16,32,64,100): print(i, 'str') print(timeit.Timer(f's = {{str(n) for n in range({i}*1000000)}}') .timeit(number=1)) print(timeit.Timer('del s', f's = {{str(n) for n in range({i}*1000000)}}') .timeit(number=1)) Raymond, I believe you monitor the set implementation, and I know Victor is interested in timing and performance. ---------- messages: 312188 nosy: rhettinger, terry.reedy, vstinner priority: normal severity: normal stage: needs patch status: open title: Deletion of large sets of strings is extra slow type: performance versions: Python 3.7, Python 3.8 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue32846> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com