Re: Cache memory and its effect on list searching

Chris Angelico Fri, 16 Dec 2016 18:30:25 -0800

On Sat, Dec 17, 2016 at 1:20 PM,  <sohcahto...@gmail.com> wrote:
> I thought this was curious behavior.  I created a list of random-looking 
> strings, then made a sorted copy.  I then found that using "in" to see if a 
> string exists in the sorted list took over 9 times as long!
>


My numbers replicate yours (though my computer's faster). But my
conclusion is different:

Python 3.7.0a0 (default:ecd218c41cd4, Dec 16 2016, 03:08:47)
[GCC 6.2.0 20161027] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import hashlib
>>> import timeit
>>> hashes =  [hashlib.md5(bytes(str(i), "utf-8")).hexdigest() for i in 
>>> range(10000000)]
>>> sortedhashes = sorted(hashes)
>>> timeit.timeit('"x" in hashes', globals={'hashes': hashes}, number=10)
0.8167107938788831
>>> timeit.timeit('"x" in hashes', globals={'hashes': sortedhashes}, number=10)
5.029693723190576
>>> timeit.timeit('"x" in hashes', globals={'hashes': hashes}, number=10)
0.855183657258749
>>> timeit.timeit('"x" in hashes', globals={'hashes': sortedhashes}, number=10)
5.0585526106879115
>>> sethashes = set(hashes)
>>> timeit.timeit('"x" in hashes', globals={'hashes': sethashes}, number=10)
6.13601878285408e-06

You want containment checks? Use a set :)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Cache memory and its effect on list searching

Reply via email to