On Sat, Dec 17, 2016 at 1:20 PM, <sohcahto...@gmail.com> wrote: > I thought this was curious behavior. I created a list of random-looking > strings, then made a sorted copy. I then found that using "in" to see if a > string exists in the sorted list took over 9 times as long! >
My numbers replicate yours (though my computer's faster). But my conclusion is different: Python 3.7.0a0 (default:ecd218c41cd4, Dec 16 2016, 03:08:47) [GCC 6.2.0 20161027] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import hashlib >>> import timeit >>> hashes = [hashlib.md5(bytes(str(i), "utf-8")).hexdigest() for i in >>> range(10000000)] >>> sortedhashes = sorted(hashes) >>> timeit.timeit('"x" in hashes', globals={'hashes': hashes}, number=10) 0.8167107938788831 >>> timeit.timeit('"x" in hashes', globals={'hashes': sortedhashes}, number=10) 5.029693723190576 >>> timeit.timeit('"x" in hashes', globals={'hashes': hashes}, number=10) 0.855183657258749 >>> timeit.timeit('"x" in hashes', globals={'hashes': sortedhashes}, number=10) 5.0585526106879115 >>> sethashes = set(hashes) >>> timeit.timeit('"x" in hashes', globals={'hashes': sethashes}, number=10) 6.13601878285408e-06 You want containment checks? Use a set :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list