"jim stockford" <[EMAIL PROTECTED]> wrote > Why is a dict lookup constant time. I.e. if there's a > loop that walks a (shorter) list and compares each > element with each element of a dict, what's going > on to make this faster than an outer loop walking > a list and an inner loop walking a second list?
A dict is essentially random access, the interpreter can find the item in a single jump(*) whereas with a list it must cycle through the entire list. So an 'in' check with a dict is basically a single indexing operation. An 'in' check with a list is a pass through, on average, half the list. Against that has to be factored the overhead of converting the big list to a dict of course. Where one list is much shorter than the other a loop check of the lists may work out faster but in most cases the dict check should win out. Finally, the dict solution breaks down where there are multiple entries of the same value since a dict needs unique keys. If you had two lists with a high number of duplicate entries then it could give a false positive depending on your definition of how the comparison should work. But as in all performance related issues, don't optimise until you have a problem and use measurement to check results. (*) Actually hashing algorithms often wind up with a few entries against a hash node so it is a jump plus an iteration over a few elements. But it is only a few, on a thousand element dict you might find some nodes with two or three entries but not more. -- Alan Gauld Author of the Learn to Program web site http://www.freenetpages.co.uk/hp/alan.gauld _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor