"jim stockford" <[EMAIL PROTECTED]> wrote 

> Why is a dict lookup constant time. I.e. if there's a
> loop that walks a (shorter) list and compares each
> element with each element of a dict, what's going
> on to make this faster than an outer loop walking
> a list and an inner loop walking a second list?

A dict is essentially random access, the interpreter can 
find the item in a single jump(*) whereas with a list it 
must cycle through the entire list. So an 'in' check
with a dict is basically a single indexing operation. 
An 'in' check with a list is a pass through, on average, 
half the list.

Against that has to be factored the overhead of 
converting the big list to a dict of course. Where 
one list is much shorter than the other a loop check 
of the lists may work out faster but in most cases 
the dict check should win out.

Finally, the dict solution breaks down where there 
are multiple entries of the same value since a dict 
needs unique keys. If you had two lists with a high 
number of duplicate entries then it could give a 
false positive depending on your definition of how 
the comparison should work.

But as in all performance related issues, don't
optimise until you have a problem and use measurement 
to check results. 

(*) Actually hashing algorithms often wind up with a 
few entries against a hash node so it is a jump plus an 
iteration over a few elements. But it is only a few, on a 
thousand element dict you might find some nodes 
with two or three entries but not more.

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.freenetpages.co.uk/hp/alan.gauld

_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to