On Mar 21, 4:17 am, Bryan Olson <[EMAIL PROTECTED]> wrote: > Arnaud Delobelle wrote: > > Bryan Olson wrote: > [...] > >> Arnaud Delobelle offered a good Wikipedia link, and for more background > >> look up "amortized analysis. > > > Hrvoje Niksic provided the link :). > > Oops, careless thread-following. Hrvoje Niksic it was. > > > I still think two unrelated > > things are being compared in this thread when people say that method A > > (using dictionaries / sets) is O(n) and method B (sorting the list) > > O(nlogn). > > > Isn't it the case that: > > > | Worst case | Average case > > ---------|------------|------------- > > Method A | O(n^2) | O(n) > > Method B | O(nlogn) | O(nlogn) > > > Which method is the best would then be determined by the distribution > > of the hash values of your data, and whether you want to have a > > guarantee the method will have a less-than-quadratic behaviour. > > If we exclude the case where an adversary is choosing the keys, the > chance of a seriously degenerate case in the hashing method is so > remote that we do should not worry about it. Those who insist on > ranking by worst-case behavior have probably abandoned Python long > ago, as it uses those hashed 'dict' things everywhere. Of course > they've also abandoned OS's with demand-paged virtual memory and > such. > > -- > --Bryan
A collision sequence is not so rare. >>> [ hash( 2**i ) for i in range( 0, 256, 32 ) ] [1, 1, 1, 1, 1, 1, 1, 1] -- http://mail.python.org/mailman/listinfo/python-list