On Mar 22, 1:11 am, [EMAIL PROTECTED] wrote: > On Mar 21, 4:17 am, Bryan Olson <[EMAIL PROTECTED]> wrote: > > > > > Arnaud Delobelle wrote: > > > Bryan Olson wrote: > > [...] > > >> Arnaud Delobelle offered a good Wikipedia link, and for more background > > >> look up "amortized analysis. > > > > Hrvoje Niksic provided the link :). > > > Oops, careless thread-following. Hrvoje Niksic it was. > > > > I still think two unrelated > > > things are being compared in this thread when people say that method A > > > (using dictionaries / sets) is O(n) and method B (sorting the list) > > > O(nlogn). > > > > Isn't it the case that: > > > > | Worst case | Average case > > > ---------|------------|------------- > > > Method A | O(n^2) | O(n) > > > Method B | O(nlogn) | O(nlogn) > > > > Which method is the best would then be determined by the distribution > > > of the hash values of your data, and whether you want to have a > > > guarantee the method will have a less-than-quadratic behaviour. > > > If we exclude the case where an adversary is choosing the keys, the > > chance of a seriously degenerate case in the hashing method is so > > remote that we do should not worry about it. Those who insist on > > ranking by worst-case behavior have probably abandoned Python long > > ago, as it uses those hashed 'dict' things everywhere. Of course > > they've also abandoned OS's with demand-paged virtual memory and > > such. > > > -- > > --Bryan > > A collision sequence is not so rare. > > >>> [ hash( 2**i ) for i in range( 0, 256, 32 ) ] > > [1, 1, 1, 1, 1, 1, 1, 1]
Bryan did qualify his remarks: "If we exclude the case where an adversary is choosing the keys, ..." -- http://mail.python.org/mailman/listinfo/python-list
