Re: finding items that occur more than once in a list

2008-03-23 Thread Bryan Olson
Arnaud Delobelle wrote: Bryan Olson wrote: We mean that the party supplying the keys deliberately chose them to make the hash table inefficient. In this thread the goal is efficiency; a party working toward an opposing goal is an adversary. There are situations where this can happen I

Re: finding items that occur more than once in a list

2008-03-22 Thread castironpi
On Mar 21, 3:39 pm, John Machin [EMAIL PROTECTED] wrote: On Mar 22, 1:11 am, [EMAIL PROTECTED] wrote: A collision sequence is not so rare. [ hash( 2**i ) for i in range( 0, 256, 32 ) ] [1, 1, 1, 1, 1, 1, 1, 1] Bryan did qualify his remarks: If we exclude the case where an adversary is

Re: finding items that occur more than once in a list

2008-03-22 Thread Bryan Olson
[EMAIL PROTECTED] wrote: John Machin wrote: On Mar 22, 1:11 am, [EMAIL PROTECTED] wrote: A collision sequence is not so rare. [ hash( 2**i ) for i in range( 0, 256, 32 ) ] [1, 1, 1, 1, 1, 1, 1, 1] Bryan did qualify his remarks: If we exclude the case where an adversary is choosing the

Re: finding items that occur more than once in a list

2008-03-22 Thread Arnaud Delobelle
On Mar 22, 8:31 am, Bryan Olson [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: John Machin wrote: On Mar 22, 1:11 am, [EMAIL PROTECTED] wrote: A collision sequence is not so rare. [ hash( 2**i ) for i in range( 0, 256, 32 ) ] [1, 1, 1, 1, 1, 1, 1, 1] Bryan did qualify his remarks:

Re: finding items that occur more than once in a list

2008-03-22 Thread John Machin
On Mar 22, 9:58 pm, Arnaud Delobelle [EMAIL PROTECTED] wrote: Lastly, if one deals with a totally ordered set of object but they are not hashable (there isn't a good hash function), then Ninereeds' idea of sorting first is still useful. ... and if not totally ordered, then ... I'll just stand

Re: finding items that occur more than once in a list

2008-03-22 Thread sturlamolden
On 22 Mar, 09:31, Bryan Olson [EMAIL PROTECTED] wrote: Even a hash function that behaves as a random oracle has worst-case quadratic-time in the algorithm here In which case inserts are not amortized to O(1). -- http://mail.python.org/mailman/listinfo/python-list

Re: finding items that occur more than once in a list

2008-03-22 Thread castironpi
On Mar 22, 3:31 am, Bryan Olson [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: John Machin wrote: On Mar 22, 1:11 am, [EMAIL PROTECTED] wrote: A collision sequence is not so rare. [ hash( 2**i ) for i in range( 0, 256, 32 ) ] [1, 1, 1, 1, 1, 1, 1, 1] Bryan did qualify his remarks:

Re: finding items that occur more than once in a list

2008-03-21 Thread castironpi
On Mar 20, 2:07 am, John Machin [EMAIL PROTECTED] wrote: On Mar 20, 12:50 pm, Gabriel Genellina [EMAIL PROTECTED] wrote: En Wed, 19 Mar 2008 20:16:36 -0300, John Machin [EMAIL PROTECTED]   escribió: On Mar 20, 9:14 am, sturlamolden [EMAIL PROTECTED] wrote: Is a Python set implemented

Re: finding items that occur more than once in a list

2008-03-21 Thread Bryan Olson
Arnaud Delobelle wrote: Bryan Olson wrote: [...] Arnaud Delobelle offered a good Wikipedia link, and for more background look up amortized analysis. Hrvoje Niksic provided the link :). Oops, careless thread-following. Hrvoje Niksic it was. I still think two unrelated things are being

Re: finding items that occur more than once in a list

2008-03-21 Thread castironpi
On Mar 21, 4:17 am, Bryan Olson [EMAIL PROTECTED] wrote: Arnaud Delobelle wrote: Bryan Olson wrote: [...] Arnaud Delobelle offered a good Wikipedia link, and for more background look up amortized analysis. Hrvoje Niksic provided the link :). Oops, careless thread-following. Hrvoje

Re: finding items that occur more than once in a list

2008-03-21 Thread John Machin
On Mar 22, 1:11 am, [EMAIL PROTECTED] wrote: On Mar 21, 4:17 am, Bryan Olson [EMAIL PROTECTED] wrote: Arnaud Delobelle wrote: Bryan Olson wrote: [...] Arnaud Delobelle offered a good Wikipedia link, and for more background look up amortized analysis. Hrvoje Niksic provided the

Re: finding items that occur more than once in a list

2008-03-20 Thread John Machin
On Mar 20, 12:50 pm, Gabriel Genellina [EMAIL PROTECTED] wrote: En Wed, 19 Mar 2008 20:16:36 -0300, John Machin [EMAIL PROTECTED]   escribió: On Mar 20, 9:14 am, sturlamolden [EMAIL PROTECTED] wrote: Is a Python set implemented using a hash table? What don't you understand about the

Re: finding items that occur more than once in a list

2008-03-20 Thread Arnaud Delobelle
On Mar 19, 3:08 am, Bryan Olson [EMAIL PROTECTED] wrote: Arnaud Delobelle wrote: Ninereeds wrote: Hrvoje Niksic wrote: This doesn't apply to Python, which implements dict storage as an open-addressed table and automatically (and exponentially) grows the table when the number of entries

Re: finding items that occur more than once in a list

2008-03-19 Thread John Machin
On Mar 19, 10:08 am, sturlamolden [EMAIL PROTECTED] wrote: On 18 Mar, 23:45, Arnaud Delobelle [EMAIL PROTECTED] wrote: def nonunique(lst): slst = sorted(lst) dups = [s[0] for s in filter(lambda t : t[0] == t[1], zip(slst[:-1],slst[1:]))] return [dups[0]] + [s[1]

Re: finding items that occur more than once in a list

2008-03-19 Thread sturlamolden
On 19 Mar, 22:48, John Machin [EMAIL PROTECTED] wrote: I'd use Raymond Hettinger's solution. It is as much O(N) as Paul's, and is IMHO more readable than Paul's. Is a Python set implemented using a hash table? -- http://mail.python.org/mailman/listinfo/python-list

Re: finding items that occur more than once in a list

2008-03-19 Thread Justin Bozonier
On Mar 19, 2:48 pm, John Machin [EMAIL PROTECTED] wrote: On Mar 19, 10:08 am, sturlamolden [EMAIL PROTECTED] wrote: On 18 Mar, 23:45, Arnaud Delobelle [EMAIL PROTECTED] wrote: def nonunique(lst): slst = sorted(lst) dups = [s[0] for s in filter(lambda t : t[0]

Re: finding items that occur more than once in a list

2008-03-19 Thread John Machin
On Mar 20, 9:57 am, Justin Bozonier [EMAIL PROTECTED] wrote: On Mar 19, 2:48 pm, John Machin [EMAIL PROTECTED] wrote: On Mar 19, 10:08 am, sturlamolden [EMAIL PROTECTED] wrote: On 18 Mar, 23:45, Arnaud Delobelle [EMAIL PROTECTED] wrote: def nonunique(lst): slst =

Re: finding items that occur more than once in a list

2008-03-19 Thread John Machin
On Mar 20, 9:14 am, sturlamolden [EMAIL PROTECTED] wrote: On 19 Mar, 22:48, John Machin [EMAIL PROTECTED] wrote: I'd use Raymond Hettinger's solution. It is as much O(N) as Paul's, and is IMHO more readable than Paul's. Is a Python set implemented using a hash table? What don't you

Re: finding items that occur more than once in a list

2008-03-19 Thread sturlamolden
On 20 Mar, 00:16, John Machin [EMAIL PROTECTED] wrote: What don't you understand about the comments in the first two screenfuls of Objects/setobject.c? I had not looked at it, but now I have. Is seems Hettinger is the author :) Ok, so sets are implemented as hash tables. Then I agree, use

Re: finding items that occur more than once in a list

2008-03-19 Thread Gabriel Genellina
En Wed, 19 Mar 2008 20:16:36 -0300, John Machin [EMAIL PROTECTED] escribió: On Mar 20, 9:14 am, sturlamolden [EMAIL PROTECTED] wrote: Is a Python set implemented using a hash table? What don't you understand about the comments in the first two screenfuls of Objects/setobject.c? That

finding items that occur more than once in a list

2008-03-18 Thread Simon Forman
Is there a more efficient way to do this? def f(L): '''Return a set of the items that occur more than once in L.''' L = list(L) for item in set(L): L.remove(item) return set(L) | f([0, 0, 1, 1, 2, 2, 3]) set([0, 1, 2]) --

Re: finding items that occur more than once in a list

2008-03-18 Thread Chris
On Mar 18, 11:57 am, Simon Forman [EMAIL PROTECTED] wrote: Is there a more efficient way to do this? def f(L):     '''Return a set of the items that occur more than once in L.'''     L = list(L)     for item in set(L):         L.remove(item)     return set(L) | f([0, 0, 1, 1, 2, 2, 3])

Re: finding items that occur more than once in a list

2008-03-18 Thread Paul Rubin
Simon Forman [EMAIL PROTECTED] writes: Is there a more efficient way to do this? http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/502263 -- http://mail.python.org/mailman/listinfo/python-list

Re: finding items that occur more than once in a list

2008-03-18 Thread Bryan Olson
Simon Forman wrote: Is there a more efficient way to do this? def f(L): '''Return a set of the items that occur more than once in L.''' L = list(L) for item in set(L): L.remove(item) return set(L) That's neat, but quadratic time because list.remove() requires a

Re: finding items that occur more than once in a list

2008-03-18 Thread Ninereeds
Just to throw in one more alternative, if you sort your list, you only need to test adjacent items for equality rather than needing a search for each unique item found. You should get O(n log n) rather than O(n^2), since the performance bottleneck is now the sorting rather than the searching for

Re: finding items that occur more than once in a list

2008-03-18 Thread Hrvoje Niksic
Ninereeds [EMAIL PROTECTED] writes: The dictionary version Chris suggests (and the essentially equivalent set-based approach) is doing essentially the same thing in a way, but using hashing rather than ordering to organise the list and spot duplicates. This is *not* O(n) due to the rate of

Re: finding items that occur more than once in a list

2008-03-18 Thread Ninereeds
Hrvoje Niksic wrote: This doesn't apply to Python, which implements dict storage as an open-addressed table and automatically (and exponentially) grows the table when the number of entries approaches 2/3 of the table size. Assuming a good hash function, filling the dict should yield

Re: finding items that occur more than once in a list

2008-03-18 Thread Raymond Hettinger
On Mar 18, 2:57 am, Simon Forman [EMAIL PROTECTED] wrote: Is there a more efficient way to do this? def f(L): '''Return a set of the items that occur more than once in L.''' L = list(L) for item in set(L): L.remove(item) return set(L) | f([0, 0, 1, 1, 2, 2, 3])

Re: finding items that occur more than once in a list

2008-03-18 Thread Hrvoje Niksic
Ninereeds [EMAIL PROTECTED] writes: As for the growth pattern, each time you grow the table you have to redistribute all the items previously inserted to new locations. Resizes would get rarer as more items are added due to the exponential growth, but every table resize would take longer too

Re: finding items that occur more than once in a list

2008-03-18 Thread sturlamolden
On 18 Mar, 10:57, Simon Forman [EMAIL PROTECTED] wrote: def f(L): '''Return a set of the items that occur more than once in L.''' L = list(L) for item in set(L): L.remove(item) return set(L) def nonunique(lst): slst = sorted(lst) return list(set([s[0] for s

Re: finding items that occur more than once in a list

2008-03-18 Thread sturlamolden
On 18 Mar, 22:22, sturlamolden [EMAIL PROTECTED] wrote: def nonunique(lst): slst = sorted(lst) return list(set([s[0] for s in filter(lambda t : not(t[0]-t[1]), zip(slst[:-1],slst[1:]))])) Or perhaps better: def nonunique(lst): slst = sorted(lst) return list(set([s[0] for

Re: finding items that occur more than once in a list

2008-03-18 Thread Paul Rubin
sturlamolden [EMAIL PROTECTED] writes: def nonunique(lst): slst = sorted(lst) return list(set([s[0] for s in filter(lambda t : t[0] != t[1], zip(slst[:-1],slst[1:]))])) The items are all comparable and you're willing to take them out of order? from collections import defaultdict

Re: finding items that occur more than once in a list

2008-03-18 Thread sturlamolden
On 18 Mar, 22:25, sturlamolden [EMAIL PROTECTED] wrote: def nonunique(lst): slst = sorted(lst) return list(set([s[0] for s in filter(lambda t : t[0] != t[1], zip(slst[:-1],slst[1:]))])) Obviously that should be 'lambda t : t[0] == t[1]'. Instead of using the set function, there

Re: finding items that occur more than once in a list

2008-03-18 Thread Michael Mabin
How about using list comprehension? l1 = [apples,apples,bananas,oranges,oranges,peaches] s1 = set([x for x in l1 if l1.count(x) 1]) On Tue, Mar 18, 2008 at 4:56 PM, sturlamolden [EMAIL PROTECTED] wrote: On 18 Mar, 22:25, sturlamolden [EMAIL PROTECTED] wrote: def nonunique(lst): slst

Re: finding items that occur more than once in a list

2008-03-18 Thread Simon Forman
I love you guys, thanks! ;-) ~Simon -- http://mail.python.org/mailman/listinfo/python-list

Re: finding items that occur more than once in a list

2008-03-18 Thread Arnaud Delobelle
On Mar 18, 9:56 pm, sturlamolden [EMAIL PROTECTED] wrote: On 18 Mar, 22:25, sturlamolden [EMAIL PROTECTED] wrote: def nonunique(lst):    slst = sorted(lst)    return list(set([s[0] for s in        filter(lambda t : t[0] != t[1], zip(slst[:-1],slst[1:]))])) Obviously that should be

Re: finding items that occur more than once in a list

2008-03-18 Thread sturlamolden
On 18 Mar, 23:45, Arnaud Delobelle [EMAIL PROTECTED] wrote: def nonunique(lst): slst = sorted(lst) dups = [s[0] for s in filter(lambda t : t[0] == t[1], zip(slst[:-1],slst[1:]))] return [dups[0]] + [s[1] for s in filter(lambda t : t[0] != t[1],

Re: finding items that occur more than once in a list

2008-03-18 Thread Arnaud Delobelle
On Mar 18, 3:59 pm, Ninereeds [EMAIL PROTECTED] wrote: Hrvoje Niksic wrote: This doesn't apply to Python, which implements dict storage as an open-addressed table and automatically (and exponentially) grows the table when the number of entries approaches 2/3 of the table size. Assuming a

Re: finding items that occur more than once in a list

2008-03-18 Thread castironpi
On Mar 18, 6:38 pm, Arnaud Delobelle [EMAIL PROTECTED] wrote: On Mar 18, 3:59 pm, Ninereeds [EMAIL PROTECTED] wrote: Hrvoje Niksic wrote: This doesn't apply to Python, which implements dict storage as an open-addressed table and automatically (and exponentially) grows the table when

Re: finding items that occur more than once in a list

2008-03-18 Thread Bryan Olson
Arnaud Delobelle wrote: Ninereeds wrote: Hrvoje Niksic wrote: This doesn't apply to Python, which implements dict storage as an open-addressed table and automatically (and exponentially) grows the table when the number of entries approaches 2/3 of the table size. Assuming a good hash

Re: finding items that occur more than once in a list

2008-03-18 Thread castironpi
On Mar 18, 10:08 pm, Bryan Olson [EMAIL PROTECTED] wrote: Arnaud Delobelle wrote: Ninereeds wrote: Hrvoje Niksic wrote: This doesn't apply to Python, which implements dict storage as an open-addressed table and automatically (and exponentially) grows the table when the number of entries