On 2/6/2010 3:25 PM, Wolodja Wentland wrote:
On Sat, Feb 06, 2010 at 14:42 -0500, Terry Reedy wrote:
On 2/6/2010 2:09 PM, Wolodja Wentland wrote:
I think you can use the itertools.groupby(L, lambda el: el[1]) to group
elements in your *sorted* list L by the value el[1] (i.e. the
identifier) and then iterate through these groups until you find the
desired number of instances grouped by the same identifier.
This will generally not return the same result. It depends on
whether OP wants *any* item appearing at least 5 times or whether
the order is significant and the OP literally wants the first.
Order is preserved by itertools.groupby - Have a look:
Sorting does not.
instances = [(1, 'b'), (2, 'b'), (3, 'a'), (4, 'c'), (5, 'c'), (6, 'c'), (7,
'b'), (8, 'b')]
grouped_by_identifier = groupby(instances, lambda el: el[1])
grouped_by_identifier = ((identifier, list(group)) for identifier, group in
grouped_by_identifier)
k_instances = (group for identifier, group in grouped_by_identifier if
len(group) == 2)
for group in k_instances:
... print group
...
[(1, 'b'), (2, 'b')]
[(7, 'b'), (8, 'b')]
So the first element yielded by the k_instances generator will be the
first group of elements from the original list whose identifier appears
exactly k times in a row.
Sorting the entire list may also take a *lot* longer.
Than what?
Than linearly scanning for the first 5x item, as in my corrected version
of the original code.
Terry Jan Reedy
--
http://mail.python.org/mailman/listinfo/python-list