Anyone?

Gustavo said:
> Hello, everyone.
> 
> I've checked the new collections.Counter class and I think I've found a bug:
> > >>> from collections import Counter
> > >>> c1 = Counter([1, 2, 1, 3, 2])
> > >>> c2 = Counter([1, 1, 2, 2, 3])
> > >>> c3 = Counter([1, 1, 2, 3])
> > >>> c1 == c2 and c3 not in (c1, c2)
> > 
> > True
> > 
> > >>> # Perfect, so far. But... There's always a "but":
> > ...
> > 
> > >>> len(c1)
> > 
> > 3
> 
> The length of a Counter is the amount of unique elements. But the length
> must be the cardinality, and the cardinality of a multiset is the total
> number of elements (including duplicates) [1] [2]. The source code
> mentions that the recipe on ActiveState [3] was one of the references, but
> that recipe has this right.
> 
> Also, why is it indexed? The indexes of a multiset call to mind the
> position of its elements, but there's no such thing in sets. I think this
> is inconsistent with the built-in set. I would have implemented the
> multiplicity function as a method instead of the indexes:
>     c1.get_multiplicity(element)
>     # instead of
>     c1[element]
> 
> Is this the intended behavior? If so, I'd like to propose a proper multiset
> implementation for the standard library (preferably called "Multiset";
> should I create a PEP?). If not, I can write a patch to fix it, although
> I'm afraid it'd be a backwards incompatible change.
> 
> Cheers,
> 
> [1] http://en.wikipedia.org/wiki/Multiset#Overview
> [2] http://preview.tinyurl.com/smalltalk-bag
> [3] http://code.activestate.com/recipes/259174/
-- 
Gustavo Narea <xri://=Gustavo>.
| Tech blog: =Gustavo/(+blog)/tech  ~  About me: =Gustavo/about |
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to