Anyone?
Gustavo said: > Hello, everyone. > > I've checked the new collections.Counter class and I think I've found a bug: > > >>> from collections import Counter > > >>> c1 = Counter([1, 2, 1, 3, 2]) > > >>> c2 = Counter([1, 1, 2, 2, 3]) > > >>> c3 = Counter([1, 1, 2, 3]) > > >>> c1 == c2 and c3 not in (c1, c2) > > > > True > > > > >>> # Perfect, so far. But... There's always a "but": > > ... > > > > >>> len(c1) > > > > 3 > > The length of a Counter is the amount of unique elements. But the length > must be the cardinality, and the cardinality of a multiset is the total > number of elements (including duplicates) [1] [2]. The source code > mentions that the recipe on ActiveState [3] was one of the references, but > that recipe has this right. > > Also, why is it indexed? The indexes of a multiset call to mind the > position of its elements, but there's no such thing in sets. I think this > is inconsistent with the built-in set. I would have implemented the > multiplicity function as a method instead of the indexes: > c1.get_multiplicity(element) > # instead of > c1[element] > > Is this the intended behavior? If so, I'd like to propose a proper multiset > implementation for the standard library (preferably called "Multiset"; > should I create a PEP?). If not, I can write a patch to fix it, although > I'm afraid it'd be a backwards incompatible change. > > Cheers, > > [1] http://en.wikipedia.org/wiki/Multiset#Overview > [2] http://preview.tinyurl.com/smalltalk-bag > [3] http://code.activestate.com/recipes/259174/ -- Gustavo Narea <xri://=Gustavo>. | Tech blog: =Gustavo/(+blog)/tech ~ About me: =Gustavo/about | _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com