Thanks for the reply. I will add it to the documentation and will work on a use case and recipe for the n highest frequency problem.
As for >I'm not sure if this would be better as an attribute kept directly or as a >property that called `sum(self.values())` when accessed. I believe that >having `mycounter.total` would provide the right normalization in a clean API, >and also expose easy access to other questions one would naturally ask (e.g. >"How many observations were made?") I am not quite sure what you mean, especially the observations part. For the attribute part, do you mean we would just have a hidden class variable like num_values that was incremented or decremented whenever something is added or removed, so we have O(1) size queries instead of O(n) (where n is the number of keys)? Then there could be a method like 'normalize' that printed all elements with their frequencies divided by the total count? On Wed, Mar 15, 2017 at 12:52 AM, David Mertz <me...@gnosis.cx> wrote: > On Tue, Mar 14, 2017 at 2:38 AM, Marco Cognetta <cognetta.ma...@gmail.com> > wrote: >> >> 1) Addition of a Counter.least_common method: >> This was addressed in https://bugs.python.org/issue16994, but it was >> never resolved and is still open (since Jan. 2013). This is a small >> change, but I think that it is useful to include in the stdlib. > > > -1 on adding this. I read the issue, and do not find a convincing use case > that is common enough to merit a new method. As some people noted in the > issue, the "least common" is really the infinitely many keys not in the > collection at all. > > But I can imagine an occasional need to, e.g. "find outliers." However, > that is not hard to spell as `mycounter.most_common()[-1*N:]`. Or if your > program does this often, write a utility function `find_outliers(...)` > >> 2) Undefined behavior when using Counter.most_common: >> 'c', 'c']), when calling c.most_common(3), there are more than 3 "most >> common" elements in c and c.most_common(3) will not always return the >> same list, since there is no defined total order on the elements in c. >> >> Should this be mentioned in the documentation? > > > +1. I'd definitely support adding this point to the documentation. > >> >> Additionally, perhaps there is room for a method that produces all of >> the elements with the n highest frequencies in order of their >> frequencies. For example, in the case of c = Counter([1, 1, 1, 2, 2, >> 3, 3, 4, 4, 5]) c.aforementioned_method(2) would return [(1, 3), (2, >> 2), (3, 2), (4, 2)] since the two highest frequencies are 3 and 2. > > > -0 on this. I can see wanting this, but I'm not sure often enough to add to > the promise of the class. The utility function to do this would be somewhat > less trivial to write than `find_outliers(..)` but not enormously hard. I > think I'd be +0 on adding a recipe to the documentation for a utility > function. > >> >> 3) Addition of a collections.Frequency or collections.Proportion class >> derived from collections.Counter: >> >> This is sort of discussed in https://bugs.python.org/issue25478. >> The idea behind this would be a dictionary that, instead of returning >> the integer frequency of an element, would return it's proportional >> representation in the iterable. > > > One could write a subclass easily enough. The essential feature in my mind > would be to keep an attributed Counter.total around to perform the > normalization. I'm +1 on adding that to collections.Counter itself. > > I'm not sure if this would be better as an attribute kept directly or as a > property that called `sum(self.values())` when accessed. I believe that > having `mycounter.total` would provide the right normalization in a clean > API, and also expose easy access to other questions one would naturally ask > (e.g. "How many observations were made?") > > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/