Re: [Python-ideas] Additions to collections.Counter and a Counter derived class

2017-03-15 Thread Steven D'Aprano
On Wed, Mar 15, 2017 at 11:06:20AM -0700, David Mertz wrote: > On Wed, Mar 15, 2017 at 10:39 AM, Steven D'Aprano > wrote: > > > > But I can imagine an occasional need to, e.g. "find outliers." However, > > > that is not hard to spell as `mycounter.most_common()[-1*N:]`. Or

Re: [Python-ideas] Additions to collections.Counter and a Counter derived class

2017-03-15 Thread Brendan Barnwell
On 2017-03-15 11:06, David Mertz wrote: Just because a data point is uncommon doesn't mean it is an outlier. That's kinda *by definition* what an outlier is in categorical data! Not really. Or rather, it depends what you mean by "uncommon". But this thread is about adding

Re: [Python-ideas] Additions to collections.Counter and a Counter derived class

2017-03-15 Thread David Mertz
On Wed, Mar 15, 2017 at 10:39 AM, Steven D'Aprano wrote: > > But I can imagine an occasional need to, e.g. "find outliers." However, > > that is not hard to spell as `mycounter.most_common()[-1*N:]`. Or if > your > > program does this often, write a utility function

Re: [Python-ideas] Additions to collections.Counter and a Counter derived class

2017-03-15 Thread Steven D'Aprano
On Tue, Mar 14, 2017 at 08:52:52AM -0700, David Mertz wrote: > But I can imagine an occasional need to, e.g. "find outliers." However, > that is not hard to spell as `mycounter.most_common()[-1*N:]`. Or if your > program does this often, write a utility function `find_outliers(...)` That's not

Re: [Python-ideas] Additions to collections.Counter and a Counter derived class

2017-03-15 Thread David Mertz
I added a couple comments at https://bugs.python.org/issue25478 about what I mean. Raymond replied as well. So it feels like we should use that thread there. In a scientific context I often think of a Counter as a way to count observations of a categorical variable. "I saw 3 As, then 7 Bs,

Re: [Python-ideas] Additions to collections.Counter and a Counter derived class

2017-03-15 Thread Marco Cognetta
Thanks for the reply. I will add it to the documentation and will work on a use case and recipe for the n highest frequency problem. As for >I'm not sure if this would be better as an attribute kept directly or as a >property that called `sum(self.values())` when accessed. I believe that