> On Apr 15, 2018, at 2:05 PM, Peter Norvig <pe...@norvig.com> wrote:
> For most types that implement __add__, `x + x` is equal to `2 * x`. 
> ... 
> That is true for all numbers, list, tuple, str, timedelta, etc. -- but not 
> for collections.Counter. I can add two Counters, but I can't multiply one by 
> a scalar. That seems like an oversight. 

If you view the Counter as a sparse associative array of numeric values, it 
does seem like an oversight.  If you view the Counter as a Multiset or Bag, it 
doesn't make sense at all ;-)

>From an implementation point of view, Counter is just a kind of dict that has 
>a __missing__() method that returns zero.  That makes it trivially easy to 
>subclass Counter to add new functionality or just use dictionary 
>comprehensions for bulk updates.

> It would be worthwhile to implement multiplication because, among other 
> reasons, Counters are a nice representation for discrete probability 
> distributions, for which multiplication is an even more fundamental operation 
> than addition.

There is an open issue on this topic.  See:  https://bugs.python.org/issue25478

One stumbling point is that a number of commenters are fiercely opposed to 
non-integer uses of Counter. Also, some of the use cases (such as those found 
in Allen Downey's "Think Stats" and "Think Bayes" books) also need division and 
rescaling to a total (i.e. normalizing the total to 1.0) for a probability mass 

If the idea were to go forward, it still isn't clear whether the correct API 
should be low level (__mul__ and __div__ and a "total" property) or higher 
level (such as a normalize() or rescale() method that produces a new Counter 
instance).  The low level approach has the advantage that it is simple to 
understand and that it feels like a logical extension of the __add__ and 
__sub__ methods.  The downside is that doesn't really add any new capabilities 
(being just short-cuts for a simple dict comprehension or call to c.values()).  
And, it starts to feature creep the Counter class further away from its core 
mission of counting and ventures into the realm of generic sparse arrays with 
numeric values.  There is also a learnability/intelligibility issue in __add__ 
and __sub__ correspond to "elementwise" operations while  __mul__ and __div__ 
would be "scalar broadcast" operations.

Peter, I'm really glad you chimed in.  My advocacy lacked sufficient weight to 
move this idea forward.


Python-ideas mailing list
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to