On 15/08/13 14:08, Steven D'Aprano wrote:
On 15/08/13 21:42, Mark Dickinson wrote:
The PEP and code look generally good to me.

I think the API for median and its variants deserves some wider discussion:
the reference implementation has a callable 'median', and variant callables
'median.low', 'median.high', 'median.grouped'.  The pattern of attaching
the variant callables as attributes on the main callable is unusual, and
isn't something I've seen elsewhere in the standard library.  I'd like to
see some explanation in the PEP for why it's done this way.  (There was
already some discussion of this on the issue, but that was more centered
around the implementation than the API.)

I'd propose two alternatives for this:  either have separate functions
'median', 'median_low', 'median_high', etc., or have a single function
'median' with a "method" argument that takes a string specifying
computation using a particular method.  I don't see a really good reason to
deviate from standard patterns here, and fear that users would find the
current API surprising.

Alexander Belopolsky has convinced me (off-list) that my current implementation 
is better changed to a more conservative one of a callable singleton instance 
with methods implementing the alternative
computations. I'll have something like:


def _singleton(cls):
     return cls()


@_singleton
class median:
     def __call__(self, data):
         ...
     def low(self, data):
         ...
     ...

Horrible.


In my earlier stats module, I had a single median function that took a argument to choose 
between alternatives. I called it "scheme":

median(data, scheme="low")

What is wrong with this?
It's a perfect API; simple and self-explanatory.
median is a function in the mathematical sense and it should be a function in 
Python.


R uses parameter called "type" to choose between alternate calculations, not 
for median as we are discussing, but for quantiles:

quantile(x, probs ... type = 7, ...).

SAS also uses a similar system, but with different numeric codes. I rejected both "type" 
and "method" as the parameter name since it would cause confusion with the usual meanings 
of those words. I
eventually decided against this system for two reasons:

There are other words to choose from ;) "scheme" seems OK to me.


- Each scheme ended up needing to be a separate function, for ease of both 
implementation and testing. So I had four private median functions, which I put 
inside a class to act as namespace and avoid
polluting the main namespace. Then I needed a "master function" to select which 
of the methods should be called, with all the additional testing and documentation that 
entailed.

- The API doesn't really feel very Pythonic to me. For example, we write:

mystring.rjust(width)
dict.items()
These are methods on objects, the result of these calls depends on the value of 
'self' argument, not merely its class. No so with a median singleton.

We also have len(seq) and copy.copy(obj)
No classes required.


rather than mystring.justify(width, "right") or dict.iterate("items"). So I 
think individual methods is a better API, and one which is more familiar to most Python users. The 
only innovation (if
that's what it is) is to have median a callable object.


As far as having four separate functions, median, median_low, etc., it just 
doesn't feel right to me. It puts four slight variations of the same function 
into the main namespace, instead of keeping
them together in a namespace. Names like median_low merely simulates a 
namespace with pseudo-methods separated with underscores instead of dots, only 
without the advantages of a real namespace.

(I treat variance and std dev differently, and make the sample and population 
forms separate top-level functions rather than methods, simply because they are 
so well-known from scientific calculators
that it is unthinkable to me to do differently. Whenever I use numpy, I am 
surprised all over again that it has only a single variance function.)



_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to