On 2/1/2014 8:06 PM, Steven D'Aprano wrote:
Hi all,

Over on the Python-ideas list, there's a thread about the new statistics
module, and as the author of that module, I'm looking for a bit of
guidance regarding backwards compatibility. Specifically two issues:


(1) With numeric code, what happens if the module become more[1]
accurate in the future? Does that count as breaking backwards
compatibility?

E.g. Currently I use a particular algorithm for calculating variance.
Suppose that for a particular data set, this algorithm is accurate to
(say) seven decimal places:

# Python 3.4
variance(some_data) == 1.23456700001

Later, I find a better algorithm, which improves the accuracy of the
result:

# Python 3.5 or 3.6
variance(some_data) == 1.23456789001


Would this count as breaking backwards compatibility? If so, how should
I handle this? I don't claim that the current implementation of the
statistics module is optimal, as far as precision and accuracy is
concerned. It may improve in the future.

Or would that count as a bug-fix? "Variance function was inaccurate, now
less wrong", perhaps.

That is my inclination.

I suppose the math module has the same issue, except that it just wraps
the C libraries, which are mature and stable and unlikely to change.

Because C libraries differ, math results differ even in the same version, so they can certainly change (hopefully improve) in future versions. I think the better analogy is cmath, which I believe is more than just a wrapper.

The random module has a similar issue:

http://docs.python.org/3/library/random.html#notes-on-reproducibility


(2) Mappings[2] are iterable. That means that functions which expect
sequences or iterators may also operate on mappings by accident.

I think 'accident' is the key. (Working with sets is not an accident.) Anyone who really wants the mean of keys should be explicit:
   mean(d.keys())

example, sum({1: 100, 2: 200}) returns 3. If one wanted to reserve the
opportunity to handle mappings specifically in the future, without being
locked in by backwards-compatibility, how should one handle it?

a) document that behaviour with mappings is unsupported and may
    change in the future;

I think the doc should in any case specify the proper domain. In this case, I think it should exclude mappings: 'non-empty non-mapping iterable of numbers', or 'an iterable of numbers that is neither empty nor a mapping'. That makes the behavior at best undefined and subject to change. There should also be a caveat about mixing types, especially Decimals, if not one already. Perhaps rewrite the above as 'an iterable that is neither empty nor a mapping of numbers that are mutually summable'.

b) raise a warning when passed a mapping, but still iterate over it;

c) raise an exception and refuse to iterate over the mapping;

This, if possible. An empty iterable will raise at '/ 0'. Most anything that is not an iterable of number will eventually raise at '/ n' Testing both that an exception is raised and that it is one we want is why why unittest has assertRaises.

Question (2) is of course a specific example of a more general
question, to what degree is the library author responsible for keeping
backwards compatibility under circumstances which are not part of the
intended API, but just work by accident?

[1] Or, for the sake of the argument, less accurate.

[2] And sets.

--
Terry Jan Reedy

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to