On 2/1/2014 8:06 PM, Steven D'Aprano wrote:
Hi all,
Over on the Python-ideas list, there's a thread about the new statistics
module, and as the author of that module, I'm looking for a bit of
guidance regarding backwards compatibility. Specifically two issues:
(1) With numeric code, what happens if the module become more[1]
accurate in the future? Does that count as breaking backwards
compatibility?
E.g. Currently I use a particular algorithm for calculating variance.
Suppose that for a particular data set, this algorithm is accurate to
(say) seven decimal places:
# Python 3.4
variance(some_data) == 1.23456700001
Later, I find a better algorithm, which improves the accuracy of the
result:
# Python 3.5 or 3.6
variance(some_data) == 1.23456789001
Would this count as breaking backwards compatibility? If so, how should
I handle this? I don't claim that the current implementation of the
statistics module is optimal, as far as precision and accuracy is
concerned. It may improve in the future.
Or would that count as a bug-fix? "Variance function was inaccurate, now
less wrong", perhaps.
That is my inclination.
I suppose the math module has the same issue, except that it just wraps
the C libraries, which are mature and stable and unlikely to change.
Because C libraries differ, math results differ even in the same
version, so they can certainly change (hopefully improve) in future
versions. I think the better analogy is cmath, which I believe is more
than just a wrapper.
The random module has a similar issue:
http://docs.python.org/3/library/random.html#notes-on-reproducibility
(2) Mappings[2] are iterable. That means that functions which expect
sequences or iterators may also operate on mappings by accident.
I think 'accident' is the key. (Working with sets is not an accident.)
Anyone who really wants the mean of keys should be explicit:
mean(d.keys())
example, sum({1: 100, 2: 200}) returns 3. If one wanted to reserve the
opportunity to handle mappings specifically in the future, without being
locked in by backwards-compatibility, how should one handle it?
a) document that behaviour with mappings is unsupported and may
change in the future;
I think the doc should in any case specify the proper domain. In this
case, I think it should exclude mappings: 'non-empty non-mapping
iterable of numbers', or 'an iterable of numbers that is neither empty
nor a mapping'. That makes the behavior at best undefined and subject to
change. There should also be a caveat about mixing types, especially
Decimals, if not one already. Perhaps rewrite the above as 'an iterable
that is neither empty nor a mapping of numbers that are mutually summable'.
b) raise a warning when passed a mapping, but still iterate over it;
c) raise an exception and refuse to iterate over the mapping;
This, if possible. An empty iterable will raise at '/ 0'. Most anything
that is not an iterable of number will eventually raise at '/ n'
Testing both that an exception is raised and that it is one we want is
why why unittest has assertRaises.
Question (2) is of course a specific example of a more general
question, to what degree is the library author responsible for keeping
backwards compatibility under circumstances which are not part of the
intended API, but just work by accident?
[1] Or, for the sake of the argument, less accurate.
[2] And sets.
--
Terry Jan Reedy
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com