Re: [Python-Dev] sum(...) limitation

Stefan Richthofer Tue, 12 Aug 2014 13:55:27 -0700

I know, I have nothing to decide here, since I'm no contributer and just a silent watcher on this list.

However I just wanted to point out I fully agree with Chris Barker's position. Couldn't have stated

it better. Performance should be interpreter implementation issue, not language issue.

> 2) add a special case for strings that is fast and efficient -- may be as simple as calling "".join() under the hood --no more code than the exception check.

I would give it a +1 if my opinion counts anything.

Cheers

Stefan

Gesendet: Dienstag, 12. August 2014 um 21:11 Uhr
Von: "Chris Barker" <chris.bar...@noaa.gov>
An: Kein Empfänger
Cc: "Python Dev" <python-dev@python.org>
Betreff: Re: [Python-Dev] sum(...) limitation

On Mon, Aug 11, 2014 at 11:07 PM, Stephen J. Turnbull <step...@xemacs.org> wrote:

I'm referring to removing the unnecessary information that there's a
better way to do it, and simply raising an error (as in Python 3.2,
say) which is all a RealProgrammer[tm] should ever need!

I can't imagine anyone is suggesting that -- disallow it, but don't tell anyone why?

The only thing that is remotely on the table here is:

1) remove the special case for strings -- buyer beware -- but consistent and less "ugly"

2) add a special case for strings that is fast and efficient -- may be as simple as calling "".join() under the hood --no more code than the exception check.

And I doubt anyone really is pushing for anything but (2)

Steven Turnbull wrote:

IMO we'd also want a homogeneous_iterable ABC

Actually, I've thought for years that that would open the door to a lot of optimizations -- but that's a much broader question that sum(). I even brought it up probably over ten years ago -- but no one was the least bit iinterested -- nor are they now -- I now this was a rhetorical suggestion to make the point about what not to do....

Because obviously we'd want the
attractive nuisance of "if you have __add__, there's a default
definition of __sum__"

now I'm confused -- isn't that exactly what we have now?

It's possible that Python could provide some kind of feature that
would allow an optimized sum function for every type that has __add__,
but I think this will take a lot of thinking.

does it need to be every type? As it is the common ones work fine already except for strings -- so if we add an optimized string sum() then we're done.

*Somebody* will do it
(I don't think anybody is +1 on restricting sum() to a subset of types
with __add__).

uhm, that's exactly what we have now -- you can use sum() with anything that has an __add__, except strings. Ns by that logic, if we thought there were other inefficient use cases, we'd restrict those too.

But users can always define their own classes that have a __sum__ and are really inefficient -- so unless sum() becomes just for a certain subset of built-in types -- does anyone want that? Then we are back to the current situation:

sum() can be used for any type that has an __add__ defined.

But naive users are likely to try it with strings, and that's bad, so we want to prevent that, and have a special case check for strings.

What I fail to see is why it's better to raise an exception and point users to a better way, than to simply provide an optimization so that it's a mute issue.

The only justification offered here is that will teach people that summing strings (and some other objects?) is order(N^2) and a bad idea. But:

a) Python's primary purpose is practical, not pedagogical (not that it isn't great for that)

b) I doubt any naive users learn anything other than "I can't use sum() for strings, I should use "".join()". Will they make the leap to "I shouldn't use string concatenation in a loop, either"? Oh, wait, you can use string concatenation in a loop -- that's been optimized. So will they learn: "some types of object shave poor performance with repeated concatenation and shouldn't be used with sum(). So If I write such a class, and want to sum them up, I'll need to write an optimized version of that code"?

I submit that no naive user is going to get any closer to a proper understanding of algorithmic Order behavior from this small hint. Which leaves no reason to prefer an Exception to an optimization.

One other point: perhaps this will lead a naive user into thinking -- "sum() raises an exception if I try to use it inefficiently, so it must be OK to use for anything that doesn't raise an exception" -- that would be a bad lesson to mis-learn....

-Chris

PS:

Armin Rigo wrote:

It also improves a
lot the precision of sum(list_of_floats) (though not reaching the same
precision levels of math.fsum()).

while we are at it, having the default sum() for floats be fsum() would be nice -- I'd rather the default was better accuracy loser performance. Folks that really care about performance could call math.fastsum(), or really, use numpy...

This does turn sum() into a function that does type-based dispatch, but isn't python full of those already? do something special for the types you know about, call the generic dunder method for the rest.

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA 98115   (206) 526-6317   main reception

chris.bar...@noaa.gov

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/stefan.richthofer%40gmx.de

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] sum(...) limitation

Reply via email to