subject:"Re\: \[Python\-ideas\] Start argument for itertools.accumulate\(\) \[Was\: Proposal\: A Reduce\-Map Comprehension and a \"last\" builtin\]"

[Peter O'Connor ]
> Ok, so it seems everyone's happy with adding an initial_value argument.

Heh - that's not clear to me ;-)


> Now, I claim that while it should be an option, the initial value should NOT
> be returned by default.  (i.e. the returned generator should by default
> yield N elements, not N+1).

-1 on that.

- It goes against prior art.  Haskell's scanl does return the initial
value, and nobody on the planet has devoted more quality thought to
how streams "should work" than those folks.  The Python itertoolz
package's accumulate already supports an optional `initial=` argument,
and also returns it when specified.  It requires truly compelling
arguments to go against prior art.

- It's "obvious" that the first value should be returned if specified.
The best evidence:  in this thread's first message, it was "so
obvious" to Raymond that the implementation he suggested did exactly
that.  I doubt it even occurred to him to question whether it should.
It didn't occur to me either, but my mind is arguably "polluted" by
significant prior exposure to functional languages.

- In all but one "real life" example so far (including the
slice-summer class I stumbled into today), the code _wanted_ the
initial value to be returned.  The sole exception was one of the three
instances in Will Ness's wheel sieve code, where he discarded the
unwanted (in that specific case) initial value via a plain

next(wheel)

Which is telling:  it's easy to discard a value you don't want, but to
inject a value you _do_ want but don't get requires something like
reintroducing the

chain([value_i_want], the_iterable_that_didn't_give_the_value_i_want)

trick the new optional argument is trying to get _away_ from.  Talk
about ironic ;-)


I would like to see a simple thing added to itertools to make dropping
unwanted values easier, though:

"""
drop(iterable, n=None)
Return an iterator whose next() method returns all but the first `n`
values from the iterable.  If specified, `n` must be an integer >= 0.
By default (`n`=None), the iterator is run to exhaustion.
"""

Then, e.g.,

- drop(it, 0) would effectively be a synonym for iter(it).

- drop((it, 1) would skip over the first value from the iterable.

- drop(it) would give "the one obvious way" to consume an iterator
completely (for some reason that's a semi-FAQ,, and is usually
answered by suggesting the excruciatingly obscure trick of feeding the
iterable to a 0-size collections.deque constructor)..

Of course Haskell has had `drop` all along, although not the "run to
exhaustion" part.


> Example: suppose we're doing the toll booth thing, and we want to yield a
> cumulative sum of tolls so far.  Suppose someone already made a
> reasonable-looking generator yielding the cumulative sum of tolls for today:
>
> def iter_cumsum_tolls_from_day(day, toll_amount_so_far):
> return accumulate(get_tolls_from_day(day, initial=toll_amount_so_far))
>
> And now we want to make a way to get all tolls from the month.  One might
> reasonably expect this to work:
>
> def iter_cumsum_tolls_from_month(month, toll_amount_so_far):
> for day in month:
> for cumsum_tolls in iter_cumsum_tolls_from_day(day,
> toll_amount_so_far = toll_amount_so_far):
> yield cumsum_tolls
> toll_amount_so_far = cumsum_tolls
>
> But this would actually DUPLICATE the last toll of every day - it appears
> both as the last element of the day's generator and as the first element of
> the next day's generator.

I didn't really follow the details there, but the suggestion would be
the same regardless: drop the duplicates you don't want.

Note that making up an example in your head isn't nearly as persuasive
as "real life" code.  Code can be _contrived_ to "prove" anything.


> This is why I think that there should be an additional
> "include_initial_in_return=False" argument.  I do agree that it should be an
> option to include the initial value (your "find tolls over time-span"
> example shows why), but that if you want that you should have to show that
> you thought about that by specifying "include_initial_in_return=True"

It's generally poor human design to have a second optional argument
modify the behavior of yet another optional argument.  If the presence
of the latter can have two distinct modes of operation, then people
_will_ forget which one the default mode is, making code harder to
write and harder to read.

Since "return the value" is supported by all known prior art, and by
the bulk of "real life" Python codes known so far, "return the value"
should be the default.  But far better to make it the only mode rather
than _just_ the default mode.  Then there's nothing to be forgotten
:-)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

* correction to brackets from first example:

def iter_cumsum_tolls_from_day(day, toll_amount_so_far):
return accumulate(get_tolls_from_day(day), initial=toll_amount_so_far)


On Mon, Apr 9, 2018 at 11:55 PM, Peter O'Connor 
wrote:

> Ok, so it seems everyone's happy with adding an initial_value argument.
>
> Now, I claim that while it should be an option, the initial value should
> NOT be returned by default.  (i.e. the returned generator should by default
> yield N elements, not N+1).
>
> Example: suppose we're doing the toll booth thing, and we want to yield a
> cumulative sum of tolls so far.  Suppose someone already made a
> reasonable-looking generator yielding the cumulative sum of tolls for today:
>
> def iter_cumsum_tolls_from_day(day, toll_amount_so_far):
> return accumulate(get_tolls_from_day(day, initial=toll_amount_so_far))
>
> And now we want to make a way to get all tolls from the month.  One might
> reasonably expect this to work:
>
> def iter_cumsum_tolls_from_month(month, toll_amount_so_far):
> for day in month:
> for cumsum_tolls in iter_cumsum_tolls_from_day(day,
> toll_amount_so_far = toll_amount_so_far):
> yield cumsum_tolls
> toll_amount_so_far = cumsum_tolls
>
> But this would actually DUPLICATE the last toll of every day - it appears
> both as the last element of the day's generator and as the first element of
> the next day's generator.
>
> This is why I think that there should be an additional "
> include_initial_in_return=False" argument.  I do agree that it should be
> an option to include the initial value (your "find tolls over time-span"
> example shows why), but that if you want that you should have to show that
> you thought about that by specifying "include_initial_in_return=True"
>
>
>
>
>
> On Mon, Apr 9, 2018 at 10:30 PM, Tim Peters  wrote:
>
>> [Tim]
>> >> while we have N numbers, there are N+1 slice indices.  So
>> >> accumulate(xs) doesn't quite work.  It needs to also have a 0 inserted
>> >> as the first prefix sum (the empty prefix sum(xs[:0]).
>> >>
>> >> Which is exactly what a this_is_the_initial_value=0 argument would do
>> >> for us.
>>
>> [Greg Ewing ]
>> > In this case, yes. But that still doesn't mean it makes
>> > sense to require the initial value to be passed *in* as
>> > part of the input sequence.
>> >
>> > Maybe the best idea is for the initial value to be a
>> > separate argument, but be returned as the first item in
>> > the list.
>>
>> I'm not sure you've read all the messages in this thread, but that's
>> exactly what's being proposed.  That. e.g., a new optional argument:
>>
>> accumulate(xs, func, initial=S)
>>
>> act like the current
>>
>>  accumulate(chain([S], xs), func)
>>
>> Note that in neither case is the original `xs` modified in any way,
>> and in both cases the first value generated is S.
>>
>> Note too that the proposal is exactly the way Haskell's `scanl` works
>> (although `scanl` always requires specifying an initial value - while
>> the related `scanl1` doesn't allow specifying one).
>>
>> And that's all been so since the thread's first message, in which
>> Raymond gave a proposed implementation:
>>
>> _sentinel = object()
>>
>> def accumulate(iterable, func=operator.add, start=_sentinel):
>> it = iter(iterable)
>> if start is _sentinel:
>> try:
>> total = next(it)
>> except StopIteration:
>> return
>> else:
>> total = start
>> yield total
>> for element in it:
>> total = func(total, element)
>> yield total
>>
>> > I can think of another example where this would make
>> > sense. Suppose you have an initial bank balance and a
>> > list of transactions, and you want to produce a statement
>> > with a list of running balances.
>> >
>> > The initial balance and the list of transactions are
>> > coming from different places, so the most natural way
>> > to call it would be
>> >
>> >result = accumulate(transactions, initial = initial_balance)
>> >
>> > If the initial value is returned as item 0, then the
>> > result has the following properties:
>> >
>> >result[0] is the balance brought forward
>> >result[-1] is the current balance
>> >
>> > and this remains true in the corner case where there are
>> > no transactions.
>>
>> Indeed, something quite similar often applies when parallelizing
>> search loops of the form:
>>
>>  for candidate in accumulate(chain([starting_value], cycle(deltas))):
>>
>> For a sequence that eventually becomes periodic in the sequence of
>> deltas it cycles through, multiple processes can run independent
>> searches starting at carefully chosen different starting values "far"
>> apart.  In effect, they're each a "balance brought forward" pretending
>> that previous chunks have

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

Ok, so it seems everyone's happy with adding an initial_value argument.

Now, I claim that while it should be an option, the initial value should
NOT be returned by default.  (i.e. the returned generator should by default
yield N elements, not N+1).

Example: suppose we're doing the toll booth thing, and we want to yield a
cumulative sum of tolls so far.  Suppose someone already made a
reasonable-looking generator yielding the cumulative sum of tolls for today:

def iter_cumsum_tolls_from_day(day, toll_amount_so_far):
return accumulate(get_tolls_from_day(day, initial=toll_amount_so_far))

And now we want to make a way to get all tolls from the month.  One might
reasonably expect this to work:

def iter_cumsum_tolls_from_month(month, toll_amount_so_far):
for day in month:
for cumsum_tolls in iter_cumsum_tolls_from_day(day,
toll_amount_so_far = toll_amount_so_far):
yield cumsum_tolls
toll_amount_so_far = cumsum_tolls

But this would actually DUPLICATE the last toll of every day - it appears
both as the last element of the day's generator and as the first element of
the next day's generator.

This is why I think that there should be an additional "
include_initial_in_return=False" argument.  I do agree that it should be an
option to include the initial value (your "find tolls over time-span"
example shows why), but that if you want that you should have to show that
you thought about that by specifying "include_initial_in_return=True"





On Mon, Apr 9, 2018 at 10:30 PM, Tim Peters  wrote:

> [Tim]
> >> while we have N numbers, there are N+1 slice indices.  So
> >> accumulate(xs) doesn't quite work.  It needs to also have a 0 inserted
> >> as the first prefix sum (the empty prefix sum(xs[:0]).
> >>
> >> Which is exactly what a this_is_the_initial_value=0 argument would do
> >> for us.
>
> [Greg Ewing ]
> > In this case, yes. But that still doesn't mean it makes
> > sense to require the initial value to be passed *in* as
> > part of the input sequence.
> >
> > Maybe the best idea is for the initial value to be a
> > separate argument, but be returned as the first item in
> > the list.
>
> I'm not sure you've read all the messages in this thread, but that's
> exactly what's being proposed.  That. e.g., a new optional argument:
>
> accumulate(xs, func, initial=S)
>
> act like the current
>
>  accumulate(chain([S], xs), func)
>
> Note that in neither case is the original `xs` modified in any way,
> and in both cases the first value generated is S.
>
> Note too that the proposal is exactly the way Haskell's `scanl` works
> (although `scanl` always requires specifying an initial value - while
> the related `scanl1` doesn't allow specifying one).
>
> And that's all been so since the thread's first message, in which
> Raymond gave a proposed implementation:
>
> _sentinel = object()
>
> def accumulate(iterable, func=operator.add, start=_sentinel):
> it = iter(iterable)
> if start is _sentinel:
> try:
> total = next(it)
> except StopIteration:
> return
> else:
> total = start
> yield total
> for element in it:
> total = func(total, element)
> yield total
>
> > I can think of another example where this would make
> > sense. Suppose you have an initial bank balance and a
> > list of transactions, and you want to produce a statement
> > with a list of running balances.
> >
> > The initial balance and the list of transactions are
> > coming from different places, so the most natural way
> > to call it would be
> >
> >result = accumulate(transactions, initial = initial_balance)
> >
> > If the initial value is returned as item 0, then the
> > result has the following properties:
> >
> >result[0] is the balance brought forward
> >result[-1] is the current balance
> >
> > and this remains true in the corner case where there are
> > no transactions.
>
> Indeed, something quite similar often applies when parallelizing
> search loops of the form:
>
>  for candidate in accumulate(chain([starting_value], cycle(deltas))):
>
> For a sequence that eventually becomes periodic in the sequence of
> deltas it cycles through, multiple processes can run independent
> searches starting at carefully chosen different starting values "far"
> apart.  In effect, they're each a "balance brought forward" pretending
> that previous chunks have already been done.
>
> Funny:  it's been weeks now since I wrote an accumulate() that
> _didn't_ want to specify a starting value - LOL ;-)
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

[Tim]
>> while we have N numbers, there are N+1 slice indices.  So
>> accumulate(xs) doesn't quite work.  It needs to also have a 0 inserted
>> as the first prefix sum (the empty prefix sum(xs[:0]).
>>
>> Which is exactly what a this_is_the_initial_value=0 argument would do
>> for us.

[Greg Ewing ]
> In this case, yes. But that still doesn't mean it makes
> sense to require the initial value to be passed *in* as
> part of the input sequence.
>
> Maybe the best idea is for the initial value to be a
> separate argument, but be returned as the first item in
> the list.

I'm not sure you've read all the messages in this thread, but that's
exactly what's being proposed.  That. e.g., a new optional argument:

accumulate(xs, func, initial=S)

act like the current

 accumulate(chain([S], xs), func)

Note that in neither case is the original `xs` modified in any way,
and in both cases the first value generated is S.

Note too that the proposal is exactly the way Haskell's `scanl` works
(although `scanl` always requires specifying an initial value - while
the related `scanl1` doesn't allow specifying one).

And that's all been so since the thread's first message, in which
Raymond gave a proposed implementation:

_sentinel = object()

def accumulate(iterable, func=operator.add, start=_sentinel):
it = iter(iterable)
if start is _sentinel:
try:
total = next(it)
except StopIteration:
return
else:
total = start
yield total
for element in it:
total = func(total, element)
yield total

> I can think of another example where this would make
> sense. Suppose you have an initial bank balance and a
> list of transactions, and you want to produce a statement
> with a list of running balances.
>
> The initial balance and the list of transactions are
> coming from different places, so the most natural way
> to call it would be
>
>result = accumulate(transactions, initial = initial_balance)
>
> If the initial value is returned as item 0, then the
> result has the following properties:
>
>result[0] is the balance brought forward
>result[-1] is the current balance
>
> and this remains true in the corner case where there are
> no transactions.

Indeed, something quite similar often applies when parallelizing
search loops of the form:

 for candidate in accumulate(chain([starting_value], cycle(deltas))):

For a sequence that eventually becomes periodic in the sequence of
deltas it cycles through, multiple processes can run independent
searches starting at carefully chosen different starting values "far"
apart.  In effect, they're each a "balance brought forward" pretending
that previous chunks have already been done.

Funny:  it's been weeks now since I wrote an accumulate() that
_didn't_ want to specify a starting value - LOL ;-)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

2018-04-09 Thread Greg Ewing


Tim Peters wrote:

while we have N numbers, there are N+1 slice indices.  So
accumulate(xs) doesn't quite work.  It needs to also have a 0 inserted
as the first prefix sum (the empty prefix sum(xs[:0]).

Which is exactly what a this_is_the_initial_value=0 argument would do
for us.


In this case, yes. But that still doesn't mean it makes
sense to require the initial value to be passed *in* as
part of the input sequence.

Maybe the best idea is for the initial value to be a
separate argument, but be returned as the first item in
the list.

I can think of another example where this would make
sense. Suppose you have an initial bank balance and a
list of transactions, and you want to produce a statement
with a list of running balances.

The initial balance and the list of transactions are
coming from different places, so the most natural way
to call it would be

   result = accumulate(transactions, initial = initial_balance)

If the initial value is returned as item 0, then the
result has the following properties:

   result[0] is the balance brought forward
   result[-1] is the current balance

and this remains true in the corner case where there are
no transactions.

--
Greg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

2018-04-09 Thread Greg Ewing

Peter O'Connor wrote:
The behaviour where the first element of the return is the same as the 
first element of the input can be weird and confusing.  E.g. compare:

 >> list(itertools.accumulate([2, 3, 4], lambda accum, val: accum-val))
[2, -1, -5]
 >> list(itertools.accumulate([2, 3, 4], lambda accum, val: val-accum))
[2, 1, 3]

This is another symptom of the fact that the first item in
the list is taken to be the initial value. There's no way
to interpret these results in terms of an assumed initial
value, because neither of those functions has a left
identity.

--
Greg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

Woo hoo!  Another coincidence.  I just happened to be playing with
this problem today:

You have a large list - xs - of N numbers.  It's necessary to compute slice sums

sum(xs[i:j])

for a great many slices, 0 <= i <= j <= N.

For concreteness, say xs is a time series representing a toll booth
receipt's by hour across years.  "Management" may ask for all sorts of
sums - by 24-hour period, by week, by month, by year, by season, ...

A little thought showed that sum(xs[i:j]) = sum(xs[:j]) - sum(xs[:i]),
so if we precomputed just the prefix sums, the sum across an arbitrary
slice could be computed thereafter in constant time.  Hard to beat
that ;-)

But computing the prefix sums is exactly what accumulate() does!  With
one twist:  while we have N numbers, there are N+1 slice indices.  So
accumulate(xs) doesn't quite work.  It needs to also have a 0 inserted
as the first prefix sum (the empty prefix sum(xs[:0]).

Which is exactly what a this_is_the_initial_value=0 argument would do
for us.  As is, using the chain trick:

class SliceSummer:
def __init__(self, xs):
from itertools import accumulate, chain
self.N = N = len(xs)
if not N:
raise ValueError("need a non-empty sequence")
self.prefixsum = list(accumulate(chain([0], xs)))
assert len(self.prefixsum) == N+1

def slicesum(self, i, j):
N = self.N
if not 0 <= i <= j <= N:
raise ValueError(f"need 0 <= {i} <= {j} <= {N}")
return self.prefixsum[j] - self.prefixsum[i]

def test(N):
from random import randrange
xs = [randrange(-10, 11) for _ in range(N)]
ntried = 0
ss = SliceSummer(xs)
NP1 = N + 1
for i in range(NP1):
for j in range(i, NP1):
ntried += 1
assert ss.slicesum(i, j) == sum(xs[i:j])
assert ntried == N * NP1 // 2 + NP1, ntried
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

2018-04-09 Thread Neil Girdhar



On Friday, April 6, 2018 at 9:03:05 PM UTC-4, Raymond Hettinger wrote:
>
> > On Friday, April 6, 2018 at 8:14:30 AM UTC-7, Guido van Rossum wrote: 
> > On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor  
> wrote: 
> >>   So some more humble proposals would be: 
> >> 
> >> 1) An initializer to itertools.accumulate 
> >> functools.reduce already has an initializer, I can't see any 
> controversy to adding an initializer to itertools.accumulate 
> > 
> > See if that's accepted in the bug tracker. 
>
> It did come-up once but was closed for a number reasons including lack of 
> use cases.  However, Peter's signal processing example does sound 
> interesting, so we could re-open the discussion. 
>
> For those who want to think through the pluses and minuses, I've put 
> together a Q as food for thought (see below).  Everybody's design 
> instincts are different -- I'm curious what you all think think about the 
> proposal. 
>
>
> Raymond 
>
> - 
>
> Q. Can it be done? 
> A. Yes, it wouldn't be hard. 
>
> _sentinel = object() 
>
> def accumulate(iterable, func=operator.add, start=_sentinel): 
> it = iter(iterable) 
> if start is _sentinel: 
> try: 
> total = next(it) 
> except StopIteration: 
> return 
> else: 
> total = start 
> yield total 
> for element in it: 
> total = func(total, element) 
> yield total 
>
> Q. Do other languages do it? 
> A. Numpy, no. R, no. APL, no. Mathematica, no. Haskell, yes. 
>

Isn't numpy a yes?  
https://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.accumulate.html

They definitely support it for add and multiply.  It's defined, but doesn't 
seem to work on custum ufuncs (the result of frompyfunc).
  

>
> * 
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.accumulate.html
>  
> * https://stat.ethz.ch/R-manual/R-devel/library/base/html/cumsum.html 
> * http://microapl.com/apl/apl_concepts_chapter5.html 
>   \+ 1 2 3 4 5 
>   1 3 6 10 15 
> * https://reference.wolfram.com/language/ref/Accumulate.html 
> * https://www.haskell.org/hoogle/?hoogle=mapAccumL 
>
>
> Q. How much work for a person to do it currently? 
> A. Almost zero effort to write a simple helper function: 
>
>myaccum = lambda it, func, start: accumulate(chain([start], it), func) 
>
>
> Q. How common is the need? 
> A. Rare. 
>
>
> Q. Which would be better, a simple for-loop or a customized itertool? 
> A. The itertool is shorter but more opaque (especially with respect 
>to the argument order for the function call): 
>
> result = [start] 
> for x in iterable: 
>  y = func(result[-1], x) 
>  result.append(y) 
>
> versus: 
>
> result = list(accumulate(iterable, func, start=start)) 
>
>
> Q. How readable is the proposed code? 
> A. Look at the following code and ask yourself what it does: 
>
> accumulate(range(4, 6), operator.mul, start=6) 
>
>Now test your understanding: 
>
> How many values are emitted? 
> What is the first value emitted? 
> Are the two sixes related? 
> What is this code trying to accomplish? 
>
>
> Q. Are there potential surprises or oddities? 
> A. Is it readily apparent which of assertions will succeed? 
>
> a1 = sum(range(10)) 
> a2 = sum(range(10), 0) 
> assert a1 == a2 
>
> a3 = functools.reduce(operator.add, range(10)) 
> a4 = functools.reduce(operator.add, range(10), 0) 
> assert a3 == a4 
>
> a4 = list(accumulate(range(10), operator.add)) 
> a5 = list(accumulate(range(10), operator.add, start=0)) 
> assert a5 == a6 
>
>
> Q. What did the Python 3.0 Whatsnew document have to say about reduce()? 
> A. "Removed reduce(). Use functools.reduce() if you really need it; 
> however, 99 percent of the time an explicit for loop is more readable." 
>
>
> Q. What would this look like in real code? 
> A. We have almost no real-world examples, but here is one from a 
> StackExchange post: 
>
> def wsieve():   # wheel-sieve, by Will Ness.
> ideone.com/mqO25A->0hIE89 
> wh11 = [ 2,4,2,4,6,2,6,4,2,4,6,6, 2,6,4,2,6,4,6,8,4,2,4,2, 
>  4,8,6,4,6,2,4,6,2,6,6,4, 2,4,6,2,6,4,2,4,2,10,2,10] 
> cs = accumulate(cycle(wh11), start=11) 
> yield( next( cs))   #   cf. ideone.com/WFv4f 
> ps = wsieve()   # 
> codereview.stackexchange.com/q/92365/9064 
> p = next(ps)# 11 
> psq = p*p   # 121 
> D = dict( zip( accumulate(wh11, start=0), count(0)))   # start 
> from 
> sieve = {} 
> for c in cs: 
> if c in sieve: 
> wheel =

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

[Tim]
>> Then why was [accumulate] generalized to allow any 2-argument function?

[Raymond]
> Prior to 3.2, accumulate() was in the recipes section as pure Python
> code.  It had no particular restriction to numeric types.
>
> I received a number of requests for accumulate() to be promoted
> to a real itertool (fast, tested, documented C code with a stable API).
> I agreed and accumulate() was added to itertools in 3.2.  It worked
> with anything supporting __add__, including str, bytes, lists, and
> tuples.

So that's the restriction Nick had in mind:  a duck-typing kind, in
that it would blow up on types that don't participate in
PyNumber_Add():

> More specifically, accumulate_next() called PyNumber_Add() without
< any particular type restriction.
>
> Subsequently, I got requests to generalize accumulate() to support any
> arity-2 function (with operator.mul offered as the motivating example).

Sucker ;-)

>  Given that there were user requests and there were ample precedents
> in other languages, I acquiesced despite having some reservations (if used
> with a lambda, the function call overhead might make accumulate() slower
> than a plain Python for-loop without the function call). So, that generalized
> API extension went into 3.3 and has remained unchanged ever since.
>
> Afterwards, I was greeted with the sound of crickets.  Either it was nearly
> perfect or no one cared or both ;-)

Or nobody cared _enough_ to endure a 100-message thread arguing about
an objectively minor change ;-)

If you can borrow Guido's time machine, I'd suggest going back to the
original implementation, except name it `cumsum()` instead, and leave
`accumulate()` to the 3rd-party itertools packages (at least one of
which (itertoolz) has supported an optional "initial" argument all
along).

> It remains one of the least used itertools.

I don't see how that can be known.  There are at least tens of
thousands of Python programmers nobody on this list has ever heard
about - or from - writing code that's invisible to search engines.

I _believe_ it - I just don't see how it can be known.


> ...
> Honestly, I couldn't immediately tell what this code was doing:
>
> list(accumulate([8, 4, "k"], lambda x, y: x + [y], first_result=[]))

Of course you couldn't:  you think of accumulate() as being about
running sums, and _maybe_ some oddball out there using it for running
products.  But that's a statement about your background, seeing code
you've never seen before, not about the function.  Nobody knows
immediately, at first sight, what

 list(accumulate([8, 4, 6], lambda x, y: x + y, first_result=0))

does either.  It's learned.  If your background were in, e.g., Haskell
instead, then in the latter case you'd picture a list [a, b, c, ...]
and figure it out from thinking about what the prefixes of

0 + a + b + c + 

compute.  In exactly the same way, in the former case you'd think
about what the prefixes of

[] + [a] + [b] + [c] + ...

compute.  They're equally obvious _after_ undertaking that easy
exercise, but clear as mud before doing so.


> This may be a case where a person would be better-off without accumulate() at 
> all.

De gustibus non est disputandum.


>> In short, for _general_ use `accumulate()` needs `initial` for exactly
>> the same reasons `reduce()` needed it.

> The reduce() function had been much derided, so I've had it mentally filed
> in the anti-pattern category.  But yes, there may be wisdom there.

The current accumulate() isn't just akin to reduce(), it _is_
reduce(), except a drunken reduce() so nauseated it vomits its
internal state out after each new element it eats ;-)


>> BTW, the type signatures on the scanl (requires an initial value) and
>> scanl1 (does not support an initial value) implementations I pasted
>> from Haskell's Standard Prelude give a deeper reason:  without an
>> initial value, a list of values of type A can only produce another
>> list of values of type A via scanl1.  The dyadic function passed must
>> map As to As.  But with an initial value supplied of type B, scanl can
>> transform a list of values of type A to a list of values of type B.
>> While that may not have been obvious in the list prefix example I
>> gave, that was at work:  a list of As was transformed into a list _of_
>> lists of As.  That's impossible for scanl1 to do, but easy for scanl.

> Thanks for pointing that out.  I hadn't considered that someone might
> want to transform one type into another using accumulate().  That is
> pretty far from my mental model of what accumulate() was intended for.

It's nevertheless what the current function supports - nothing being
suggested changes that one whit.  It's "worse" in Python because while
only `scanl` in Haskell can "change types", the current `scanl1`-like
Python `accumulate` can change types too.  Perhaps the easiest way to
see that is by noting that

map(f, xs)

is generally equivalent to

accumulate(xs, lambda x, y: f(y))

right now.

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

Also Tim Peter's one-line example of:

print(list(itertools.accumulate([1, 2, 3], lambda x, y: str(x) + str(y

I think makes it clear that itertools.accumulate is not the right vehicle
for this change - we should make a new itertools function with a required
"initial" argument.

On Mon, Apr 9, 2018 at 1:44 PM, Peter O'Connor 
wrote:

> It seems clear that the name "accumulate" has been kind of antiquated
> since the "func" argument was added and "sum" became just a default.
>
> And people seem to disagree about whether the result should have a length
> N or length N+1 (where N is the number of elements in the input iterable).
>
> The behaviour where the first element of the return is the same as the
> first element of the input can be weird and confusing.  E.g. compare:
>
> >> list(itertools.accumulate([2, 3, 4], lambda accum, val: accum-val))
> [2, -1, -5]
> >> list(itertools.accumulate([2, 3, 4], lambda accum, val: val-accum))
> [2, 1, 3]
>
> One might expect that since the second function returned the negative of
> the first function, and both are linear, that the results of the second
> would be the negative of the first, but that is not the case.
>
> Maybe we can instead let "accumulate" fall into deprecation, and instead
> add a new more general itertools "reducemap" method:
>
> def reducemap(iterable: Iterable[Any], func: Callable[(Any, Any), Any],
> initial: Any, include_initial_in_return=False): -> Generator[Any]
>
> Benefits:
> - The name is more descriptive of the operation (a reduce operation where
> we keep values at each step, like a map)
> - The existence of include_initial_in_return=False makes it somewhat
> clear that the initial value will by default NOT be provided in the
> returning generator
> - The mandatory initial argument forces you to think about initial
> conditions.
>
> Disadvantages:
> - The most common use case (summation, product), has a "natural" first
> element (0, and 1, respectively) when you'd now be required to write out.
> (but we could just leave accumulate for sum).
>
> I still prefer a built-in language comprehension syntax for this like: (y
> := f(y, x) for x in x_vals from y=0), but for a huge discussion on that see
> the other thread.
>
> --- More Examples (using "accumulate" as the name for now)  ---
>
> # Kalman filters
> def kalman_filter_update(state, measurement):
> ...
> return state
>
> online_trajectory_estimate = accumulate(measurement_generator, func=
> kalman_filter_update, initial = initial_state)
>
> ---
>
> # Bayesian stats
> def update_model(prior, evidence):
>...
>return posterior
>
> model_history  = accumulate(evidence_generator, func=update_model,
> initial = prior_distribution)
>
> ---
>
> # Recurrent Neural networks:
> def recurrent_network_layer_step(last_hidden, current_input):
> new_hidden = 
> return new_hidden
>
> hidden_state_generator = accumulate(input_sequence, func=
> recurrent_network_layer_step, initial = initial_hidden_state)
>
>
>
>
> On Mon, Apr 9, 2018 at 7:14 AM, Nick Coghlan  wrote:
>
>> On 9 April 2018 at 14:38, Raymond Hettinger 
>> wrote:
>> >> On Apr 8, 2018, at 6:43 PM, Tim Peters  wrote:
>> >> In short, for _general_ use `accumulate()` needs `initial` for exactly
>> >> the same reasons `reduce()` needed it.
>> >
>> > The reduce() function had been much derided, so I've had it mentally
>> filed in the anti-pattern category.  But yes, there may be wisdom there.
>>
>> Weirdly (or perhaps not so weirdly, given my tendency to model
>> computational concepts procedurally), I find the operation of reduce()
>> easier to understand when it's framed as "last(accumulate(iterable,
>> binop, initial=value)))".
>>
>> Cheers,
>> Nick.
>>
>> --
>> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
>> ___
>> Python-ideas mailing list
>> Python-ideas@python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

It seems clear that the name "accumulate" has been kind of antiquated since
the "func" argument was added and "sum" became just a default.

And people seem to disagree about whether the result should have a length N
or length N+1 (where N is the number of elements in the input iterable).

The behaviour where the first element of the return is the same as the
first element of the input can be weird and confusing.  E.g. compare:

>> list(itertools.accumulate([2, 3, 4], lambda accum, val: accum-val))
[2, -1, -5]
>> list(itertools.accumulate([2, 3, 4], lambda accum, val: val-accum))
[2, 1, 3]

One might expect that since the second function returned the negative of
the first function, and both are linear, that the results of the second
would be the negative of the first, but that is not the case.

Maybe we can instead let "accumulate" fall into deprecation, and instead
add a new more general itertools "reducemap" method:

def reducemap(iterable: Iterable[Any], func: Callable[(Any, Any), Any],
initial: Any, include_initial_in_return=False): -> Generator[Any]

Benefits:
- The name is more descriptive of the operation (a reduce operation where
we keep values at each step, like a map)
- The existence of include_initial_in_return=False makes it somewhat clear
that the initial value will by default NOT be provided in the returning
generator
- The mandatory initial argument forces you to think about initial
conditions.

Disadvantages:
- The most common use case (summation, product), has a "natural" first
element (0, and 1, respectively) when you'd now be required to write out.
(but we could just leave accumulate for sum).

I still prefer a built-in language comprehension syntax for this like: (y
:= f(y, x) for x in x_vals from y=0), but for a huge discussion on that see
the other thread.

--- More Examples (using "accumulate" as the name for now)  ---

# Kalman filters
def kalman_filter_update(state, measurement):
...
return state

online_trajectory_estimate = accumulate(measurement_generator, func=
kalman_filter_update, initial = initial_state)

---

# Bayesian stats
def update_model(prior, evidence):
   ...
   return posterior

model_history  = accumulate(evidence_generator, func=update_model, initial
= prior_distribution)

---

# Recurrent Neural networks:
def recurrent_network_layer_step(last_hidden, current_input):
new_hidden = 
return new_hidden

hidden_state_generator = accumulate(input_sequence, func=
recurrent_network_layer_step, initial = initial_hidden_state)

On Mon, Apr 9, 2018 at 7:14 AM, Nick Coghlan  wrote:

> On 9 April 2018 at 14:38, Raymond Hettinger 
> wrote:
> >> On Apr 8, 2018, at 6:43 PM, Tim Peters  wrote:
> >> In short, for _general_ use `accumulate()` needs `initial` for exactly
> >> the same reasons `reduce()` needed it.
> >
> > The reduce() function had been much derided, so I've had it mentally
> filed in the anti-pattern category.  But yes, there may be wisdom there.
>
> Weirdly (or perhaps not so weirdly, given my tendency to model
> computational concepts procedurally), I find the operation of reduce()
> easier to understand when it's framed as "last(accumulate(iterable,
> binop, initial=value)))".
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

2018-04-09 Thread Nick Coghlan

On 9 April 2018 at 14:38, Raymond Hettinger  wrote:
>> On Apr 8, 2018, at 6:43 PM, Tim Peters  wrote:
>> In short, for _general_ use `accumulate()` needs `initial` for exactly
>> the same reasons `reduce()` needed it.
>
> The reduce() function had been much derided, so I've had it mentally filed in 
> the anti-pattern category.  But yes, there may be wisdom there.

Weirdly (or perhaps not so weirdly, given my tendency to model
computational concepts procedurally), I find the operation of reduce()
easier to understand when it's framed as "last(accumulate(iterable,
binop, initial=value)))".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

2018-04-09 Thread Greg Ewing


Raymond Hettinger wrote:


I don't want to overstate the case, but I do think a function signature that
offers a "first_value" option is an invitation to treat the first value as
being distinct from the rest of the data stream.


I conjecture that the initial value is *always* special,
and the only cases where it seems not to be are where
you're relying on some implicit initial value such as
zero.

--
Greg

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

2018-04-08 Thread Raymond Hettinger

> On Apr 8, 2018, at 6:43 PM, Tim Peters  wrote:
> 
>> My other common case for accumulate() is building cumulative
>> probability distributions from probability mass functions (see the
>> code for random.choice() for example, or typical code for a K-S test).
> 
> So, a question:  why wasn't itertools.accumulate() written to accept
> iterables of _only_ numeric types?  Akin to `sum()`.  I gather from
> one of Nick's messages that it was so restricted in 3.2.  Then why was
> it generalized to allow any 2-argument function?

Prior to 3.2, accumulate() was in the recipes section as pure Python code.  It 
had no particular restriction to numeric types.

I received a number of requests for accumulate() to be promoted to a real 
itertool (fast, tested, documented C code with a stable API).  I agreed and 
accumulate() was added to itertools in 3.2.  It worked with anything supporting 
__add__, including str, bytes, lists, and tuples.  More specifically, 
accumulate_next() called PyNumber_Add() without any particular type restriction.

Subsequently, I got requests to generalize accumulate() to support any arity-2 
function (with operator.mul offered as the motivating example).  Given that 
there were user requests and there were ample precedents in other languages, I 
acquiesced despite having some reservations (if used with a lambda, the 
function call overhead might make accumulate() slower than a plain Python 
for-loop without the function call). So, that generalized API extension went 
into 3.3 and has remained unchanged ever since.

Afterwards, I was greeted with the sound of crickets.  Either it was nearly 
perfect or no one cared or both ;-)  

It remains one of the least used itertools.

> Given that it was, `sum()` is no longer particularly relevant:  the
> closest thing by far is now `functools.reduce()`, which does support
> an optional `initial` argument.  Which it really should, because it's
> impossible for the implementation to guess a suitable starting value
> for an arbitrary user-supplied dyadic function.
> 
> My example using accumulate() to generate list prefixes got snipped,
> but same thing there:  it's impossible for that snippet to work unless
> an empty list is supplied as the starting value.  And it's impossible
> for the accumulate() implementation to guess that.

Honestly, I couldn't immediately tell what this code was doing:

list(accumulate([8, 4, "k"], lambda x, y: x + [y], first_result=[]))

This may be a case where a person would be better-off without accumulate() at 
all.

> In short, for _general_ use `accumulate()` needs `initial` for exactly
> the same reasons `reduce()` needed it.

The reduce() function had been much derided, so I've had it mentally filed in 
the anti-pattern category.  But yes, there may be wisdom there.

> BTW, the type signatures on the scanl (requires an initial value) and
> scanl1 (does not support an initial value) implementations I pasted
> from Haskell's Standard Prelude give a deeper reason:  without an
> initial value, a list of values of type A can only produce another
> list of values of type A via scanl1.  The dyadic function passed must
> map As to As.  But with an initial value supplied of type B, scanl can
> transform a list of values of type A to a list of values of type B.
> While that may not have been obvious in the list prefix example I
> gave, that was at work:  a list of As was transformed into a list _of_
> lists of As.  That's impossible for scanl1 to do, but easy for scanl.

Thanks for pointing that out.  I hadn't considered that someone might want to 
transform one type into another using accumulate().  That is pretty far from my 
mental model of what accumulate() was intended for.  Also, I'm still not sure 
whether we would want code like that buried in an accumulate() call rather than 
as a regular for-loop where I can see the logic and trace through it with pdb.

As for scanl, I'm not sure what this code means without seeing some python 
equivalent.

scanl:: (a -> b -> a) -> a -> [b] -> [a]
scanl f q xs =  q : (case xs of
   []   -> []
   x:xs -> scanl f (f q x) xs)

scanl1   :: (a -> a -> a) -> [a] -> [a]
scanl1 f (x:xs)  =  scanl f x xs
scanl1 _ []  =  []

> Or, in short, someone coming from a typed functional language
> background sees all sorts of things that rarely (if ever) come up in
> number-crunching languages.  Their sensibilities should count too -
> although not much ;-)  They should get _some_ extra consideration in
> this context, though, because `itertools` is one of the first things
> they dig into when they give Python a try.

I concur.

>> and it would have been distracting to even had the option.
> 
> Distracting for how long?  One second or two? ;-)

Possibly forever.  In my experience, if a person initially frames a problem 
wrong (or perhaps in a hard to solve way), it can

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

2018-04-08 Thread Greg Ewing

Raymond Hettinger wrote:

For neither of those use case categories did I ever want an initial value and
it would have been distracting to even had the option. For example, when
doing a discounted cash flow analysis, I was taught to model the various
flows as a single sequence of up and down arrows rather than thinking of the
initial balance as a distinct concept¹

There's always an initial value, even if it's implicit.

The way accumulate() works can be thought of in two ways:

(1) The initial value is implicitly the identity of whatever
operation the function performs.

(2) The first item in the list is the initial value, and
the rest are items to be accumulated.

Both of these are somewhat dodgy, IMO. The first one works
only if the assumed identity is what you actually want, *and*
there is always at least one item to accumulate.

If those conditions don't hold, you need to insert the
initial value as the first item. But this is almost
certainly going to require extra code. The initial value
and the items are conceptually different things, and are
unlikely to start out in the same list together.

What's more, the first thing the implementation of
accumulate() does is extract the first item and treat it
differently from the rest. So your code goes out of its
way to insert the initial value, and then accumulate()
goes out of its way to pull it out again. Something
smells wrong about that.

As an example, suppose you have a list [1, 2, 3] and
you want to construct [], [1], [1, 2], [1, 2, 3].
To do that with accumulate() you need to write something
like:

accumulate([[], 1, 2, 3], lambda x, y: x + [y])

The fact that the first element of the list doesn't even
have the same *type* as the rest should be a strong hint
that forcing them to occupy the same list is an
unnatural thing to do.

--
Greg

Because of this background, I was surprised to have the question ever come up
at all (other than the symmetry argument that sum() has "start" so
accumulate() must as well).

When writing itertools.accumulate(), I started by looking to see what other
languages had done. Since accumulate() is primarily a numerical tool, I
expected that the experience of numeric-centric languages would have
something to teach us. My reasoning was that if the need hadn't arisen for
APL, R, Numpy, Matlab², or Mathematica, perhaps it really was just noise.

My views may be dated though. Looking at the wheel sieve and collatz glide
record finder, I see something new, a desire to work with lazy, potentially
infinite accumulations (something that iterators do well but almost never
arises in the world of fixed-length sequences or cumulative probability
distributions).

So I had been warming up to the idea, but got concerned that Nick could have
had such a profoundly different idea about what the code should do. That
cooled my interest a bit, especially when thinking about two key questions,
"Will it create more problems than it solves?" and "Will anyone actually use
it?".

Raymond

¹
http://www.chegg.com/homework-help/questions-and-answers/solve-present-worth-cash-flow-shown-using-three-interest-factors-10-interest-compounded-an-q878034

² https://www.mathworks.com/help/matlab/ref/accumarray.html
___ Python-ideas mailing list
Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas

Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

> >>> list(accumulate([1, 2, 3]))
> [11, 13, 16]

Wow!  I would have sworn that said

[1, 3, 6]

when I sent it.  Damn Gmail ;-)

> >>> list(accumulate([1, 2, 3], initial=10))
>  [10, 11, 13, 16]
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

[Raymond]
> The Bayesian world view isn't much different except they would
> prefer "prior" instead of "initial" or "start" ;-)
>
> my_changing_beliefs = accumulate(stream_of_new_evidence, bayes_rule, 
> prior=what_i_used_to_think)
>
> Though the two analogies are cute, I'm not sure they tell us much.  In ...

They're not intended to.  They're just intended to shake people loose
from picturing nothing subtler than adding a list of 3 integers ;-)


> My own experience with actually using accumulations in algorithmic
> code falls neatly into two groups.  Many years ago, I used APL
> extensively in accounting work and my recollection is that a part
> of the convenience of "\+" was that the sequence length didn't change
> (so that the various data arrays could interoperate with one another).

Sure.


> My other common case for accumulate() is building cumulative
> probability distributions from probability mass functions (see the
> code for random.choice() for example, or typical code for a K-S test).

So, a question:  why wasn't itertools.accumulate() written to accept
iterables of _only_ numeric types?  Akin to `sum()`.  I gather from
one of Nick's messages that it was so restricted in 3.2.  Then why was
it generalized to allow any 2-argument function?

Given that it was, `sum()` is no longer particularly relevant:  the
closest thing by far is now `functools.reduce()`, which does support
an optional `initial` argument.  Which it really should, because it's
impossible for the implementation to guess a suitable starting value
for an arbitrary user-supplied dyadic function.

My example using accumulate() to generate list prefixes got snipped,
but same thing there:  it's impossible for that snippet to work unless
an empty list is supplied as the starting value.  And it's impossible
for the accumulate() implementation to guess that.

In short, for _general_ use `accumulate()` needs `initial` for exactly
the same reasons `reduce()` needed it.

BTW, the type signatures on the scanl (requires an initial value) and
scanl1 (does not support an initial value) implementations I pasted
from Haskell's Standard Prelude give a deeper reason:  without an
initial value, a list of values of type A can only produce another
list of values of type A via scanl1.  The dyadic function passed must
map As to As.  But with an initial value supplied of type B, scanl can
transform a list of values of type A to a list of values of type B.
While that may not have been obvious in the list prefix example I
gave, that was at work:  a list of As was transformed into a list _of_
lists of As.  That's impossible for scanl1 to do, but easy for scanl.

Or, in short, someone coming from a typed functional language
background sees all sorts of things that rarely (if ever) come up in
number-crunching languages.  Their sensibilities should count too -
although not much ;-)  They should get _some_ extra consideration in
this context, though, because `itertools` is one of the first things
they dig into when they give Python a try.


> For neither of those use case categories did I ever want an initial value

As above, in all your related experiences "0" was a suitable base
value, so you had no reason to care.

> and it would have been distracting to even had the option.

Distracting for how long?  One second or two? ;-)


> ...
> Because of this background, I was surprised to have the question ever
> come up at all (other than the symmetry argument that sum() has "start"
> so accumulate() must as well).

As above, the real parallel here is to reduce().  `sum()` became an
historical red herring when `accumulate()` was generalized.

With a different background, you may just as well have been surprised
if the question _hadn't_ come up.  For example, this is a standard
example in the Haskell world for how to define an infinite Fibonacci
sequence with the initial two values f0 and f1:

fibs = f0 : scanl (+) f1 fibs

The part from `scanl` onward would be spelled in Python as

accumulate(fibs, initial=f1)

although it requires some trickery to get the recursive reference to
work (details on request, but I'm sure you already know how to do
that).


> When writing itertools.accumulate(), I started by looking to see what
> other languages had done.  Since accumulate() is primarily a
> numerical tool, I expected that the experience of numeric-centric
> languages would have something to teach us.  My reasoning was
> that if the need hadn't arisen for APL, R, Numpy, Matlab², or Mathematica,
> perhaps it really was just noise.

In the itertools context, I also would have looked hard at Haskell experience.

BTW, whoever wrote the current `accumulate()` docs also found a use
for `initial`, but hacked around it:

"""
First-order recurrence relations can be modeled by supplying the
initial value in the iterable and using only the accumulated total in
func argument:
"""

followed by:

>>> logistic_map = lambda x, _:  r * x * (1 - x)
>>> r = 3.8
>>> x0 = 0.4
>>>

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

2018-04-08 Thread Raymond Hettinger

> On Apr 8, 2018, at 12:22 PM, Tim Peters  wrote:
> 
> [Guido]
>> Well if you can get Raymond to agree on that too I suppose you can go ahead.
>> Personally I'm -0 but I don't really write this kind of algorithmic code
>> enough to know what's useful.
> 
> Actually, you do - but you don't _think_ of problems in these terms.
> Neither do I.  For those who do:  consider any program that has state
> and responds to inputs.  When you get a new input, the new state is a
> function of the existing state and the input.

The Bayesian world view isn't much different except they would prefer "prior" 
instead of "initial" or "start" ;-)

my_changing_beliefs = accumulate(stream_of_new_evidence, bayes_rule, 
prior=what_i_used_to_think)

Though the two analogies are cute, I'm not sure they tell us much.  In running 
programs or bayesian analysis, we care more about the result rather than the 
accumulation of intermediate results.

My own experience with actually using accumulations in algorithmic code falls 
neatly into two groups.  Many years ago, I used APL extensively in accounting 
work and my recollection is that a part of the convenience of "\+" was that the 
sequence length didn't change (so that the various data arrays could 
interoperate with one another).  

My other common case for accumulate() is building cumulative probability 
distributions from probability mass functions (see the code for random.choice() 
for example, or typical code for a K-S test).

For neither of those use case categories did I ever want an initial value and 
it would have been distracting to even had the option. For example, when doing 
a discounted cash flow analysis, I was taught to model the various flows as a 
single sequence of up and down arrows rather than thinking of the initial 
balance as a distinct concept¹

Because of this background, I was surprised to have the question ever come up 
at all (other than the symmetry argument that sum() has "start" so accumulate() 
must as well).

When writing itertools.accumulate(), I started by looking to see what other 
languages had done.  Since accumulate() is primarily a numerical tool, I 
expected that the experience of numeric-centric languages would have something 
to teach us.  My reasoning was that if the need hadn't arisen for APL, R, 
Numpy, Matlab², or Mathematica, perhaps it really was just noise.

My views may be dated though.  Looking at the wheel sieve and collatz glide 
record finder, I see something new, a desire to work with lazy, potentially 
infinite accumulations (something that iterators do well but almost never 
arises in the world of fixed-length sequences or cumulative probability 
distributions).

So I had been warming up to the idea, but got concerned that Nick could have 
had such a profoundly different idea about what the code should do.  That 
cooled my interest a bit, especially when thinking about two key questions, 
"Will it create more problems than it solves?" and "Will anyone actually use 
it?".

Raymond

¹ 
http://www.chegg.com/homework-help/questions-and-answers/solve-present-worth-cash-flow-shown-using-three-interest-factors-10-interest-compounded-an-q878034

² https://www.mathworks.com/help/matlab/ref/accumarray.html
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

[Guido]
> Well if you can get Raymond to agree on that too I suppose you can go ahead.
> Personally I'm -0 but I don't really write this kind of algorithmic code
> enough to know what's useful.

Actually, you do - but you don't _think_ of problems in these terms.
Neither do I.  For those who do:  consider any program that has state
and responds to inputs.  When you get a new input, the new state is a
function of the existing state and the input.  That's `accumulate`!
It generates a sequence of new states as the sequence of incrementally
updated states derived from a sequence of inputs according to the
function passed to accumulate.

In that view of the world, specifying a starting value is merely
specifying the program's initial state - and from that view of the
world, not allowing to specify a starting value is bizarre.

While Will Ness's wheel sieve program, and my Collatz
glide-record-finder, don't _derive_ from that view of the world, it
turns out that specifying (and returning) an initial value is also
useful for them, and despite that they're just doing integer running
sums.

A simple example from the broader view of the world is generating all
the prefixes of a list:

>>> from itertools import *
>>> list(accumulate(chain([[]], [8, 4, "k"]), lambda x, y: x + [y]))
[[], [8], [8, 4], [8, 4, 'k']]

That's obviously easier to follow if written, e.g.,

list(accumulate([8, 4, "k"], lambda x, y: x + [y], first_result=[]))


> I do think that the new parameter name ["first_result"] is ugly. But maybe 
> that's
> the point.

I noted later that the `accumulate` in the Python itertoolz package
names its optional argument "initial".  That's not ugly, and works
just as well for me.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

2018-04-08 Thread Kirill Balunov

2018-04-08 8:19 GMT+03:00 Nick Coghlan :

> A name like "first_result" would also make it clearer to readers that
> passing that parameter has an impact on the length of the output
> series (since you're injecting an extra result), and also that the
> production of the first result skips calling func completely (as can
> be seen in Tim's str coercion example).
>
> So where I'd be -1 on:
>
> >>> list(accumulate(1, 2, 3))
> [1, 3, 6]
> >>> list(accumulate(1, 2, 3, start=0))
> [0, 1, 3, 6]
> >>> list(accumulate(1, 2, 3, start=1))
> [1, 2, 4, 7]
>
> I'd be +1 on:
>
> >>> list(accumulate(1, 2, 3))
> [1, 3, 6]
> >>> list(accumulate(1, 2, 3, first_result=0))
> [0, 1, 3, 6]
> >>> list(accumulate(1, 2, 3, first_result=1))
> [1, 2, 4, 7]
>
>
It is a fair point! But the usual way to understand how to use an
additional argument, is to try it or to look examples in the documentation.

Concerning the topic of relationship between `sum` and `accumulate` I have
another point of view. If `start` means something to the `sum`, there are
no grounds for believing that it should mean the same for
`accumulate`, these functions are not really comparable and fundamentally
different. The closest `sum`s friend is `functools.reduce` which uses
`initial` instead of `start`. ( the documentation uses `initializer` and
the docstring uses `initial`, as for me I prefer the `initial`) and so
there is already a discrepancy.

Having said this I think that it is no matter how it will be named `start`
or `initial` or `first_result` or `first`, which is more suitable. I would
prefer `initial` to be the same as in `itertoolz` package. Regarding the
second point, should it yield one more element if provided - I think
everyone here agrees that yes.

With kind regards,
-gdg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

Another bit of prior art:  the Python itertoolz package also supplies
`accumulate()`, with an optional `initial` argument.  I stumbled into
that when reading a Stackoverflow "how can I do Haskell's scanl in
Python?" question.

https://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.accumulate


On Fri, Apr 6, 2018 at 8:02 PM, Raymond Hettinger
 wrote:
>> On Friday, April 6, 2018 at 8:14:30 AM UTC-7, Guido van Rossum wrote:
>> On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor  wrote:
>>>   So some more humble proposals would be:
>>>
>>> 1) An initializer to itertools.accumulate
>>> functools.reduce already has an initializer, I can't see any controversy to 
>>> adding an initializer to itertools.accumulate
>>
>> See if that's accepted in the bug tracker.
>
> It did come-up once but was closed for a number reasons including lack of use 
> cases.  However, Peter's signal processing example does sound interesting, so 
> we could re-open the discussion.
>
> For those who want to think through the pluses and minuses, I've put together 
> a Q as food for thought (see below).  Everybody's design instincts are 
> different -- I'm curious what you all think think about the proposal.
>
>
> Raymond
>
> -
>
> Q. Can it be done?
> A. Yes, it wouldn't be hard.
>
> _sentinel = object()
>
> def accumulate(iterable, func=operator.add, start=_sentinel):
> it = iter(iterable)
> if start is _sentinel:
> try:
> total = next(it)
> except StopIteration:
> return
> else:
> total = start
> yield total
> for element in it:
> total = func(total, element)
> yield total
>
> Q. Do other languages do it?
> A. Numpy, no. R, no. APL, no. Mathematica, no. Haskell, yes.
>
> * 
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.accumulate.html
> * https://stat.ethz.ch/R-manual/R-devel/library/base/html/cumsum.html
> * http://microapl.com/apl/apl_concepts_chapter5.html
>   \+ 1 2 3 4 5
>   1 3 6 10 15
> * https://reference.wolfram.com/language/ref/Accumulate.html
> * https://www.haskell.org/hoogle/?hoogle=mapAccumL
>
>
> Q. How much work for a person to do it currently?
> A. Almost zero effort to write a simple helper function:
>
>myaccum = lambda it, func, start: accumulate(chain([start], it), func)
>
>
> Q. How common is the need?
> A. Rare.
>
>
> Q. Which would be better, a simple for-loop or a customized itertool?
> A. The itertool is shorter but more opaque (especially with respect
>to the argument order for the function call):
>
> result = [start]
> for x in iterable:
>  y = func(result[-1], x)
>  result.append(y)
>
> versus:
>
> result = list(accumulate(iterable, func, start=start))
>
>
> Q. How readable is the proposed code?
> A. Look at the following code and ask yourself what it does:
>
> accumulate(range(4, 6), operator.mul, start=6)
>
>Now test your understanding:
>
> How many values are emitted?
> What is the first value emitted?
> Are the two sixes related?
> What is this code trying to accomplish?
>
>
> Q. Are there potential surprises or oddities?
> A. Is it readily apparent which of assertions will succeed?
>
> a1 = sum(range(10))
> a2 = sum(range(10), 0)
> assert a1 == a2
>
> a3 = functools.reduce(operator.add, range(10))
> a4 = functools.reduce(operator.add, range(10), 0)
> assert a3 == a4
>
> a4 = list(accumulate(range(10), operator.add))
> a5 = list(accumulate(range(10), operator.add, start=0))
> assert a5 == a6
>
>
> Q. What did the Python 3.0 Whatsnew document have to say about reduce()?
> A. "Removed reduce(). Use functools.reduce() if you really need it; however, 
> 99 percent of the time an explicit for loop is more readable."
>
>
> Q. What would this look like in real code?
> A. We have almost no real-world examples, but here is one from a 
> StackExchange post:
>
> def wsieve():   # wheel-sieve, by Will Ness.
> ideone.com/mqO25A->0hIE89
> wh11 = [ 2,4,2,4,6,2,6,4,2,4,6,6, 2,6,4,2,6,4,6,8,4,2,4,2,
>  4,8,6,4,6,2,4,6,2,6,6,4, 2,4,6,2,6,4,2,4,2,10,2,10]
> cs = accumulate(cycle(wh11), start=11)
> yield( next( cs))   #   cf. ideone.com/WFv4f
> ps = wsieve()   # 
> codereview.stackexchange.com/q/92365/9064
> p = next(ps)# 11
> psq = p*p   # 121
> D = dict( zip( accumulate(wh11, start=0), count(0)))   # start 
> from
> sieve = {}
> for c in cs:
> if c in sieve:
> wheel = sieve.pop(c)
>

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

2018-04-08 Thread Guido van Rossum

Well if you can get Raymond to agree on that too I suppose you can go
ahead. Personally I'm -0 but I don't really write this kind of algorithmic
code enough to know what's useful. I do think that the new parameter name
is ugly. But maybe that's the point.

On Sat, Apr 7, 2018 at 10:26 PM, Tim Peters  wrote:

> Top-posting just to say I agree with Nick's bottom line (changing the
> name to `first_result=`).  I remain just +0.5, although that is up a
> notch from yesterday's +0.4 ;-)
>
> --- nothing new below ---
>
> On Sun, Apr 8, 2018 at 12:19 AM, Nick Coghlan  wrote:
> > On 8 April 2018 at 14:31, Guido van Rossum  wrote:
> >> Given that two respected members of the community so strongly disagree
> >> whether accumulate([], start=0) should behave like accumulate([]) or
> like
> >> accumulate([0]), maybe in the end it's better not to add a start
> argument.
> >> (The disagreement suggests that we can't trust users' intuition here.)
> >
> > The potential ambiguity I see is created mainly by calling the
> > proposed parameter "start", while having it do more than just adjust
> > the individual partial sum calculations by adding an extra partial
> > result to the output series.
> >
> > If it's called something else (e.g. "first_result"), then the
> > potential "sum(iterable, start=start)" misinterpretation goes away,
> > and it can have Tim's desired effect of defining the first output
> > value (effectively prepending it to the input iterable, without the
> > boilerplate and overhead of actually doing so).
> >
> > A name like "first_result" would also make it clearer to readers that
> > passing that parameter has an impact on the length of the output
> > series (since you're injecting an extra result), and also that the
> > production of the first result skips calling func completely (as can
> > be seen in Tim's str coercion example).
> >
> > So where I'd be -1 on:
> >
> > >>> list(accumulate(1, 2, 3))
> > [1, 3, 6]
> > >>> list(accumulate(1, 2, 3, start=0))
> > [0, 1, 3, 6]
> > >>> list(accumulate(1, 2, 3, start=1))
> > [1, 2, 4, 7]
> >
> > I'd be +1 on:
> >
> > >>> list(accumulate(1, 2, 3))
> > [1, 3, 6]
> > >>> list(accumulate(1, 2, 3, first_result=0))
> > [0, 1, 3, 6]
> > >>> list(accumulate(1, 2, 3, first_result=1))
> > [1, 2, 4, 7]
> >
> > Cheers,
> > Nick.
> >
> > --
> > Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

FYI:

[Raymond]
> ...
> Q. Do other languages do it?
> A. Numpy, no. R, no. APL, no. Mathematica, no. Haskell, yes.
>
>...
> * https://www.haskell.org/hoogle/?hoogle=mapAccumL

Haskell has millions of functions ;-)  `mapAccumL` is a God-awful
mixture of Python's map(), reduce(), and accumulate() :-(  The
examples here should convince you it's nearly incomprehensible:

http://zvon.org/other/haskell/Outputlist/mapAccumL_f.html

A more-than-less direct equivalent to Python's `accumulate` is
Haskell's `scanl1`:

http://zvon.org/comp/r/ref-Haskell.html#Functions~Prelude.scanl1

That doesn't allow specifying an initial value.  But, Haskell being
Haskell, the closely related `scanl` function requires specifying an
initial value, which is also the first element of the list it returns:

http://zvon.org/comp/r/ref-Haskell.html#Functions~Prelude.scanl

Of the two, `scanl` is more basic - indeed, the Standard Prelude
defines scanl1 by peeling off the first element of the list and
passing it as the initial value for scanl to use

scanl:: (a -> b -> a) -> a -> [b] -> [a]
scanl f q xs =  q : (case xs of
[]   -> []
x:xs -> scanl f (f q x) xs)


scanl1   :: (a -> a -> a) -> [a] -> [a]
scanl1 f (x:xs)  =  scanl f x xs
scanl1 _ []  =  []

There are also analogous `scanr` and `scanr1` functions for doing
"right-to-left" accumulation.

So, bottom line:  as usual, when looking to Haskell for prior art,,
"all of the above - for a start" applies ;-)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

Just a nit here:

[Tim]
>> ...
>> Arguing that it "has to do" something exactly the way `sum()` happens
>> to be implemented just doesn't follow - not even if they happen to
>> give the same name to an optional argument.  If the function were
>> named `accumulate_sum()`, and restricted to numeric types, maybe - but
>> it's not.

[Nick]
> When first added in 3.2, it did have that restriction, and the default
> behaviour without a function argument still parallels repeated use of
> the sum() builtin.

They're not quite the same today.  For example, if you have an object
`x` of a custom numeric class,

sum([x])

invokes x.__radd__ once (it really does do `x.__radd__(0)`), but

list(itertools.accumulate([x]))

just returns [x] without invoking x.__add__ or x.__radd__ at all.


> Extending the parallel to *also* include a parameter called "start"
> would create the expectation for me that that parameter would adjust
> the partial result calculations, without adding an extra result.
>
> Other parameter names (like the "first_result" I suggested in my reply
> to Guido) wouldn't have the same implications for me, so this
> objection is specific to the use of "start" as the parameter name, not
> to the entire idea.

Yup!  That's fine by me too - and preferable even ;-)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

Top-posting just to say I agree with Nick's bottom line (changing the
name to `first_result=`).  I remain just +0.5, although that is up a
notch from yesterday's +0.4 ;-)

--- nothing new below ---

On Sun, Apr 8, 2018 at 12:19 AM, Nick Coghlan  wrote:
> On 8 April 2018 at 14:31, Guido van Rossum  wrote:
>> Given that two respected members of the community so strongly disagree
>> whether accumulate([], start=0) should behave like accumulate([]) or like
>> accumulate([0]), maybe in the end it's better not to add a start argument.
>> (The disagreement suggests that we can't trust users' intuition here.)
>
> The potential ambiguity I see is created mainly by calling the
> proposed parameter "start", while having it do more than just adjust
> the individual partial sum calculations by adding an extra partial
> result to the output series.
>
> If it's called something else (e.g. "first_result"), then the
> potential "sum(iterable, start=start)" misinterpretation goes away,
> and it can have Tim's desired effect of defining the first output
> value (effectively prepending it to the input iterable, without the
> boilerplate and overhead of actually doing so).
>
> A name like "first_result" would also make it clearer to readers that
> passing that parameter has an impact on the length of the output
> series (since you're injecting an extra result), and also that the
> production of the first result skips calling func completely (as can
> be seen in Tim's str coercion example).
>
> So where I'd be -1 on:
>
> >>> list(accumulate(1, 2, 3))
> [1, 3, 6]
> >>> list(accumulate(1, 2, 3, start=0))
> [0, 1, 3, 6]
> >>> list(accumulate(1, 2, 3, start=1))
> [1, 2, 4, 7]
>
> I'd be +1 on:
>
> >>> list(accumulate(1, 2, 3))
> [1, 3, 6]
> >>> list(accumulate(1, 2, 3, first_result=0))
> [0, 1, 3, 6]
> >>> list(accumulate(1, 2, 3, first_result=1))
> [1, 2, 4, 7]
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

On 8 April 2018 at 15:00, Tim Peters  wrote:
> `accumulate()` accepts any two-argument function.
>
 itertools.accumulate([1, 2, 3], lambda x, y: str(x) + str(y))
> 
 list(_)
> [1, '12', '123']
>
> Arguing that it "has to do" something exactly the way `sum()` happens
> to be implemented just doesn't follow - not even if they happen to
> give the same name to an optional argument.  If the function were
> named `accumulate_sum()`, and restricted to numeric types, maybe - but
> it's not.

When first added in 3.2, it did have that restriction, and the default
behaviour without a function argument still parallels repeated use of
the sum() builtin.

Extending the parallel to *also* include a parameter called "start"
would create the expectation for me that that parameter would adjust
the partial result calculations, without adding an extra result.

Other parameter names (like the "first_result" I suggested in my reply
to Guido) wouldn't have the same implications for me, so this
objection is specific to the use of "start" as the parameter name, not
to the entire idea.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

On 8 April 2018 at 14:31, Guido van Rossum  wrote:
> Given that two respected members of the community so strongly disagree
> whether accumulate([], start=0) should behave like accumulate([]) or like
> accumulate([0]), maybe in the end it's better not to add a start argument.
> (The disagreement suggests that we can't trust users' intuition here.)

The potential ambiguity I see is created mainly by calling the
proposed parameter "start", while having it do more than just adjust
the individual partial sum calculations by adding an extra partial
result to the output series.

If it's called something else (e.g. "first_result"), then the
potential "sum(iterable, start=start)" misinterpretation goes away,
and it can have Tim's desired effect of defining the first output
value (effectively prepending it to the input iterable, without the
boilerplate and overhead of actually doing so).

A name like "first_result" would also make it clearer to readers that
passing that parameter has an impact on the length of the output
series (since you're injecting an extra result), and also that the
production of the first result skips calling func completely (as can
be seen in Tim's str coercion example).

So where I'd be -1 on:

>>> list(accumulate(1, 2, 3))
[1, 3, 6]
>>> list(accumulate(1, 2, 3, start=0))
[0, 1, 3, 6]
>>> list(accumulate(1, 2, 3, start=1))
[1, 2, 4, 7]

I'd be +1 on:

>>> list(accumulate(1, 2, 3))
[1, 3, 6]
>>> list(accumulate(1, 2, 3, first_result=0))
[0, 1, 3, 6]
>>> list(accumulate(1, 2, 3, first_result=1))
[1, 2, 4, 7]

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

Nick, sorry, but your arguments still make little sense to me.  I
think you're pushing an analogy between `sum()` details and
`accumulate()` way too far,  changing a simple idea into a
needlessly complicated one.

`accumulate()` can do anything at all it wants to do with a `start`
argument (if it grows one), and a "default" of start=0 makes no sense:
 unlike `sum()`, `accumulate()` is not

specifically for use with numeric values and may
reject non-numeric types [from the `sum()` docs]

`accumulate()` accepts any two-argument function.

>>> itertools.accumulate([1, 2, 3], lambda x, y: str(x) + str(y))

>>> list(_)
[1, '12', '123']

Arguing that it "has to do" something exactly the way `sum()` happens
to be implemented just doesn't follow - not even if they happen to
give the same name to an optional argument.  If the function were
named `accumulate_sum()`, and restricted to numeric types, maybe - but
it's not.


[Nick Coghlan ]
> ...
> That concern mostly goes away if the new parameter is deliberately
> called something *other than* "start" (e.g. "prepend=value", or
> "first=value"), but it could also be addressed by offering a dedicated
> "yield_start" toggle, such that the revised semantics were:
>
> def accumulate(iterable, func=operator.add, start=0, 
> yield_start=False):
> it = iter(iterable)
> total = start
> if yield_start:
> yield total
> for element in it:
> total = func(total, element)
> yield total
>
> That approach would have the advantage of making the default value of
> "start" much easier to document (since it would just be zero, the same
> as it is for sum()), and only the length of the input iterable and
> "yield_start" would affect how many partial sums were produced.

As above, start=0 is senseless for `accumulate` (despite that it makes
sense for `sum`).  Raymond gave the obvious implementation in his
original message.

If you reworked your implementation to accommodate that NO sensible
default for `start` exists except for the one Raymond used (a unique
object private to the implementation, so he knows for sure whether or
not `start` was passed), you'd end up with his implementation ;-)

`yield_start` looks like a nuisance in any case.  As already
explained, most uses want the `start` value if it's given, and in
cases where it isn't it's trivial to discard by doing `next()` once on
the result.  Of course it could be added - but why bother?
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

2018-04-07 Thread Guido van Rossum

Given that two respected members of the community so strongly disagree
whether accumulate([], start=0) should behave like accumulate([]) or like
accumulate([0]), maybe in the end it's better not to add a start argument.
(The disagreement suggests that we can't trust users' intuition here.)

On Sat, Apr 7, 2018 at 9:14 PM, Nick Coghlan  wrote:

> On 8 April 2018 at 13:17, Tim Peters  wrote:
> > [Nick Coghlan ]
> >> So I now think that having "start" as a parameter to one but not the
> >> other, counts as a genuine API discrepancy.
> >
> > Genuine but minor ;-)
>
> Agreed :)
>
> >> Providing start to accumulate would then mean the same thing as
> >> providing it to sum(): it would change the basis point for the first
> >> addition operation, but it wouldn't change the *number* of cumulative
> >> sums produced.
> >
> > That makes no sense to me.  `sum()` with a `start` argument always
> > returns a single result, even if the iterable is empty.
> >
>  sum([], 42)
> > 42
>
> Right, but if itertools.accumulate() had the semantics of starting
> with a sum() over an empty iterable, then it would always start with
> an initial zero.
>
> It doesn't - it starts with "0+first_item", so the length of the
> output iterator matches the number of items in the input iterable:
>
> >>> list(accumulate([]))
> []
> >>> list(accumulate([1, 2, 3, 4]))
> [1, 3, 6, 10]
>
> That matches the output you'd get from a naive O(n^2) implementation
> of cumulative sums:
>
> data = list(iterable)
> for stop in range(1, len(iterable)):
> yield sum(data[:stop])
>
> So if the new parameter were to be called start, then I'd expect the
> semantics to be equivalent to:
>
> data = list(iterable)
> for stop in range(1, len(iterable)):
> yield sum(data[:stop], start=start)
>
> rather than the version Raymond posted at the top of the thread (where
> setting start explicitly also implicitly increases the number of items
> produced).
>
> That concern mostly goes away if the new parameter is deliberately
> called something *other than* "start" (e.g. "prepend=value", or
> "first=value"), but it could also be addressed by offering a dedicated
> "yield_start" toggle, such that the revised semantics were:
>
> def accumulate(iterable, func=operator.add, start=0,
> yield_start=False):
> it = iter(iterable)
> total = start
> if yield_start:
> yield total
> for element in it:
> total = func(total, element)
> yield total
>
> That approach would have the advantage of making the default value of
> "start" much easier to document (since it would just be zero, the same
> as it is for sum()), and only the length of the input iterable and
> "yield_start" would affect how many partial sums were produced.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

On 8 April 2018 at 13:17, Tim Peters  wrote:
> [Nick Coghlan ]
>> So I now think that having "start" as a parameter to one but not the
>> other, counts as a genuine API discrepancy.
>
> Genuine but minor ;-)

Agreed :)

>> Providing start to accumulate would then mean the same thing as
>> providing it to sum(): it would change the basis point for the first
>> addition operation, but it wouldn't change the *number* of cumulative
>> sums produced.
>
> That makes no sense to me.  `sum()` with a `start` argument always
> returns a single result, even if the iterable is empty.
>
 sum([], 42)
> 42

Right, but if itertools.accumulate() had the semantics of starting
with a sum() over an empty iterable, then it would always start with
an initial zero.

It doesn't - it starts with "0+first_item", so the length of the
output iterator matches the number of items in the input iterable:

>>> list(accumulate([]))
[]
>>> list(accumulate([1, 2, 3, 4]))
[1, 3, 6, 10]

That matches the output you'd get from a naive O(n^2) implementation
of cumulative sums:

data = list(iterable)
for stop in range(1, len(iterable)):
yield sum(data[:stop])

So if the new parameter were to be called start, then I'd expect the
semantics to be equivalent to:

data = list(iterable)
for stop in range(1, len(iterable)):
yield sum(data[:stop], start=start)

rather than the version Raymond posted at the top of the thread (where
setting start explicitly also implicitly increases the number of items
produced).

That concern mostly goes away if the new parameter is deliberately
called something *other than* "start" (e.g. "prepend=value", or
"first=value"), but it could also be addressed by offering a dedicated
"yield_start" toggle, such that the revised semantics were:

def accumulate(iterable, func=operator.add, start=0, yield_start=False):
it = iter(iterable)
total = start
if yield_start:
yield total
for element in it:
total = func(total, element)
yield total

That approach would have the advantage of making the default value of
"start" much easier to document (since it would just be zero, the same
as it is for sum()), and only the length of the input iterable and
"yield_start" would affect how many partial sums were produced.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

[Nick Coghlan ]
> I didn't have a strong opinion either way until Tim mentioned sum()
> and then I went and checked the docs for both that and for accumulate.
>
> First sentence of the sum() docs:
>
> Sums *start* and the items of an *iterable* from left to right and
> returns the total.
>
> First sentence of the accumulate docs:
>
> Make an iterator that returns accumulated sums, ...
>
> So I now think that having "start" as a parameter to one but not the
> other, counts as a genuine API discrepancy.

Genuine but minor ;-)


> Providing start to accumulate would then mean the same thing as
> providing it to sum(): it would change the basis point for the first
> addition operation, but it wouldn't change the *number* of cumulative
> sums produced.

That makes no sense to me.  `sum()` with a `start` argument always
returns a single result, even if the iterable is empty.

>>> sum([], 42)
42

As the example shows, it's possible that `sum()` does no additions
whatsoever.  It would be exceedingly bizarre if the same stuff passed
to `accumulate()` returned an empty iterator instead:

>>> list(accumulate([], start=42))
[]

It should return [42].

It seems obvious to me that a sane implementation would maintain the invariant:

sum(xs, s) == list(accumulate(xs, start=s))[-1]

and there's nothing inherently special about `xs` being empty.

It seems also obviously desirable that

accumulate(xs, start=s)

generate the same results as

accumulate(chain([s], xs))

That's obviously desirable because it's _so_ obvious that Raymond
implicitly assumed that's how it would work in his first message ;-)

Or think of it this way:  if you're adding N numbers, there are N-1
additions, and N partial sums.  Whether it's `sum(xs)` or
`accumulate(xs)`, if len(xs)==K then specifying `start` too changes
the number of addends from K to K+1.


> By contrast, using the prepend() approach with accumulate() not only
> changes the starting value, it also changes the number of cumulative
> sums produced.

As it should :-)

Note that that in the "real life" example code I gave, it was
essential that `accumulate()` with `start` yield the starting value
first.  There were three uses in Will Ness's wheel sieve code, two of
which wanted the starting value on its own, and the last of which
didn't.  In that last case, it was just a matter of doing

next(wheel)

on its own to discard the (in that specific case) unwanted starting
value.  If you have to paste the starting value _in_ instead (when it
is wanted), then we're reintroducing a need for the "chain a singleton
list with the iterator" hack introducing `start=` is trying to
eliminate.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

On 8 April 2018 at 08:09, Tim Peters  wrote:
[Raymond wrote]:
>> The docs probably need another recipe to show this pattern:
>>
>> def prepend(value, iterator):
>> "prepend(1, [2, 3, 4]) -> 1 2 3 4"
>> return chain([value], iterator)
>
> +1.  Whether `accumulate()` should grow a `start=` argument still
> seems a distinct (albeit related) issue to me, though.

I didn't have a strong opinion either way until Tim mentioned sum()
and then I went and checked the docs for both that and for accumulate.

First sentence of the sum() docs:

Sums *start* and the items of an *iterable* from left to right and
returns the total.

First sentence of the accumulate docs:

Make an iterator that returns accumulated sums, ...

So I now think that having "start" as a parameter to one but not the
other, counts as a genuine API discrepancy.

Providing start to accumulate would then mean the same thing as
providing it to sum(): it would change the basis point for the first
addition operation, but it wouldn't change the *number* of cumulative
sums produced.

By contrast, using the prepend() approach with accumulate() not only
changes the starting value, it also changes the number of cumulative
sums produced.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]