[issue31815] Make itertools iterators interruptible

2018-06-23 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

I don't think a new public API should be introduced.  This is at best an 
implementation detail.  

Also, I really don't want to garbage-up the inner-loop code for the itertools.  
I've spent a good deal of time micro-optimizing this code and don't want to 
throw it away for something that is of nearly zero value and imo not a real 
issue that affects real users.

Marking this a closed for now.  We can discuss it more at the sprints (Python 
3.8 is still a long way away).

--
resolution:  -> rejected
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2018-06-23 Thread Nick Coghlan


Nick Coghlan  added the comment:

The purpose would be two-fold:

1. The presence of the `check_signals()` wrapper provides a way to more 
explicitly document that the other itertools iterators *don't* implicitly check 
for signals, so if you want to combine them with consumers that also don't 
check for signals, then you're going to need to wrap the iterator.

2. As a helper for integration code that's dealing with consumers that don't 
check for signals, but want to make those loops interruptible. Doing that in 
Python (as in my example) is inefficient, since you end up running Python 
bytecode on every iteration, and also don't have as much control over exactly 
when the signals get checked.

Given a solution to issue 33939, I'd drop the priority on this issue to low, 
but I don't think it would make it redundant.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2018-06-23 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

> What if itertools were to offer an opt-in ...

This doesn't make sense to me.  As far as I can tell, the only time this issue 
has ever arisen in the past 15 or 16 years is when someone was trying to create 
an unbreakable infinite loop on-purpose.  In a way, it is no more interesting 
than intentionally triggering a seqfault with ctypes or bytecode hacks.  
Likewise, it isn't even unique to itertools -- it shows up in any potentially 
long running C-code such as numpy/scipy calls.

I would like to close this issue and instead go down the path of issue 33939 
which would allow consumers to detect when an input expects to be infinite.  
The consumers can then decide whether they want to make periodic cntl-c checks.

--
versions: +Python 3.8 -Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2018-06-23 Thread Nick Coghlan


Nick Coghlan  added the comment:

As a potential stepping stone towards possible future changes in the default 
behaviour here, what if itertools were to offer an opt-in "check_signals(itr, 
*, iterations=100_000)" helper function that was essentially a more efficient 
version of::

def check_signals(itr, *iterations=100_000):
while True:
next_slice = islice(itr, iterations)
for count, item in enumerate(next_slice, 1):
yield item
if count < iterations:
raise StopIteration

This would:

1. Provide a straightforward way for folks to explicitly opt-in to periodic 
signal checks
2. Provide a way to check for potential compatibility issues with other 
libraries and components to better assess the risks of switching the default 
behaviour

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2018-06-22 Thread Nick Coghlan


Nick Coghlan  added the comment:

Note: I've filed the "raise TypeError in __length_hint__" suggestion for 
infinite iterators separately in https://bugs.python.org/issue33939

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2017-10-22 Thread Koos Zevenhoven

Koos Zevenhoven  added the comment:

For the interactive user who uses an interactive environment such as the repl 
or a Jupyter notebook, the situation is a little different from "CPython as 
programming language runtime".

The docs say a KeyboardInterrupt is "Raised when the user hits the interrupt 
key (normally Control-C or Delete). During execution, a check for interrupts is 
made regularly.". I suppose there's some ambiguity in what "regularly" means 
there ;). 

But regardless of whether anyone bothers to read that part of the docs, Ctrl-C 
or an interrupt button not working can feel like a correctness issue for 
someone that's using an interactive Python environment *as an application* in 
daily work. Python gives you the impression that you can always interrupt 
anything if it turns out to take too much time. And I remember that being one 
of the points that made me move away from matlab, which at that time had 
problems with interrupting computations.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2017-10-19 Thread Tim Peters

Tim Peters  added the comment:

Segfaults are different:  they usually expose an error in CPython's 
implementation.  We don't prioritize them because the user may have to restart 
their program (who cares? <0.5 wink>), but because they demonstrate the 
language implementation is accessing memory wildly.  That in turn can result in 
anything, from arbitrarily wrong program results, through file corruption, to 
massive security holes.  It's far more a "correctness" than a "usability" 
concern.

If a user provokes a segfault by (ab)using low-level facilities (say, ctypes), 
we don't care - that's on them.  But most segfaults have pointed to legitimate 
corner-case errors in CPython itself.

There's no correctness issue in whether iterators are always interruptible - it 
doesn't merit the same concern.

--
nosy: +tim.peters

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2017-10-19 Thread Nick Coghlan

Nick Coghlan  added the comment:

I'd personally be happy enough if the infinite iterators implemented 
__length_hint__() as always raising TypeError so the machine-breaking cases of 
incremental consumption of ever-increasing amounts of memory were blocked - I 
was suggesting on python-ideas that enabling pervasive signal checking would be 
too intrusive for anyone to be willing to implement it.

However, Serhiy's patch showed me that it isn't particularly intrusive at all, 
and the risk of surprising consumers is low, since __next__() methods can 
already raise arbitrary exceptions.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2017-10-18 Thread Raymond Hettinger

Raymond Hettinger  added the comment:

I respectfully disagree that this just happens to people accidentally -- Every 
single day, I work with either Python professionals or Python students and 
never see this situation occur, nor have I had a single report of it from one 
of my clients, ever.  In my experience, someone has to be trying to produce 
exactly this effect.  

They have to go out of their way to import a high-performance module, select 
one of the tools specifically documented to be infinite, specifically reach for 
one the very few tools like repeat() or count() that don't make any pure python 
callbacks, and then separately reach for a high-performance consumer that makes 
no pure python callbacks.  People don't just write ``sum(itertools.count()`` to 
do something useful, they do it just to see if they can produce exactly this 
effect.

We have a number of areas where we're comfortable saying "just don't do that" 
(i.e. the repr of a large number or of a large container, repeated 
exponentation, bytecode hacks, ill-formed ctypes, etc).

I would like to draw a line in the sand for itertools to not go down this path 
unless we actually see this happening in the wild to people not trying to do it 
on purpose.  It is much more likely that a user with accidentally types ">>> 
'x' * 10" and gets the same effect.

On a side note, I have a fear (possibly rational, possibly not) that 
introducing signal handling into formerly atomic operations will open up new 
classes of bugs and usability problems (i.e. Issue #14976 showed that when GC 
gained the ability trigger calls to __del__, it created queue reentrancy 
deadlock problems that could not be solved with pure python code).

One last thought -- the various core devs seem to be charging in opposite 
directions.  On the one hand, there seems to be no limit to the coding 
atrocities being considered to save under a millisecond of startup time and for 
various other questionable mirco-optimizations.  And on the other hand, there 
seems to be a great deal of willingness to inject almost-never-needed error 
checks or signal handling into otherwise tight, high-volume code paths.   One 
group likes to refactor code to make it clean and easy to maintain and stick 
with its business purpose, while another group is comfortable garbaging-up code 
in order to achieve some other benefit that may not be in line with the module 
designer's intent.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2017-10-18 Thread Nick Coghlan

Nick Coghlan  added the comment:

To put this another way: I see an uninterruptible infinite loop as a data loss 
bug on par with a segfault, since there's no graceful way to terminate the 
process and allow cleanup code to run.

For segfaults, we're willing to tolerate them, but we expect the reproducers to 
involve arcane coding contortions, not simple expressions like 
"sum(itertools.count())".

Now, the producer side check that Serhiy posted here only addresses part of the 
problem - there's also the question of making the consumption loops more robust 
by having them check for signals, and adding a ThreadExit equivalent to allow 
the interpreter to request shutdown of non-daemon threads other than the main 
thread.

But as long as we think it's a-OK for us to hang a user's session, causing them 
to lose all their unsaved/uncached data, then we're going to resist the extra 
code complexity required to account for these usability concerns. (And I 
realise they're not new concerns - they're just longstanding problems that 
folks have gotten used to tolerating and excusing)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2017-10-18 Thread Nick Coghlan

Nick Coghlan  added the comment:

Defensive coding and the complications it brings is a fact of life when 
providing a widely used platform.

Sure, we're free to say "We don't care about minor user experience irritations 
like Ctrl-C not always being reliable, users should just suck it up and cope".

I think "It's your own fault for typing that, just restart your session from 
scratch" is setting the bar too low for ourselves.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2017-10-18 Thread Koos Zevenhoven

Koos Zevenhoven  added the comment:

To repeat one of my points in the linked threads, I'm not convinced that 
infinite iterators are the most common case for the problem of long 
uninterruptible loops. A general mechanism that can be easily used in many 
places with minimal maintenance burden would be nice. It could be used even in 
third-party extension modules.

--
nosy: +koos.zevenhoven

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2017-10-18 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

I concur with Raymond. I cited the same arguments in the discussion on 
Python-ideas. But the other solution that was suggested in this discussion will 
add more complexity and can't solve all cases.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2017-10-18 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

With optimized repeat():

$ ./python -m perf timeit --compare-to=../cpython-release/python -s 'from 
itertools import repeat' 'list(repeat(None, 100))'
/home/serhiy/py/cpython-release/python: . 3.77 ms +- 0.06 ms
/home/serhiy/py/cpython-iter/python: . 3.77 ms +- 0.05 ms

Mean +- std dev: [/home/serhiy/py/cpython-release/python] 3.77 ms +- 0.06 ms -> 
[/home/serhiy/py/cpython-iter/python] 3.77 ms +- 0.05 ms: 1.00x faster (-0%)
Not significant!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2017-10-18 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

Microbenchmark results:

$ ./python -m perf timeit --compare-to=../cpython-release/python -s 'from 
itertools import repeat' 'list(repeat(None, 100))'
/home/serhiy/py/cpython-release/python: . 3.79 ms +- 0.09 ms
/home/serhiy/py/cpython-iter/python: . 4.14 ms +- 0.07 ms

Mean +- std dev: [/home/serhiy/py/cpython-release/python] 3.79 ms +- 0.09 ms -> 
[/home/serhiy/py/cpython-iter/python] 4.14 ms +- 0.07 ms: 1.09x slower (+9%)


$ ./python -m perf timeit --compare-to=../cpython-release/python -s 'from 
itertools import cycle, islice' 'list(islice(cycle(range(1000)), 100))'
/home/serhiy/py/cpython-release/python: . 6.88 ms +- 0.30 ms
/home/serhiy/py/cpython-iter/python: . 6.87 ms +- 0.26 ms

Mean +- std dev: [/home/serhiy/py/cpython-release/python] 6.88 ms +- 0.30 ms -> 
[/home/serhiy/py/cpython-iter/python] 6.87 ms +- 0.26 ms: 1.00x faster (-0%)
Not significant!


$ ./python -m perf timeit --compare-to=../cpython-release/python -s 'from 
itertools import count, islice' 'list(islice(count(), 100))'
/home/serhiy/py/cpython-release/python: . 26.1 ms +- 0.6 ms
/home/serhiy/py/cpython-iter/python: . 26.3 ms +- 0.6 ms

Mean +- std dev: [/home/serhiy/py/cpython-release/python] 26.1 ms +- 0.6 ms -> 
[/home/serhiy/py/cpython-iter/python] 26.3 ms +- 0.6 ms: 1.01x slower (+1%)
Not significant!


$ ./python -m perf timeit --compare-to=../cpython-release/python -s 'from 
itertools import product' 'list(product(range(100), repeat=3))'
/home/serhiy/py/cpython-release/python: . 80.2 ms +- 3.2 ms
/home/serhiy/py/cpython-iter/python: . 80.2 ms +- 1.7 ms

Mean +- std dev: [/home/serhiy/py/cpython-release/python] 80.2 ms +- 3.2 ms -> 
[/home/serhiy/py/cpython-iter/python] 80.2 ms +- 1.7 ms: 1.00x faster (-0%)
Not significant!


$ ./python -m perf timeit --compare-to=../cpython-release/python -s 'from 
itertools import combinations' 'list(combinations(range(23), 10))'
/home/serhiy/py/cpython-release/python: . 177 ms +- 14 ms
/home/serhiy/py/cpython-iter/python: . 169 ms +- 4 ms

Mean +- std dev: [/home/serhiy/py/cpython-release/python] 177 ms +- 14 ms -> 
[/home/serhiy/py/cpython-iter/python] 169 ms +- 4 ms: 1.05x faster (-4%)


The only significant slowdown is for repeat(). But there is possibility to 
optimize this one by reusing an existing counter.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2017-10-18 Thread Raymond Hettinger

Raymond Hettinger  added the comment:

When I have time, I would like to re-launch a python-dev discussion on this.  
It is my feeling that this solves an invented problem.  In my experience, it 
only ever happens to people who have intentionally trying to create this effect.

Adding this kind of "junk" through-out the code base adds complexity and more 
internal operations, but won't help *any* existing, deployed code.  We're 
making everyone pay for a problem that almost no one has.

Also, if we do care about interruptability, it is unclear whether the 
responsiblity should like with the consumer or the producer of the iterator.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2017-10-18 Thread Raymond Hettinger

Change by Raymond Hettinger :


--
assignee:  -> rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31815] Make itertools iterators interruptible

2017-10-18 Thread Serhiy Storchaka

Change by Serhiy Storchaka :


--
title: Make itertools iterators interrable -> Make itertools iterators 
interruptible

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com