On Thu, Oct 19, 2017 at 3:42 AM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On 19 October 2017 at 08:34, Greg Ewing <greg.ew...@canterbury.ac.nz> > wrote: > >> Nick Coghlan wrote: >> >>> since breaking up the current single level loops as nested loops would >>> be a pre-requisite for allowing these APIs to check for signals while >>> they're running while keeping the per-iteration overhead low >>> >> >> Is there really much overhead? Isn't it just checking a flag? >> > > It's checking an atomically updated flag, so it forces CPU cache > synchronisation, which means you don't want to be doing it on every > iteration of a low level loop. > > Even just that it's a C function call makes me not want to recommend doing it in a lot of tight loops. Who knows what the function does anyway, let alone what it might or might not do in the future. > However, reviewing Serhiy's PR reminded me that PyErr_CheckSignals() > already encapsulates the "Should this thread even be checking for signals > in the first place?" logic, which means the code change to make the > itertools iterators inherently interruptible with Ctrl-C is much smaller > than I thought it would be. > And if it didn't encapsulate that, you would probably have written a wrapper that does. Good thing it's the wrapper that's exposed in the API. > That approach is also clearly safe from an exception handling point of > view, since all consumer loops already need to cope with the fact that > itr.__next__() may raise arbitrary exceptions (including KeyboardInterrupt). > > So that change alone already offers a notable improvement, and combining it > with a __length_hint__() implementation that keeps container constructors > from even starting to iterate would go even further towards making the > infinite iterators more user friendly. > > Similar signal checking changes to the consumer loops would also be > possible, but I don't think that's an either/or decision: changing the > iterators means they'll be interruptible for any consumer, while changing > the consumers would make them interruptible for any iterator, and having > checks in both the producer & the consumer merely means that you'll be > checking for signals twice every 65k iterations, rather than once. > > Indeed it's not strictly an either/or decision, but more about where we might spend time executing C code. But I'm leaning a bit towards doing it on the consumer side, because there it's more obvious that the code might take some time to run. If the consumer ends up iterating over pure-Python objects, there are no concerns about the overhead. But if it *does* call a C-implemented __next__, then that's the case where we actully need the whole thing. Adding the check in both places would double the (small) overhead. And nested (wrapped) iterators are also a thing. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/