I'm a little short on time, but I think it would be nice to iterate on this design a few more times until it is general and robust and clear enough to be added to the asyncio module as a standard helper abstraction for such cases.
I'm still curious -- why are you rejecting using a Queue in the implementation? Your stated reason "if we wanted a Queue we'd have to add a limit" doesn't make sense to me. On Tue, Apr 22, 2014 at 2:34 AM, chrysn <[email protected]> wrote: > hello guido, > > thank you for your feedback (and for making asyncio real in general). > > On Sun, Apr 20, 2014 at 08:11:45PM -0700, Guido van Rossum wrote: > > Here the pattern is somewhat simpler: there's no yield in the for-loop > > header; also, the implementation doesn't need to hold back a value to > raise > > StopIteration because the number of results is known ahead of time (it's > > len(fs) :-). > > not knowing the length of results (as i think it is a typical case of > iterators instead of lists) is a key component to why i can't use > as_completed here. > > > > it = listall() > > while (yield from it.more()): > > value = it.value > > # Use value > > this has the most appeal to me; it loses on the semantic side > (logically, it's a for-loop after all), but wins on practicality and not > having to hold back a value. > > > > - The notation listall().lines feels a little odd -- perhaps you can > have a > > separate method for this, e.g. listall_async()? > > be aware that .listall() is not really blocking (that'd be footnote > [1]), but is just a future that gets done when all the items are > available, so callers that gain no benefit from waiting for single > events can just do this: > > >>> for value in (yield from listall()): > ... # use value > > anyway, with the .more()-style approach, the .lines is not needed any > more. > > (neither are some other quirks; the details of what used to be important > are in footnote [2]). > > > > Let me try to write this up as an example using an asyncio StreamReader > and > > its readline() method (just to have a specific source of data that > doesn't > > know in advance how many items it will produce): > > i've combined your Its with my MultilineFuture, doing away with the > Queue stuff (if we want to use a Queue, we should go the full way and > limit the queue length; then, the senders would have to `yield from` > when feeding data, and could (but must) implement flow control; that'd > only affect the feeding interface and not the consumer interface, > though). the whole thing pretty much behaves like an iterator. > > it is safer in that respect that not obeying the protcol (always one > .more(), one .value) raises exceptions, and has a more stream-like > terminology: > > class IteratorFuture: > def __init__(self): > # must never be empty, but needs the possibility to grow > unless we > # require tasks that push to it to be able to wait, in which > case we're > # back at an asyncio.Queue. the rightmost element must always > be > # unfullfilled or raising EndOfIterator, and may be preceded > by any > # number of fulfilled futures. > self._pending = collections.deque([asyncio.Future()]) > > class EndOfIterator(Exception): > """Like StopIteration, but indicating the end of a > IteratorFuture""" > # we can't use StopIteration for the same reason we can't just > use > # `yield` -- it's already used in asyncio > > # consumer interface > > @coroutine > def can_peek(self): > try: > peeked = yield from self._pending[0] > return True > except self.EndOfIterator: > return False > > def consume(self): > result = self._pending[0].result() > # we might not even get in here if a EndOfIterator flies out > of this. > # that's not to be expected when properly used (consumer > should have > # done can_peek before), but it won't harm the internal > structure > # either this way. > self._pending.popleft() > if not self._pending: > self._pending.append(asyncio.Future()) > return result > > # feeding interface > > def set_item(self, data): > self._pending[-1].set_result(data) > self._pending.append(asyncio.Future()) > > def set_completed(self): > self._pending[-1].set_exception(self.EndOfIterator()) > > # implementing the Future interface -- note that it's neither a > Future by > # inheritance, nor does it offer the complete Future interface; > but it can > # be yielded from. > > def __iter__(self): > result = [] > while (yield from self.can_peek()): > result.append(self.consume()) > return result > > # compatibility to the `Its` class > more = can_peek > value = property(consume) > > an object like `it = client.listall()` can be used like this: > > while (yield from it.can_peek()): > value = it.consume() > eat(value) > > and like this: > > for value in (yield from it): > eat(value) > > > Of course, it would be nice if we had a variant of the for-loop notation > > instead, so that we could just write this: > > > > for value yield in listall(): > > # Use value > > if we can work out a pattern that is as general as the iterator > interface, it might be worth considering to have this as syntactic sugar > for the equivalent `while .can_peek(): value = .consume()` like `for` is > syntactic sugar for `i = .iter(); while True: try: value = i.__next() / > except StopIteration`. > > but independent of future syntax suggestions, there needs to be an > agreeable syntax-independent interface first. > > > best regards > chrysn > > ----- > > [1] footnote: > > > I'm not entirely sure what to propose instead. One idea is something I > did > > in Google App Engine NDB -- for every API there is a synchronous method > and > > a corresponding async method; the names are always foo() and foo_async(), > > where foo() returns a synchronous result while foo_async() returns a > > Future. > > that's a different beast, but we're having this in mpd too; there, i've > implemented a completely async AsyncMPDClient class, and a subclass that > wraps every future-returning function into a run_until_complete. > > thus, legacy and all-blocking applications can use the old way, and > asyncio users don't have to suffix every function. > > ----- > > [2] footnote: > > > - Why do you still need a "yield from" in the for-loop header if you have > > one in the body already? > > this is needed in case the result does not yield a single result at all; > in that case, the loop body must not be entered even once. > > > - Having to hold back one value just so you can raise StopIteration makes > > me sad. > > i'm not fond of it at all either, it just seems to be the only way to > use a for loop (which covers the semantics of what i'm doing) together > with yield-from mechanisms. > > > -- > I shouldn't have written all those tank programs. > -- Kevin Flynn > -- --Guido van Rossum (python.org/~guido)
