I'm a little short on time, but I think it would be nice to iterate on this
design a few more times until it is general and robust and clear enough to
be added to the asyncio module as a standard helper abstraction for such
cases.

I'm still curious -- why are you rejecting using a Queue in the
implementation? Your stated reason "if we wanted a Queue we'd have to add a
limit" doesn't make sense to me.


On Tue, Apr 22, 2014 at 2:34 AM, chrysn <[email protected]> wrote:

> hello guido,
>
> thank you for your feedback (and for making asyncio real in general).
>
> On Sun, Apr 20, 2014 at 08:11:45PM -0700, Guido van Rossum wrote:
> > Here the pattern is somewhat simpler: there's no yield in the for-loop
> > header; also, the implementation doesn't need to hold back a value to
> raise
> > StopIteration because the number of results is known ahead of time (it's
> > len(fs) :-).
>
> not knowing the length of results (as i think it is a typical case of
> iterators instead of lists) is a key component to why i can't use
> as_completed here.
>
>
> > it = listall()
> > while (yield from it.more()):
> >     value = it.value
> >     # Use value
>
> this has the most appeal to me; it loses on the semantic side
> (logically, it's a for-loop after all), but wins on practicality and not
> having to hold back a value.
>
>
> > - The notation listall().lines feels a little odd -- perhaps you can
> have a
> > separate method for this, e.g. listall_async()?
>
> be aware that .listall() is not really blocking (that'd be footnote
> [1]), but is just a future that gets done when all the items are
> available, so callers that gain no benefit from waiting for single
> events can just do this:
>
>     >>> for value in (yield from listall()):
>     ...     # use value
>
> anyway, with the .more()-style approach, the .lines is not needed any
> more.
>
> (neither are some other quirks; the details of what used to be important
> are in footnote [2]).
>
>
> > Let me try to write this up as an example using an asyncio StreamReader
> and
> > its readline() method (just to have a specific source of data that
> doesn't
> > know in advance how many items it will produce):
>
> i've combined your Its with my MultilineFuture, doing away with the
> Queue stuff (if we want to use a Queue, we should go the full way and
> limit the queue length; then, the senders would have to `yield from`
> when feeding data, and could (but must) implement flow control; that'd
> only affect the feeding interface and not the consumer interface,
> though). the whole thing pretty much behaves like an iterator.
>
> it is safer in that respect that not obeying the protcol (always one
> .more(), one .value) raises exceptions, and has a more stream-like
> terminology:
>
>     class IteratorFuture:
>         def __init__(self):
>             # must never be empty, but needs the possibility to grow
> unless we
>             # require tasks that push to it to be able to wait, in which
> case we're
>             # back at an asyncio.Queue. the rightmost element must always
> be
>             # unfullfilled or raising EndOfIterator, and may be preceded
> by any
>             # number of fulfilled futures.
>             self._pending = collections.deque([asyncio.Future()])
>
>         class EndOfIterator(Exception):
>             """Like StopIteration, but indicating the end of a
> IteratorFuture"""
>             # we can't use StopIteration for the same reason we can't just
> use
>             # `yield` -- it's already used in asyncio
>
>         # consumer interface
>
>         @coroutine
>         def can_peek(self):
>             try:
>                 peeked = yield from self._pending[0]
>                 return True
>             except self.EndOfIterator:
>                 return False
>
>         def consume(self):
>             result = self._pending[0].result()
>             # we might not even get in here if a EndOfIterator flies out
> of this.
>             # that's not to be expected when properly used (consumer
> should have
>             # done can_peek before), but it won't harm the internal
> structure
>             # either this way.
>             self._pending.popleft()
>             if not self._pending:
>                 self._pending.append(asyncio.Future())
>             return result
>
>         # feeding interface
>
>         def set_item(self, data):
>             self._pending[-1].set_result(data)
>             self._pending.append(asyncio.Future())
>
>         def set_completed(self):
>             self._pending[-1].set_exception(self.EndOfIterator())
>
>         # implementing the Future interface -- note that it's neither a
> Future by
>         # inheritance, nor does it offer the complete Future interface;
> but it can
>         # be yielded from.
>
>         def __iter__(self):
>             result = []
>             while (yield from self.can_peek()):
>                 result.append(self.consume())
>             return result
>
>         # compatibility to the `Its` class
>         more = can_peek
>         value = property(consume)
>
> an object like `it = client.listall()` can be used like this:
>
>     while (yield from it.can_peek()):
>         value = it.consume()
>         eat(value)
>
> and like this:
>
>     for value in (yield from it):
>         eat(value)
>
> > Of course, it would be nice if we had a variant of the for-loop notation
> > instead, so that we could just write this:
> >
> > for value yield in listall():
> >     # Use value
>
> if we can work out a pattern that is as general as the iterator
> interface, it might be worth considering to have this as syntactic sugar
> for the equivalent `while .can_peek(): value = .consume()` like `for` is
> syntactic sugar for `i = .iter(); while True: try: value = i.__next() /
> except StopIteration`.
>
> but independent of future syntax suggestions, there needs to be an
> agreeable syntax-independent interface first.
>
>
> best regards
> chrysn
>
> -----
>
> [1] footnote:
>
> > I'm not entirely sure what to propose instead. One idea is something I
> did
> > in Google App Engine NDB -- for every API there is a synchronous method
> and
> > a corresponding async method; the names are always foo() and foo_async(),
> > where foo() returns a synchronous result while foo_async() returns a
> > Future.
>
> that's a different beast, but we're having this in mpd too; there, i've
> implemented a completely async AsyncMPDClient class, and a subclass that
> wraps every future-returning function into a run_until_complete.
>
> thus, legacy and all-blocking applications can use the old way, and
> asyncio users don't have to suffix every function.
>
> -----
>
> [2] footnote:
>
> > - Why do you still need a "yield from" in the for-loop header if you have
> > one in the body already?
>
> this is needed in case the result does not yield a single result at all;
> in that case, the loop body must not be entered even once.
>
> > - Having to hold back one value just so you can raise StopIteration makes
> > me sad.
>
> i'm not fond of it at all either, it just seems to be the only way to
> use a for loop (which covers the semantics of what i'm doing) together
> with yield-from mechanisms.
>
>
> --
> I shouldn't have written all those tank programs.
>   -- Kevin Flynn
>



-- 
--Guido van Rossum (python.org/~guido)

Reply via email to