westonpace commented on a change in pull request #9995:
URL: https://github.com/apache/arrow/pull/9995#discussion_r612505516



##########
File path: cpp/src/arrow/util/async_generator.h
##########
@@ -1332,4 +1332,49 @@ Result<Iterator<T>> MakeReadaheadIterator(Iterator<T> 
it, int readahead_queue_si
   return MakeGeneratorIterator(std::move(owned_bg_generator));
 }
 
+/// \brief Make a generator that returns a single pre-generated future

Review comment:
       FWIW: There are only two spots today where we pull in an async-reentrant 
fashion.  The first is in `ReadaheadGenerator::operator()()`. There is a loop...
   
   ```
         for (int i = 0; i < max_readahead_; i++) {
           auto next = source_generator_();
           next.AddCallback(mark_finished_if_done_);
           readahead_queue_.push(std::move(next));
         }
   ```
   
   If `source_generator_` is fully synchronous (i.e. it always returns finished 
futures) this does not add any parallelism (an unfortunate fact I hope to 
remedy someday).
   
   However, if `source_generator_` returns an unfinished future, then this will 
fan out the tasks.  For example, if `source_generator_` is reading from a file 
then this will cause up to `max_readahead_` concurrent file reads.
   
   The second spot is in `MergedGenerator` and I will probably remove it at 
some point as it isn't important and causes more headaches than it should.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to