westonpace commented on a change in pull request #10008: URL: https://github.com/apache/arrow/pull/10008#discussion_r612500764
########## File path: cpp/src/arrow/dataset/dataset.cc ########## @@ -95,6 +95,33 @@ Result<ScanTaskIterator> InMemoryFragment::Scan(std::shared_ptr<ScanOptions> opt return MakeMapIterator(fn, std::move(batches_it)); } +Result<RecordBatchGenerator> InMemoryFragment::ScanBatchesAsync( + const ScanOptions& options) { + struct Generator { + Future<std::shared_ptr<RecordBatch>> operator()() { + if (batch_index >= self->record_batches_.size()) { + return AsyncGeneratorEnd<std::shared_ptr<RecordBatch>>(); + } + const auto& next_parent = self->record_batches_[batch_index]; + if (offset + batch_size < next_parent->num_rows()) { + offset += batch_size; + auto next = next_parent->Slice(offset, batch_size); + return Future<std::shared_ptr<RecordBatch>>::MakeFinished(std::move(next)); + } + batch_index++; + auto next = next_parent->Slice(offset, batch_size); + return Future<std::shared_ptr<RecordBatch>>::MakeFinished(std::move(next)); Review comment: Yep. This logic was all backwards. I've since changed it to a while loop. Just pushed the change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org