lidavidm commented on a change in pull request #10076:
URL: https://github.com/apache/arrow/pull/10076#discussion_r619478321



##########
File path: cpp/src/arrow/dataset/scanner_test.cc
##########
@@ -115,7 +116,7 @@ class TestScanner : public 
DatasetFixtureMixinWithParam<TestScannerParams> {
 
     AssertScanBatchesUnorderedEquals(expected.get(), scanner.get(), 1);
   }
-};
+};  // namespace dataset

Review comment:
       Interesting, FWIW I've only ever used the `format` target from our build 
scripts which doesn't have this issue.

##########
File path: cpp/src/arrow/dataset/scanner.cc
##########
@@ -480,13 +492,24 @@ Result<std::shared_ptr<Table>> AsyncScanner::ToTable() {
   return table_fut.result();
 }
 
+Result<EnumeratedRecordBatchGenerator> 
AsyncScanner::ScanBatchesUnorderedAsync() {
+  return ScanBatchesUnorderedAsync(internal::GetCpuThreadPool());
+}
+
 Result<EnumeratedRecordBatchGenerator> AsyncScanner::ScanBatchesUnorderedAsync(
     internal::Executor* cpu_executor) {
   auto self = shared_from_this();
   ARROW_ASSIGN_OR_RAISE(auto fragment_gen, GetFragments());
   ARROW_ASSIGN_OR_RAISE(auto batch_gen_gen,
                         FragmentsToBatches(self, std::move(fragment_gen)));
-  return MakeConcatenatedGenerator(std::move(batch_gen_gen));
+  auto batch_gen_gen_readahead = MakeSerialReadaheadGenerator(
+      std::move(batch_gen_gen), scan_options_->fragment_readahead);
+  return MakeMergedGenerator(std::move(batch_gen_gen_readahead),
+                             scan_options_->fragment_readahead);
+}

Review comment:
       Ah ok, thanks for the explanation. I think I got mixed up because I 
thought MakeMergedGenerator could also itself queue up to one item per source.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to