lidavidm commented on a change in pull request #10076:
URL: https://github.com/apache/arrow/pull/10076#discussion_r619478321
##########
File path: cpp/src/arrow/dataset/scanner_test.cc
##########
@@ -115,7 +116,7 @@ class TestScanner : public
DatasetFixtureMixinWithParam<TestScannerParams> {
AssertScanBatchesUnorderedEquals(expected.get(), scanner.get(), 1);
}
-};
+}; // namespace dataset
Review comment:
Interesting, FWIW I've only ever used the `format` target from our build
scripts which doesn't have this issue.
##########
File path: cpp/src/arrow/dataset/scanner.cc
##########
@@ -480,13 +492,24 @@ Result<std::shared_ptr<Table>> AsyncScanner::ToTable() {
return table_fut.result();
}
+Result<EnumeratedRecordBatchGenerator>
AsyncScanner::ScanBatchesUnorderedAsync() {
+ return ScanBatchesUnorderedAsync(internal::GetCpuThreadPool());
+}
+
Result<EnumeratedRecordBatchGenerator> AsyncScanner::ScanBatchesUnorderedAsync(
internal::Executor* cpu_executor) {
auto self = shared_from_this();
ARROW_ASSIGN_OR_RAISE(auto fragment_gen, GetFragments());
ARROW_ASSIGN_OR_RAISE(auto batch_gen_gen,
FragmentsToBatches(self, std::move(fragment_gen)));
- return MakeConcatenatedGenerator(std::move(batch_gen_gen));
+ auto batch_gen_gen_readahead = MakeSerialReadaheadGenerator(
+ std::move(batch_gen_gen), scan_options_->fragment_readahead);
+ return MakeMergedGenerator(std::move(batch_gen_gen_readahead),
+ scan_options_->fragment_readahead);
+}
Review comment:
Ah ok, thanks for the explanation. I think I got mixed up because I
thought MakeMergedGenerator could also itself queue up to one item per source.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]