Cerdore commented on issue #39765:
URL: https://github.com/apache/arrow/issues/39765#issuecomment-1910090923
The extra 4 columns returned in the benchmark are:
```
const FieldVector kAugmentedFields{
field("__fragment_index", int32()),
field("__batch_index", int32()),
field("__last_in_fragment", boolean()),
field("__filename", utf8()),
};
```
In the failure case, 6 columns are returned instead of the expected 2
columns. This is because `kScanFactory` is used as scan options, which triggers
the execution of the MakeScanNode function. In this function, additional
columns shown above are added.
I also examined the TEST_F(TestReordering, ScanBatches) test case, which
also invokes the MakeScanNode function. In this test, extra columns are also
added. And the two case use the similar scan option.
MakeScanNode
Related PR:
[If a projected_schema is not supplied
..](https://github.com/apache/arrow/pull/12466)
[Implement ability to retrieve fragment
filename](https://github.com/apache/arrow/pull/12560)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]