Cerdore commented on issue #39765:
URL: https://github.com/apache/arrow/issues/39765#issuecomment-1910090923

   The extra 4 columns returned in the benchmark are:
   
   ```
   const FieldVector kAugmentedFields{
       field("__fragment_index", int32()),
       field("__batch_index", int32()),
       field("__last_in_fragment", boolean()),
       field("__filename", utf8()),
   };
   ```
    In the failure case, 6 columns are returned instead of the expected 2 
columns. This is because `kScanFactory` is used as scan options, which triggers 
the execution of the MakeScanNode function. In this function, additional 
columns shown above are added.
   
   I also examined the TEST_F(TestReordering, ScanBatches) test case, which 
also invokes the MakeScanNode function. In this test, extra columns are also 
added. And the two case use the similar scan option.
   
   MakeScanNode 
   Related PR: 
   [If a projected_schema is not supplied 
..](https://github.com/apache/arrow/pull/12466)
   [Implement ability to retrieve fragment 
filename](https://github.com/apache/arrow/pull/12560)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to