westonpace commented on issue #35595:
URL: https://github.com/apache/arrow/issues/35595#issuecomment-1552998924
> Is this indeed a bug and my use of the API is correct, are there any
workarounds for this?
Hmm, the suspicious part to me here is the call to `format->MakeFragment`.
This function is primarily intended for internal use. The normal flow is:
* Create a dataset
* Scan the dataset
Scanning a fragment directly should be technically possible. However, the
call to `MakeFragment` expects to receive the "physical schema". This must be
the schema of the file itself. My best guess is that your definition of
`makeTestSchema` is not matching the column order stored in the parquet file.
The schema provided to `scanOpts` is the dataset schema (not the physical
schema), and is free to be in whatever order you want. The method
`InferColumnSchema` is attempting to map between the two. Since it thinks the
physical schema is identical it is not doing any reordering.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]