kevinwilfong opened a new pull request, #10569: URL: https://github.com/apache/incubator-gluten/pull/10569
## What changes are proposed in this pull request? This PR changes the SubstraitToVeloxPlanConverter to pass the table schema rather than just the schema of the columns we read from the files to the HiveTableHandle. This is a necessary prerequisite to supporting index based column resolution, whether that's reading from files using column positions rather than names to map between the file schema and the table schema, or reading from file formats that do not contain schema information like Text. To do this I updated VeloxIteratorApi to set the file schema of LocalFilesNodes to the data schema of the Scan (when present). This is similar to what's already done in the Iterator API for ClickHouse. I then parse that schema and pass it to the SplitInfo in VeloxPlanConverter. Finally, I extract it from the SplitInfo in SubstraitToVeloxPlanConverter and pass it to the HiveTableHandle constructor in place of the base schema. This should not produce any noticeable effect for the existing code paths/file formats as the table schema is a superset of the base schema and file columns are currently mapped to table columns exclusively by name. ## How was this patch tested? Ran the existing unit tests. This change should not change any existing behavior, but should enable future changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
