kevinwilfong opened a new pull request, #10569:
URL: https://github.com/apache/incubator-gluten/pull/10569

   ## What changes are proposed in this pull request?
   
   This PR changes the SubstraitToVeloxPlanConverter to pass the table schema 
rather than just the schema of the columns we 
   read from the files to the HiveTableHandle.
   
   This is a necessary prerequisite to supporting index based column 
resolution, whether that's reading from files using column
   positions rather than names to map between the file schema and the table 
schema, or reading from file formats that do not
   contain schema information like Text.
   
   To do this I updated VeloxIteratorApi to set the file schema of 
LocalFilesNodes to the data schema of the Scan (when 
   present). This is similar to what's already done in the Iterator API for 
ClickHouse. I then parse that schema and pass it to the
   SplitInfo in VeloxPlanConverter. Finally, I extract it from the SplitInfo in 
SubstraitToVeloxPlanConverter and pass it to the
   HiveTableHandle constructor in place of the base schema.
   
   This should not produce any noticeable effect for the existing code 
paths/file formats as the table schema is a superset of
   the base schema and file columns are currently mapped to table columns 
exclusively by name.
   
   ## How was this patch tested?
   
   Ran the existing unit tests. This change should not change any existing 
behavior, but should enable future changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to