friendlymatthew commented on PR #20822:
URL: https://github.com/apache/datafusion/pull/20822#issuecomment-4025466833

   > > > This looks great to me!
   > > > How do we generate the right ProjectionMask / translate into the right 
leaf column index in Parquet? I don't see that added anywhere but maybe the 
existing code already did that correctly?
   > > 
   > > 
   > > We do not. I added a note in the PR message:
   > > > Note: this does not address the projection side and should not be 
blocked by it. SELECT s['foo'] still reads the entire struct rather than just 
the needed leaf column. That requires separate changes to how the opener builds 
its projection mask.
   > > 
   > > 
   > > I have some ideas about this, and will push up a follow up PR
   > 
   > Wording It's a bit confusing here, because projection can mean one of two 
things: the select part of the query, or the projection of the columns that the 
filters need to be evaluated. I assume here you are referring to the latter, 
i.e. although we support these filters now, we still read the entire struct 
column and then apply the get field operation in memory?
   
   Yes exactly. Filters on struct fields are now pushed down to the row-level 
filter, but the projection side still reads the entire struct column. I still 
need to teach Datafusion to project only the needed subcolumns (leaf columns) 
of the struct rather than materializing the whole thing
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to