dbtsai commented on issue #27155: [SPARK-17636][SPARK-25557][SQL] Parquet and 
ORC predicate pushdown in nested fields
URL: https://github.com/apache/spark/pull/27155#issuecomment-574826262
 
 
   Hello @emaynardigs ,
   
   Thank you for your contribution, and I do value your work a lot. In fact, at 
Apple, we are still using an updated version of 
https://github.com/apache/spark/pull/22535 which is critical to our production 
job. As far as I know, Databirkcs's runtime also has an implementation with 
similar approach to tackle this issue.
   
   The reason why I am inactive on my previous PR is that I feel adding nested 
support to the current filter api is a short term solution since the design 
doesn't consider this complex use-cases. For a better long term solution, I 
would like to create a new set of FilterV2 apis in DSv2 framework that makes 
nested columns as first class support. + @cloud-fan @rdblue @viirya for 
feedback on this.
   
   I already started to work on FilterV2 api, and here is WIP code 
https://github.com/dbtsai/spark/pull/10/files . The change is bigger than I 
thought, and now, I am debating do we actually need a new FilterV2 framework?
   
   Feedback and idea are welcome.
   
   Thanks.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to