[GitHub] [spark] gengliangwang edited a comment on pull request #33584: [SPARK-36351][SQL] Separate partition filters and data filters in PushDownUtils

GitBox Wed, 04 Aug 2021 10:10:56 -0700


gengliangwang edited a comment on pull request #33584:
URL: https://github.com/apache/spark/pull/33584#issuecomment-892826981



   > In order to lift the above restriction, at the time of checking whether to 
push down the aggregate, we should have already separated the partition filters 
and data filters. However, in the current code, we won't separate these two 
filters until PruneFileSourcePartitions. This PR is proposed to separates 
partition filters and data filters in PushDownUtils, so we can use this info to 
determine whether we can push down aggregate if filter is involved.
   
   @huaxingao `FileScanBuilder` already has the partition schema and data 
schema. IIUC we can get the extract the partition filters from a set of filters 
without these changes .
   As @viirya @sunchao points out, this PR makes the code complicated. 
   Shall we simply add duplicated code to resolve the partition filters on 
pushing down Aggregation in V2 first? We can look back and see whether we need 
to do refactoring.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] gengliangwang edited a comment on pull request #33584: [SPARK-36351][SQL] Separate partition filters and data filters in PushDownUtils

Reply via email to