huaxingao commented on pull request #33584:
URL: https://github.com/apache/spark/pull/33584#issuecomment-893088473


   @gengliangwang Thanks a lot for taking a look at this problem.
   
   I need the postScanFilter at this line 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala#L78
 to be empty to push down aggregate.
   
   In JDBC, this postScanFilter is un-pushed filters + un-translated filters
   In file based data source, currently this postScanFilter is data filters + 
partition filters + un-translated filters
   
   My goal is to make file based data source's postScanFilter to be data 
filters + untranslated filters. If this postScanFilter is empty, I can push 
down aggregate. e.g. `SELECT count(*) FROM t WHERE part_col = 1 ` can be pushed 
down.
   
   I can add duplicated code to resolve the partition filters, but I will need 
to remove the partition filters from postScanFilters. Since these partition 
filters are removed here, at the time of calling `PruneFileSourcePartitions`, 
we don't have partition filters any more and nothing needs to be done there.
   
   If we don't want to touch `PruneFileSourcePartitions`, I guess instead of 
checking if the postScanFilter is empty using `if filters.isEmpty`, we do 
something like this:
   ```
   if (JDBC)
     if filters.isEmpty
        push down aggregate
   else // file based
     if filters are only partition filters
       push down aggregate
   ```
   This looks hacky though. Please let me know if anybody has a better idea. 
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to