[GitHub] [spark] LorenzoMartini commented on pull request #36918: [SQL][SPARK-39528] Use V2 Filter in SupportsRuntimeFiltering

GitBox Tue, 10 Jan 2023 03:35:49 -0800


LorenzoMartini commented on PR #36918:
URL: https://github.com/apache/spark/pull/36918#issuecomment-1377123165


   Hi @huaxingao.
   
   We are trying to use spark datasourceV2 and noticed that the spark v2 
built-in data sources (eg parquet one, looking at `ParquetScan`) don't support 
this (`SupportsRuntimeFiltering` nor `SupportsRuntimeV2Filtering`) by default, 
creating a large performance difference between using v1 and v2 datasource ootb.
   
   Is there a plan to have them support this? It would be really beneficial for 
the file scans to be able to do this and given they already benefit of some 
push downs we were wondering why the runtime filtering is not implemented. Or 
maybe I am missing something? And in that case it would be great to understand 
how to have spark file sources take advantage of dpp.
   
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] LorenzoMartini commented on pull request #36918: [SQL][SPARK-39528] Use V2 Filter in SupportsRuntimeFiltering

Reply via email to