Re: [PR] Avoid pushdown of volatile functions to tablescan [datafusion]

via GitHub Tue, 19 Nov 2024 12:43:03 -0800


theirix commented on code in PR #13475:
URL: https://github.com/apache/datafusion/pull/13475#discussion_r1849012118



##########
datafusion/optimizer/src/push_down_filter.rs:
##########
@@ -998,7 +998,19 @@ impl OptimizerRule for PushDownFilter {
                         filter_predicates.len());
                 }
 
-                let zip = filter_predicates.into_iter().zip(results);
+                let zip =
+                    filter_predicates
+                        .into_iter()
+                        .zip(results)
+                        .map(|(expr, res)| {
+                            let filter_pushdown_type = if expr.is_volatile() {

Review Comment:
   As I understand, we can ask a table source to tell if the filter is 
supported even for volatile expressions.
   
   I thought the main difference is which call is more expensive. 
`supports_filters_pushdown` for different sources often provides blanket 
all-supported or all-unsupported for all predicates. So we can save 
`is_volatile` calls by filtering out `Unsupported` values before. In contrast, 
if we check for volatility first and only then ask the source if the filters 
are supported, we can waste `is_volatilty` calls. If the 
`supports_filters_pushdown` is more costly, let's reverse it, of course.
   
   I pushed a change to check the volatility after filtering out unsupported 
predicates. If we prefer the inverse order, I could refactor this optimiser 
quite a bit to:
   - Get a subset of supported predicates
   - Apply the `is_volatile` filter to it
   - Build `new_scan_filters` and `new_predicate`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Avoid pushdown of volatile functions to tablescan [datafusion]

Reply via email to