[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6799: fix: incorrect nullability of `InList` expr

via GitHub Fri, 30 Jun 2023 12:24:36 -0700


alamb commented on code in PR #6799:
URL: https://github.com/apache/arrow-datafusion/pull/6799#discussion_r1248216604



##########
datafusion/expr/src/expr_schema.rs:
##########
@@ -173,8 +173,30 @@ impl ExprSchemable for Expr {
             Expr::Alias(expr, _)
             | Expr::Not(expr)
             | Expr::Negative(expr)
-            | Expr::Sort(Sort { expr, .. })
-            | Expr::InList(InList { expr, .. }) => expr.nullable(input_schema),
+            | Expr::Sort(Sort { expr, .. }) => expr.nullable(input_schema),
+
+            Expr::InList(InList { expr, list, .. }) => {
+                // Avoid inspecting too many expressions.
+                const MAX_INSPECT_LIMIT: usize = 6;

Review Comment:
   As @jonahgao  hints at , trying to add some sort of cache / memoization is 
going to be challenging given Rust and how DataFusion is structured
   
   I think the current PR solution is good:
   1. It is well tested 
   2. It is conservative (it might say an InList is nullable that isn't, but I 
think that will not generate wrong results, only potentially less optimal plans)
   
   Thus I think we should merge this PR as is and then as a follow on PR we can 
remove the limit or increase it, etc. 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6799: fix: incorrect nullability of `InList` expr

Reply via email to