Github user maropu commented on a diff in the pull request:
https://github.com/apache/spark/pull/18576#discussion_r177782600
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1309,3 +1311,20 @@ object RemoveRepetitionFromGroupExpressions extends
Rule[LogicalPlan] {
}
}
}
+
+/**
+ * Updates nullability in [[AttributeReference]]s if nullability is
different between
+ * non-leaf plan's expressions and the children output.
+ */
+object UpdateNullabilityInAttributeReferences extends Rule[LogicalPlan] {
--- End diff --
I dropped the changes of `execution.FilterExec` though, you suggested we
would drop the changes of `logical.Filter`, too?
https://github.com/apache/spark/pull/18576/files#diff-72917e7b68f0311b2fb42990e0dc616dR139
I basically agree that the `Join.output` modification is more
simple/important, but is it okay to ignore nullability in `logical.Filter`? For
example, in the current master,
`QueryPlanConstraints.inferIsNotNullConstraints` appends non-nullable
constraints in `logical.Filter` and this constraints aren't correctly
propagated into upper plan nodes now. So, I think it'd be better to respect
nullability in both `logical.Join` and `logical.Filter`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]