Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20687#discussion_r173012266 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/complexTypesSuite.scala --- @@ -331,4 +330,31 @@ class ComplexTypesSuite extends PlanTest with ExpressionEvalHelper { .analyze comparePlans(Optimizer execute rel, expected) } + + test("SPARK-23500: Simplify complex ops that aren't at the plan root") { + val structRel = relation + .select(GetStructField(CreateNamedStruct(Seq("att1", 'nullable_id)), 0, None) as "foo") + .groupBy($"foo")("1").analyze + val structExpected = relation + .select('nullable_id as "foo") + .groupBy($"foo")("1").analyze + comparePlans(Optimizer execute structRel, structExpected) + + // If nullable attributes aren't used in the 'expected' plans, the array and map test + // cases fail because array and map indexing can return null so the output attribute --- End diff -- `nullable` is mostly calculated on demand, so we don't have rules to change the `nullable` property. For this case, the expression is `Alias(GetArrayItem(CreateArray(Attribute...)))`, which is nullable. After optimize, it becomes `Alias(Attribute...)` and is not nullable(if that attribute is not nullable). So the `nullable` is updated automatically. I don't know why you hit this issue, please ping us if you can't figure it out, we can help to debug.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org