Github user henryr commented on a diff in the pull request: https://github.com/apache/spark/pull/20687#discussion_r172992262 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/complexTypesSuite.scala --- @@ -331,4 +330,30 @@ class ComplexTypesSuite extends PlanTest with ExpressionEvalHelper { .analyze comparePlans(Optimizer execute rel, expected) } + + test("SPARK-23500: Simplify complex ops that aren't at the plan root") { + // If nullable attributes aren't used, the array and map test cases fail because array + // and map indexing can return null so the output is marked nullable. --- End diff -- The optimization works either way, but in (for example) the map case, `m1` is marked as nullable in the original plan because presumably `GetMapValue(CreateMap(...))` can return `null` if the key is not in the map. So for the expected plan to compare the same as the original, it has to be reading a nullable attribute - otherwise the plans don't pass `comparePlans`. I moved and reworded the comment to hopefully clarify this a bit. There's an opportunity to fix this up again after the rule completes (since some attributes could be marked too conservatively as nullable). Do you think that's something we should pursue for this PR?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org