Github user henryr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20687#discussion_r172992262
  
    --- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/complexTypesSuite.scala
 ---
    @@ -331,4 +330,30 @@ class ComplexTypesSuite extends PlanTest with 
ExpressionEvalHelper {
           .analyze
         comparePlans(Optimizer execute rel, expected)
       }
    +
    +  test("SPARK-23500: Simplify complex ops that aren't at the plan root") {
    +    // If nullable attributes aren't used, the array and map test cases 
fail because array
    +    // and map indexing can return null so the output is marked nullable.
    --- End diff --
    
    The optimization works either way, but in (for example) the map case, `m1` 
is marked as nullable in the original plan because presumably 
`GetMapValue(CreateMap(...))` can return `null` if the key is not in the map. 
    
    So for the expected plan to compare the same as the original, it has to be 
reading a nullable attribute - otherwise the plans don't pass `comparePlans`.  
I moved and reworded the comment to hopefully clarify this a bit.
    
    There's an opportunity to fix this up again after the rule completes (since 
some attributes could be marked too conservatively as nullable). Do you think 
that's something we should pursue for this PR?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to