Github user mn-mikke commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21704#discussion_r200134825
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
    @@ -2007,7 +2007,14 @@ case class Concat(children: Seq[Expression]) extends 
Expression {
         }
       }
     
    -  override def dataType: DataType = 
children.map(_.dataType).headOption.getOrElse(StringType)
    +  override def dataType: DataType = {
    +    val dataTypes = children.map(_.dataType)
    +    dataTypes.headOption.map {
    +      case ArrayType(et, _) =>
    +        ArrayType(et, 
dataTypes.exists(_.asInstanceOf[ArrayType].containsNull))
    --- End diff --
    
    @ueshin For ```Concat```, ```Coalesce```, etc.  it seems to be that case 
since a coercion rule is executed if there is any nullability difference on any 
level of nesting. But it's not the case of ```CaseWhenCoercion``` rule, since 
```sameType``` method is used for comparison.
    
    I'm wondering if the goal is to avoid generation of extra ```Cast``` 
expressions, shouldn't other coercion rules utilize ```sameType``` method as 
well? Let's assume that the result of ```concat``` is subsequently used by 
```flatten```, wouldn't  it lead to generation of extra null safe checks as 
mentioned 
[here](https://github.com/apache/spark/pull/21704#discussion_r200110924)? 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to