Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21732#discussion_r230739979
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala
 ---
    @@ -207,7 +198,7 @@ case class ExpressionEncoder[T](
       val serializer: Seq[NamedExpression] = {
         val clsName = Utils.getSimpleName(clsTag.runtimeClass)
     
    -    if (isSerializedAsStruct) {
    +    if (isSerializedAsStruct && 
!classOf[Option[_]].isAssignableFrom(clsTag.runtimeClass)) {
    --- End diff --
    
    I think some places are needed to check Option too. I will also add few 
more tests to cover some use cases.
    
    One possible place is Dataset.groupByKey. Before that, I may need #22944 to 
be merged first. So I can write something like:
    
    ```scala
    val ds = Seq(Some(("a", 1)), Some(("b", 2)), Some(("c", 3))).toDS()
    ds.groupByKey(_.map(_._2).getOrElse("d")).agg(sum("value._2").as[Long], 
sum($"value._2" + 1).as[Long])
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to