koert kuipers created SPARK-15204:
-------------------------------------

             Summary: Nullability is not correct for Aggregator
                 Key: SPARK-15204
                 URL: https://issues.apache.org/jira/browse/SPARK-15204
             Project: Spark
          Issue Type: Bug
          Components: SQL
         Environment: spark-2.0.0-SNAPSHOT
            Reporter: koert kuipers


{noformat}
object SimpleSum extends Aggregator[Row, Int, Int] {
  def zero: Int = 0
  def reduce(b: Int, a: Row) = b + a.getInt(1)
  def merge(b1: Int, b2: Int) = b1 + b2
  def finish(b: Int) = b
  def bufferEncoder: Encoder[Int] = Encoders.scalaInt
  def outputEncoder: Encoder[Int] = Encoders.scalaInt
}

val df = List(("a", 1), ("a", 2), ("a", 3)).toDF("k", "v")
val df1 = df.groupBy("k").agg(SimpleSum.toColumn as "v1")
df1.printSchema
df1.show

root
 |-- k: string (nullable = true)
 |-- v1: integer (nullable = true)

+---+---+
|  k| v1|
+---+---+
|  a|  6|
+---+---+
{noformat}

notice how v1 has nullable set to true. the default (and expected) behavior for 
spark sql is to give an int column false for nullable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to