koert kuipers created SPARK-15204:
-------------------------------------
Summary: Nullability is not correct for Aggregator
Key: SPARK-15204
URL: https://issues.apache.org/jira/browse/SPARK-15204
Project: Spark
Issue Type: Bug
Components: SQL
Environment: spark-2.0.0-SNAPSHOT
Reporter: koert kuipers
{noformat}
object SimpleSum extends Aggregator[Row, Int, Int] {
def zero: Int = 0
def reduce(b: Int, a: Row) = b + a.getInt(1)
def merge(b1: Int, b2: Int) = b1 + b2
def finish(b: Int) = b
def bufferEncoder: Encoder[Int] = Encoders.scalaInt
def outputEncoder: Encoder[Int] = Encoders.scalaInt
}
val df = List(("a", 1), ("a", 2), ("a", 3)).toDF("k", "v")
val df1 = df.groupBy("k").agg(SimpleSum.toColumn as "v1")
df1.printSchema
df1.show
root
|-- k: string (nullable = true)
|-- v1: integer (nullable = true)
+---+---+
| k| v1|
+---+---+
| a| 6|
+---+---+
{noformat}
notice how v1 has nullable set to true. the default (and expected) behavior for
spark sql is to give an int column false for nullable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]