Re: [SparkSQL 1.4.0]The result of SUM(xxx) in SparkSQL is 0.0 but not null when the column xxx is all null

2015-07-06 Thread Michael Armbrust
This was a change that was made to match a wrong answer coming from older versions of Hive. Unfortunately I think its too late to fix this in the 1.4 branch (as I'd like to avoid changing answers at all in point releases), but in Spark 1.5 we revert to the correct behavior.

[SparkSQL 1.4.0]The result of SUM(xxx) in SparkSQL is 0.0 but not null when the column xxx is all null

2015-07-03 Thread StanZhai
Hi all, I have a table named test like this: | a | b | | 1 | null | | 2 | null | After upgraded the cluster from spark 1.3.1 to 1.4.0, I found the Sum function in spark 1.4 and 1.3 are different. The SQL is: select sum(b) from test In Spark 1.4.0 the result is 0.0, in spark 1.3.1 the