Abdeali Kothari created SPARK-22448: ---------------------------------------
Summary: Add functions like Mode(), NumNulls(), etc. in Summarizer Key: SPARK-22448 URL: https://issues.apache.org/jira/browse/SPARK-22448 Project: Spark Issue Type: New Feature Components: Optimizer Affects Versions: 2.2.0 Reporter: Abdeali Kothari Would be very useful to have a MODE() function in the Summary statistics currently supported by DataSets. I can see that the Summarizer has many useful functions in 2.3.0 and it would be useful to add the following to it: - Mode - Element that occurs maximum number of times - CSS - Cumulative Sum of Squares ... Sum((x - mean)^2) - NumNull - The number of values that are NULL in the column - SUM - Just the sum of the column ... -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org