Andrew Ray created SPARK-21100: ---------------------------------- Summary: describe should give quartiles similar to Pandas Key: SPARK-21100 URL: https://issues.apache.org/jira/browse/SPARK-21100 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.1.1 Reporter: Andrew Ray Priority: Minor
The DataFrame describe method should also include quartiles (25th, 50th, and 75th percentiles) like Pandas. Example pandas output: {code} In [4]: df.describe() Out[4]: Unnamed: 0 displ year cyl cty hwy count 234.000000 234.000000 234.000000 234.000000 234.000000 234.000000 mean 117.500000 3.471795 2003.500000 5.888889 16.858974 23.440171 std 67.694165 1.291959 4.509646 1.611534 4.255946 5.954643 min 1.000000 1.600000 1999.000000 4.000000 9.000000 12.000000 25% 59.250000 2.400000 1999.000000 4.000000 14.000000 18.000000 50% 117.500000 3.300000 2003.500000 6.000000 17.000000 24.000000 75% 175.750000 4.600000 2008.000000 8.000000 19.000000 27.000000 max 234.000000 7.000000 2008.000000 8.000000 35.000000 44.000000 {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org