Hi, You don't need to run approxPercentile against a list. Since it is an aggregation function, you can simply run:
// Just for illustrate the idea. val approxPercentile = new ApproximatePercentile(v1, Literal(percentage)) val agg_approx_percentile = Column(approxPercentile.toAggregateExpression()) df.groupBy (k1, k2, k3).agg(collect_list(v1), agg_approx_percentile) Rishi wrote > I need to compute have a spark quantiles on a numeric field after a group > by operation. Is there a way to apply the approxPercentile on an > aggregated list instead of a column? > > E.g. The Dataframe looks like > > k1 | k2 | k3 | v1 > > a1 | b1 | c1 | 879 > > a2 | b2 | c2 | 769 > > a1 | b1 | c1 | 129 > > a2 | b2 | c2 | 323 > I need to first run groupBy (k1, k2, k3) and collect_list(v1), and then > compute quantiles [10th, 50th...] on list of v1's ----- Liang-Chi Hsieh | @viirya Spark Technology Center http://www.spark.tc/ -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/approx-percentile-computation-tp20820p20823.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org