A rather more general question is - assume I have an JavaRDD<K> which is sorted - How can I convert this into a JavaPairRDD<Integer,K> where the Integer is tie index -> 0...N - 1. Easy to do on one machine JavaRDD<K> values = ... // create here JavaRDD<Integer,K> positions = values.mapToPair(new PairFunction<K, Integer, K>() { private int index = 0; @Override public Tuple2<Integer, K> call(final K t) throws Exception { return new Tuple2(index++,t); } }); but will this code do the right thing on a cluster
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-calculate-percentiles-with-Spark-tp16937p16945.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org