I was able to solve it by writing a java method (to slice and dice data) and invoking the method/function from spark.map. This transformed the data way faster than my previous approach.
Thanks geoHeil for the pointer. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org