You should return an iterator in mapPartitionsWIthIndex. This is from the programming guide (http://spark.apache.org/docs/latest/programming-guide.html):
mapPartitionsWithIndex(func): Similar to mapPartitions, but also provides func with an integer value representing the index of the partition, so func must be of type (Int, Iterator<T>) => Iterator<U> when running on an RDD of type T. For your case, try something similar to the following: val keyval=dRDD.mapPartitionsWithIndex { (ind,iter) => iter.map(x => process(ind,x.trim().split(' ').map(_.toDouble),q,m,r)) } -Xiangrui On Sun, Jul 13, 2014 at 11:26 PM, Madhura <das.madhur...@gmail.com> wrote: > I have a text file consisting of a large number of random floating values > separated by spaces. I am loading this file into a RDD in scala. > > I have heard of mapPartitionsWithIndex but I haven't been able to implement > it. For each partition I want to call a method(process in this case) to > which I want to pass the partition and it's respective index as parameters. > > My method returns a pair of values. > This is what I have done. > > val dRDD = sc.textFile("hdfs://master:54310/Data/input*") > var ind:Int=0 > val keyval= dRDD.mapPartitionsWithIndex((ind,x) => process(ind,x,...)) > val res=keyval.collect() > > We are not able to access res(0)._1 and res(0)._2 > > The error log is as follows. > > [error] SimpleApp.scala:420: value trim is not a member of Iterator[String] > [error] Error occurred in an application involving default arguments. > [error] val keyval=dRDD.mapPartitionsWithIndex( (ind,x) => > process(ind,x.trim().split(' ').map(_.toDouble),q,m,r)) > [error] > ^ > [error] SimpleApp.scala:425: value mkString is not a member of > Array[Nothing] > [error] println(res.mkString("")) > [error] ^ > [error] /SimpleApp.scala:427: value _1 is not a member of Nothing > [error] var final= res(0)._1 > [error] ^ > [error] /home/madhura/DTWspark/src/main/scala/SimpleApp.scala:428: value _2 > is not a member of Nothing > [error] var final1 = res(0)._2 - m +1 > [error] ^ > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/mapPartitionsWithIndex-tp9590.html > Sent from the Apache Spark User List mailing list archive at Nabble.com.