You should return an iterator in mapPartitionsWIthIndex. This is from
the programming guide

mapPartitionsWithIndex(func): Similar to mapPartitions, but also
provides func with an integer value representing the index of the
partition, so func must be of type (Int, Iterator<T>) => Iterator<U>
when running on an RDD of type T.

For your case, try something similar to the following:

val keyval=dRDD.mapPartitionsWithIndex { (ind,iter) => => process(ind,x.trim().split(' ').map(_.toDouble),q,m,r))


On Sun, Jul 13, 2014 at 11:26 PM, Madhura <> wrote:
> I have a text file consisting of a large number of random floating values
> separated by spaces. I am loading this file into a RDD in scala.
> I have heard of mapPartitionsWithIndex but I haven't been able to implement
> it. For each partition I want to call a method(process in this case) to
> which I want to pass the partition and it's respective index as parameters.
> My method returns a pair of values.
> This is what I have done.
> val dRDD = sc.textFile("hdfs://master:54310/Data/input*")
> var ind:Int=0
> val keyval= dRDD.mapPartitionsWithIndex((ind,x) => process(ind,x,...))
> val res=keyval.collect()
> We are not able to access res(0)._1 and res(0)._2
> The error log is as follows.
> [error] SimpleApp.scala:420: value trim is not a member of Iterator[String]
> [error] Error occurred in an application involving default arguments.
> [error]     val keyval=dRDD.mapPartitionsWithIndex( (ind,x) =>
> process(ind,x.trim().split(' ').map(_.toDouble),q,m,r))
> [error]
> ^
> [error] SimpleApp.scala:425: value mkString is not a member of
> Array[Nothing]
> [error]       println(res.mkString(""))
> [error]                   ^
> [error] /SimpleApp.scala:427: value _1 is not a member of Nothing
> [error]       var final= res(0)._1
> [error]                             ^
> [error] /home/madhura/DTWspark/src/main/scala/SimpleApp.scala:428: value _2
> is not a member of Nothing
> [error]       var final1 = res(0)._2 - m +1
> [error]                                  ^
> --
> View this message in context: 
> Sent from the Apache Spark User List mailing list archive at

Reply via email to