from:"Alex Turner \(TMS\)"

Re: mapPartitions - How Does it Works

2015-03-18 Thread Alex Turner (TMS)

List(x.next).iterator is giving you the first element from each partition, which would be 1, 4 and 7 respectively. On 3/18/15, 10:19 AM, "ashish.usoni" wrote: >I am trying to understand about mapPartitions but i am still not sure how >it >works > >in the below example it create three partition >

RDD pair to pair of RDDs

2015-03-18 Thread Alex Turner (TMS)

What's the best way to go from: RDD[(A, B)] to (RDD[A], RDD[B]) If I do: def separate[A, B](k: RDD[(A, B)]) = (k.map(_._1), k.map(_._2)) Which is the obvious solution, this runs two maps in the cluster. Can I do some kind of a fold instead: def separate[A, B](l: List[(A, B)]) = l.foldLeft(Li

Memory Settings for local execution context

2015-03-17 Thread Alex Turner (TMS)

So the page that talks about settings: http://spark.apache.org/docs/1.2.1/configuration.html seems to not apply when running local contexts. I have a shell script that starts my job: xport SPARK_MASTER_OPTS="-Dsun.io.serialization.extendedDebugInfo=true" export SPARK_WORKER_OPTS="-Dsun.io.ser

Re: mapPartitions - How Does it Works

RDD pair to pair of RDDs

Memory Settings for local execution context

3 matches

Site Navigation

Mail list logo

Footer information