Re: filling missing values in a sequence

2016-09-18 Thread sudhindra
Hi i have coded something like this , pls tell me how bad it is . package Spark.spark; import java.util.List; import java.util.function.Function; import org.apache.spark.SparkConf; import org.apache.spark.SparkContext; import org.apache.spark.api.java.JavaRDD; import

Re: filling missing values in a sequence

2016-09-18 Thread Sudhindra Magadi
Hi Jorn , We have a file with billion records.We want to find if there any missing sequences here .If so what are they ? Thanks Sudhindra On Mon, Sep 19, 2016 at 11:12 AM, Jörn Franke <jornfra...@gmail.com> wrote: > I am not sure what you try to achieve here. Can you please tel

Re: filling missing values in a sequence

2016-09-19 Thread Sudhindra Magadi
Each of the records will be having a sequence id .No duplicates On Mon, Sep 19, 2016 at 11:42 AM, ayan guha <guha.a...@gmail.com> wrote: > And how do you define missing sequence? Can you give an example? > > On Mon, Sep 19, 2016 at 3:48 PM, Sudhindra Magadi <smag...@gmail.com

Re: filling missing values in a sequence

2016-09-19 Thread Sudhindra Magadi
that is correct On Mon, Sep 19, 2016 at 12:09 PM, ayan guha <guha.a...@gmail.com> wrote: > Ok, so if you see > > 1,3,4,6. > > Will you say 2,5 are missing? > > On Mon, Sep 19, 2016 at 4:15 PM, Sudhindra Magadi <smag...@gmail.com> > wrote: > >> Eac

Re: filling missing values in a sequence

2016-09-19 Thread Sudhindra Magadi
quot;.join(map(str,iterator)) > > Now, you can use RDD operation to run this function on each partition: > > >>> r1 = r.mapPartitions(f) > > Now, you would have local missing values. You can now write them out to a > file. > > On Mon, Sep 19, 2016 at 4:39 PM, Sudhi