quot;.join(map(str,iterator))
>
> Now, you can use RDD operation to run this function on each partition:
>
> >>> r1 = r.mapPartitions(f)
>
> Now, you would have local missing values. You can now write them out to a
> file.
>
> On Mon, Sep 19, 2016 at 4:39 PM, Sudhi
that is correct
On Mon, Sep 19, 2016 at 12:09 PM, ayan guha <guha.a...@gmail.com> wrote:
> Ok, so if you see
>
> 1,3,4,6.
>
> Will you say 2,5 are missing?
>
> On Mon, Sep 19, 2016 at 4:15 PM, Sudhindra Magadi <smag...@gmail.com>
> wrote:
>
>> Eac
Each of the records will be having a sequence id .No duplicates
On Mon, Sep 19, 2016 at 11:42 AM, ayan guha <guha.a...@gmail.com> wrote:
> And how do you define missing sequence? Can you give an example?
>
> On Mon, Sep 19, 2016 at 3:48 PM, Sudhindra Magadi <smag...@gmail.com
Hi Jorn ,
We have a file with billion records.We want to find if there any missing
sequences here .If so what are they ?
Thanks
Sudhindra
On Mon, Sep 19, 2016 at 11:12 AM, Jörn Franke wrote:
> I am not sure what you try to achieve here. Can you please tell us what
> the