best way to generate per key auto increment numerals after sorting

2015-10-19 Thread fahad shah
Hi I wanted to ask whats the best way to achieve per key auto increment numerals after sorting, for eg. : raw file: 1,a,b,c,1,1 1,a,b,d,0,0 1,a,b,e,1,0 2,a,e,c,0,0 2,a,f,d,1,0 post-output (the last column is the position number after grouping on first three fields and reverse sorting on last

Re: best way to generate per key auto increment numerals after sorting

2015-10-19 Thread fahad shah
Thanks Davies, groupbykey was throwing up the error: unpack requires a string argument of length 4 interestingly, I replace that with the sortbykey (which i read also shuffles so that data for same key are on same partition) and it ran fine - wondering if this a bug on groupbykey for Spark 1.3?

Re: best way to generate per key auto increment numerals after sorting

2015-10-19 Thread Davies Liu
What's the issue with groupByKey()? On Mon, Oct 19, 2015 at 1:11 AM, fahad shah wrote: > Hi > > I wanted to ask whats the best way to achieve per key auto increment > numerals after sorting, for eg. : > > raw file: > > 1,a,b,c,1,1 > 1,a,b,d,0,0 > 1,a,b,e,1,0 > 2,a,e,c,0,0 >