No, but for a multi-tens of billion elements dataset, you cannot fit your elements in memory on a single host. So at some point, the solution I currently use (not the one I sent) : dataset.collect().zipWithIndex just won't scale. Did you try the code I sent ? I think the sortBy is probably in the wrong direction, so change it with -i instead of i Guillaume --
|
- How to map each line to (line number, line)? Aureliano Buendia
- Re: How to map each line to (line number, line)? Aureliano Buendia
- Re: How to map each line to (line number, line... Tom Vacek
- Re: How to map each line to (line number, ... Aureliano Buendia
- Re: How to map each line to (line number, line)? Guillaume Pitel
- Re: How to map each line to (line number, line... Aureliano Buendia
- Re: How to map each line to (line number, ... Guillaume Pitel
- Re: How to map each line to (line numb... Aureliano Buendia
- Re: How to map each line to (line... Guillaume Pitel
- Re: How to map each line to (... Aureliano Buendia
- Re: How to map each line ... Tom Vacek
- Re: How to map each line ... Guillaume Pitel
- Re: How to map each line ... Aureliano Buendia
- Re: How to map each line ... Andrew Ash
- Re: How to map each line ... Aureliano Buendia
- Re: How to map each line ... Christopher Nguyen
- Re: How to map each line ... K. Shankari
- Re: How to map each line ... Aureliano Buendia
- Re: How to map each line ... Guillaume Pitel

