We have similar needs but IIRC, I came to the conclusion that this would only work on ordered RDDs, and then you would still have to figure out which partition is the first one. I ended up deciding it would be best to just drop the header lines from a Scala iterator before creating an RDD based on it. Not sure if this was the "right" thing to do, but would that work for you?
Regards, Ethan On Mon, Apr 14, 2014 at 10:24 AM, Philip Ogren <philip.og...@oracle.com>wrote: > Has there been any thought to adding a tail() method to RDD? It would be > really handy to skip over the first item in an RDD when it contains header > information. Even better would be a drop(int) function that would allow > you to skip over several lines of header information. Our attempts to do > something equivalent with a filter() call seem a bit contorted. Any > thoughts? > > Thanks, > Philip >