We have similar needs but IIRC, I came to the conclusion that this would
only work on ordered RDDs, and then you would still have to figure out
which partition is the first one. I ended up deciding it would be best to
just drop the header lines from a Scala iterator before creating an RDD
based on it. Not sure if this was the "right" thing to do, but would that
work for you?

Regards,
Ethan


On Mon, Apr 14, 2014 at 10:24 AM, Philip Ogren <philip.og...@oracle.com>wrote:

> Has there been any thought to adding a tail() method to RDD?  It would be
> really handy to skip over the first item in an RDD when it contains header
> information.  Even better would be a drop(int) function that would allow
> you to skip over several lines of header information.  Our attempts to do
> something equivalent with a filter() call seem a bit contorted.  Any
> thoughts?
>
> Thanks,
> Philip
>

Reply via email to