You shouldn't even need the index. Just:
data.mapPartitions(_.drop(1)) should work, I think. On Wed, Sep 25, 2013 at 1:52 AM, Michael Kun Yang <[email protected]>wrote: > thank you! But can you explain in more detail? I only want to skip the > first line, not the whole block. > > > On Tue, Sep 24, 2013 at 8:54 PM, Jason Lenderman <[email protected]>wrote: > >> Perhaps you could use mapPartitionsWithIndex to do this. >> >> >> On Tue, Sep 24, 2013 at 4:52 PM, Michael Kun Yang >> <[email protected]>wrote: >> >>> Spark's filter can do this job, but it need to scan very line (row). Is >>> there a way to just skip the first line in the file? >>> >>> any feedback? >>> >>> >>> On Tue, Sep 24, 2013 at 4:14 PM, Michael Kun Yang >>> <[email protected]>wrote: >>> >>>> Dataframes usually have headers in the first row, how can I avoid >>>> reading the first row? >>>> I know in hadoop, I can figure it out by the line number. >>>> >>>> Best >>>> >>> >>> >> > -- Nathan Kronenfeld Senior Visualization Developer Oculus Info Inc 2 Berkeley Street, Suite 600, Toronto, Ontario M5A 4J5 Phone: +1-416-203-3003 x 238 Email: [email protected]
