Perhaps you could use mapPartitionsWithIndex to do this.
On Tue, Sep 24, 2013 at 4:52 PM, Michael Kun Yang <[email protected]>wrote: > Spark's filter can do this job, but it need to scan very line (row). Is > there a way to just skip the first line in the file? > > any feedback? > > > On Tue, Sep 24, 2013 at 4:14 PM, Michael Kun Yang <[email protected]>wrote: > >> Dataframes usually have headers in the first row, how can I avoid reading >> the first row? >> I know in hadoop, I can figure it out by the line number. >> >> Best >> > >
