Perhaps you could use mapPartitionsWithIndex to do this.

On Tue, Sep 24, 2013 at 4:52 PM, Michael Kun Yang <[email protected]>wrote:

> Spark's filter can do this job, but it need to scan very line (row). Is
> there a way to just skip the first line in the file?
>
> any feedback?
>
>
> On Tue, Sep 24, 2013 at 4:14 PM, Michael Kun Yang <[email protected]>wrote:
>
>> Dataframes usually have headers in the first row, how can I avoid reading
>> the first row?
>> I know in hadoop, I can figure it out by the line number.
>>
>> Best
>>
>
>

Reply via email to