You shouldn't even need the index.

Just:

data.mapPartitions(_.drop(1))

should work, I think.


On Wed, Sep 25, 2013 at 1:52 AM, Michael Kun Yang <[email protected]>wrote:

> thank you! But can you explain in more detail? I only want to skip the
> first line, not the whole block.
>
>
> On Tue, Sep 24, 2013 at 8:54 PM, Jason Lenderman <[email protected]>wrote:
>
>> Perhaps you could use mapPartitionsWithIndex to do this.
>>
>>
>> On Tue, Sep 24, 2013 at 4:52 PM, Michael Kun Yang 
>> <[email protected]>wrote:
>>
>>> Spark's filter can do this job, but it need to scan very line (row). Is
>>> there a way to just skip the first line in the file?
>>>
>>> any feedback?
>>>
>>>
>>> On Tue, Sep 24, 2013 at 4:14 PM, Michael Kun Yang 
>>> <[email protected]>wrote:
>>>
>>>> Dataframes usually have headers in the first row, how can I avoid
>>>> reading the first row?
>>>> I know in hadoop, I can figure it out by the line number.
>>>>
>>>> Best
>>>>
>>>
>>>
>>
>


-- 
Nathan Kronenfeld
Senior Visualization Developer
Oculus Info Inc
2 Berkeley Street, Suite 600,
Toronto, Ontario M5A 4J5
Phone:  +1-416-203-3003 x 238
Email:  [email protected]

Reply via email to