Re: removing first record from RDD[String]

Jörg Schad Tue, 23 Dec 2014 08:55:04 -0800

Hi,
maybe the drop function is helpful for you (even though this is probably
more than you need, still interesting read)
http://erikerlandson.github.io/blog/2014/07/27/some-implications-of-supporting-the-scala-drop-method-for-spark-rdds/


Joerg

On Tue, Dec 23, 2014 at 5:45 PM, Hao Ren <inv...@gmail.com> wrote:

> Hi,
>
> I guess you would like to remove the header of a CSV file.
>
> You can play with partitions. =)
>
> // src is your RDD
> val noHeader = src.mapPartitionsWithIndex(
> (i, iterator) =>
>     if (i == 0 && iterator.hasNext) {
>        iterator.next
>        iterator
>     } else iterator)
>
> Thus, you don't need to filter on the whole RDD. Good luck.
>
> Hao
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/removing-first-record-from-RDD-String-tp20834p20836.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: removing first record from RDD[String]

Reply via email to