Hi, maybe the drop function is helpful for you (even though this is probably more than you need, still interesting read) http://erikerlandson.github.io/blog/2014/07/27/some-implications-of-supporting-the-scala-drop-method-for-spark-rdds/
Joerg On Tue, Dec 23, 2014 at 5:45 PM, Hao Ren <inv...@gmail.com> wrote: > Hi, > > I guess you would like to remove the header of a CSV file. > > You can play with partitions. =) > > // src is your RDD > val noHeader = src.mapPartitionsWithIndex( > (i, iterator) => > if (i == 0 && iterator.hasNext) { > iterator.next > iterator > } else iterator) > > Thus, you don't need to filter on the whole RDD. Good luck. > > Hao > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/removing-first-record-from-RDD-String-tp20834p20836.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >