How to call mapPartitions on DataFrame?

unk1102 Wed, 23 Dec 2015 09:44:34 -0800

Hi I have the following code where I use mapPartitions on RDD but then I need
to convert it into DataFrame so why do I need to convert DataFrame into RDD
and back into DataFrame for just calling mapPartitions why can I call it
directly on DataFrame?


sourceFrame.toJavaRDD().mapPartitions(new
FlatMapFunction<Iterator&lt;Row>,Row>() {

   @Override 
   public Iterable<Row>  call(Iterable<Row> rowIterator) throws Exception { 
        List rowAsList = new ArrayList<>(); 
        while(rowIterator.hasNext()) { 
          Row row = rowIterator.next();
          rowAsList = iterate(JavaConversions.seqAsJavaList(row.toSeq())); 
          Row updatedRow = RowFactory.create(rowAsList.toArray()); 
          rowAsList.add(updatedRow);
        } 
        return rowAsList; 
   } 


When I see method signature it
is.mapPartitions(scala.Function1<Iterator&lt;Row>,Iterator<R>> f,ClassTag<R>
evidence$5)

How to I map above code into dataframe.mapPartitions please guide I am new
to Spark.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-call-mapPartitions-on-DataFrame-tp25791.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

How to call mapPartitions on DataFrame?

Reply via email to