Hi I have the following code where I use mapPartitions on RDD but then I need to convert it into DataFrame so why do I need to convert DataFrame into RDD and back into DataFrame for just calling mapPartitions why can I call it directly on DataFrame?
sourceFrame.toJavaRDD().mapPartitions(new FlatMapFunction<Iterator<Row>,Row>() { @Override public Iterable<Row> call(Iterable<Row> rowIterator) throws Exception { List rowAsList = new ArrayList<>(); while(rowIterator.hasNext()) { Row row = rowIterator.next(); rowAsList = iterate(JavaConversions.seqAsJavaList(row.toSeq())); Row updatedRow = RowFactory.create(rowAsList.toArray()); rowAsList.add(updatedRow); } return rowAsList; } When I see method signature it is.mapPartitions(scala.Function1<Iterator<Row>,Iterator<R>> f,ClassTag<R> evidence$5) How to I map above code into dataframe.mapPartitions please guide I am new to Spark. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-call-mapPartitions-on-DataFrame-tp25791.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org