Just hints: Repartition in 10? Get the RDD from the dataframe? What about a forEach row and send every 100? (I just did that actually)
jg > On Oct 26, 2017, at 13:37, Noorul Islam Kamal Malmiyoda <noo...@noorul.com> > wrote: > > Hi all, > > I have a Dataframe with 1000 records. I want to split them into 100 > each and post to rest API. > > If it was RDD, I could use something like this > > myRDD.foreachRDD { > rdd => > rdd.foreachPartition { > partition => { > > This will ensure that code is executed on executors and not on driver. > > Is there any similar approach that we can take for Dataframes? I see > examples on stackoverflow with collect() which will bring whole data > to driver. > > Thanks and Regards > Noorul > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org