Just hints: Repartition in 10? Get the RDD from the dataframe?

What about a forEach row and send every 100? (I just did that actually)

jg


> On Oct 26, 2017, at 13:37, Noorul Islam Kamal Malmiyoda <noo...@noorul.com> 
> wrote:
> 
> Hi all,
> 
> I have a Dataframe with 1000 records. I want to split them into 100
> each and post to rest API.
> 
> If it was RDD, I could use something like this
> 
>    myRDD.foreachRDD {
>      rdd =>
>        rdd.foreachPartition {
>          partition => {
> 
> This will ensure that code is executed on executors and not on driver.
> 
> Is there any similar approach that we can take for Dataframes? I see
> examples on stackoverflow with collect() which will bring whole data
> to driver.
> 
> Thanks and Regards
> Noorul
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to