What is the format?
________________________________ From: KhajaAsmath Mohammed <mdkhajaasm...@gmail.com> Sent: Thursday, December 15, 2016 7:54:27 PM To: user @spark Subject: Spark Dataframe: Save to hdfs is taking long time Hi, I am using issue while saving the dataframe back to HDFS. It's taking long time to run. val results_dataframe = sqlContext.sql("select gt.*,ct.* from PredictTempTable pt,ClusterTempTable ct,GamificationTempTable gt where gt.vin=pt.vin and pt.cluster=ct.cluster") results_dataframe.coalesce(numPartitions) results_dataframe.persist(StorageLevel.MEMORY_AND_DISK) dataFrame.write.mode(saveMode).format(format) .option(Codec, compressCodec) //"org.apache.hadoop.io.compress.snappyCodec" .save(outputPath) It was taking long time and total number of records for this dataframe is 4903764 I even increased number of partitions from 10 to 20, still no luck. Can anyone help me in resolving this performance issue Thanks, Asmath