I'm modifying a CSV file which is inside HDFS and finally putting it back to HDFS in Spark. val fs=FileSystem.get(spark.sparkContext.hadoopConfiguration) csv_file.coalesce(1).write .format("csv”) .mode("overwrite”) .save("hdfs://localhost:8020/data/temp_insight”) Thread.sleep(15000) println(fs.exists(new Path("/data/temp_insight"))) Output:
false while I have stopped the thread for 15 sec, I have checked my hdfs using command hdfs dfs -ls /data/temp_insight Output: 18/06/08 17:48:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable -rw-r--r-- 3 abhijeet supergroup 0 2018-06-08 17:48 /data/temp_insight/_SUCCESS -rw-r--r-- 3 abhijeet supergroup 201 2018-06-08 17:48 /data/temp_insight/part-00000-7bffb826-f18d-4022-b089-da85565525b7-c000.csv To cross verify whether it is taking the path of hdfs or not I have added one more println statement in my code, providing the path which is already there in HDFS. It's showing true in that case. So, what could be the reason? Thanks, Abhijeet Kumar