I'm modifying a CSV file which is inside HDFS and finally putting it back to 
HDFS in Spark.
val fs=FileSystem.get(spark.sparkContext.hadoopConfiguration)
csv_file.coalesce(1).write
  .format("csv”)
  .mode("overwrite”)
  .save("hdfs://localhost:8020/data/temp_insight”)
Thread.sleep(15000)
println(fs.exists(new Path("/data/temp_insight")))
Output:

false
while I have stopped the thread for 15 sec, I have checked my hdfs using command

hdfs dfs -ls /data/temp_insight
Output:

18/06/08 17:48:18 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
-rw-r--r--   3 abhijeet supergroup          0 2018-06-08 17:48 
/data/temp_insight/_SUCCESS
-rw-r--r--   3 abhijeet supergroup        201 2018-06-08 17:48 
/data/temp_insight/part-00000-7bffb826-f18d-4022-b089-da85565525b7-c000.csv
To cross verify whether it is taking the path of hdfs or not I have added one 
more println statement in my code, providing the path which is already there in 
HDFS. It's showing true in that case.

So, what could be the reason?

Thanks,

Abhijeet Kumar

Reply via email to