Hello,


I’m using spark 2.1.

Once a job completes, I want to write a Parquet file to, let’s say, the folder 
/user/my_user/final_path/


However, I have other jobs reading files in that specific folder, so I need 
those files to be completely written when there are in that folder.

So while the file is written, I need it to be written in a temporary location 
like /user/my_user/tmp_path/, the path of my application or any other path that 
could be temporary. Once fully written, that file can then be moved to the real 
destination folder /user/my_user/final_path/


So I was wondering, is this the default behavior? If not, did I miss an option 
to do so? I looked in the documentation and in 
org.apache.spark.sql.execution.datasources.parquet.ParquetOptions.scala but I 
can’t find any information about this.

Or else, should I save by myself to a temporary location and then move the file 
to the right location?


Any input is greatly appreciated,

Yohann

Reply via email to