Re: Does dataframe spark API write/create a single file instead of directory as a result of write operation.

2020-02-22 Thread Nicolas PARIS
> Is there any way to save it as raw_csv file as we do in pandas? I have a I did write such a function for scala. Please take a look at https://github.com/EDS-APHP/spark-etl/blob/master/spark-csv/src/main/scala/CSVTool.scala see writeCsvToLocal it first writes csv to hdfs, and then fetches

Re: Does dataframe spark API write/create a single file instead of directory as a result of write operation.

2020-02-22 Thread JARDIN Yohann
How costly is it for you, to move files after generating them with Spark? File systems tend to just update some links under the hood. *Yohann Jardin* Le 2/22/2020 à 11:47 AM, Kshitij a écrit : That's the alternative ofcourse. But that is costly when we are dealing with bunch of files.

Re: Does dataframe spark API write/create a single file instead of directory as a result of write operation.

2020-02-22 Thread Kshitij
That's the alternative ofcourse. But that is costly when we are dealing with bunch of files. Thanks. On Sat, Feb 22, 2020, 4:15 PM Sebastian Piu wrote: > I'm not aware of a way to specify the file name on the writer. > Since you'd need to bring all the data into a single node and write from >

Re: Does dataframe spark API write/create a single file instead of directory as a result of write operation.

2020-02-22 Thread Sebastian Piu
I'm not aware of a way to specify the file name on the writer. Since you'd need to bring all the data into a single node and write from there to get a single file out you could simple move/rename the file that spark creates or write the csv yourself with your library of preference? On Sat, 22 Feb

Re: Does dataframe spark API write/create a single file instead of directory as a result of write operation.

2020-02-22 Thread Kshitij
Is there any way to save it as raw_csv file as we do in pandas? I have a script that uses the CSV file for further processing. On Sat, 22 Feb 2020 at 14:31, rahul c wrote: > Hi Kshitij, > > There are option to suppress the metadata files from get created. > Set the below properties and try. > >