> Is there any way to save it as raw_csv file as we do in pandas? I have a
I did write such a function for scala. Please take a look at
https://github.com/EDS-APHP/spark-etl/blob/master/spark-csv/src/main/scala/CSVTool.scala
see writeCsvToLocal
it first writes csv to hdfs, and then fetches
How costly is it for you, to move files after generating them with Spark?
File systems tend to just update some links under the hood.
*Yohann Jardin*
Le 2/22/2020 à 11:47 AM, Kshitij a écrit :
That's the alternative ofcourse. But that is costly when we are
dealing with bunch of files.
That's the alternative ofcourse. But that is costly when we are dealing
with bunch of files.
Thanks.
On Sat, Feb 22, 2020, 4:15 PM Sebastian Piu wrote:
> I'm not aware of a way to specify the file name on the writer.
> Since you'd need to bring all the data into a single node and write from
>
I'm not aware of a way to specify the file name on the writer.
Since you'd need to bring all the data into a single node and write from
there to get a single file out you could simple move/rename the file that
spark creates or write the csv yourself with your library of preference?
On Sat, 22 Feb
Is there any way to save it as raw_csv file as we do in pandas? I have a
script that uses the CSV file for further processing.
On Sat, 22 Feb 2020 at 14:31, rahul c wrote:
> Hi Kshitij,
>
> There are option to suppress the metadata files from get created.
> Set the below properties and try.
>
>