To: lk_spark
Cc: user.spark
Subject: RE: Re:RE: how to merge dataframe write output files
Your coalesce should technically work - One thing to check would be overhead
memory. You should configure it as 10% of executor memory. Also, you might
need to increase maxResultSize. Also, the data looks fine
ect: Re:RE: how to merge dataframe write output files
thank you for reply,Shreya:
It's because the files is too small and hdfs dosen't like small file .
for your question. yes I want to create ExternalTable on the parquetfile
floder. And how to use fragmented files as you mention?
the tests case as b
thank you for reply,Shreya:
It's because the files is too small and hdfs dosen't like small file .
for your question. yes I want to create ExternalTable on the parquetfile
floder. And how to use fragmented files as you mention?
the tests case as below:
bin/spark-shell --master yarn