Check these links:
https://stackoverflow.com/questions/31610971/spark-repartition-vs-coalesce
https://medium.com/@mrpowers/managing-spark-partitions-with-coalesce-and-repartition-4050c57ad5c4
El dom., 5 may. 2019 a las 11:48, hemant singh ()
escribió:
> Based on size of the output data you ca
Based on size of the output data you can do the math of how many file you
will need to produce 100MB files. Once you have number of files you can do
coalesce or repartition depending on whether your job writes more or less
output partitions.
On Sun, 5 May 2019 at 2:21 PM, rajat kumar
wrote:
> Hi
Hi All,
My spark sql job produces output as per default partition and creates N
number of files.
I want to create each file as 100Mb sized in the final result.
how can I do it ?
thanks
rajat