How to reduce number of tasks and partitions in Spark job?

Md. Rezaul Karim Thu, 26 Jan 2017 09:13:53 -0800

Hi All,

When I run a Spark job on my local machine (having 8 cores and 16GB of RAM)
on an input data of 6.5GB, it creates 193 parallel tasks and put
the output into 193 partitions.


How can I change the number of tasks and consequently, the number of output
files - say to just one or less?





Regards,
_________________________________
*Md. Rezaul Karim*, BSc, MSc
PhD Researcher, INSIGHT Centre for Data Analytics
National University of Ireland, Galway
IDA Business Park, Dangan, Galway, Ireland
Web: http://www.reza-analytics.eu/index.html
<http://139.59.184.114/index.html>

How to reduce number of tasks and partitions in Spark job?

Reply via email to