Hi All, When I run a Spark job on my local machine (having 8 cores and 16GB of RAM) on an input data of 6.5GB, it creates 193 parallel tasks and put the output into 193 partitions.
How can I change the number of tasks and consequently, the number of output files - say to just one or less? Regards, _________________________________ *Md. Rezaul Karim*, BSc, MSc PhD Researcher, INSIGHT Centre for Data Analytics National University of Ireland, Galway IDA Business Park, Dangan, Galway, Ireland Web: http://www.reza-analytics.eu/index.html <http://139.59.184.114/index.html>