Can I reduce shuffling time?

Austin Chungath Tue, 13 Mar 2012 05:25:17 -0700

Hi,
I am running a pig query on around 500 GB input data.
The current block size is 128 MB and split size is the default 128 MB.
I have also specified 16 reducers and around 3800 mappers are running.


Now I observe that shuffling is taking a long time to complete execution,
approximately 25 mins per job.

Can anyone suggest how I can bring down the shuffling time? Is there any
property that I can tweak to improve performance?

Thanks & Regards,
Austin

Can I reduce shuffling time?

Reply via email to