Re: App Master takes ~30min to re-schedule task attempts.

Susheel Kumar Gadalay Wed, 19 Aug 2015 23:05:42 -0700

Change mapreduce.reduce.shuffle.connect.timeout,
mapreduce.reduce.shuffle.read.timeout.
By default they are 180000.


On 8/20/15, manoj <[email protected]> wrote:
> Hello all,
>
> I'm running Apache2.6.0.
> I'm trying to remove a node from a Hadoop Cluster and the add it back.
> The taskattempts on the node which was removed are rescheduled only after
> 30min.
>
> During this 30min period looks like the App Master is trying to connect(
> check the log below ) the same node which was removed and after about 30min
> it reschedules those taskAttempts from the lost node and eventually the job
> succeeds.
>
> how can I reduce the 30min wait time?
>
> .....
> ......
> 2015-08-14 11:25:21,662 INFO [ContainerLauncher #7]
> org.apache.hadoop.ipc.Client: Retrying connect to server:
> host172/XX.XX.XX.XX:36158. Already tried 0 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
> MILLISECONDS)
> ......
> ......
>
> Thanks
> --Manoj Kumar M
>

Re: App Master takes ~30min to re-schedule task attempts.

Reply via email to