Change mapreduce.reduce.shuffle.connect.timeout, mapreduce.reduce.shuffle.read.timeout. By default they are 180000.
On 8/20/15, manoj <[email protected]> wrote: > Hello all, > > I'm running Apache2.6.0. > I'm trying to remove a node from a Hadoop Cluster and the add it back. > The taskattempts on the node which was removed are rescheduled only after > 30min. > > During this 30min period looks like the App Master is trying to connect( > check the log below ) the same node which was removed and after about 30min > it reschedules those taskAttempts from the lost node and eventually the job > succeeds. > > how can I reduce the 30min wait time? > > ..... > ...... > 2015-08-14 11:25:21,662 INFO [ContainerLauncher #7] > org.apache.hadoop.ipc.Client: Retrying connect to server: > host172/XX.XX.XX.XX:36158. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > ...... > ...... > > Thanks > --Manoj Kumar M >
