Hello all, I'm running Apache2.6.0. I'm trying to remove a node from a Hadoop Cluster and the add it back. The taskattempts on the node which was removed are rescheduled only after 30min.
During this 30min period looks like the App Master is trying to connect( check the log below ) the same node which was removed and after about 30min it reschedules those taskAttempts from the lost node and eventually the job succeeds. how can I reduce the 30min wait time? ..... ...... 2015-08-14 11:25:21,662 INFO [ContainerLauncher #7] org.apache.hadoop.ipc.Client: Retrying connect to server: host172/XX.XX.XX.XX:36158. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) ...... ...... Thanks --Manoj Kumar M
