[ https://issues.apache.org/jira/browse/MAPREDUCE-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy resolved MAPREDUCE-5198. -------------------------------------- Resolution: Fixed Fix Version/s: 1.2.0 I just committed this. Thanks Arpit! PS: I added a javadoc to the new ttReInit param for TT.TIP.jobHasFinished during the commit. > Race condition in cleanup during task tracker renint with LinuxTaskController > ----------------------------------------------------------------------------- > > Key: MAPREDUCE-5198 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5198 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker > Affects Versions: 1.2.0 > Reporter: Arpit Gupta > Assignee: Arpit Gupta > Fix For: 1.2.0 > > Attachments: MAPREDUCE-5198.patch > > > This was noticed when job tracker would be restarted while jobs were running > and would ask the task tracker to reinitialize. > Tasktracker would fail with an error like > {code} > 013-04-27 20:19:09,627 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred > local directories are: > /grid/0/hdp/mapred/local,/grid/1/hdp/mapred/local,/grid/2/hdp/mapred/local,/grid/3/hdp/mapred/local,/grid/4/hdp/mapred/local,/grid/5/hdp/mapred/local > 2013-04-27 20:19:09,628 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 3 on 42075 caught: java.nio.channels.ClosedChannelException > at > sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) > at org.apache.hadoop.ipc.Server.channelWrite(Server.java:1717) > at org.apache.hadoop.ipc.Server.access$2000(Server.java:98) > at > org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:744) > at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:808) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1433) > 2013-04-27 20:19:09,628 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 3 on 42075: exiting > 2013-04-27 20:19:10,414 ERROR org.apache.hadoop.mapred.TaskTracker: Got fatal > exception while reinitializing TaskTracker: > org.apache.hadoop.util.Shell$ExitCodeException: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:255) > at org.apache.hadoop.util.Shell.run(Shell.java:182) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375) > at > org.apache.hadoop.mapred.LinuxTaskController.deleteAsUser(LinuxTaskController.java:281) > at > org.apache.hadoop.mapred.TaskTracker.deleteUserDirectories(TaskTracker.java:779) > at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:816) > at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2704) > at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3934) > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira