JobTracker shuts down during initialization if the NameNode is down
-------------------------------------------------------------------

                 Key: HADOOP-3027
                 URL: https://issues.apache.org/jira/browse/HADOOP-3027
             Project: Hadoop Core
          Issue Type: Bug
          Components: dfs
            Reporter: Amareshwari Sriramadasu
            Priority: Blocker
             Fix For: 0.17.0


When the JobTracker is initializing and trying to connect to the NameNode, it 
shuts itself down if the NameNode is unreachable for more than one iteration of 
the connect loop. It can be easily reproduced if the JobTracker is started 
before the NameNode is started. The JobTracker will shut itself down in a few 
seconds. The problem seems to be with adding a shutdown hook in the FileSystem 
in the case where the same hook has been added before.

2008-03-17 09:45:20,979 INFO org.apache.hadoop.mapred.JobTracker: JobTracker up 
at: 9101
2008-03-17 09:45:20,979 INFO org.apache.hadoop.mapred.JobTracker: JobTracker 
webserver: 50030
2008-03-17 09:45:21,374 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: localhost/127.0.0.1:9000. Already tried 1 time(s).
2008-03-17 09:45:22,377 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: localhost/127.0.0.1:9000. Already tried 2 time(s).
2008-03-17 09:45:23,380 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: localhost/127.0.0.1:9000. Already tried 3 time(s).
2008-03-17 09:45:24,383 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: localhost/127.0.0.1:9000. Already tried 4 time(s).
2008-03-17 09:45:25,385 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: localhost/127.0.0.1:9000. Already tried 5 time(s).
2008-03-17 09:45:26,388 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: localhost/127.0.0.1:9000. Already tried 6 time(s).
2008-03-17 09:45:27,391 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: localhost/127.0.0.1:9000. Already tried 7 time(s).
2008-03-17 09:45:28,394 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: localhost/127.0.0.1:9000. Already tried 8 time(s).
2008-03-17 09:45:29,397 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: localhost/127.0.0.1:9000. Already tried 9 time(s).
2008-03-17 09:45:30,402 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: localhost/127.0.0.1:9000. Already tried 10 time(s).
2008-03-17 09:45:31,406 INFO org.apache.hadoop.mapred.JobTracker: problem 
cleaning system directory: /tmp/hadoop/mapred/system
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
        at sun.nio.ch.SocketAdaptor.connect(Unknown Source)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:174)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:623)
        at org.apache.hadoop.ipc.Client.call(Client.java:546)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:211)
        at org.apache.hadoop.dfs.$Proxy4.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:312)
        at org.apache.hadoop.dfs.DFSClient.createRPCNamenode(DFSClient.java:94)
        at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:158)
        at 
org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:69)
        at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1255)
        at org.apache.hadoop.fs.FileSystem.access$400(FileSystem.java:53)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1272)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:191)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:96)
        at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:702)
        at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:135)
        at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2266)
2008-03-17 09:45:41,410 FATAL org.apache.hadoop.mapred.JobTracker: 
java.lang.IllegalArgumentException: Hook previously registered
        at java.lang.ApplicationShutdownHooks.add(Unknown Source)
        at java.lang.Runtime.addShutdownHook(Unknown Source)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1269)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:191)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:96)
        at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:702)
        at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:135)
        at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2266)

2008-03-17 09:45:41,412 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to