Hi,
I do have this basic question about hadoop configuration. Whenever I try to
start the jobtracker it will remain in "initializing" mode forever, and when I
checked the log file, I found the following errors:
several lines like these for different slaves in my cluster:
2009-12-17 17:47:43,717 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream java.net.SocketTimeoutException: 66000 millis timeout
while waiting for channel to be ready for connect. ch :
java.nio.channels.SocketChannel[connection-pending
remote=/XXX.XXX.XXX.XXX:50010]
2009-12-17 17:47:43,717 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_7740448897934265604_1010
2009-12-17 17:47:43,720 INFO org.apache.hadoop.hdfs.DFSClient: Waiting to find
target node: XXX.XXX.XXX.XXX:50010
then
2009-12-17 17:47:49,727 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
Exception: java.io.IOException: Unable to create new block.
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2812)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)
2009-12-17 17:47:49,728 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery
for block blk_7740448897934265604_1010 bad datanode[0] nodes == null
2009-12-17 17:47:49,728 WARN org.apache.hadoop.hdfs.DFSClient: Could not get
block locations. Source file
"${mapred.system.dir}/mapred/system/jobtracker.info" - Aborting...
2009-12-17 17:47:49,728 WARN org.apache.hadoop.mapred.JobTracker: Writing to
file ${fs.default.name}/${mapred.system.dir}/mapred/system/jobtracker.info
failed!
2009-12-17 17:47:49,728 WARN org.apache.hadoop.mapred.JobTracker: FileSystem is
not ready yet!
2009-12-17 17:47:49,749 WARN org.apache.hadoop.mapred.JobTracker: Failed to
initialize recovery manager.
java.net.SocketTimeoutException: 66000 millis timeout while waiting for channel
to be ready for connect.. ch :
java.nio.channels.SocketChannel[connection-pending
remote=/XXX.XXX..XXX.XXX:50010]
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:213)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2837)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2793)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)
2009-12-17 17:47:59,757 WARN org.apache.hadoop.mapred.JobTracker: Retrying...
then it will start all over again.
I am not sure what is the reason for this error. I tried to set
mapred.system.dir to leave it to the default value, and overwriting it in
mapred-site.xml to both local and shared directories but no use. In all cases
the this error will show in the log file: Writing to file
${fs.default.name}/${mapred.system.dir}/mapred/system/jobtracker.info failed!
Is it true that hadoop append these values together? What should I do to avoid
this? Does anyone know what I am doing wrong or what could be causing these
errors?
Thanks