Hi All, I am running jobs on cluster in my application. In one of my jobs i am getting SocketTimeOutException and job is failing. I have ran the job out of hadoop and it runs fine. But even on pseudo cluster it fails on Hadoop with following errors:
*DATANODE*: 2012-05-07 11:40:35,849 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 10.72.234.31:50010, storageID=DS-1360089189-10.72.234.31-50010-1336388587612, infoPort=50075, ipcPort=50020):Got exception while serving blk_-6429888481691427193_1002 to /10.72.234.31: 2012-05-07 12:18:23,119 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 10.72.234.31:50010, storageID=DS-1360089189-10.72.234.31-50010-1336388587612, infoPort=50075, ipcPort=50020):DataXceiver java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.72.234.31:50010 remote=/ 10.72.234.31:48438] at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246) at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:313) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:401) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:197) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:99) at java.lang.Thread.run(Thread.java:679) *NameNode*: StateChange: BLOCK* NameSystem.addToInvalidates: blk_6623152329906627130 is added to invalidSet *TaskTracker*: 2012-05-07 13:37:37,997 INFO org.apache.hadoop.mapred.TaskTracker: Process Thread Dump: lost task 25 active threads Thread 138 (process reaper): State: RUNNABLE Blocked count: 0 Waited count: 0 Stack: java.lang.UNIXProcess.waitForProcessExit(Native Method) java.lang.UNIXProcess.access$900(UNIXProcess.java:36) java.lang.UNIXProcess$1$1.run(UNIXProcess.java:148) Thread 137 (JVM Runner jvm_201205071102_0001_m_1572137851 spawned.): State: WAITING Blocked count: 1 Waited count: 2 Waiting on java.lang.UNIXProcess@fbe496b Stack: java.lang.Object.wait(Native Method) java.lang.Object.wait(Object.java:502) java.lang.UNIXProcess.waitFor(UNIXProcess.java:181) org.apache.hadoop.util.Shell.runCommand(Shell.java:244) org.apache.hadoop.util.Shell.run(Shell.java:182) org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375) org.apache.hadoop.mapred.DefaultTaskController.launchTask(DefaultTaskController.java:126) org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.runChild(JvmManager.java:472) org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.run(JvmManager.java:446) My application does lot of file and string operation. I have tried increasing open file to 16384. I am stuck with this issue.Please guide me as to what needs to be done to run this job on Hadoop. Thanks in advance. Regards: Ash