Some of the region servers suddenly dying..I've pasted relevant log lines..I don't see any error in datanodes Any ideas? thanks venkatsh
..... 2010-10-10 12:55:36,664 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block. at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2845) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288) 2010-10-10 12:55:36,664 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_-8758558338582893960_95415 bad datanode[0] nodes == null 2010-10-10 12:55:36,665 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file "/hbase_data/user_activity/compaction.dir/78194102/766401078063435042" - Aborting... 2010-10-10 12:55:36,666 ERROR org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction/Split failed for region user_activity,1286729575294_11655_614aa74e,1286729678877 java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.io.Text.readString(Text.java:400) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2901) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2826) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288) 2010-10-10 12:55:40,176 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException 2010-10-10 12:55:40,176 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-568910271688144725_95415 2010-10-10 12:55:53,353 DEBUG org.apache.hadoop.hbase.regionserver.Store: closed activities2010-10-10 12:55:53,353 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed user_activity,1286232613677_albridgew4_18363_c45677e1,12862335110072010-10-10 12:55:53,353 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: closing region user_activity,1286202422485_bayequip_15725_a6b7893e,12862031448812010-10-10 12:55:53,353 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing user_activity,1286202422485_bayequip_15725_a6b7893e,1286203144881: disabling compactions & flushes2010-10-10 12:55:53,353 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region, no outstanding scanners on user_activity,1286202422485_bayequip_15725_a6b7893e,12862031448812010-10-10 12:55:53,353 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row locks outstanding on region user_activity,1286202422485_15725_a6b7893e,12862031448812010-10-10 12:55:53,353 DEBUG org.apache.hadoop.hbase.regionserver.Store: closed activities2010-10-10 12:55:53,354 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed user_activity,1286202422485_bayequip_15725_a6b7893e,1286203144881 2010-10-10 12:55:53,354 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: aborting server at: 172.29.253.200:60020 2010-10-10 12:55:55,091 INFO org.apache.hadoop.hbase.Leases: regionserver/172.29.253.200:60020.leaseChecker closing leases2010-10-10 12:55:55,091 INFO org.apache.hadoop.hbase.Leases: regionserver/172.29.253.200:60020.leaseChecker closed leases 2010-10-10 12:55:59,664 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread exiting 2010-10-10 12:55:59,664 INFO org.apache.zookeeper.ZooKeeper: Closing session: 0x22b967dce5d0001 2010-10-10 12:55:59,664 INFO org.apache.zookeeper.ClientCnxn: Closing ClientCnxn for session: 0x22b967dce5d0001 2010-10-10 12:55:59,669 INFO org.apache.zookeeper.ClientCnxn: Exception while closing send thread for session 0x22b967dce5d0001 : Read error rc = -1 java.nio.DirectByteBuffer[pos=0 lim=4 cap=4] 2010-10-10 12:55:59,775 INFO org.apache.zookeeper.ClientCnxn: Disconnecting ClientCnxn for session: 0x22b967dce5d0001 2010-10-10 12:55:59,775 INFO org.apache.zookeeper.ZooKeeper: Session: 0x22b967dce5d0001 closed 2010-10-10 12:55:59,775 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Closed connection with ZooKeeper 2010-10-10 12:55:59,775 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2010-10-10 12:55:59,776 ERROR org.apache.hadoop.hdfs.DFSClient: Exception closing file /hbase_data/user_activity/78194102/activities/8044918410206348854 : java.io.EOFException java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)