No idea, the reason it died is higher in the log. Look for a message like "Dumping metrics" and the reason should be just a few lines higher than that.
J-D On Sun, Oct 10, 2010 at 5:13 PM, Venkatesh <[email protected]> wrote: > > Some of the region servers suddenly dying..I've pasted relevant log lines..I > don't see any error in datanodes > Any ideas? > thanks > venkatsh > > ..... > > > 2010-10-10 12:55:36,664 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer > Exception: java.io.IOException: Unable to create new block. at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2845) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288) > > 2010-10-10 12:55:36,664 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery > for block blk_-8758558338582893960_95415 bad datanode[0] nodes == null > 2010-10-10 12:55:36,665 WARN org.apache.hadoop.hdfs.DFSClient: Could not get > block locations. Source file > "/hbase_data/user_activity/compaction.dir/78194102/766401078063435042" - > Aborting... > 2010-10-10 12:55:36,666 ERROR > org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction/Split > failed for region user_activity,1286729575294_11655_614aa74e,1286729678877 > java.io.EOFException > at java.io.DataInputStream.readByte(DataInputStream.java:250) > at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) > at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) > at org.apache.hadoop.io.Text.readString(Text.java:400) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2901) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2826) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288) > 2010-10-10 12:55:40,176 INFO org.apache.hadoop.hdfs.DFSClient: Exception in > createBlockOutputStream java.io.EOFException > 2010-10-10 12:55:40,176 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning > block blk_-568910271688144725_95415 > > > > 2010-10-10 12:55:53,353 DEBUG org.apache.hadoop.hbase.regionserver.Store: > closed activities2010-10-10 12:55:53,353 INFO > org.apache.hadoop.hbase.regionserver.HRegion: Closed > user_activity,1286232613677_albridgew4_18363_c45677e1,12862335110072010-10-10 > 12:55:53,353 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: > closing region > user_activity,1286202422485_bayequip_15725_a6b7893e,12862031448812010-10-10 > 12:55:53,353 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing > user_activity,1286202422485_bayequip_15725_a6b7893e,1286203144881: disabling > compactions & flushes2010-10-10 12:55:53,353 DEBUG > org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region, no > outstanding scanners on > user_activity,1286202422485_bayequip_15725_a6b7893e,12862031448812010-10-10 > 12:55:53,353 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row > locks outstanding on region > user_activity,1286202422485_15725_a6b7893e,12862031448812010-10-10 > 12:55:53,353 DEBUG org.apache.hadoop.hbase.regionserver.Store: closed > activities2010-10-10 12:55:53,354 INFO > org.apache.hadoop.hbase.regionserver.HRegion: Closed > user_activity,1286202422485_bayequip_15725_a6b7893e,1286203144881 > 2010-10-10 12:55:53,354 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: aborting server at: > 172.29.253.200:60020 > 2010-10-10 12:55:55,091 INFO org.apache.hadoop.hbase.Leases: > regionserver/172.29.253.200:60020.leaseChecker closing leases2010-10-10 > 12:55:55,091 INFO org.apache.hadoop.hbase.Leases: > regionserver/172.29.253.200:60020.leaseChecker closed leases > 2010-10-10 12:55:59,664 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread exiting > 2010-10-10 12:55:59,664 INFO org.apache.zookeeper.ZooKeeper: Closing session: > 0x22b967dce5d0001 > 2010-10-10 12:55:59,664 INFO org.apache.zookeeper.ClientCnxn: Closing > ClientCnxn for session: 0x22b967dce5d0001 > 2010-10-10 12:55:59,669 INFO org.apache.zookeeper.ClientCnxn: Exception while > closing send thread for session 0x22b967dce5d0001 : Read error rc = -1 > java.nio.DirectByteBuffer[pos=0 lim=4 cap=4] > 2010-10-10 12:55:59,775 INFO org.apache.zookeeper.ClientCnxn: Disconnecting > ClientCnxn for session: 0x22b967dce5d0001 > 2010-10-10 12:55:59,775 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x22b967dce5d0001 closed > 2010-10-10 12:55:59,775 DEBUG > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Closed connection with > ZooKeeper > 2010-10-10 12:55:59,775 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2010-10-10 12:55:59,776 ERROR org.apache.hadoop.hdfs.DFSClient: Exception > closing file > /hbase_data/user_activity/78194102/activities/8044918410206348854 : > java.io.EOFException > java.io.EOFException at > java.io.DataInputStream.readByte(DataInputStream.java:250) at > org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) > > >
