HI.I'm also encountering error like this. I'm using Hbase 0.18.0 an Hadoop 0.18.0. I addition to this error, i'm getting that sometimes region servers are died, in the log i see region server shutdown, after starting compaction, because that some data blocks are not found.
Best Regards. On Wed, Oct 8, 2008 at 11:29 PM, stack <[EMAIL PROTECTED]> wrote: > You should update to 0.2.1 if you can. Make sure you've upped your file > descriptors too: See http://wiki.apache.org/hadoop/Hbase/FAQ#6. Also see > how to enable DEBUG in same FAQ. > > Something odd is up when you see messages like this out of HDFS: ': No live > nodes contain current block*'. Thats lost data. > > Or messages like this, 'compaction completed on region > search1,r3_1_3_c157476,1223360357528 in 18mins, 39sec' -- i.e. that > compactions are taking so long -- would seem to indicate your machines are > severly overloaded or underpowered or both. Can you study load when the > upload is running on these machines? Perhaps try throttling back to see if > hbase survives longer? > > The regionserver will output thread dump in its RPC layer if critical error > -- OOME -- or its been hung up for a long time IIRC. > > Check the '.out' logs too for you hbase install to see if they contain any > errors. Grep the datanode logs too for OOME or "too many open file > handles". > > St.Ack > > Rui Xing wrote: > >> Hi All, >> >> 1). We are doing performance testing on hbase. The environment of the >> testing is 3 data nodes, and 1 name node distributed on 4 machines. We >> started one region server on each data node respectively. To insert the >> data, one insertion client is started on each data node machine. But as >> the >> data inserted, the region servers crashed one by one. One of the reasons >> is >> listed as follows: >> >> *==> >> 2008-10-07 14:47:01,519 WARN org.apache.hadoop.dfs.DFSClient: Exception >> while reading from blk_-806310822584979460 of >> /hbase/search1/1201761134/col9/mapfiles/3578469984425427480/data from >> 10.2.6.102:50010: java.io.IOException: Premeture EOF from inputStream* >> >> ... ... >> >> *2008-10-07 14:47:01,521 INFO org.apache.hadoop.dfs.DFSClient: Could not >> obtain block blk_-806310822584979460 from any node: >> java.io.IOExceptionYou >> >> 2008-10-07 14:52:25,229 INFO org.apache.hadoop.hbase.regionserver.HRegion: >> compaction completed on region search1,r3_1_3_c157476,1223360357528 in >> 18mins, 39sec >> 2008-10-07 14:52:25,238 INFO >> org.apache.hadoop.hbase.regionserver.CompactSplitThread: >> regionserver/0.0.0.0:60020.compactor exiting >> 2008-10-07 14:52:25,284 INFO org.apache.hadoop.hbase.regionserver.HRegion: >> closed search1,r3_1_3_c157476,1223360357528 >> 2008-10-07 14:52:25,291 INFO org.apache.hadoop.hbase.regionserver.HRegion: >> closed -ROOT-,,0 >> 2008-10-07 14:52:25,291 INFO >> org.apache.hadoop.hbase.regionserver.HRegionServer: aborting server at: >> 10.2.6.104:60020 >> 2008-10-07 14:52:25,291 INFO >> org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver/ >> 0.0.0.0:60020 exiting >> 2008-10-07 14:52:25,511 INFO >> org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown >> thread. >> 2008-10-07 14:52:25,511 INFO >> org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread >> complete >> ===< >> >> 2). Another question is, under what circunstance will the region server >> print logs of the thread information as below? It appears among the normal >> log records. >> ===> >> 35 active threads >> Thread 1281 (IPC Client connection to >> d3v1.corp.alimama.com/10.2.6.101:54310 >> ): >> State: RUNNABLE >> Blocked count: 0 >> Waited count: 0 >> Stack: >> java.util.Hashtable.remove(Hashtable.java:435) >> org.apache.hadoop.ipc.Client$Connection.run(Client.java:297) >> ... ... >> ===< >> >> We use hadoop 0.17.1 and hbase 0.2.0. It would be greatly appreciated if >> any >> clues can be dropped. >> >> Regards, >> -Ray >> >> >> > >
