Re: region server problem

Slava Gorelik Wed, 08 Oct 2008 14:35:55 -0700

HI.I'm also encountering error like this.
I'm using Hbase 0.18.0 an Hadoop 0.18.0.
I addition to this error, i'm getting that sometimes region servers are
died, in the log i see region server shutdown, after starting compaction,
because that some data blocks are not found.


Best Regards.

On Wed, Oct 8, 2008 at 11:29 PM, stack <[EMAIL PROTECTED]> wrote:

> You should update to 0.2.1 if you can.  Make sure you've upped your file
> descriptors too:  See http://wiki.apache.org/hadoop/Hbase/FAQ#6.  Also see
> how to enable DEBUG in same FAQ.
>
> Something odd is up when you see messages like this out of HDFS: ': No live
> nodes contain current block*'.  Thats lost data.
>
> Or messages like this, 'compaction completed on region
> search1,r3_1_3_c157476,1223360357528 in 18mins, 39sec' -- i.e. that
> compactions are taking so long -- would seem to indicate your machines are
> severly overloaded or underpowered or both.  Can you study load when the
> upload is running on these machines?  Perhaps try  throttling back to see if
> hbase survives longer?
>
> The regionserver will output thread dump in its RPC layer if critical error
> -- OOME -- or its been hung up for a long time IIRC.
>
> Check the '.out' logs too for you hbase install to see if they contain any
> errors.  Grep the datanode logs too for OOME or "too many open file
> handles".
>
> St.Ack
>
> Rui Xing wrote:
>
>> Hi All,
>>
>> 1). We are doing performance testing on hbase. The environment of the
>> testing is 3 data nodes, and 1 name node distributed on 4 machines. We
>> started one region server on each data node respectively. To insert the
>> data, one insertion client is started on each data node machine. But as
>> the
>> data inserted, the region servers crashed one by one. One of the reasons
>> is
>> listed as follows:
>>
>> *==>
>> 2008-10-07 14:47:01,519 WARN org.apache.hadoop.dfs.DFSClient: Exception
>> while reading from blk_-806310822584979460 of
>> /hbase/search1/1201761134/col9/mapfiles/3578469984425427480/data from
>> 10.2.6.102:50010: java.io.IOException: Premeture EOF from inputStream*
>>
>> ... ...
>>
>> *2008-10-07 14:47:01,521 INFO org.apache.hadoop.dfs.DFSClient: Could not
>> obtain block blk_-806310822584979460 from any node:
>>  java.io.IOExceptionYou
>>
>> 2008-10-07 14:52:25,229 INFO org.apache.hadoop.hbase.regionserver.HRegion:
>> compaction completed on region search1,r3_1_3_c157476,1223360357528 in
>> 18mins, 39sec
>> 2008-10-07 14:52:25,238 INFO
>> org.apache.hadoop.hbase.regionserver.CompactSplitThread:
>> regionserver/0.0.0.0:60020.compactor exiting
>> 2008-10-07 14:52:25,284 INFO org.apache.hadoop.hbase.regionserver.HRegion:
>> closed search1,r3_1_3_c157476,1223360357528
>> 2008-10-07 14:52:25,291 INFO org.apache.hadoop.hbase.regionserver.HRegion:
>> closed -ROOT-,,0
>> 2008-10-07 14:52:25,291 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: aborting server at:
>> 10.2.6.104:60020
>> 2008-10-07 14:52:25,291 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver/
>> 0.0.0.0:60020 exiting
>> 2008-10-07 14:52:25,511 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown
>> thread.
>> 2008-10-07 14:52:25,511 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread
>> complete
>> ===<
>>
>> 2). Another question is, under what circunstance will the region server
>> print logs of the thread information as below? It appears among the normal
>> log records.
>> ===>
>> 35 active threads
>> Thread 1281 (IPC Client connection to
>> d3v1.corp.alimama.com/10.2.6.101:54310
>> ):
>>  State: RUNNABLE
>>  Blocked count: 0
>>  Waited count: 0
>>  Stack:
>>    java.util.Hashtable.remove(Hashtable.java:435)
>>    org.apache.hadoop.ipc.Client$Connection.run(Client.java:297)
>> ... ...
>> ===<
>>
>> We use hadoop 0.17.1 and hbase 0.2.0. It would be greatly appreciated if
>> any
>> clues can be dropped.
>>
>> Regards,
>> -Ray
>>
>>
>>
>
>

Re: region server problem

Reply via email to