You'll see this if the server reports to the master after the master
has ruled it 'dead'.

Here is the code that produces the exception:

    if (!isDead(serverName)) return;
    String message = "Server " + what + " rejected; currently processing " +
      serverName + " as dead server";
    LOG.debug(message);
    throw new YouAreDeadException(message);

Servers are on the 'dead' list if zk reports their session has
expired.  The master moves then to cleanup after the dead server and
process its logs.  If during this cleanup time the server reports in,
master will return the youaredead exception.

Usually the RS has lost its zk session but has yet to realize it.

St.Ack

On Thu, Jul 15, 2010 at 11:52 PM, Jinsong Hu <[email protected]> wrote:
> Hi, There:
>  I got some YouAreDeadException  with hbase. what can cause it ? I do notice
> between 5:49 to 5:53 ,
> for 4 minutes, there is no log. This doesn't look like GC issue as I checked
> the GC log, the longest GC
> is only 9.6 seconds.
>
> Jimmy.
>
>
> 2010-07-16 05:49:26,805 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Ca
> che Stats: Sizes: Total=3.355194MB (3518176), Free=405.4198MB (425113472),
> Max=4
> 08.775MB (428631648), Counts: Blocks=1, Access=2178914, Hit=1034,
> Miss=2177880,
> Evictions=0, Evicted=0, Ratios: Hit Ratio=0.04745483165606856%, Miss
> Ratio=99.95
> 254278182983%, Evicted/Run=NaN
> 2010-07-16 05:53:23,476 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Ca
> che Stats: Sizes: Total=3.355194MB (3518176), Free=405.4198MB (425113472),
> Max=4
> 08.775MB (428631648), Counts: Blocks=1, Access=2178915, Hit=1035,
> Miss=2177880,
> Evictions=0, Evicted=0, Ratios: Hit Ratio=0.04750070511363447%, Miss
> Ratio=99.95
> 250105857849%, Evicted/Run=NaN
>
> ....
> 2010-07-16 05:53:26,171 INFO org.apache.zookeeper.ClientCnxn: Client session
> tim
> ed out, have not heard from server in 240540ms for sessionid
> 0x329c88039b0006c,
> closing socket connection and attempting reconnect
> 2010-07-16 05:53:27,333 INFO org.apache.zookeeper.ClientCnxn: Opening socket
> con
> nection to server t-zookeeper2.cloud.ppops.net/10.110.24.57:2181
> 2010-07-16 05:53:27,334 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection
> established to t-zookeeper2.cloud.ppops.net/10.110.24.57:2181, initiating
> sessio
> n
> 2010-07-16 05:53:27,335 INFO org.apache.zookeeper.ClientCnxn: Unable to
> reconnec
> t to ZooKeeper service, session 0x329c88039b0006c has expired, closing
> socket co
> nnection
> 2010-07-16 05:53:27,896 INFO org.apache.zookeeper.ClientCnxn: Client session
> tim
> ed out, have not heard from server in 240520ms for sessionid
> 0x129c87a7f98007a,
> closing socket connection and attempting reconnect
>
>
> 2010-07-16 05:53:39,090 FATAL
> org.apache.hadoop.hbase.regionserver.HRegionServer
> : Aborting region server serverName=m0002028.ppops.net,60020,1279237223465,
> load
> =(requests=952, regions=21, usedHeap=575, maxHeap=2043): Unhandled exception
> org.apache.hadoop.hbase.YouAreDeadException:
> org.apache.hadoop.hbase.YouAreDeadE
> xception: Server REPORT rejected; currently processing
> m0002028.ppops.net,60020,
> 1279237223465 as dead server
>       at
> org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManage
> r.java:217)
>       at
> org.apache.hadoop.hbase.master.ServerManager.regionServerReport(Serve
> rManager.java:271)
>       at
> org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.jav
> a:684)
>       at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>       at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
> sorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:576)
>       at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:
> 919)
>
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>

Reply via email to