After each zookeeper reconnect, I saw same session Id (0x703e...)

What version of zookeeper are you using ? Can you search zookeeper log for this 
session Id to see what happened ?

Thanks

On Apr 29, 2013, at 2:36 AM, Viral Bajaria <[email protected]> wrote:

> On Mon, Apr 29, 2013 at 2:25 AM, Ted Yu <[email protected]> wrote:
> 
>> I noticed the 8 occurrences of 0x703e... following region server name in
>> the abort message.
>> I wonder why the repetition ?
>> 
>> Cheers
> 
> Oh good observation. I just stepped through the logs again and saw that the
> client timeout occurred 8 times before that exception. Maybe that explains
> the repetitive occurrence, but that is definitely weird ? Logs pasted below.
> 
> -Viral
> 
> ===logs===
> 2013-04-29 07:41:03,227 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Serving as
> regionserver1,60020,1367221255417, RPC listening on regionserver1/
> 10.155.234.3:60020, sessionid=0x703e48a8cfd81be6
> 2013-04-29 07:41:03,228 INFO
> org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker
> regionserver1,60020,1367221255417 starting
> 2013-04-29 07:41:03,230 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Registered RegionServer
> MXBean
> 2013-04-29 07:41:05,967 INFO org.apache.hadoop.hbase.util.ChecksumType:
> Checksum can use java.util.zip.CRC32
> 2013-04-29 07:43:47,739 INFO org.apache.zookeeper.ClientCnxn: Client
> session timed out, have not heard from server in 27868ms for sessionid
> 0x703e48a8cfd81be6, closing socket connection and attempting reconnect
> 2013-04-29 07:43:48,776 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server 10.165.33.180:2181. Will not attempt to
> authenticate using SASL (Unable to locate a login configuration)
> 2013-04-29 07:43:48,777 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to 10.165.33.180:2181, initiating session
> 2013-04-29 07:43:48,782 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server 10.165.33.180:2181, sessionid =
> 0x703e48a8cfd81be6, negotiated timeout = 40000
> 2013-04-29 07:44:28,337 INFO org.apache.zookeeper.ClientCnxn: Client
> session timed out, have not heard from server in 39427ms for sessionid
> 0x703e48a8cfd81be6, closing socket connection and attempting reconnect
> 2013-04-29 07:44:28,735 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server 10.152.152.84:2181. Will not attempt to
> authenticate using SASL (Unable to locate a login configuration)
> 2013-04-29 07:44:28,736 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to 10.152.152.84:2181, initiating session
> 2013-04-29 07:44:28,738 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server 10.152.152.84:2181, sessionid =
> 0x703e48a8cfd81be6, negotiated timeout = 40000
> 2013-04-29 07:46:29,080 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=98.34 MB,
> free=11.61 GB, max=11.71 GB, blocks=0, accesses=0, hits=0, hitRatio=0,
> cachingAccesses=0, cachingHits=0, cachingHitsRatio=0, evictions=0,
> evicted=0, evictedPerRun=NaN
> 2013-04-29 07:46:29,174 INFO org.apache.zookeeper.ClientCnxn: Client
> session timed out, have not heard from server in 40354ms for sessionid
> 0x703e48a8cfd81be6, closing socket connection and attempting reconnect
> 2013-04-29 07:46:29,682 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server 10.147.128.158:2181. Will not attempt to
> authenticate using SASL (Unable to locate a login configuration)
> 2013-04-29 07:46:29,682 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to 10.147.128.158:2181, initiating session
> 2013-04-29 07:46:29,684 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server 10.147.128.158:2181, sessionid =
> 0x703e48a8cfd81be6, negotiated timeout = 40000
> 2013-04-29 07:49:10,404 INFO org.apache.zookeeper.ClientCnxn: Client
> session timed out, have not heard from server in 39876ms for sessionid
> 0x703e48a8cfd81be6, closing socket connection and attempting reconnect
> 2013-04-29 07:49:11,503 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server 10.165.33.180:2181. Will not attempt to
> authenticate using SASL (Unable to locate a login configuration)
> 2013-04-29 07:49:11,504 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to 10.165.33.180:2181, initiating session
> 2013-04-29 07:49:11,506 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server 10.165.33.180:2181, sessionid =
> 0x703e48a8cfd81be6, negotiated timeout = 40000
> 2013-04-29 07:51:11,462 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=98.34 MB,
> free=11.61 GB, max=11.71 GB, blocks=0, accesses=0, hits=0, hitRatio=0,
> cachingAccesses=0, cachingHits=0, cachingHitsRatio=0, evictions=0,
> evicted=0, evictedPerRun=NaN
> 2013-04-29 07:53:52,158 INFO org.apache.zookeeper.ClientCnxn: Client
> session timed out, have not heard from server in 40195ms for sessionid
> 0x703e48a8cfd81be6, closing socket connection and attempting reconnect
> 2013-04-29 07:53:52,555 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server 10.152.152.84:2181. Will not attempt to
> authenticate using SASL (Unable to locate a login configuration)
> 2013-04-29 07:53:52,556 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to 10.152.152.84:2181, initiating session
> 2013-04-29 07:53:52,558 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server 10.152.152.84:2181, sessionid =
> 0x703e48a8cfd81be6, negotiated timeout = 40000
> 2013-04-29 07:55:53,052 INFO org.apache.zookeeper.ClientCnxn: Client
> session timed out, have not heard from server in 26955ms for sessionid
> 0x703e48a8cfd81be6, closing socket connection and attempting reconnect
> 2013-04-29 07:55:53,444 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server 10.147.128.158:2181. Will not attempt to
> authenticate using SASL (Unable to locate a login configuration)
> 2013-04-29 07:55:53,444 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to 10.147.128.158:2181, initiating session
> 2013-04-29 07:55:53,446 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server 10.147.128.158:2181, sessionid =
> 0x703e48a8cfd81be6, negotiated timeout = 40000
> 2013-04-29 07:55:55,512 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=98.34 MB,
> free=11.61 GB, max=11.71 GB, blocks=0, accesses=0, hits=0, hitRatio=0,
> cachingAccesses=0, cachingHits=0, cachingHitsRatio=0, evictions=0,
> evicted=0, evictedPerRun=NaN
> 2013-04-29 07:58:33,950 INFO org.apache.zookeeper.ClientCnxn: Client
> session timed out, have not heard from server in 39972ms for sessionid
> 0x703e48a8cfd81be6, closing socket connection and attempting reconnect
> 2013-04-29 07:58:35,096 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server 10.165.33.180:2181. Will not attempt to
> authenticate using SASL (Unable to locate a login configuration)
> 2013-04-29 07:58:35,097 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to 10.165.33.180:2181, initiating session
> 2013-04-29 07:58:35,098 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server 10.165.33.180:2181, sessionid =
> 0x703e48a8cfd81be6, negotiated timeout = 40000
> 2013-04-29 08:00:55,512 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=98.34 MB,
> free=11.61 GB, max=11.71 GB, blocks=0, accesses=0, hits=0, hitRatio=0,
> cachingAccesses=0, cachingHits=0, cachingHitsRatio=0, evictions=0,
> evicted=0, evictedPerRun=NaN
> 2013-04-29 08:02:35,674 INFO org.apache.zookeeper.ClientCnxn: Client
> session timed out, have not heard from server in 40592ms for sessionid
> 0x703e48a8cfd81be6, closing socket connection and attempting reconnect
> 2013-04-29 08:02:36,286 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server 10.152.152.84:2181. Will not attempt to
> authenticate using SASL (Unable to locate a login configuration)
> 2013-04-29 08:02:36,287 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to 10.152.152.84:2181, initiating session
> 2013-04-29 08:02:36,288 INFO org.apache.zookeeper.ClientCnxn: Unable to
> reconnect to ZooKeeper service, session 0x703e48a8cfd81be6 has expired,
> closing socket connection
> 2013-04-29 08:03:16,287 FATAL
> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server
> regionserver1,60020,1367221255417:
> regionserver:60020-0x703e48a8cfd81be6-0x703e48a8cfd81be6-0x703e48a8cfd81be6-0x703e48a8cfd81be6-0x703e48a8cfd81be6-0x703e48a8cfd81be6-0x703e48a8cfd81be6-0x703e48a8cfd81be6
> regionserver:60020-0x703e48a8cfd81be6-0x703e48a8cfd81be6-0x703e48a8cfd81be6-0x703e48a8cfd81be6-0x703e48a8cfd81be6-0x703e48a8cfd81be6-0x703e48a8cfd81be6-0x703e48a8cfd81be6
> received expired from ZooKeeper, aborting
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired
>        at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:389)
>        at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:286)
>        at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
>        at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> 2013-04-29 08:03:16,288 FATAL
> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server
> regionserver1,60020,1367221255417: Unhandled exception:
> org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected;
> currently processing regionserver1,60020,1367221255417 as dead server
> org.apache.hadoop.hbase.YouAreDeadException:
> org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected;
> currently processing regionserver1,60020,1367221255417 as dead server
>        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>        at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>        at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>        at
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
>        at
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:880)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:748)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected;
> currently processing regionserver1,60020,1367221255417 as dead server
>        at
> org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:254)
>        at
> org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:172)
>        at
> org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:984)
>        at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
> 
>        at
> org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1021)
>        at
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
>        at $Proxy8.regionServerReport(Unknown Source)

Reply via email to