master sees HRS znode expire and splits log while the HRS is still running and 
accepting edits
----------------------------------------------------------------------------------------------

                 Key: HBASE-1314
                 URL: https://issues.apache.org/jira/browse/HBASE-1314
             Project: Hadoop HBase
          Issue Type: Bug
    Affects Versions: 0.20.0
            Reporter: Andrew Purtell


ZK session expiration related problem. HRS loses its ephemeral node while it is 
still up and running and accepting edits. Master sees it go away and starts 
splitting its logs. After this, all reconstruction logs have to be manually 
removed from the region directories or the regions will never deploy (CRC 
errors). I think on HDFS edits would be lost, not corrupted. (I am using a 
HBase root on local file system.) 

HRS ZK session expires, causing its znode to go away:

2009-04-07 03:50:39,953 INFO org.apache.hadoop.hbase.master.ServerManager: 
localhost.localdomain_1239068648333_60020 znode expired
2009-04-07 03:50:40,565 DEBUG org.apache.hadoop.hbase.master.HMaster: 
Processing todo: ProcessServerShutdown of 
localhost.localdomain_1239068648333_60020
2009-04-07 03:50:40,637 INFO 
org.apache.hadoop.hbase.master.RegionServerOperation: process shutdown of 
server localhost.localdomain_1239068648333_60020: logSplit: false, 
rootRescanned: false, numberOfMetaRegions: 1, onlineMetaRegions.size(): 1

But here we have the HRS still reporting in, triggering a 
LeaseStillHeldException:

2009-04-07 03:50:40,826 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 2 on 60000, call regionServerReport(address: 127.0.0.1:60020, 
startcode: 1239068648333, load: (requests=14, regions=7, usedHeap=582, 
maxHeap=888), [Lorg.apache.hadoop.hbase.HMsg;@6da21389, 
[Lorg.apache.hadoop.hbase.HRegionInfo;@2bb0bf9a) from 127.0.0.1:39238: error: 
org.apache.hadoop.hbase.Leases$LeaseStillHeldException
org.apache.hadoop.hbase.Leases$LeaseStillHeldException
        at 
org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:198)
        at 
org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:601)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
        at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:909)

And log splitting starts anyway:

2009-04-07 03:50:41,139 INFO org.apache.hadoop.hbase.regionserver.HLog: 
Splitting 3 log(s) in 
file:/data/hbase/log_localhost.localdomain_1239068648333_60020
2009-04-07 03:50:41,139 DEBUG org.apache.hadoop.hbase.regionserver.HLog: 
Splitting 1 of 3: 
file:/data/hbase/log_localhost.localdomain_1239068648333_60020/hlog.dat.1239075060711
[...]


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to