[ 
https://issues.apache.org/jira/browse/HBASE-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5781:
----------------------------------

    Attachment: hbase-5781.patch

Implemented fix suggested in conversation.  Applied to 0.92.x based hbase, and 
confirmed that hbck's assignment operations worked.

* Fixed a borked test cluster
* On ok cluster, use hbase shell to closed a region, ran updated hbck to verify 
 detected, ran 'hbck -fix' to fix assignment and problem was repaired. 

Note for this to pass on trunk, HBASE-5993 is needed as well.
                
> Zookeeper session got closed while trying to assign the region to RS using 
> hbck -fix
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-5781
>                 URL: https://issues.apache.org/jira/browse/HBASE-5781
>             Project: HBase
>          Issue Type: Bug
>          Components: hbck
>    Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
>            Reporter: Kristam Subba Swathi
>            Assignee: Jonathan Hsieh
>            Priority: Critical
>             Fix For: 0.94.0
>
>         Attachments: hbase-5781.patch
>
>
> After running the hbck in the cluster ,it is found that one region is not 
> assigned
> So the hbck -fix is used to fix this 
> But the assignment didnt happen since the zookeeper session is closed
> Please find the attached trace for more details
> -----------------------------------------
> Trying to fix unassigned region...
> 12/04/03 11:02:57 INFO util.HBaseFsckRepair: Region still in transition, 
> waiting for it to become assigned: {NAME => 
> 'ufdr,002300,1333379123498.00871fbd7583512e12c4eb38e900be8d.', STARTKEY => 
> '002300', ENDKEY => '002311', ENCODED => 00871fbd7583512e12c4eb38e900be8d,}
> 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: 
> Closed zookeeper sessionid=0x236738a2630000a
> 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Session: 0x236738a2630000a closed
> ERROR: Region { meta => 
> ufdr,010444,1333379123857.01594219211d0035b9586f98954462e1., hdfs => 
> hdfs://10.18.40.25:9000/hbase/ufdr/01594219211d0035b9586f98954462e1, deployed 
> => } not deployed on any region server.
> Trying to fix unassigned region...
> 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: EventThread shut down
> 12/04/03 11:02:58 WARN zookeeper.ZKUtil: hconnection-0x236738a2630000a Unable 
> to set watcher on znode (/hbase)
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
> at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
> at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356)
> at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894)
> 12/04/03 11:02:58 ERROR zookeeper.ZooKeeperWatcher: 
> hconnection-0x236738a2630000a Received unexpected KeeperException, 
> re-throwing exception
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
> at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
> at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356)
> at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894)
> 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: 
> This client just lost it's session with ZooKeeper, trying to reconnect.
> 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: 
> Trying to reconnect to zookeeper
> 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Initiating client connection, 
> connectString=10.18.40.21:2181,10.18.40.25:2181,10.18.40.93:2181 
> sessionTimeout=60000 watcher=hconnection
> 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: Opening socket connection to 
> server /10.18.40.93:2181
> 12/04/03 11:02:58 INFO zookeeper.RecoverableZooKeeper: The identifier of this 
> process is 18333@HOST-10-18-40-93
> 12/04/03 11:02:58 WARN client.ZooKeeperSaslClient: SecurityException: 
> java.lang.SecurityException: Unable to locate a login configuration occurred 
> when trying to find JAAS configuration.
> 12/04/03 11:02:58 INFO client.ZooKeeperSaslClient: Client will not 
> SASL-authenticate because the default JAAS configuration section 'Client' 
> could not be found. If you are not using SASL, you may ignore this. On the 
> other hand, if you expected SASL to work, please fix your JAAS configuration.
> 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: Socket connection established to 
> HOST-10-18-40-93/10.18.40.93:2181, initiating session
> 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: Session establishment complete 
> on server HOST-10-18-40-93/10.18.40.93:2181, sessionid = 0x3367392d5140018, 
> negotiated timeout = 40000
> 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: 
> Reconnected successfully. This disconnect could have been caused by a network 
> partition or a long-running GC pause, either way it's recommended that you 
> verify your environment.
> Exception in thread "main" org.apache.hadoop.hbase.MasterNotRunningException
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:686)
> at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
> at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356)
> at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894)
> Please find the attached file for more details..
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to