[
https://issues.apache.org/jira/browse/HBASE-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15021535#comment-15021535
]
Sumit Nigam commented on HBASE-8675:
------------------------------------
I'd like to understand that is it guaranteed to be Kerberos being unreachable
issue? I have similar problem but my error message is:
15/11/15 15:46:53 ERROR client.ZooKeeperSaslClient: An error:
(java.security.PrivilegedActionException: javax.security.sasl.SaslException:
GSS initiate failed [Caused by GSSException: No valid credentials provided
(Mechanism level: Connection reset)]) occurred when evaluating Zookeeper Quorum
Member's received SASL token. Zookeeper Client will go to AUTH_FAILED state.
15/11/15 15:46:53 ERROR zookeeper.ClientCnxn: SASL authentication with
Zookeeper Quorum member failed: javax.security.sasl.SaslException: An error:
(java.security.PrivilegedActionException: javax.security.sasl.SaslException:
GSS initiate failed [Caused by GSSException: No valid credentials provided
(Mechanism level: Connection reset)]) occurred when evaluating Zookeeper Quorum
Member's received SASL token. Zookeeper Client will go to AUTH_FAILED state.
The mechanism level points to connection reset. Is that error being reported
for kerberos server or for zookeeper client's inability to connect with
zookeeper quorum?
> Two active Hmasters for AUTH_FAILED in secure hbase cluster
> -----------------------------------------------------------
>
> Key: HBASE-8675
> URL: https://issues.apache.org/jira/browse/HBASE-8675
> Project: HBase
> Issue Type: Bug
> Components: master
> Reporter: Liu Shaohui
> Priority: Critical
> Attachments: HBASE-8675-0.94-v1.patch
>
>
> In our product cluster, because of the net problem to kerberos server, the
> ZooKeeperWatcher in active hmaster fails to Auth , gets a connection Event of
> AUTH_FAILED and loose the master lock. But the zookeeper watcher ignores the
> event, so the old active hmaster keeps to be active. After the net problem is
> fixed, the backup hmaster gets the master lock and becomes active. There are
> two two active hmasters in the cluster.
> 2013-05-30 09:44:21,004 ERROR
> org.apache.zookeeper.client.ZooKeeperSaslClient: An error:
> (java.security.PrivilegedActionException: javax.security.sasl.SaslException:
> GSS initiate failed [Caused by GSSException: No valid credentials provided
> (Mechanism level: krb1.xiaomi.net)]) occurred when evaluating Zookeeper
> Quorum Member's received SASL token. Zookeeper Client will go to AUTH_FAILED
> state.
> 2013-05-30 09:54:07,755 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil:
> hconnection-0x3e10d98be405bc Unable to set watcher on znode /hbase/master
> org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode =
> AuthFailed for /hbase/master
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:123)
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1036)
> at
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:166)
> at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:231)
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:76)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.ensureZookeeperTrackers(HConnectionManager.java:595)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:850)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:825)
> at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:286)
> at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:201)
> at
> org.apache.hadoop.hbase.catalog.MetaReader.getHTable(MetaReader.java:200)
> at
> org.apache.hadoop.hbase.catalog.MetaReader.getMetaHTable(MetaReader.java:226)
> at
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:705)
> at
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:183)
> at
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:168)
> at
> org.apache.hadoop.hbase.master.CatalogJanitor.getSplitParents(CatalogJanitor.java:123)
> at
> org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:134)
> at
> org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:92)
> at org.apache.hadoop.hbase.Chore.run(Chore.java:67)
> at java.lang.Thread.run(Thread.java:662)
> I want to just abort the hmaster server if AuthFailed or SaslAuthenticated.
> Any better idea about this issue?
> For ZookeeperWatcher is used in many classes, will the aborting will bring
> more problems? Any more problems we need consider?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)