Josh Elser created HBASE-27044:
----------------------------------
Summary: Serialized procedures which point to users from other
Kerberos domains can prevent master startup
Key: HBASE-27044
URL: https://issues.apache.org/jira/browse/HBASE-27044
Project: HBase
Issue Type: Bug
Components: proc-v2
Reporter: Josh Elser
We ran into an interesting bug when test teams were running HBase against cloud
storage without ensuring that the previous location was cleaned. This resulted
in an hbase.rootdir that had:
* A valid HBase MasterData Region
* A valid hbase:meta
* A valid collection of HBase tables
* An empty ZooKeeper
Through the changes that we've worked on prior, those described in HBASE-24286
were effective in getting every _except_ the Procedures back online without
issue. Parsing the existing procedures produced an interesting error:
{noformat}
java.lang.IllegalArgumentException: Illegal principal name
hbase/wrong-hostname.domain@WRONG_REALM:
org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No
rules applied to hbase/wrong-hostname.domain@WRONG_REALM
at org.apache.hadoop.security.User.<init>(User.java:51)
at org.apache.hadoop.security.User.<init>(User.java:43)
at
org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1418)
at
org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1402)
at
org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.toUserInfo(MasterProcedureUtil.java:60)
at
org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.deserializeStateData(ModifyTableProcedure.java:262)
at
org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:294)
at
org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43)
at
org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90)
at
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:411)
at
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:78)
at
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.load(ProcedureExecutor.java:339)
at
org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:285)
at
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:330)
at
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:600)
at
org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1581)
at
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:835)
at
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2205)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:514)
at java.lang.Thread.run(Thread.java:750) {noformat}
What's actually happening is that we are storing the {{User}} into the
procedure and then relying on UserGroupInformation to parse the {{User}}
protobuf into a UGI to get the "short" username.
When the serialized procedure (whether in the MasterData region over via PV2
WAL files, I think) gets loaded, we end up needing Hadoop auth_to_local
configuration to be able to parse that kerberos principal back to a name.
However, Hadoop's KerberosName will only unwrap Kerberos principals which match
the local Kerberos realm (defined by the krb5.conf's default_realm,
[ref|https://github.com/frohoff/jdk8u-jdk/blob/master/src/share/classes/sun/security/krb5/Config.java#L978-L983])
The interesting part is that we don't seem to ever use the user _other_ than to
display the {{owner}} attribute for procedures on the HBase UI. There is a
method in hbase-procedure which can filter procedures based on Owner, but I
didn't see any usages of that method.
Given the pushback against HBASE-24286, I assume that, for the same reasons, we
would see pushback against fixing this issue. However, I wanted to call it out
for posterity. The expectation of users is that HBase _should_ implicitly
handle this case.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)