[
https://issues.apache.org/jira/browse/HBASE-27044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17537812#comment-17537812
]
Josh Elser commented on HBASE-27044:
------------------------------------
We could do a pretty naive "change" here where we just return back a {{User}}
which is "unknown" when we fail to parse the serialized protobuf which would be
enough to fix this problem on the surface.
However, I think this change is missing the root of the problem (the
expectation that HBase should just be able to "reattach" itself to an
hbase.rootdir).
I can't think of any way in which the above exception would be thrown other
than the cloud storage reattachment case I described. I'm happy to put up a
patch to gracefully handle a the failure to create the UGI if folks think there
is merit in that.
> Serialized procedures which point to users from other Kerberos domains can
> prevent master startup
> -------------------------------------------------------------------------------------------------
>
> Key: HBASE-27044
> URL: https://issues.apache.org/jira/browse/HBASE-27044
> Project: HBase
> Issue Type: Bug
> Components: proc-v2
> Reporter: Josh Elser
> Priority: Major
>
> We ran into an interesting bug when test teams were running HBase against
> cloud storage without ensuring that the previous location was cleaned. This
> resulted in an hbase.rootdir that had:
> * A valid HBase MasterData Region
> * A valid hbase:meta
> * A valid collection of HBase tables
> * An empty ZooKeeper
> Through the changes that we've worked on prior, those described in
> HBASE-24286 were effective in getting every _except_ the Procedures back
> online without issue. Parsing the existing procedures produced an interesting
> error:
> {noformat}
> java.lang.IllegalArgumentException: Illegal principal name
> hbase/wrong-hostname.domain@WRONG_REALM:
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule:
> No rules applied to hbase/wrong-hostname.domain@WRONG_REALM
> at org.apache.hadoop.security.User.<init>(User.java:51)
> at org.apache.hadoop.security.User.<init>(User.java:43)
> at
> org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1418)
> at
> org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1402)
> at
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.toUserInfo(MasterProcedureUtil.java:60)
> at
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.deserializeStateData(ModifyTableProcedure.java:262)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:294)
> at
> org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43)
> at
> org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:411)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:78)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.load(ProcedureExecutor.java:339)
> at
> org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:285)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:330)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:600)
> at
> org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1581)
> at
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:835)
> at
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2205)
> at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:514)
> at java.lang.Thread.run(Thread.java:750) {noformat}
> What's actually happening is that we are storing the {{User}} into the
> procedure and then relying on UserGroupInformation to parse the {{User}}
> protobuf into a UGI to get the "short" username.
> When the serialized procedure (whether in the MasterData region over via PV2
> WAL files, I think) gets loaded, we end up needing Hadoop auth_to_local
> configuration to be able to parse that kerberos principal back to a name.
> However, Hadoop's KerberosName will only unwrap Kerberos principals which
> match the local Kerberos realm (defined by the krb5.conf's default_realm,
> [ref|https://github.com/frohoff/jdk8u-jdk/blob/master/src/share/classes/sun/security/krb5/Config.java#L978-L983])
> The interesting part is that we don't seem to ever use the user _other_ than
> to display the {{owner}} attribute for procedures on the HBase UI. There is a
> method in hbase-procedure which can filter procedures based on Owner, but I
> didn't see any usages of that method.
> Given the pushback against HBASE-24286, I assume that, for the same reasons,
> we would see pushback against fixing this issue. However, I wanted to call it
> out for posterity. The expectation of users is that HBase _should_ implicitly
> handle this case.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)