[ 
https://issues.apache.org/jira/browse/HBASE-27044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17537812#comment-17537812
 ] 

Josh Elser commented on HBASE-27044:
------------------------------------

We could do a pretty naive "change" here where we just return back a {{User}} 
which is "unknown" when we fail to parse the serialized protobuf which would be 
enough to fix this problem on the surface.

However, I think this change is missing the root of the problem (the 
expectation that HBase should just be able to "reattach" itself to an 
hbase.rootdir).

I can't think of any way in which the above exception would be thrown other 
than the cloud storage reattachment case I described. I'm happy to put up a 
patch to gracefully handle a the failure to create the UGI if folks think there 
is merit in that.

> Serialized procedures which point to users from other Kerberos domains can 
> prevent master startup
> -------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-27044
>                 URL: https://issues.apache.org/jira/browse/HBASE-27044
>             Project: HBase
>          Issue Type: Bug
>          Components: proc-v2
>            Reporter: Josh Elser
>            Priority: Major
>
> We ran into an interesting bug when test teams were running HBase against 
> cloud storage without ensuring that the previous location was cleaned. This 
> resulted in an hbase.rootdir that had:
>  * A valid HBase MasterData Region
>  * A valid hbase:meta
>  * A valid collection of HBase tables
>  * An empty ZooKeeper
> Through the changes that we've worked on prior, those described in 
> HBASE-24286 were effective in getting every _except_ the Procedures back 
> online without issue. Parsing the existing procedures produced an interesting 
> error:
> {noformat}
> java.lang.IllegalArgumentException: Illegal principal name 
> hbase/wrong-hostname.domain@WRONG_REALM: 
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to hbase/wrong-hostname.domain@WRONG_REALM
>       at org.apache.hadoop.security.User.<init>(User.java:51)
>       at org.apache.hadoop.security.User.<init>(User.java:43)
>       at 
> org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1418)
>       at 
> org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1402)
>       at 
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.toUserInfo(MasterProcedureUtil.java:60)
>       at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.deserializeStateData(ModifyTableProcedure.java:262)
>       at 
> org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:294)
>       at 
> org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43)
>       at 
> org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90)
>       at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:411)
>       at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:78)
>       at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.load(ProcedureExecutor.java:339)
>       at 
> org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:285)
>       at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:330)
>       at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:600)
>       at 
> org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1581)
>       at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:835)
>       at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2205)
>       at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:514)
>       at java.lang.Thread.run(Thread.java:750) {noformat}
> What's actually happening is that we are storing the {{User}} into the 
> procedure and then relying on UserGroupInformation to parse the {{User}} 
> protobuf into a UGI to get the "short" username.
> When the serialized procedure (whether in the MasterData region over via PV2 
> WAL files, I think) gets loaded, we end up needing Hadoop auth_to_local 
> configuration to be able to parse that kerberos principal back to a name. 
> However, Hadoop's KerberosName will only unwrap Kerberos principals which 
> match the local Kerberos realm (defined by the krb5.conf's default_realm, 
> [ref|https://github.com/frohoff/jdk8u-jdk/blob/master/src/share/classes/sun/security/krb5/Config.java#L978-L983])
> The interesting part is that we don't seem to ever use the user _other_ than 
> to display the {{owner}} attribute for procedures on the HBase UI. There is a 
> method in hbase-procedure which can filter procedures based on Owner, but I 
> didn't see any usages of that method.
> Given the pushback against HBASE-24286, I assume that, for the same reasons, 
> we would see pushback against fixing this issue. However, I wanted to call it 
> out for posterity. The expectation of users is that HBase _should_ implicitly 
> handle this case.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to