[
https://issues.apache.org/jira/browse/HBASE-9321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13749125#comment-13749125
]
Gary Helmling commented on HBASE-9321:
--------------------------------------
There's definitely an abstraction mismatch in caching the User at the
HConnection level. RpcClient itself makes use of the User a connection has
authenticated as in it's own internal map of {{PoolMap<ConnectionId,
Connection> connections}} (that's RpcClient.Connection there). So you would
then have HConnection (User) -> RpcClient -> RpcClient.Connection (User), which
opens the door for a mismatch.
One option here would be to eliminate usage of User in RpcClient.ConnectionId,
and only keep a reference to User in HConnection. So if you want to
authenticate as a new User, you obtain a new HConnection. User could be
specified explicitly as a parameter to HCM.createConnection(), and could use
User.getCurrent() to populate the value for the old signatures.
I don't know if this winds up in a place that's better or worse than the
current situation, but it seems the most straightforward way of pulling up the
User to eliminate the contention.
> Contention getting the current user in RpcClient$Connection.writeRequest
> ------------------------------------------------------------------------
>
> Key: HBASE-9321
> URL: https://issues.apache.org/jira/browse/HBASE-9321
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.95.2
> Reporter: Jean-Daniel Cryans
> Fix For: 0.98.0, 0.96.0
>
>
> I've been running tests on clusters with "lots" of regions, about 400, and
> I'm seeing weird contention in the client.
> This one I see a lot, hundreds and sometimes thousands of threads are blocked
> like this:
> {noformat}
> "htable-pool4-t74" daemon prio=10 tid=0x00007f2254114000 nid=0x2a99 waiting
> for monitor entry [0x00007f21f9e94000]
> java.lang.Thread.State: BLOCKED (on object monitor)
> at
> org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:466)
> - waiting to lock <0x00000000fb5ad000> (a java.lang.Class for
> org.apache.hadoop.security.UserGroupInformation)
> at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.writeRequest(RpcClient.java:1013)
> at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1407)
> at
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1634)
> at
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1691)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.multi(ClientProtos.java:27339)
> at
> org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:105)
> at
> org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:43)
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:183)
> {noformat}
> While the holder is doing this:
> {noformat}
> "htable-pool17-t55" daemon prio=10 tid=0x00007f2244408000 nid=0x2a98 runnable
> [0x00007f21f9f95000]
> java.lang.Thread.State: RUNNABLE
> at java.security.AccessController.getStackAccessControlContext(Native
> Method)
> at java.security.AccessController.getContext(AccessController.java:487)
> at
> org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:466)
> - locked <0x00000000fb5ad000> (a java.lang.Class for
> org.apache.hadoop.security.UserGroupInformation)
> at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.writeRequest(RpcClient.java:1013)
> at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1407)
> at
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1634)
> at
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1691)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.multi(ClientProtos.java:27339)
> at
> org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:105)
> at
> org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:43)
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:183)
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira