[ 
https://issues.apache.org/jira/browse/HBASE-28321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815705#comment-17815705
 ] 

Duo Zhang commented on HBASE-28321:
-----------------------------------

Ah, I found that it is a bit difficult to fallback to randomly select a server 
principal when we notices that we are connection an old server.

The problem is that the old server will send back a FatalConnectionException 
and then close the connection, in either way, all the pending rpc calls will be 
complete exceptionally in our current architecture.

I will not say it is impossible but it is really not easy to retry at 
connection level. Maybe we could add some special hack in the 
AbstractRpcClient, when the call is finished with a special exception, we will 
send it again.

Anyway, let me finish the other logic first, and start a discussion thread on 
dev list about this.

> RpcConnectionRegistry is broken when security is enabled and we use different 
> principal for master and region server
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-28321
>                 URL: https://issues.apache.org/jira/browse/HBASE-28321
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Client, IPC/RPC, security
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Critical
>
> After introducing RpcConnectionRegistry, we let master and region server both 
> implement ClientMetaService.
> In our current client architecture, when security is enabled, we rely on the 
> record in SecurityInfo to determine the server principal to use, 
> unfortunately there is only one principal can be specified, so if we use 
> different principal for master and region server, either we can not connect 
> to master, or we can not connect to region server.
> And just changing the server principal field in SecurityInfo to an array can 
> not solve the problem, as when connecting, we do not know whether the remote 
> server is a master or region server, so we still can not determine which 
> principal to use...
> Anyway, since this has been in our code base since 2.5.0, it is not a new 
> problem, so just set it as critical, not a blocker. But we should find out 
> the solution ASAP.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to