[ 
https://issues.apache.org/jira/browse/HBASE-27204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17568597#comment-17568597
 ] 

Szabolcs Bukros commented on HBASE-27204:
-----------------------------------------

[~apurtell] Please revert HBASE-24579.

I have done some of the investigation I should have done 2 years ago and found 
that not reading the potential error msg might not be limited to PLAIN sasl. 
Based on my understanding of the code this could happen with GSS too. 
GssKrb5Client after evaluating the final handshake challenge can send a 
gssOutToken back to the server, just after setting "completed" to true. Then 
GssKrb5Server tries to evaluate the response in doHandshake2 where it either 
fails with an exception or returns with null, basically producing the same 
issue we have with PLAIN sasl. Because the client is already completed the 
potential response is never read.

I think a potential fix would have 3 parts.
 * ServerRpcConnection.saslReadAndProcess could be changed to always return a 
response even if replyToken is null. Maybe just an empty byte array. This would 
make the communication consistent by allowing us to always check the stream for 
a response.
 * HBaseSaslRpcClient.saslConnect now could be extended to track if a 
"readStatus" was called after a response was writen. If the client is complete, 
but we are still waiting for a response we could call "readStatus".
 * Netty. Considering ServerRpcConnection.saslReadAndProcess is shared between 
the implementation I assume the issue is present in Netty too, but I do not 
understand that code well enough to propose a solution.

What do you think?

> BlockingRpcClient will hang for 20 seconds when SASL is enabled after 
> finishing negotiation
> -------------------------------------------------------------------------------------------
>
>                 Key: HBASE-27204
>                 URL: https://issues.apache.org/jira/browse/HBASE-27204
>             Project: HBase
>          Issue Type: Bug
>          Components: rpc, sasl, security
>            Reporter: Duo Zhang
>            Assignee: Andrew Kyle Purtell
>            Priority: Critical
>             Fix For: 2.5.0, 3.0.0-alpha-4, 2.4.14
>
>
> Found this when implementing HBASE-27185. When running TestSecureIPC, if 
> BlockingRpcClient is used, the tests will spend much more time comparing to 
> NettyRpcClient.
> The problem is that, for the normal kerberos authentication, the last step is 
> client send a reply to server, so after server receives the last token, it 
> will not write anything back but expect client to send connection header.
> In HBASE-24579, for reading the error message, we added a readReply after the 
> SaslClient indicates that the negotiation is completed. But as said above, 
> for normal cases, we will not write anything back from server side, so the 
> client will hang there and only throw an exception when timeout is reached, 
> which is 20 seconds.
> This nearly makes the BlockingRpcClient unusable when sasl is enabled, as it 
> will hang 20 seconds when connecting...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to