Sorabh Hamirwasia created DRILL-6187:
----------------------------------------
Summary: Exception in RPC communication between
DataClient/ControlClient and respective servers when bit-to-bit security is on
Key: DRILL-6187
URL: https://issues.apache.org/jira/browse/DRILL-6187
Project: Apache Drill
Issue Type: Bug
Components: Execution - RPC, Security
Reporter: Sorabh Hamirwasia
Assignee: Sorabh Hamirwasia
Fix For: 1.13.0
{color:#000000}Below is the summary of issue: {color}
{color:#000000}*Scenario:*{color}
{color:#000000}It seems like first sendRecordBatch was sent to Foreman which
initiated the Authentication handshake. But before initiating handshake for
auth we establish a connection and store that in a registry. Now if in parallel
there is another recordBatch (by a different minor fragment running on same
Drillbit) to be sent then that will see the connection available in registry
and will initiate the send. Before the authentication is completed this second
request reached foreman and it throws below exception saying RPC type 3 message
is not allowed and closes the connection. This also fails the authentication
handshake which was in progress.{color}{color:#000000} Here the logs with
details:{color}
{color:#000000} {color}
{color:#000000}*Forman received the SASL_START message from another
node:*{color}
{color:#000000}*_2018-02-21 18:43:30,759 [_*{color}{color:#000000}_BitServer-4]
TRACE o.a.d.e.r.s.ServerAuthenticationHandler - Received SASL message
SASL_START from /10.10.100.161:35482_{color}
{color:#000000} {color}
{color:#000000}*Then around same time it received another message from client
of Rpc Type 3 which is for SendRecordBatch and fails since handshake is not
completed yet.*{color}
{color:#000000} {color}
{color:#000000}*_2018-02-21 18:43:30,762_*{color}{color:#000000} _[BitServer-4]
ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.
Connection: /10.10.100.162:31012 <--> /__10.10.100.161:35482_ _(data server).
Closing connection._{color}
{color:#000000}_io.netty.handler.codec.DecoderException:
org.apache.drill.exec.rpc.RpcException: Request of type 3 is not allowed
without authentication. Client on /__10.10.100.161:35482_ _must authenticate
before making requests. Connection dropped. [Details: Encryption: enabled ,
MaxWrappedSize: 65536 , WrapSizeLimit: 0]_{color}
{color:#000000} {color}
{color:#000000}*Then client receives an channel closed exception:*{color}
{color:#000000} {color}
{color:#000000}*2018-02-21 18:43:30,764 [*{color}{color:#000000}BitClient-4]
WARN o.a.d.exec.rpc.RpcExceptionHandler - Exception occurred with closed
channel. Connection: /_10.10.100.161:35482_ <--> _10.10.100.162:31012_ (data
client){color}
{color:#000000} {color}
{color:#000000}*and due to this it's initial command for authentication also
fails. Since there is channel closed exception above I will think that
triggered the failure of authentication request as well.*{color}
{color:#000000} {color}
{color:#000000}_Caused by: org.apache.drill.exec.rpc.RpcException: Command
failed while establishing connection. Failure type AUTHENTICATION._{color}
{color:#000000} _at
org.apache.drill.exec.rpc.RpcException.mapException(RpcException.java:67)
~[drill-rpc-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000} _at
org.apache.drill.exec.rpc.ListeningCommand.connectionFailed(ListeningCommand.java:66)
~[drill-rpc-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000} _at
org.apache.drill.exec.rpc.data.DataTunnel$SendBatchAsyncListen.connectionFailed(DataTunnel.java:166)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000} _at
org.apache.drill.exec.rpc.data.DataClient$AuthenticationCommand.connectionSucceeded(DataClient.java:203)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000} _at
org.apache.drill.exec.rpc.data.DataClient$AuthenticationCommand.connectionSucceeded(DataClient.java:147)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000} _at
org.apache.drill.exec.rpc.ReconnectingConnection$ConnectionListeningFuture.waitAndRun(ReconnectingConnection.java:122)
~[drill-rpc-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000} _at
org.apache.drill.exec.rpc.ReconnectingConnection.runCommand(ReconnectingConnection.java:83)
~[drill-rpc-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000} _at
org.apache.drill.exec.rpc.data.DataTunnel._{color}{color:#000000}*_sendRecordBatch_*{color}{color:#000000}_(DataTunnel.java:84)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000} _at
org.apache.drill.exec.ops.AccountingDataTunnel.sendRecordBatch(AccountingDataTunnel.java:45)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000} _at
org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:127)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000} {color}
{color:#000000}So I think there is a concurrency issue where even though the
authentication is not completed the other requests are send to remote node as
soon as TCP connection is available. Instead it should wait until
authentication is completed. Something like TCP connection should be made
available from registry only if authentication is completed.{color}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)