[
https://issues.apache.org/jira/browse/DRILL-6187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391594#comment-16391594
]
ASF GitHub Bot commented on DRILL-6187:
---------------------------------------
Github user vrozov commented on a diff in the pull request:
https://github.com/apache/drill/pull/1145#discussion_r173234604
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/client/DrillClient.java ---
@@ -312,6 +312,11 @@ public synchronized void connect(String connect,
Properties props) throws RpcExc
if (connected) {
return;
}
+
+ if (props == null) {
--- End diff --
My recommendation is to change other 2 overloaded methods to pass `new
Properties()` instead of `null` and making it explicit that `null` is not
allowed (avoid passing `null` and checking for `null` at the same time).
> Exception in RPC communication between DataClient/ControlClient and
> respective servers when bit-to-bit security is on
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: DRILL-6187
> URL: https://issues.apache.org/jira/browse/DRILL-6187
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - RPC, Security
> Reporter: Sorabh Hamirwasia
> Assignee: Sorabh Hamirwasia
> Priority: Major
> Fix For: 1.13.0
>
>
>
> {color:#000000}Below is the summary of issue: {color}
>
> {color:#000000}*Scenario:*{color}
> {color:#000000}It seems like first sendRecordBatch was sent to Foreman which
> initiated the Authentication handshake. But before initiating handshake for
> auth we establish a connection and store that in a registry. Now if in
> parallel there is another recordBatch (by a different minor fragment running
> on same Drillbit) to be sent then that will see the connection available in
> registry and will initiate the send. Before the authentication is completed
> this second request reached foreman and it throws below exception saying RPC
> type 3 message is not allowed and closes the connection. This also fails the
> authentication handshake which was in progress.{color}{color:#000000} Here
> the logs with details:{color}
> {color:#000000} {color}
> {color:#000000}*Forman received the SASL_START message from another
> node:*{color}
> {color:#000000}*_2018-02-21 18:43:30,759
> [_*{color}{color:#000000}_BitServer-4] TRACE
> o.a.d.e.r.s.ServerAuthenticationHandler - Received SASL message SASL_START
> from /10.10.100.161:35482_{color}
> {color:#000000} {color}
> {color:#000000}*Then around same time it received another message from client
> of Rpc Type 3 which is for SendRecordBatch and fails since handshake is not
> completed yet.*{color}
> {color:#000000} {color}
> {color:#000000}*_2018-02-21 18:43:30,762_*{color}{color:#000000}
> _[BitServer-4] ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC
> communication. Connection: /10.10.100.162:31012 <--> /__10.10.100.161:35482_
> _(data server). Closing connection._{color}
> {color:#000000}_io.netty.handler.codec.DecoderException:
> org.apache.drill.exec.rpc.RpcException: Request of type 3 is not allowed
> without authentication. Client on /__10.10.100.161:35482_ _must authenticate
> before making requests. Connection dropped. [Details: Encryption: enabled ,
> MaxWrappedSize: 65536 , WrapSizeLimit: 0]_{color}
> {color:#000000} {color}
> {color:#000000}*Then client receives an channel closed exception:*{color}
> {color:#000000} {color}
> {color:#000000}*2018-02-21 18:43:30,764 [*{color}{color:#000000}BitClient-4]
> WARN o.a.d.exec.rpc.RpcExceptionHandler - Exception occurred with closed
> channel. Connection: /_10.10.100.161:35482_ <--> _10.10.100.162:31012_ (data
> client){color}
> {color:#000000} {color}
> {color:#000000}*and due to this it's initial command for authentication also
> fails. Since there is channel closed exception above I will think that
> triggered the failure of authentication request as well.*{color}
> {color:#000000} {color}
> {color:#000000}_Caused by: org.apache.drill.exec.rpc.RpcException: Command
> failed while establishing connection. Failure type AUTHENTICATION._{color}
> {color:#000000} _at
> org.apache.drill.exec.rpc.RpcException.mapException(RpcException.java:67)
> ~[drill-rpc-1.12.0-mapr.jar:1.12.0-mapr]_{color}
> {color:#000000} _at
> org.apache.drill.exec.rpc.ListeningCommand.connectionFailed(ListeningCommand.java:66)
> ~[drill-rpc-1.12.0-mapr.jar:1.12.0-mapr]_{color}
> {color:#000000} _at
> org.apache.drill.exec.rpc.data.DataTunnel$SendBatchAsyncListen.connectionFailed(DataTunnel.java:166)
> ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
> {color:#000000} _at
> org.apache.drill.exec.rpc.data.DataClient$AuthenticationCommand.connectionSucceeded(DataClient.java:203)
> ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
> {color:#000000} _at
> org.apache.drill.exec.rpc.data.DataClient$AuthenticationCommand.connectionSucceeded(DataClient.java:147)
> ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
> {color:#000000} _at
> org.apache.drill.exec.rpc.ReconnectingConnection$ConnectionListeningFuture.waitAndRun(ReconnectingConnection.java:122)
> ~[drill-rpc-1.12.0-mapr.jar:1.12.0-mapr]_{color}
> {color:#000000} _at
> org.apache.drill.exec.rpc.ReconnectingConnection.runCommand(ReconnectingConnection.java:83)
> ~[drill-rpc-1.12.0-mapr.jar:1.12.0-mapr]_{color}
> {color:#000000} _at
> org.apache.drill.exec.rpc.data.DataTunnel._{color}{color:#000000}*_sendRecordBatch_*{color}{color:#000000}_(DataTunnel.java:84)
> ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
> {color:#000000} _at
> org.apache.drill.exec.ops.AccountingDataTunnel.sendRecordBatch(AccountingDataTunnel.java:45)
> ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
> {color:#000000} _at
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:127)
> ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
> {color:#000000} {color}
> {color:#000000}So I think there is a concurrency issue where even though the
> authentication is not completed the other requests are send to remote node as
> soon as TCP connection is available. Instead it should wait until
> authentication is completed. Something like TCP connection should be made
> available from registry only if authentication is completed.{color}
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)