[
https://issues.apache.org/jira/browse/DRILL-8009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17432988#comment-17432988
]
ASF GitHub Bot commented on DRILL-8009:
---------------------------------------
rymarm commented on pull request #2333:
URL: https://github.com/apache/drill/pull/2333#issuecomment-949656576
Hi @paul-rogers, thank you for your review!
By answering the main questions:
> ...So, the question is, does the isValid() method need to know the state
at this exact instant? Or, is it good enough to know that the connection was OK
when we last checked 30 seconds (or whatever) ago?
I think the answer is pretty easy and very predictable — yes, the method
needs to know the state at this exact instant, as it is required by JDBC API
and is described in [java
documentation](https://docs.oracle.com/javase/8/docs/api/java/sql/Connection.html#isValid-int-).
As we implement some interface we should comply with it, so as users may
simply use Drill JDBC driver without deep knowledge of its implementation and
the method result will be predictable.
------------------------
And let me comment on other parts of your message.
> ...So, the client always knows that the connection is still valid (or was
at the time of the last heartbeat. We can check how frequently they occur; I
think it is on the order of 30 seconds or so, but I may be wrong.
The current implementation does not correspond to what you said. For a now,
only the server always knows that the connection is still valid, because it
receives a heartbeat from the client, while the client doesn't check, whether
the server (drillbit) has answered on the PING with a PONG. Heartbeat mechanism
was implemented in [this
commit](https://github.com/apache/drill/commit/960f876a1945eee4eeb0d6a98c5c58dfe2eea1a9),
and there were missed functions that on the client side checks whether the
server answers for requests.
There are
[`IdleStateHandler`](https://github.com/apache/drill/blob/960f876a1945eee4eeb0d6a98c5c58dfe2eea1a9/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/BasicClient.java#L119)
that sends PING (heartbeat) request if the client doesn't send any request for
a while. But this handler, [doesn't check, whether the client received a PONG
answer](https://github.com/apache/drill/blob/960f876a1945eee4eeb0d6a98c5c58dfe2eea1a9/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/RpcBus.java#L266)
from the server. I may suppose, that author of this logic (heartbeat)
believed, that netty's
[`Future#isSuccess`](https://netty.io/4.1/api/io/netty/util/concurrent/Future.html#isSuccess--)
will return `false` if the server [didn't receive PING
message](https://github.com/apache/drill/blob/960f876a1945eee4eeb0d6a98c5c58dfe2eea1a9/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/BasicClient.java#L123),
but it doesn't: `Future#isSuccess` mean only successful flush to a socket and
it doesn't mean that any data was delivered
([source](https://github.com/netty/netty/issues/10343#issuecomment-641365702)).
Therefore, the client doesn't have any mechanism, that checks whether the
server (drillbit) still answers. And in cases when drillbit is not accessible
now due to network issues, drillbit failing, etc — the client (or HikariCP for
example) will think, that connection is still valid.
> ...I think it is on the order of 30 seconds or so, but I may be wrong.
It is 15 seconds by default and calculates as `drill.exec.rpc.user.timeout`
* 0.5
([souserce](https://github.com/apache/drill/blob/4aefcef2b665c5737471664912a26ef6ed9a6cfc/exec/rpc/src/main/java/org/apache/drill/exec/rpc/BasicClient.java#L87)).
> ...a heartbeat message was backed up behind data batches and the
connection ended up failing as a result.
I even heard about cases, when zookeeper was losing drillbit from a cluster,
because it didn't send heartbeat in time due to garbage collector freeze. And
therefore in infrequent cases, queries may fail, because a foreman decides,
that one of the drillbits is lost.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> DrillConnectionImpl#isValid() doesn't correspond JDBC API
> ---------------------------------------------------------
>
> Key: DRILL-8009
> URL: https://issues.apache.org/jira/browse/DRILL-8009
> Project: Apache Drill
> Issue Type: Bug
> Components: Client - JDBC
> Reporter: Maksym Rymar
> Assignee: Maksym Rymar
> Priority: Major
>
> {{DrillConnectionImpl#isValid()}} doesn't correspond [Java JDBC API
> documentation|https://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#isValid(int)].
> Current implementation doesn't do actual connection verify and doesn't cover
> cases like drillbit termination and network issues.
> {{java.sql.Connection#isValid()}} (which extends {{DrillConnectionImpl}})
> widely used in JDBC connection pools like HikariCP and without right
> implementation, Drill JDBC driver cannot be used with them, because
> connections will keep alive forever.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)