Hi,
I have been looking into an issue where the native client hangs after
failing to handshake during connection.
We have some boundary conditions where a connection is attempted while
the drill service is starting resulting in handshake failures.
--
DrillClient::connect() ends up ultimately in
DrillClientImpl::recvHandshake().
This function calls async_read() with a callback to
DrillClientImpl::handleHandshake to handle the results of handshaking.
However, on error, DrillClientImpl::handleHandshake ends up calling
handleConnError() which merely calls shutdownSocket() to kill the socket
and set m_bIsConnected to false.
When all that unfolds, DrillClientImpl::validateHandshake() ends up
returning CONN_SUCCESS which is clearly wrong because the handshake failed.
The original caller to DrillClient::connect() thinks everything is
hunky-dorey.
The following added to DrillClientImpl::recvHandshake() after the
m_io_service.run() line to check the result seems to fix the problem:
if (!m_bIsConnected)
{
return CONN_HANDSHAKE_FAILED;
}
...but I wonder if there is a better solution.
Also, although there is the IsActive() member that applications can call
to determine if the connection is up, I wonder if the front-line drill
calls (such as submitQuery()) could check the connection status a matter
of course.
Currently, if you attempt a submitQuery() call when the connection is
down, it just hangs because m_io_service is not running and
m_deadlineTimer never triggers as a fall back.
Opinions?
Cheers,
Ralph Little