Hi,

Thanks for your response:

> The original caller to DrillClient::connect() thinks everything is
> hunky-dorey.
>

Yes, that would be a problem. From what I remember, the recvHandshake call
blocks in m_ioservice.run. On return from run, the recvHandshake checks if
the error object m_pError is not null. m_pError is not null iff there has
been an error. Do you see this not working correctly?
Ah yes, I see that this code is compiled out by default unless
WIN32_SHUTDOWN_ON_TIMEOUT is defined.
I enabled that and it works as you say.

> Currently, if you attempt a submitQuery() call when the connection is
> down, it just hangs because m_io_service is not running and m_deadlineTimer
> never triggers as a fall back.
>
> Opinions?
>

It is a good idea to check connection status before sending any message to
the server. LMK if you want to submit a patch :), I can review and merge it
in.

I have added something and will send a patch shortly.

As an aside, I'm trying to shore up the resilience of query failures
from the back-end.
If I set a query timeout then pause the HADOOP backend (in a VM) so that
it is unresponsive, the application still hangs.
This seems to be because the query timeout is reset every time a
heartbeat (PONG) is received by the Native Client DLL.
So again we get no application-side timeout.

I still suspect that there may be a number of boundary scenarios that
could cause the Native Client to lock up so I'm looking into a way to
add a "cancel" application API so that the application can timeout
itself and cancel the pending query.

When I'm happy with what we have, I'll submit a patch for your perusal.

Cheers,
Ralph

Reply via email to