Wenzhe Zhou created KUDU-3366:
---------------------------------

             Summary: KRPC callback function not called when cancelling KRPC
                 Key: KUDU-3366
                 URL: https://issues.apache.org/jira/browse/KUDU-3366
             Project: Kudu
          Issue Type: Bug
          Components: rpc
            Reporter: Wenzhe Zhou


Impala ran into an issue which caused a thread hang when cancelling a query. 
Impala log messages shows that Impala coordinator called 
RpcController::Cancel() to cancel RPC, then waited RPC callback function to be 
called. But the KRPC callback function was not called. This caused the Impala 
thread wait forever. See Impala-11263.

KRPC cancellation was implemented in KUDU-2065 with patch 
https://gerrit.cloudera.org/#/c/7455/. According to the comments of KUDU-2065, 
they decided not to do cancellation for outbound request in SENDING state since 
cancelling calls in SENDING state seems too complicated, and expect most calls 
to be drained quickly and outbound request will be transferred from SENDING to 
SENT.
But reactor thread function ReactorThread::CancelOutboundCall() calls 
Connection::CancelOutboundCall() before calling OutboundCall::Cancel().  
Connection::CancelOutboundCall() reset car->call as null pointer, this lead 
Connection::HandleOutboundCallTimeout() to skip calling 
OutboundCall::SetTimedOut(), and Connection::Shutdown() to skip calling 
OutboundCall::SetFailed(). In case socket->Writev() fails while outbound 
request in SENDING state, CallTransferCallbacks::NotifyTransferFinished() will 
not be called, hence OutboundCall::SetSent() will not be called. This causes 
outbound request cannot be transferred from SENDING state to SENT state, hence 
KRPC callback function is not called in this corner case.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to