Sailesh Mukil has uploaded a new patch set (#8). Change subject: IMPALA-2592: DataStreamSender::Channel::CloseInternal() does not close the channel on an error. ......................................................................
IMPALA-2592: DataStreamSender::Channel::CloseInternal() does not close the channel on an error. CloseInternal() did not send an EOS RPC if there was an error in sending the previous RPC or an error in flushing the current row batch. It instead just logged the error. Due to this, the receiver on the RPC channel remained open causing the query to hang in the cluster. This fix adds a new function called FlushFinal() to the class DataSink. All the classes that inherit from DataSink (DataStreamSender included) implement this function. It currently is a no-op in the classes other than DataStreamSender; this will be changed in following patches. FlushFinal() can return a status and all the function calls that could fail in DataStreamSender::Close() have been moved to FlushFinal() leaving Close() as a pure teardown function which cannot fail. Due to FlushFinal() being able to return an error status, the errors are sent to the PlanFragmentExecutor which sends these errors in a report via ReportStatusCb() to the coordinator. The coordinator then cancels the query if it receives a report with an error in it. This is how the fix helps in getting rid of the cluster hang. Furthermore, a retry loop is added to send an RPC in FragmentExecState::ReportStatusCb() in case the coordinator is unreachable. If this RPC fails to be sent, there is a chance that the cluster could hang. This will be addressed in a seperate patch. Change-Id: Ia45d45dedb21c8b8abab266e140766eddfa363fb --- M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hbase-table-sink.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-sink.h M be/src/runtime/data-stream-sender.cc M be/src/runtime/data-stream-sender.h M be/src/runtime/plan-fragment-executor.cc M be/src/service/fragment-exec-state.cc 9 files changed, 104 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/05/2205/8 -- To view, visit http://gerrit.cloudera.org:8080/2205 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia45d45dedb21c8b8abab266e140766eddfa363fb Gerrit-PatchSet: 8 Gerrit-Project: Impala Gerrit-Branch: cdh5-2.5.0_5.7.0 Gerrit-Owner: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]>
