Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12567 )
Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch ...................................................................... IMPALA-8239: Fix handling of failed deserialization of row batch Previously, when a row batch failed to be deserialized in the data stream receiver, we will return the error status to the sender of the row batch without inserting the row batch. The receiver will continue to operate without flagging any error. The assumption is that the sender will eventually cancel the query upon receiving the failed status. Normally, when a caller of GetBatch() successfully dequeues a row batch from the batch queue, it will kick off the draining of the row batches from the deferred queue into the normal batch queue, which will further continue the cycle of draining the deferred queue upon the next call to GetBatch() until the deferred queue becomes empty. When an error is hit when deserializing a deferred batch to be inserted into the batch queue, the existing code will simply not insert the row batch or flag any error. This breaks the cycle of the deferred queue draining as the batch queue may become empty forever. The caller of GetBatch() will block indefinitely until the query is cancelled. The existing code works fine as the expectation is that the query will be cancelled once the sender receives the error status from the RPC response. However, this behavior is still not ideal as it lets a query which has hit a fatal error to hold on to resources for extended period of time. This patch fixes the problem by explicitly recording any error during row batch insertion in an error status object. Callers of GetBatch() will now also poll for this status object while waiting for row batch to show up and bail out early if there is any error. A new test case has been added to simulate the problematic case above. Change-Id: Iaa74937b046d95484887533be548249e96078755 Reviewed-on: http://gerrit.cloudera.org:8080/12567 Reviewed-by: Tim Armstrong <[email protected]> Reviewed-by: Thomas Marshall <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- M be/src/exec/exchange-node.cc M be/src/runtime/data-stream-test.cc M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/krpc-data-stream-recvr.h A tests/custom_cluster/test_exchange_deferred_batches.py 7 files changed, 157 insertions(+), 47 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Thomas Marshall: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/12567 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755 Gerrit-Change-Number: 12567 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Ho <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Thomas Marshall <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]>
