Hello Sailesh Mukil, Skye Wanderman-Milne, Tim Armstrong,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/2305
to look at the new patch set (#7).
Change subject: IMPALA-2987: Distinguish between already-closed and never-seen
data stream receivers
......................................................................
IMPALA-2987: Distinguish between already-closed and never-seen data stream
receivers
This patch adds an output parameter 'already_unregistered' to
FindRecvrOrWait() to signal to the caller in which of two cases it may
have returned NULL. If 'already_unregistered' is true, the receiver has
already been setup and closed (possibly by cancellation, possibly by
the fragment deliberately closing its inputs in the case of a
limit). This is not an error - cancellation will be signalled to the
sender from the coordinator, and deliberate closure means the
coordinator will tear down the query shortly.
If 'already_unregistered' is set to false by FindRecvrOrWait(), the
DataStreamMgr has never seen the intended receiver. This means the
sender has waited for a full timeout period without the upstream
receiver being established; this signals a likely query setup
problem (as long as datastream_sender_timeout_ms is set sufficiently
large) and so we return an error.
We need to tweak the two timeout parameters here:
* datastream_sender_timeout_ms needs to be large enough to avoid false
negatives for problems during query setup (otherwise queries will
unexpectedly cancel that would otherwise have succeeded, if slowly).
* STREAM_EXPIRATION_TIME_MS needs to be set high enough that a query
will not continue executing for longer than STREAM_EXPIRATION_TIME_MS
after it closes its input (otherwise the sender will get
already_unregistered=false, and cancel). This case will only trigger
when a sender tries to call TransmitData() after the receiver has been
closed for STREAM_EXPIRATION_TIME_MS; this should not happen in
non-error cases as receivers are not closed before consuming their
entire input.
In this patch the former has been set to 2 minutes, and the latter to 5 minutes.
Change-Id: Ib1734992c7199b9dd4b03afca5372022051b6fbd
---
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M common/thrift/generate_error_codes.py
3 files changed, 55 insertions(+), 34 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/05/2305/7
--
To view, visit http://gerrit.cloudera.org:8080/2305
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib1734992c7199b9dd4b03afca5372022051b6fbd
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Henry Robinson <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-Reviewer: Skye Wanderman-Milne <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>