Lars Volker has posted comments on this change. Change subject: IMPALA-5890: Abort queries if scanner hits IO errors ......................................................................
Patch Set 5: (5 comments) http://gerrit.cloudera.org:8080/#/c/8011/5/be/src/exec/base-sequence-scanner.cc File be/src/exec/base-sequence-scanner.cc: PS5, Line 177: if (status.IsCancelled() || status.IsMemLimitExceeded()) return status; : : // Log error from file format parsing. : state_->LogError(ErrorMsg(TErrorCode::SEQUENCE_SCANNER_PARSE_ERROR, : stream_->filename(), stream_->file_offset(), : (stream_->eof() ? "(EOF)" : ""))); : : // Make sure errors specified in the status are logged as well : state_->LogError(status.msg()); : : // If abort on error then return, otherwise try to recover. : if (state_->abort_on_error()) return status; : : // Abort scan range for I/O related errors : if (status.code() == TErrorCode::SCANNER_CONTEXT_WRAPPED_IO_ERROR) { : eos_ = true; : return Status::OK(); : } > it does seem like this could be written in terms of RuntimeState::LogOrRetu It looks to me as if the query will be aborted if return a non-OK status here. ProcessSplit() will return that status and it will be handled in hdfs-scan-node.cc:441. This will call SetDoneInternal() and the scanners will stop. Setting eos_ here will make stop ProcessSplit() but make it return Status::OK(). if (!status.ok()) { unique_lock<mutex> l(lock_); // If there was already an error, the main thread will do the cleanup if (!status_.ok()) break; if (status.IsCancelled() && done_) { // Scan node initiated scanner thread cancellation. No need to do anything. break; } // Set status_ before calling SetDone() (which shuts down the RowBatchQueue), // to ensure that GetNextInternal() notices the error status. status_ = status; SetDoneInternal(); break; } That being said, I changed the code to abort the query. http://gerrit.cloudera.org:8080/#/c/8011/5/be/src/exec/scanner-context.cc File be/src/exec/scanner-context.cc: PS5, Line 194: Internal " > would be best to keep "Internal error" on the same line so that it's more e Done Line 284: io_buffer_bytes_left_ = 0; > Maybe add a short comment explaining why this is necessary? Done, thx for the suggestion. Line 371: Status wrapped_status(TErrorCode::SCANNER_CONTEXT_WRAPPED_IO_ERROR, status.msg().msg()); > Maybe reference IMPALA-4697 for context on why we're doing this instead of Removed the code altogether. http://gerrit.cloudera.org:8080/#/c/8011/5/be/src/exec/scanner-context.h File be/src/exec/scanner-context.h: Line 277: Status WrapIoError(const Status& status); > Is there a simple thing we can start with for this patch (rather than intro Done -- To view, visit http://gerrit.cloudera.org:8080/8011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I44dc95184c241fbcdbdbebad54339530680d3509 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
