Dan Hecht has posted comments on this change. Change subject: IMPALA-3385: Fix crashes on accessing error_log ......................................................................
Patch Set 11: (19 comments) Thanks, that's much better. http://gerrit.cloudera.org:8080/#/c/2829/11//COMMIT_MSG Commit Message: Line 18: hdfsScanNode::splitprocess HdfsScanNode::ProcessSplit() Line 23: when returning error_log and return a copy of the RuntimeState::error_log_, fixing a race in the coordinator. http://gerrit.cloudera.org:8080/#/c/2829/11/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: Line 1204: ss << endl; is this endl still needed? i think you get one at the end of all logs, it was there to put the ErrorLog() output on a separate line. but you need to double check. http://gerrit.cloudera.org:8080/#/c/2829/11/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: Line 221: // same errors multiple times. This comment should be moved to the header comment for ClearErrorMap(). Also, this comment still doesn't make sense because keeping the keys doesn't prevent the errors from being reported multiple times. that would still be the case if we just did error_log_.clear(). Is it trying to say: Reset all messages so that GENERAL errors are not reported multiple times and reset the counts so that non-GENERAL errors are not counted multiple times. But keep the keys so that we remember how many distinct non-GENERAL errors have been reported. http://gerrit.cloudera.org:8080/#/c/2829/11/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: Line 120: void GetErrors(ErrorLogMap* errors); Since this is no longer a simple accessor, add a function comment and move down to be near the other error log functions: Copy error_log_ to '*errors'. Line 278: unreported_error_idx_ delete http://gerrit.cloudera.org:8080/#/c/2829/11/be/src/util/error-util-test.cc File be/src/util/error-util-test.cc: Line 49: cleared[TErrorCode::RPC_TIMEOUT].messages.clear(); What is this code trying to verify? Shouldn't it use ClearErrorMap() rather than calling clear() directly? Line 70: right kinda weird to be using 'right' on the left hand side. Consider just creating a new map. Line 86: ASSERT_EQ(0, cleared[TErrorCode::RPC_TIMEOUT].messages.size()); order these cases consistently with lines 72-86. That makes it read much faster to see that they are expecting the same results. http://gerrit.cloudera.org:8080/#/c/2829/11/be/src/util/error-util.cc File be/src/util/error-util.cc: Line 136: LogEntry log_entry Line 142: { missing space Line 171: DCHECK_EQ(source.messages.empty()^(source.count == 0), 0); DCHECK_EQ(source.messages.empty(), source.count == 0); Line 202: return errors.size(); one line (and braces are required for multi-line if) http://gerrit.cloudera.org:8080/#/c/2829/11/tests/query_test/test_errorlog.py File tests/query_test/test_errorlog.py: Line 24: add a brief summary of what this test wants to accomplish, and reference the JIRA. Line 29: class TestErrorLogs(ImpalaTestSuite): what happens when you run this test without your fixes? does it reproduce the crash? Line 40: if os.getenv("TARGET_FILESYSTEM") in ["s3", "isilon", "local"]: return could we just remove hbase dimension when running on these systems, or is that not easy to do? Line 49: location = vector.get_value('location') Add comment: # Node ID 0 is the scan node. Line 67: cancel_result = self.client.cancel(handle) why do we do this? is it just because otherwise the query runs for a long time, or does it test something important. add a comment. Line 69: assert cancel_result.status_code == 0,\ what gets returned if the query had completed before the cancel rpc? will we still return 0 status_code in that case? -- To view, visit http://gerrit.cloudera.org:8080/2829 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3a7e3d22e26147ada780aae5aed1f2e25a515afc Gerrit-PatchSet: 11 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Huaisi Xu <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Huaisi Xu <[email protected]> Gerrit-HasComments: Yes
