Dan Hecht has posted comments on this change.

Change subject: IMPALA-3385: Fix crashes on accessing error_log
......................................................................


Patch Set 5:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/2829/5/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

Line 223: }
Returning a copy is kind of surprising. How about making the interface mirror 
GetUnreportedErrors(). I.e. replace this with a:

void GetErrors(ErrorLogMap* errors);

That way, it's clear to readers of the coordinator code that a copy is 
happening.  Also, we should just get rid of the ErrorLogIsEmpty() method -- 
merging an empty map is fine.


Line 242:   // errors multiple times.
This comment doesn't make sense, as far as I see.  What goes wrong if we just 
do error_log_.clear()?  I think we can just clear the error_log_ map. It will 
affect ErrorCount() but that seems okay (and current semantics of ErrorCount() 
are weird anyway -- general errors decrease but non-general do not given the 
current code).


http://gerrit.cloudera.org:8080/#/c/2829/5/be/src/util/error-util.cc
File be/src/util/error-util.cc:

Line 135:     if (v.second.count == 0) continue;
please also test that you don't break ErrorLog functionality.  For example, 
looking at AppendError(), we don't maintain the count for GENERAL errorcode, so 
it looks like this change would break the GENERAL case (we'll always skip it).


Line 140:     } else {
DCHECK_GT(v.second.messages, 0);
DCHECK(!v.second.messages.empty());


Line 142:       if (v.second.count < 2) {
if (v.second.count == 1) {


Line 159:     if (v.second.count == 0) continue;
remove - this doesn't work for GENERAL. see above.


Line 161:     if (v.first == TErrorCode::GENERAL) {
DCHECK_EQ(v.second.count, 0);


Line 167:         (*left)[v.first].count += v.second.count;
this is still broken if 'left' was cleared.

I think what we should do is maintain the invariant:
for GENERAL: count == 0, messages.size() > 0
otherwise: count > 0, messages.size() == 1

and then add DCHECKs to these routines to check those invariants (as noted by 
other comments), and also fix GetUnreportedErrors() to just clear the map.


Line 168:       } else {
DCHECK_GT(v.second.count, 0);
DCHECK(!v.second.messages.empty());


Line 181:     if (it != map->end()) {
DCHECK(!it->second.messages.empty());


Line 193:       errors.find(TErrorCode::GENERAL)->second.messages.size() - 1 : 
0;
this code looks wrong -- messages.size() can be 0, so general_errors will 
underflow. Anyway, this will be fixed if we just clear the map in 
GetUnreporterErrors() so that the invariants above are held.


-- 
To view, visit http://gerrit.cloudera.org:8080/2829
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3a7e3d22e26147ada780aae5aed1f2e25a515afc
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Huaisi Xu <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Huaisi Xu <[email protected]>
Gerrit-HasComments: Yes

Reply via email to