Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
......................................................................


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/7730/3//COMMIT_MSG
Commit Message:

Line 28: variables or direct declarations in classes.
> Yeah, it does seem like there's an important distinction there.
I labeled the locations that are eligible for fault injection, and then added a 
parameter to turn on thread creation fault injection. This should make testing 
much easier.


http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

Line 361:       COUNTER_ADD(&active_scanner_thread_counter_, -1);
> Oh I see. Seems ok. I'm a little skeptical that it's worth trying to recove
There are arguments in either direction. Given that we think this should be 
rare, I think it makes sense to go ahead and abort the query. I changed this 
code and kudu-scan-node.cc to do that.


http://gerrit.cloudera.org:8080/#/c/7730/4/be/src/exec/kudu-scan-node.cc
File be/src/exec/kudu-scan-node.cc:

Line 164:       UndoGetNextScanToken(token);
> In this case it does seem simpler to fail the query, since it we mess this 
Changed this to fail the query.


http://gerrit.cloudera.org:8080/#/c/7730/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

Line 330:       // Either get this to work or remove before merge.
> Still relevant?
Removed


Line 350:     Status instance_status = entry.second->WaitForPrepare();
> I think we should preserve the postcondition that all fragment instances ha
That makes sense. One thing that I noticed when making this change is that in 
some circumstances the fragment instance will get into an RPC call before the 
query starts cancelling. In this case, the RPC call needs to timeout for the 
query to complete its cancel. The query does eventually end and nothing crashes.


http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/util/thread-pool.h
File be/src/util/thread-pool.h:

Line 61: 
> Yeah I think that would make most sense to me - reduces the number of diffe
Fixed this by doing Shutdown() + Join() in the error condition.


http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/util/thread.cc
File be/src/util/thread.cc:

Line 301:   unique_ptr<Thread> t(new Thread(category, name));
> Sounds like a good idea to me.
Added.


-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to