Tim Armstrong has posted comments on this change. Change subject: IMPALA-5750: Catch exceptions from boost thread creation ......................................................................
Patch Set 2: (12 comments) http://gerrit.cloudera.org:8080/#/c/7730/2//COMMIT_MSG Commit Message: Line 34: 1. QueryState::StartFInstances() needs appropriate > I did that for query-state.cc, but I think that is probably the most common Yeah makes sense. http://gerrit.cloudera.org:8080/#/c/7730/3//COMMIT_MSG Commit Message: Line 28: as arguments to the Create method, this eliminates For testing, it might be worth adding a stress startup option to make thread creation randomly fail. It seems hard to make a non-flaky automated test but it would be interesting to run some queries against it and see what the behaviour is. http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: Line 361: COUNTER_ADD(num_scanner_threads_started_counter_, -1); It seems ok to omit fixing up num_scanner_threads_started_counter_. I don't think it makes a difference whether it counts the number of successful starts versus number of attempted starts. Although this way is fine too. http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/exec/kudu-scan-node.cc File be/src/exec/kudu-scan-node.cc: Line 154: ++num_active_scanners_; This is protected by lock_ so we could just wait until the thread is created successfully to update it Line 170: COUNTER_ADD(num_scanner_threads_started_counter_, -1); Same as the HDFS scan node - it seems ok to omit this fixup. http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/rpc/thrift-thread.cc File be/src/rpc/thrift-thread.cc: Line 37: if (!status.ok()) throw atc::SystemResourceException("Thread::Create() failed"); This could use some explanation, since it's an unusual idiom in the code. I.e. why do we need to throw an exception and how do we expect thrift to handle it. http://gerrit.cloudera.org:8080/#/c/7730/2/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: Line 359: Status prepare_status; It seems like there's a fair bit of commonality between the case when Prepare() fails and the case when thread creating fails. Maybe we can share more of the code. http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/service/impala-hs2-server.cc File be/src/service/impala-hs2-server.cc: Line 469: HS2_RETURN_ERROR(return_val, status.GetDetail(), SQLSTATE_GENERAL_ERROR); It looks like we have identical error-handling code in three places. May be cleaner to combine into an error handling block at the end of the function + gotos. PS3, Line 476: (void) discard_result http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/util/thread-pool.h File be/src/util/thread-pool.h: Line 61: Shutdown(); How do things get cleaned up on an error? Does Shutdown() handle that case? http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/util/thread.cc File be/src/util/thread.cc: Line 301: } catch (boost::thread_resource_error& e) { Might be worth adding a status code to common/thrift/generate_error_codes.py for this error http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/util/thread.h File be/src/util/thread.h: PS3, Line 69: we'd normally use pointers for output arguments - https://google.github.io/styleguide/cppguide.html#Reference_Arguments -- To view, visit http://gerrit.cloudera.org:8080/7730 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Joe McDonnell <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
