Joe McDonnell has posted comments on this change. Change subject: IMPALA-5750: Catch exceptions from boost thread creation ......................................................................
Patch Set 6: (11 comments) http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/exec/kudu-scan-node.cc File be/src/exec/kudu-scan-node.cc: PS6, Line 159: Status status; : status = > nit: combine these lines Done PS6, Line 163: num_scanner_threads_started_counter_ > here and other scan node: I guess we undo this rather than just waiting unt I checked, and this is only used as a statistic and for making the names of the threads unique. I changed this to increment after successful thread create (here and for hdfs-scan-node.cc). (A side effect is that the thread indexes start from zero.) http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/runtime/query-exec-mgr.cc File be/src/runtime/query-exec-mgr.cc: PS6, Line 59: RETURN_IF_ERROR(Thread::Create("query-exec-mgr", : Substitute("start-query-finstances-$0", PrintId(query_id)), : &QueryExecMgr::StartQueryHelper, this, qs, &t, true)); > Yes, that looks like a bug. I will fix in the next upload. Done http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: PS6, Line 199: // If this fragment instance is not prepared, it must be reporting an error. : DCHECK(fis->IsPrepared() || !status.ok()); > If fis==null case doesn't report the error, then that's a bug we need to fi Looking into a fix. I will also do fault injection at that point to see what happens. The two might be slightly different, because in that case no fragment instances have been started. PS6, Line 331: // Remove fis from fis_map_ and fragment_map_ : fis_map_.erase(fis->instance_id()); : fis_list.pop_back(); > why don't we just do this on the success path? is it to try to be consisten I checked and all other uses of these two things check that the instances_preapred_promise_ is set. So, I moved it to the success path, since that is safe. http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/runtime/thread-resource-mgr.h File be/src/runtime/thread-resource-mgr.h: PS6, Line 115: bool in_callback = false); > this seems complicated. why is it needed now? ReleaseThreadToken calls InvokeCallbacks. The two locations where we use this (hdfs-scan-node.cc and kudu-scan-node.cc) are in functions that are registered as callbacks. If we try to release inside the callback, this is mutual recursion. http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/service/child-query.h File be/src/service/child-query.h: Line 159: Status ExecAsync(std::vector<ChildQuery>&& child_queries); > WARN_UNUSED_RESULT (or are we relying on the Status annotation now?) Added WARN_UNUSED_RESULT. http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: PS6, Line 598: Thread::Create("query-exec-state", "wait-thread", : &ClientRequestState::Wait, this, &wait_thread_); > Good point. I will add it to the list of locations that are eligible for fa Done http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/service/impala-beeswax-server.cc File be/src/service/impala-beeswax-server.cc: PS6, Line 71: Status status; : status = request_state->WaitAsync(); > nit: combine Done http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/util/thread.cc File be/src/util/thread.cc: Line 304: if ((nanos % 100) == 1) { > Good point, will make that change. Changed to use rand() Line 327: thread->swap(t); > *thread = move(t) seems clearer Done -- To view, visit http://gerrit.cloudera.org:8080/7730 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Joe McDonnell <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
