Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17872 )
Change subject: IMPALA-10811 RPC to submit query getting stuck for AWS NLB forever ...................................................................... Patch Set 14: (4 comments) SessionState is not one to one with a SQL statement execution. A SessionState corresponds to multiple SQL statements, both sequentially and simultaneously. I think we're a level too high. We're a level above ClientRequestState, which is one-to-one with SQL statement execution. The lifecycle of the SessionState object is not a good fit for per-statement things. I think the approach that ClientRequestState::ExecAsyncQueryOrDmlRequest() takes is the right level. It keeps the thread on the ClientRequestState, which makes queries independent from each other. http://gerrit.cloudera.org:8080/#/c/17872/14/be/src/service/impala-hs2-server.cc File be/src/service/impala-hs2-server.cc: http://gerrit.cloudera.org:8080/#/c/17872/14/be/src/service/impala-hs2-server.cc@600 PS14, Line 600: { : unique_lock<mutex> unique_lock(session->lock); : session->execute_statement_info.return_val = return_val; : session->execute_statement_info.state = : SessionState::ExecuteStatementInfo::ExecutionState::DONE; : } : session->execute_statement_info.cv.NotifyAll(); ExecuteStatementCommonInternal() is being used from the synchronous and the asynchronous case. One issue is that the synchronous case is not using the ExecuteStatementInfo, but it still does these statements to manipulate it. I think child queries could cause problems for their parent queries. The HS2_NOTIFY_AND_RETURN_IF_ERROR macro also manipulate the ExecuteStatementInfo. http://gerrit.cloudera.org:8080/#/c/17872/14/be/src/service/impala-hs2-server.cc@669 PS14, Line 669: if (session->execute_statement_info.state < target_state) { : session->execute_statement_info.cv.Wait(unique_lock); : } When waiting for a particular state, you want to double-check that you actually ended up in that state. If you know you can't be woken up without going to the right state, then assert it. wait() assert(state == desired_state) If you think you could be woken up without a state transition, do a while loop: while (state != desired_state) { wait() } http://gerrit.cloudera.org:8080/#/c/17872/14/be/src/service/impala-hs2-server.cc@682 PS14, Line 682: session->execute_statement_info.thread.release(); unique_ptr::release() won't delete the object that the unique_ptr is currently pointing to. The caller is responsible for freeing the object returned by release(). https://en.cppreference.com/w/cpp/memory/unique_ptr/release What you want is unique_ptr::reset(). http://gerrit.cloudera.org:8080/#/c/17872/14/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/17872/14/be/src/service/impala-server.h@663 PS14, Line 663: ExecuteStatementInfo execute_statement_info; I think we have a cardinality problem here. A SessionState is not a one-to-one relationship to a SQL statement execution. A session can have multiple SQL statements in flight simultaneously. That's already a problem with child queries, but this is also true in general apart from child queries. ClientRequestState is one-to-one with a SQL statement execution. -- To view, visit http://gerrit.cloudera.org:8080/17872 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib57e86926a233ef13d27a9ec8d9c36d33a88a44e Gerrit-Change-Number: 17872 Gerrit-PatchSet: 14 Gerrit-Owner: Qifan Chen <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Comment-Date: Thu, 07 Oct 2021 05:03:16 +0000 Gerrit-HasComments: Yes
