Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17872 )

Change subject: IMPALA-10811 RPC to submit query getting stuck for AWS NLB 
forever
......................................................................


Patch Set 20:

(13 comments)

Address all comments except for the one that runs CREATE TABLE AS SELECT in 
async.

http://gerrit.cloudera.org:8080/#/c/17872/15/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/17872/15/be/src/service/client-request-state.cc@720
PS15, Line 720:
              :   // Set the results to be reported to the client.
              :   SetResultSet(catalog_op_executor_->ddl_exec_response());
              :   return Status::OK();
              : }
              :
              : void ClientRequestState::ExecDdlRequestImplAsync() {
              :   string op_type = catalog_op_type() == TCatalogOpType::DDL ?
              :
> I hacked a bit on this, and the modifications are not that big. This is wha
Thanks a lot Joe for this.

I'll integrate your change in a separate patch set for this JIRA.


http://gerrit.cloudera.org:8080/#/c/17872/17/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/17872/17/be/src/service/client-request-state.cc@643
PS17, Line 643: DdlRequ
> Nit: We have an unrelated feature called "sync_ddl". If we can avoid simila
Done. Change the naming slightly to make it clearer. 

Impl for implementation

ExecDdlRequestImplAsync()
ExecDdlRequestImplSync()
ShouldRunExecDdlAsync()


http://gerrit.cloudera.org:8080/#/c/17872/17/be/src/service/client-request-state.cc@688
PS17, Line 688:   Status status = 
catalog_op_executor_->Exec(exec_request_->catalog_op_request);
              :   {
> Same as other DebugAction location. Same fixes.
This debug action was removed since it is in the sync path.


http://gerrit.cloudera.org:8080/#/c/17872/17/be/src/service/client-request-state.cc@728
PS17, Line 728:
> For this async code, I think we are better off with a void return value, be
Done


http://gerrit.cloudera.org:8080/#/c/17872/17/be/src/service/client-request-state.cc@735
PS17, Line 735:   Status status = 
catalog_op_executor_->Exec(exec_request_->catalog_op_request);
              :   {
> For here, I think you can use DebugActionNoFail(), since we are using this
Done


http://gerrit.cloudera.org:8080/#/c/17872/17/be/src/service/client-request-state.cc@744
PS17, Line 744:       exec_request_->query_options.sync_ddl);
              :   {
              :     lock_guard<mutex> l(lock_);
> An important thing here is that nothing is actually using the return value
Good point. Done


http://gerrit.cloudera.org:8080/#/c/17872/17/be/src/service/client-request-state.cc@778
PS17, Line 778:       RETURN_IF_ERROR(Thread::Create("impala-server", 
"async_exec_thread_",
              :           &ClientRequestState::ExecDdlRequestImplAsync, this, 
&async_exec_thread
> Please go look at the comment that I made on the previous upload about ABOR
Done


http://gerrit.cloudera.org:8080/#/c/17872/17/be/src/service/client-request-state.cc@781
PS17, Line 781:     return Status::OK();
> Do this before spawning the thread (similar to how ExecAsyncQueryOrDmlReque
Okay. Changed the order and set the state to PENDING to account for the 
possibility of failure to fork the thread.


http://gerrit.cloudera.org:8080/#/c/17872/19/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/17872/19/be/src/service/client-request-state.cc@643
PS19, Line 643: Status ClientRequestState::ExecDdlRequestImplSync() {
> LOAD DATA might also reset metadata through CatalogOpExecutor::Exec which m
Interesting.

By looking at the code, it seems the real load data is done in java 
(https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/Frontend.java#L748).

After that, the meta-data refresh is done with Catalog Op Exec call 
(https://github.com/apache/impala/blob/master/be/src/service/client-request-state.cc#L259).

Looks like some more research is needed to correctly handle AWS NLB-type 
timeout for data loading in general. My guess is that frontend_->LoadData() 
probably should be running in a thread too.

For this JIRA, I will document the limitation in the commit message and file a 
new JIRA.


http://gerrit.cloudera.org:8080/#/c/17872/19/be/src/service/client-request-state.cc@776
PS19, Line 776:     {
> I think earlier this was not being counted for 'EXEC_TIME_LIMIT_S', but aft
Did a test with CREAT TABLE with execution time limit and found the limit is 
tracked by admission control regardless of how the ddl is executed. See 
https://github.com/apache/impala/blob/master/be/src/service/impala-server.cc#L1336,
 where queries_by_timestamp_ is updated with the DDL query.

Thus I think we are OK here.


http://gerrit.cloudera.org:8080/#/c/17872/19/be/src/service/client-request-state.cc@779
PS19, Line 779:           &ClientRequestState::ExecDdlRequestImplAsync, this, 
&async_exec_thread_));
> Thread spawned just above (Line 776) might finish off before the execution
Done. See my reply to Joe's comment in this area.


http://gerrit.cloudera.org:8080/#/c/17872/19/tests/metadata/test_ddl.py
File tests/metadata/test_ddl.py:

http://gerrit.cloudera.org:8080/#/c/17872/19/tests/metadata/test_ddl.py@913
PS19, Line 913:     
cls.ImpalaTestMatrix.add_dimension(create_client_protocol_dimension())
              :     
cls.ImpalaTestMatrix.add_dimension(create_exec_option_dimension(
              :         sync_ddl=[0]))
> Test dimensions run tests multiple times with different configurations. If
Good to know!

Done.


http://gerrit.cloudera.org:8080/#/c/17872/19/tests/metadata/test_ddl.py@921
PS19, Line 921: dding/dropping of .jar and .so in the lib cache.
> You can omit multiple_impalad.
Done



--
To view, visit http://gerrit.cloudera.org:8080/17872
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib57e86926a233ef13d27a9ec8d9c36d33a88a44e
Gerrit-Change-Number: 17872
Gerrit-PatchSet: 20
Gerrit-Owner: Qifan Chen <[email protected]>
Gerrit-Reviewer: Amogh Margoor <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Comment-Date: Tue, 12 Oct 2021 15:49:48 +0000
Gerrit-HasComments: Yes

Reply via email to