Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17872 )
Change subject: IMPALA-10811 RPC to submit query getting stuck for AWS NLB forever ...................................................................... Patch Set 29: (5 comments) I want this code review to converge and get merged. What I'm looking for is for the next upload to be a "clean" upload. Address every code review comment completely. No additional changes. http://gerrit.cloudera.org:8080/#/c/17872/21/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/17872/21/be/src/service/client-request-state.cc@713 PS21, Line 713: : // If this is a CTAS request, there will usually be more work to do : // after executing the CREATE TABLE statement (the INSERT portion of the operation). : // The exception is if the user specified IF NOT EXISTS and the table already : // existed, in which case we do not execute the INSERT. : i > I see. Thanks a lot for the description. I'm glad that you changed out of the INITIALIZED state before the end of Exec(). Now, the only thing remaining is to solve the race condition. :-) I understand your argument completely, and I still disagree with you. My code review comments stand. I realize that this is not the answer that you want. Nonetheless, this is a code review, and I need you to respect my decision. http://gerrit.cloudera.org:8080/#/c/17872/29/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/17872/29/be/src/service/client-request-state.cc@700 PS29, Line 700: // For any non-CTAS DDLs, transition to RUNNING. For CTAS DDLs, transition : // to RUNNING during FinishExecQueryOrDmlRequest() called by ExecQueryOrDmlRequest(). : if (!is_CTAS) UpdateNonErrorExecState(ExecState::RUNNING); I am going to write this comment once more: The state transition for non-CTAS should go directly from INITIALIZED to RUNNING with no time spent in PENDING. The transition should occur on the main thread prior to spawning the async thread. See my previous comment for the if/then/else structure that I'm looking for. http://gerrit.cloudera.org:8080/#/c/17872/29/tests/hs2/test_hs2.py File tests/hs2/test_hs2.py: http://gerrit.cloudera.org:8080/#/c/17872/29/tests/hs2/test_hs2.py@368 PS29, Line 368: execute_statement_resp = self.execute_statement( : "create table {0}.alltypes_clone as select * from \ : functional_parquet.alltypes".format(unique_database)) Take a timestamp before this statement and after, then take the diff and make sure it is less than 5 seconds (or 3 seconds or something). http://gerrit.cloudera.org:8080/#/c/17872/29/tests/metadata/test_ddl.py File tests/metadata/test_ddl.py: http://gerrit.cloudera.org:8080/#/c/17872/29/tests/metadata/test_ddl.py@903 PS29, Line 903: # IMPALA-10811: RPC to submit query getting stuck for AWS NLB forever : # Test HS2, Beeswax and HS2-HTTP three clients. : class TestAsyncDDL(TestDdlBase): : @classmethod : def get_workload(self): : return 'functional-query' : : @classmethod : def add_test_dimensions(cls): : super(TestAsyncDDL, cls).add_test_dimensions() : cls.ImpalaTestMatrix.add_dimension(create_client_protocol_dimension()) : cls.ImpalaTestMatrix.add_dimension(create_exec_option_dimension( : sync_ddl=[0])) : : def test_async_ddl(self, vector, unique_database): : self.run_test_case('QueryTest/async_ddl', vector, use_db=unique_database) Create an test in tests/hs2/test_hs2.py that does the equivalent of test_get_operation_status_for_async_ddl() for the alter table case we are doing here. The test_hs2.py tests are more powerful, because they can test the state transitions directly (and the time it takes to do Exec()). If we test CTAS and non-CTAS in that way, then this test is no longer necessary. Remove this test. http://gerrit.cloudera.org:8080/#/c/17872/29/tests/metadata/test_metadata_query_statements.py File tests/metadata/test_metadata_query_statements.py: http://gerrit.cloudera.org:8080/#/c/17872/29/tests/metadata/test_metadata_query_statements.py@153 PS29, Line 153: def test_async_ddl_with_JDBC(self, vector, unique_database): : self.exec_with_jdbc("drop table if exists {0}.test_table".format(unique_database)) : self.exec_with_jdbc_and_compare_result( : "create table {0}.test_table(a int)".format(unique_database), : "'Table has been created.'") : : self.exec_with_jdbc("drop table if exists {0}.alltypes_clone".format(unique_database)) : self.exec_with_jdbc_and_compare_result( : "create table {0}.alltypes_clone as select * from\ : functional_parquet.alltypes".format(unique_database), : "'Inserted 7300 row(s)'") Metadata query statements are things like "describe {table_name}" and "describe database {database_name}". This is not doing metadata query statements, so this is not the place for this test. I don't think this provides additional value beyond the other tests that we have. JDBC is using HS2 under the covers just like Impyla. Remove this test and the corresponding changes in tests/common/impala_test_suite.py. -- To view, visit http://gerrit.cloudera.org:8080/17872 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib57e86926a233ef13d27a9ec8d9c36d33a88a44e Gerrit-Change-Number: 17872 Gerrit-PatchSet: 29 Gerrit-Owner: Qifan Chen <[email protected]> Gerrit-Reviewer: Amogh Margoor <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Comment-Date: Tue, 19 Oct 2021 18:59:43 +0000 Gerrit-HasComments: Yes
