Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 )
Change subject: IMPALA-12426: Query History Table ...................................................................... Patch Set 41: (24 comments) http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h File be/src/service/workload-management.h: http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h@40 PS41, Line 40: wm Can this renamed with something more explanatory? Perhaps "management" or "history"? http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h@102 PS41, Line 102: std::string Looks like most columns are primitive types. So this can be replaced with TPrimitiveType. Also clarify in comment if these column types are nullable or not. http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h@248 PS41, Line 248: if (ctx.record->base_state->stmt_type == TStmtType::SET) { : ctx.sql << "SET"; : } else if (ctx.record->base_state->stmt_type == TStmtType::QUERY) { : ctx.sql << "QUERY"; : } else if (ctx.record->base_state->stmt_type == TStmtType::DML) { : ctx.sql << "DML"; : } else if (ctx.record->base_state->stmt_type == TStmtType::DDL) { : ctx.sql << "DDL"; : } else if (ctx.record->base_state->stmt_type == TStmtType::EXPLAIN) { : ctx.sql << "EXPLAIN"; : } else if (ctx.record->base_state->stmt_type == TStmtType::ADMIN_FN) { : ctx.sql << "ADMIN"; : } else if (ctx.record->base_state->stmt_type == TStmtType::CONVERT) { : ctx.sql << "CONVERT"; : } else if (ctx.record->base_state->stmt_type == TStmtType::LOAD) { : ctx.sql << "LOAD"; : } else if (ctx.record->base_state->stmt_type == TStmtType::TESTCASE) { : ctx.sql << "TESTCASE"; : } else { : ctx.sql << "N/A"; : } This can be feed directly to StringStream sql. This the how generated Types_types.cpp looks like: const char* _kTStmtTypeNames[] = { "QUERY", "DDL", "DML", "EXPLAIN", "LOAD", "SET", "ADMIN_FN", "TESTCASE", "CONVERT", "UNKNOWN" }; const std::map<int, const char*> _TStmtType_VALUES_TO_NAMES(::apache::thrift::TEnumIterator(10, _kTStmtTypeValues, _kTStmtTypeNames), ::apache::thrift::TEnumIterator(-1, nullptr, nullptr)); std::ostream& operator<<(std::ostream& out, const TStmtType::type& val) { std::map<int, const char*>::const_iterator it = _TStmtType_VALUES_TO_NAMES.find(val); if (it != _TStmtType_VALUES_TO_NAMES.end()) { out << it->second; } else { out << static_cast<int>(val); } return out; } std::string to_string(const TStmtType::type& val) { std::map<int, const char*>::const_iterator it = _TStmtType_VALUES_TO_NAMES.find(val); if (it != _TStmtType_VALUES_TO_NAMES.end()) { return std::string(it->second); } else { return std::to_string(static_cast<int>(val)); } } http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@63 PS41, Line 63: if (boost::algorithm::any_of(val.cbegin(), val.cend(), boost::is_any_of("\"';\n"))) { : LOG(ERROR) << "Invalid value for --" << name << ": must not contain quotes or " : "semicolons."; : return false; : } : : if (!val.empty()) return true; : : LOG(ERROR) << "Invalid value for --" << name << ": must not be empty."; : return false; What if this constrained further to only contain alphanumeric and underscore chars? ie., pass regex "^[a-zA-Z0-9_]*$". This validator might be reusable by other backend flag like cluster_id. http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@79 PS41, Line 79: if (val.find_first_of('\'') != string::npos) { : LOG(ERROR) << "Invalid value for --" << name << ": must not contain single quotes."; : return false; : } : : if (val.find_first_of('"') != string::npos) { : LOG(ERROR) << "Invalid value for --" << name << ": must not contain double quotes."; : return false; : } : : if (val.find_first_of('\n') != string::npos) { : LOG(ERROR) << "Invalid value for --" << name << ": must not contain newlines."; : return false; : } : : return true; Is this mainly validate if val is a legal POSIX path? Can this simplified by creating boost::path and check if is_absolute() == True? https://www.boost.org/doc/libs/1_84_0/libs/filesystem/doc/reference.html#path-is_absolute http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@140 PS41, Line 140: DEFINE_int32_hidden(query_log_shutdown_deadline_s, 30, "Number of seconds to wait for " : "the queue of completed queries to be drained to the query log table before timing " : "out and continuing the shutdown process. The completed queries drain process runs " : "after the shutdown process completes, thus the max shutdown time is extended by the " : "value specified in this flag."); Should this be validated against query_log_write_interval_s? query_log_write_interval_s < query_log_shutdown_deadline_s. http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@382 PS41, Line 382: completed_queries_ticker_ = make_unique<TickerSecondsBool>( : FLAGS_query_log_write_interval_s, completed_queries_cv_, completed_queries_lock_); : ABORT_IF_ERROR(completed_queries_ticker_->Start("impala-server", : "completed-queries-ticker")); Should this moved inside if, right after L379? http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@391 PS41, Line 391: lock_guard<mutex> l(completed_queries_threadstate_mu_); Add DCHECK after this statement: DCHECK(completed_queries_thread_state_ != SHUTDOWN); http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@452 PS41, Line 452: } After this loop, DCHECK that COMPLETED_QUERIES_QUEUED value >= size of queries_to_insert. http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@474 PS41, Line 474: ImpaladMetrics::COMPLETED_QUERIES_QUEUED->Increment( : queries_to_insert.size() * -1); DCHECK that COMPLETED_QUERIES_QUEUED >= 0 after this. http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@479 PS41, Line 479: LOG(WARNING) << "failed to write completed queries table=\"" << table_name << : "\" record_count=\"" << queries_to_insert.size() << "\""; COMPLETED_QUERIES_QUEUED is not decremented here. Are they enqueued again and retried later? If yes, please mention that fact in this WARNING log. And what prevent completed_queries_ from being retried indefinitely? http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@481 PS41, Line 481: LOG(WARNING) << ret_status.GetDetail(); Unnecessary extra indentation. http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/util/thread.h File be/src/util/thread.h: http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/util/thread.h@229 PS41, Line 229: /// Can be used to track the lifecycle of a thread. : enum ThreadState { There are some specific *ThreadState defined already such as ScannerThreadState, LIRSThreadState. Maybe this should be kept specific to workload-management (ie., renamed to ManagementTreadState) and moved to workload-management.h http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py File tests/custom_cluster/test_query_log.py: http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@33 PS41, Line 33: class TestQueryLogTable(CustomClusterTestSuite): : """Tests to assert the query log table is correctly populated.""" : : WM_DB = "sys" : QUERY_TBL = "{0}.impala_query_log".format(WM_DB) : PROTOCOL_BEESWAX = ["beeswax"] : PROTOCOL_HS2 = ["hs2"] : PROTOCOL_ALL = [PROTOCOL_BEESWAX[0], PROTOCOL_HS2[0]] : : @CustomClusterTestSuite.with_args(impalad_args="--enable_workload_mgmt " : "--query_log_write_interval_s=1 " : "--cluster_id=test_max_select " : "--shutdown_grace_period_s=10 " : "--shutdown_deadline_s=60", : catalogd_args="--enable_workload_mgmt", : impalad_graceful_shutdown=True) : @pytest.mark.parametrize("client_protocol", PROTOCOL_BEESWAX) : def test_query_log_table_almost_max_select(self, client_protocol): Impala test framework use ImpalaTestMatrix to parameterize its test function parameters generate permutation of them depending on selected exploration strategy (core vs pairwise vs exhaustive). For example, this can be written as follow: class TestQueryLogTableBeeswaxClient(ImpalaTestSuite): @classmethod def add_test_dimensions(cls): super(TestQueryLogTableBeeswaxClient, cls).add_test_dimensions() cls.ImpalaTestMatrix.add_dimension( ImpalaTestDimension('client_protocol', 'beeswax') @CustomClusterTestSuite.with_args(impalad_args="--enable_workload_mgmt " "--query_log_write_interval_s=1 " "--cluster_id=test_max_select " "--shutdown_grace_period_s=10 " "--shutdown_deadline_s=60", catalogd_args="--enable_workload_mgmt", impalad_graceful_shutdown=True) def test_query_log_table_almost_max_select(self, vector): client_protocol = vector.get_value('client_protocol') ... If testing against all client, then the test class can be expressed like this: class TestQueryLogTableAllClients(ImpalaTestSuite): @classmethod def add_test_dimensions(cls): super(TestQueryLogTableAllClients, cls).add_test_dimensions() cls.ImpalaTestMatrix.add_dimension( ImpalaTestDimension('client_protocol', *PROTOCOL_ALL) That being said, this TestQueryLogTable can be split into 3 test classes based on client protocol they are testing: TestQueryLogTableBeeswaxClient, TestQueryLogTableHS2Client, and TestQueryLogTableAllClients. test_query_log_table_query_select can also be contained into its own test class because it has buffer_pool_limit dimension. http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@80 PS41, Line 80: assert data[0] == "16774084" : assert data[1] == "935" Add assert message, mention what indices 0 and 1 are about. http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@168 PS41, Line 168: @pytest.mark.parametrize("client_protocol", PROTOCOL_ALL) : def test_query_log_table_dml(self, client_protocol): : """Asserts the values written to the query log table match the values from the : query profile.""" Can this merged with test_query_log_table_ddl? This looks like a superset of test_query_log_table_ddl. http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@199 PS41, Line 199: SCRATCH_DIR Move this constant above, near class name. http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@411 PS41, Line 411: CACHE_DIR Move this constant above, near class name. http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@431 PS41, Line 431: # Create the test table. : create_tbl_sql = "create table {0} (id INT, product_name STRING) " \ : "partitioned by (category INT)".format(tbl_name) : create_tbl_results = client.execute(create_tbl_sql) : assert create_tbl_results.success : : # Insert some rows into the test table. : insert_sql = "insert into {0} (id,category,product_name) VALUES ".format(tbl_name) : for i in range(1, 11): : for j in range(1, 11): : if i * j > 1: : insert_sql += "," : : random_product_name = "".join(choice(string.ascii_letters) : for _ in range(10)) : insert_sql += "({0},{1},'{2}')".format((i * j), i, random_product_name) : : insert_results = client.execute(insert_sql) : assert insert_results.success Can this test use existing table such as functional.alltypestiny? http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@473 PS41, Line 473: data[42] Explain what data[42] is. http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@835 PS41, Line 835: 7 nit: "timeout=7", for clarity. http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@839 PS41, Line 839: FLUSH_INTERNAL_CLUSTER_ID = "test_query_log_max_records_" + str(int(time())) : QUERY_COUNT = 2 Move these constant above, near class name. Maybe name them specific to test method using it. Also, why use "str(int(time()))" for cluster_id? Can this and other test use fixed cluster_id? http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@908 PS41, Line 908: OTHER_TBL = "completed_queries_table_{0}".format(int(time())) Move constant above, near class name. Maybe name them specific to test method using it. http://gerrit.cloudera.org:8080/#/c/20770/41/tests/util/workload_management.py File tests/util/workload_management.py: http://gerrit.cloudera.org:8080/#/c/20770/41/tests/util/workload_management.py@74 PS41, Line 74: data[index] Array data is returned to caller of assert_query() function. Can data changed from array to map of {column_label: value} for easier inspection by caller? -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 41 Gerrit-Owner: Jason Fehr <[email protected]> Gerrit-Reviewer: Andrew Sherman <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Mon, 11 Mar 2024 21:33:55 +0000 Gerrit-HasComments: Yes
