Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/20762 )
Change subject: IMPALA-12540: Query Live Table ...................................................................... Patch Set 41: (32 comments) Upload incoming. I still need to add some of the requested tests. http://gerrit.cloudera.org:8080/#/c/20762/41//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20762/41//COMMIT_MSG@19 PS41, Line 19: : Query: explain select * from sys.impala_query_live > Consider adding planner test for queries against sys.impala_query_live. Done. http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/exec/system-table-scan-node.h File be/src/exec/system-table-scan-node.h: http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/exec/system-table-scan-node.h@40 PS41, Line 40: by fetching more data from metrics. > Nit: this comment does not make sense in this context. Possibly a copy-and Done http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/exec/system-table-scanner.h File be/src/exec/system-table-scanner.h: http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/exec/system-table-scanner.h@42 PS41, Line 42: bool eos() const { return eos_; } > nit: add noexcept? Done http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/exec/system-table-scanner.cc File be/src/exec/system-table-scanner.cc: http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/exec/system-table-scanner.cc@26 PS41, Line 26: #include "runtime/decimal-value.h" > unused Done http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/exec/system-table-scanner.cc@27 PS41, Line 27: #include "runtime/decimal-value.inline.h" > unused Done http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/exec/system-table-scanner.cc@71 PS41, Line 71: MaterializeNextRow > WriteStringSlot? Done http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/service/fe-support.cc File be/src/service/fe-support.cc: http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/service/fe-support.cc@655 PS41, Line 655: for (const auto& it : membership_snapshot->current_backends) { : if (it.second.has_is_coordinator() && it.second.is_coordinator()) { : VLOG_QUERY << "Found coordinator " : << it.second.address().hostname() << ":" << it.second.address().port(); : coordinators.emplace_back(FromNetworkAddressPB(it.second.address())); : } : } > What do you think about pushing this to ClusterMembershipMgr or ClusterMemb Makes sense, I've moved it there. http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/service/impala-server.h@509 PS41, Line 509: size_t NumLiveQueries(); > Please mark this function const noexcept. It's not though. It has to take a lock (which is not marked mutable). Locking can throw an exception. http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/service/impala-server.h@508 PS41, Line 508: /// Return the number of live queries managed by this server. : size_t NumLiveQueries(); > mention that this is implemented in workload-management.cc Done http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/service/impala-server.h@1138 PS41, Line 1138: /// Returns a list of completed queries that have not yet been written to storage. > Please document that this function makes a copy of the completed queries qu Done http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/service/impala-server.h@1139 PS41, Line 1139: std::vector<std::shared_ptr<QueryStateExpanded>> GetCompletedQueries(); > Please mark this function const noexcept. The completed_queries_lock_ will Do we really want to do that? lock() can throw an exception. We do occasionally mark locks as mutable, but not consistently. What's the rationale for doing so here? http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/service/impala-server.h@1138 PS41, Line 1138: /// Returns a list of completed queries that have not yet been written to storage. : std::vector<std::shared_ptr<QueryStateExpanded>> GetCompletedQueries(); > mention that this is implemented in workload-management.cc Done http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/util/sharded-query-map-util.h File be/src/util/sharded-query-map-util.h: http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/util/sharded-query-map-util.h@59 PS41, Line 59: size_t Count() { > Can this function be marked const noexcept? It acquires a lock, which is not const or noexcept. http://gerrit.cloudera.org:8080/#/c/20762/41/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/20762/41/common/thrift/CatalogObjects.thrift@65 PS41, Line 65: SYSTEM_TABLE = 8 > Add a brief comment Done http://gerrit.cloudera.org:8080/#/c/20762/41/common/thrift/CatalogObjects.thrift@672 PS41, Line 672: QUERY_LIVE > Add a brief comment Done http://gerrit.cloudera.org:8080/#/c/20762/41/common/thrift/StatestoreService.thrift File common/thrift/StatestoreService.thrift: http://gerrit.cloudera.org:8080/#/c/20762/41/common/thrift/StatestoreService.thrift@207 PS41, Line 207: // A list of network addresses : struct TAddressesList { : 1: required list<Types.TNetworkAddress> addresses; : } > Move this to Types.thrift? Done http://gerrit.cloudera.org:8080/#/c/20762/41/common/thrift/SystemTables.thrift File common/thrift/SystemTables.thrift: http://gerrit.cloudera.org:8080/#/c/20762/41/common/thrift/SystemTables.thrift@21 PS41, Line 21: enum TQueryColumn { > nit: I think TSystemTableColumn is a better fit. SystemTable is a generic concept that may be used for other things (like a MetricScanner). Maybe TQueryTableColumn? http://gerrit.cloudera.org:8080/#/c/20762/41/common/thrift/SystemTables.thrift@21 PS41, Line 21: enum TQueryColumn { > Add a note that this must be kept in sync with workload-management-fields.c Done http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/catalog/Db.java File fe/src/main/java/org/apache/impala/catalog/Db.java: http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/catalog/Db.java@134 PS41, Line 134: "sys" > Nit: should we have a constant for this? Yeah, added one. http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/catalog/SystemTable.java File fe/src/main/java/org/apache/impala/catalog/SystemTable.java: http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/catalog/SystemTable.java@52 PS41, Line 52: private static Map<String, TSystemTableName> SYSTEM_TABLE_NAME_MAP = > could be final? Done http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/catalog/SystemTable.java@52 PS41, Line 52: private static Map<String, TSystemTableName> SYSTEM_TABLE_NAME_MAP = : ImmutableMap.of(QUERY_LIVE, TSystemTableName.QUERY_LIVE); > And assert map size is 1 This is an ImmutableMap. I'm confused by your comment Riza. It seems correct by definition. And the test I've added to PlannerTest won't work if they're wrong. http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/catalog/SystemTable.java@184 PS41, Line 184: String, String > unnecessary? They both appear to be required. I was mimicking CatalogOpExecutor::createMetaStoreTable. http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/catalog/SystemTable.java@186 PS41, Line 186: FieldSchema > unnecessary? Required. http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/planner/SystemTableScanNode.java File fe/src/main/java/org/apache/impala/planner/SystemTableScanNode.java: http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/planner/SystemTableScanNode.java@21 PS41, Line 21: import java.util.Map; > unused Done http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/planner/SystemTableScanNode.java@23 PS41, Line 23: import org.slf4j.Logger; > unused Done http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/planner/SystemTableScanNode.java@24 PS41, Line 24: import org.slf4j.LoggerFactory; > unused Done http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/planner/SystemTableScanNode.java@43 PS41, Line 43: import org.apache.impala.thrift.TSystemTableName; > unused Done http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/planner/SystemTableScanNode.java@46 PS41, Line 46: import com.google.common.collect.ImmutableMap; > unused Done http://gerrit.cloudera.org:8080/#/c/20762/41/tests/custom_cluster/test_query_live.py File tests/custom_cluster/test_query_live.py: http://gerrit.cloudera.org:8080/#/c/20762/41/tests/custom_cluster/test_query_live.py@30 PS41, Line 30: vector > unused Done http://gerrit.cloudera.org:8080/#/c/20762/41/tests/custom_cluster/test_query_live.py@33 PS41, Line 33: fite > nit: fit? Done http://gerrit.cloudera.org:8080/#/c/20762/41/tests/custom_cluster/test_query_live.py@36 PS41, Line 36: assert_query('sys.impala_query_live', self.client, 'test_query_live_1', > Is there any race where the data about the query can get written before we It includes currently running queries - which will be true about itself - as well as queries that have completed but not been written to impala_query_log in a batch. Batch interval is controlled by query_log_write_interval_s, which defaults to 300s. It starts when the cluster is started, so we have 5 minutes to start the cluster and run this test before results start getting written out. http://gerrit.cloudera.org:8080/#/c/20762/41/tests/custom_cluster/test_query_live.py@64 PS41, Line 64: vector > unused Done -- To view, visit http://gerrit.cloudera.org:8080/20762 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743 Gerrit-Change-Number: 20762 Gerrit-PatchSet: 41 Gerrit-Owner: Michael Smith <[email protected]> Gerrit-Reviewer: Andrew Sherman <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Wed, 20 Mar 2024 23:53:39 +0000 Gerrit-HasComments: Yes
