Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20762 )

Change subject: IMPALA-12540: Query Live Table
......................................................................


Patch Set 41:

(12 comments)

Will do more thorough review tomorrow. In the meantime, here are some of my 
comments.

http://gerrit.cloudera.org:8080/#/c/20762/41//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20762/41//COMMIT_MSG@19
PS41, Line 19:
             : Query: explain select * from sys.impala_query_live
Consider adding planner test for queries against sys.impala_query_live.

Also add test that show error if user attempt to create sys.impala_query_live 
by themself.


http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/exec/system-table-scan-node.h
File be/src/exec/system-table-scan-node.h:

http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/exec/system-table-scan-node.h@43
PS41, Line 43: NYI
nit: Not Yet Implemented?


http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/exec/system-table-scan-node.cc
File be/src/exec/system-table-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/exec/system-table-scan-node.cc@90
PS41, Line 90: NYI
nit: spell it out.


http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/exec/system-table-scan-node.cc@91
PS41, Line 91: NYI
nit: Replace with

Status(ErrorMsg(TErrorCode::NOT_IMPLEMENTED_ERROR,...));


http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/service/impala-server.h@508
PS41, Line 508: /// Return the number of live queries managed by this server.
              :   size_t NumLiveQueries();
mention that this is implemented in workload-management.cc


http://gerrit.cloudera.org:8080/#/c/20762/41/be/src/service/impala-server.h@1138
PS41, Line 1138: /// Returns a list of completed queries that have not yet been 
written to storage.
               :   std::vector<std::shared_ptr<QueryStateExpanded>> 
GetCompletedQueries();
mention that this is implemented in workload-management.cc


http://gerrit.cloudera.org:8080/#/c/20762/41/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/20762/41/common/thrift/CatalogObjects.thrift@672
PS41, Line 672: QUERY_LIVE
Since QUERY_LIVE name is constant, should this be a const instead?

const string QUERY_LIVE = "impala_query_live"


http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/catalog/SystemTable.java
File fe/src/main/java/org/apache/impala/catalog/SystemTable.java:

http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/catalog/SystemTable.java@176
PS41, Line 176:
              :   private static org.apache.hadoop.hive.metastore.api.Table
              :       createMetastoreTable(String dbName, String tableName, 
String owner) {
This is not registering the table to HMS, isn't it?


http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/planner/SystemTableScanNode.java
File fe/src/main/java/org/apache/impala/planner/SystemTableScanNode.java:

http://gerrit.cloudera.org:8080/#/c/20762/41/fe/src/main/java/org/apache/impala/planner/SystemTableScanNode.java@121
PS41, Line 121: 1024L * 1024L
SQL, PLAN, and EXEC_SUMMARY can be long. Has this take account for that?


http://gerrit.cloudera.org:8080/#/c/20762/41/tests/custom_cluster/test_query_live.py
File tests/custom_cluster/test_query_live.py:

http://gerrit.cloudera.org:8080/#/c/20762/41/tests/custom_cluster/test_query_live.py@24
PS41, Line 24: class TestQueryLive(CustomClusterTestSuite):
Add test querying sys.impala_query_live over cluster with multiple coordinators.


http://gerrit.cloudera.org:8080/#/c/20762/41/tests/custom_cluster/test_query_live.py@27
PS41, Line 27: 
@CustomClusterTestSuite.with_args(impalad_args="--enable_workload_mgmt "
             :                                                  
"--cluster_id=test_query_live_1",
             :                                     
catalogd_args="--enable_workload_mgmt")
Can this and the other test work without workload management enabled?
I guess GetCompletedQueries will return empty list in that case, but live 
queries can still be served?
If it does work, please add EE test for that scenario.


http://gerrit.cloudera.org:8080/#/c/20762/41/tests/custom_cluster/test_query_live.py@33
PS41, Line 33: fite
nit: fit?



--
To view, visit http://gerrit.cloudera.org:8080/20762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743
Gerrit-Change-Number: 20762
Gerrit-PatchSet: 41
Gerrit-Owner: Michael Smith <[email protected]>
Gerrit-Reviewer: Andrew Sherman <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Wed, 20 Mar 2024 01:01:29 +0000
Gerrit-HasComments: Yes

Reply via email to