Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/23811 )
Change subject: IMPALA-14641: Add detailed operation page for Catalogd operations ...................................................................... Patch Set 5: (15 comments) http://gerrit.cloudera.org:8080/#/c/23811/5/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/23811/5/be/src/catalog/catalog-server.cc@1579 PS5, Line 1579: string CatalogServer::FormatTimelineFromJson(const string& timeline_json) { Processing the thrift JSON string is cumbersome. We can do things easier by processing TEventSequence directly. Replied in the comment of JniCatalog.thrift. http://gerrit.cloudera.org:8080/#/c/23811/5/be/src/catalog/catalog-server.cc@1693 PS5, Line 1693: if (PrintId(record.query_id) == query_id_str && record.thread_id == thread_id) { I just realized (query_id, thread_id) can be duplicated for different RPCs if they are from the same query and happen to be processed by the same thrift service thread. E.g. in the new test, two operations of CTAS have the same query id and thread id. So using (query_id, thread_id) as the key for in-flight RPCs is OK. But it's not enough for finished RPCs. We can add startTime in the request URL to distinguish them. http://gerrit.cloudera.org:8080/#/c/23811/5/common/thrift/JniCatalog.thrift File common/thrift/JniCatalog.thrift: http://gerrit.cloudera.org:8080/#/c/23811/5/common/thrift/JniCatalog.thrift@916 PS5, Line 916: 12: optional string timeline We convert TEventSequence to JSON string in frontend and then convert it to human-readable string in backend. I think we can pass TEventSequence to the backend directly and convert it to human-readable string by this code: https://github.com/apache/impala/blob/3e29f8aaa8f1a02630505866de45a918918aa216/be/src/util/runtime-profile.cc#L1574-L1586 We can extract this code into a util method, e.g. PrettyPrintTimeline(ostream* s, const string& prefix, const string& name, int64_t duration, vector<pair<string, int64_t>>& events). Both RuntimeProfile::EventSequence and TEventSequence can generate parameters to use this method. http://gerrit.cloudera.org:8080/#/c/23811/5/fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOperationTracker.java File fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOperationTracker.java: http://gerrit.cloudera.org:8080/#/c/23811/5/fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOperationTracker.java@91 PS5, Line 91: public RpcKey(TUniqueId queryId, long threadId) { : queryId_ = queryId; : threadId_ = threadId; : } I think we don't need this since all the usages on this are still in the same thread. http://gerrit.cloudera.org:8080/#/c/23811/5/fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOperationTracker.java@168 PS5, Line 168: // If not in-flight, try to find in finished operations When will we update a finished record? We'd better avoid it. http://gerrit.cloudera.org:8080/#/c/23811/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/23811/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@449 PS5, Line 449: public TDdlExecResponse execDdlRequest(TDdlExecRequest ddlRequest) nit: add @VisibleForTesting annotaion since this is only used by tests now. http://gerrit.cloudera.org:8080/#/c/23811/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@451 PS5, Line 451: "Catalog Server Operation" nit: use CATALOG_TIMELINE_NAME http://gerrit.cloudera.org:8080/#/c/23811/3/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: http://gerrit.cloudera.org:8080/#/c/23811/3/fe/src/main/java/org/apache/impala/service/JniCatalog.java@319 PS3, Line 319: * Gets the current catalog version. > Thanks for the detailed breakdown of the steps. However, I just wanted to a TheadLocal variables are harder to maintain, e.g., lifecycle management needs more attention, dependencies are hidden, etc. As we can simply use method parameters here, no needs to use ThreadLocal variables. http://gerrit.cloudera.org:8080/#/c/23811/5/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: http://gerrit.cloudera.org:8080/#/c/23811/5/fe/src/main/java/org/apache/impala/service/JniCatalog.java@352 PS5, Line 352: "Catalog Server Operation" The purpose of creating the catalogTimeline here is to add the TCatalogOpRecord in CatalogOperationTracker and also add this timeline object with it. So we don't need a separated map (inFlightTimelines_) in CatalogOperationTracker. nit: use CATALOG_TIMELINE_NAME instead. http://gerrit.cloudera.org:8080/#/c/23811/5/fe/src/main/java/org/apache/impala/service/JniCatalog.java@354 PS5, Line 354: () -> { : return catalogOpExecutor_.execDdlRequest(params, catalogTimeline); : } nit: simply it to () -> catalogOpExecutor_.execDdlRequest(params, catalogTimeline) http://gerrit.cloudera.org:8080/#/c/23811/5/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/23811/5/tests/webserver/test_web_pages.py@1209 PS5, Line 1209: .format(unique_database)) Get the query id from the result and verify it matches what we get from the WebUI. res = self.execute_query(...) query_id = res.query_id http://gerrit.cloudera.org:8080/#/c/23811/5/tests/webserver/test_web_pages.py@1235 PS5, Line 1235: query_id, thread_id)) We just verify one operation. Please verify both of them. http://gerrit.cloudera.org:8080/#/c/23811/5/tests/webserver/test_web_pages.py@1259 PS5, Line 1259: new fields nit: These are new fields added by this commit. But not new fields after this commit is merged. Please remove it. http://gerrit.cloudera.org:8080/#/c/23811/5/tests/webserver/test_web_pages.py@1326 PS5, Line 1326: handle = self.client.execute_async(refresh_stmt) Get the query id from 'handle' and verify it matches what we get from the WebUI. query_id = self.client.handle_id(handle) http://gerrit.cloudera.org:8080/#/c/23811/3/www/operation_detail.tmpl File www/operation_detail.tmpl: http://gerrit.cloudera.org:8080/#/c/23811/3/www/operation_detail.tmpl@133 PS3, Line 133: outDiv.appendChild(ul); : } else { : var div = document.createElement('div'); : div.style.wordBreak = 'break-word'; : div.textContent = detailsText; : outDiv.appendChild(div); : } : }); : </script> : </td> : </tr> : </table> : </div> : </div> : </div> : : {{?timeline}} : <div class="panel panel-info"> : <div class="card"> : <div class="card-header"> : <h5 class="card-title">Execution Timeline</h5> : </div> : <div class="card-body"> : <pre style="background-color: #f5f5f5; padding: 15px; border-radius: 5px; max-height: 500px; overflow-y: auto; font-family: monospace; white-space: pre;">{{timeline}}</pre> : </div> : </div> : </div> : {{/timeline}} : : <div style="margin-top: 20px;"> : <a href="/operations" class="btn btn-primary">Back to Operations List</a> : </div> : {{/error}} : : {{> www/common-footer.tmpl }} : : : : : : : : : : : : : : : : : : : : : > Done I see you wrote a C++ implementation of this. I think we don't need it. Replied in the comment of JniCatalog.thrift. -- To view, visit http://gerrit.cloudera.org:8080/23811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib80ce3b5873672065b04b00a21d3419a1db0969c Gerrit-Change-Number: 23811 Gerrit-PatchSet: 5 Gerrit-Owner: Arnab Karmakar <[email protected]> Gerrit-Reviewer: Arnab Karmakar <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Surya Hebbar <[email protected]> Gerrit-Comment-Date: Fri, 30 Jan 2026 08:53:13 +0000 Gerrit-HasComments: Yes
