Wang Xixu has posted comments on this change. ( http://gerrit.cloudera.org:8080/18846 )
Change subject: [Client] Add query id to trace the whole query process ...................................................................... Patch Set 28: (13 comments) http://gerrit.cloudera.org:8080/#/c/18846/27//COMMIT_MSG Commit Message: PS27: > Could you be more specific in describing the use case with Impala? Where d In Impala, a sql will be translated into a query plan. The query plan contains multiple execution units and will be executed distributly in different hosts of a cluster. There exists an unique query id to trace the execution process of every execution unit. It is useful to analyze which step get stucked or failed, when a slow or failed query happened. Query id can associate all log records in different hosts, which belongs to one query. Kudu is the scan node of a query plan, when defines Kudu table as external table in Impala. And a scanning on a table will be splited into multiple scanning on different tablets. The scanner id is used to trace a scanning on a tablet. But there is a gap between Impala and Kudu. We can't trace the whole scanning process in Kudu of a query from Impala. There is not unique id to associate the execution of all scanners in Kudu. We can trace all query log in Impala, but can't associate with scanning log in Kudu. http://gerrit.cloudera.org:8080/#/c/18846/27//COMMIT_MSG@14 PS27, Line 14: stucked or failed, when a > nit: in this patch, no Impala code is involved, as I can see Done http://gerrit.cloudera.org:8080/#/c/18846/27//COMMIT_MSG@15 PS27, Line 15: Quer > duplicated 'Kudu'? Done http://gerrit.cloudera.org:8080/#/c/18846/27//COMMIT_MSG@16 PS27, Line 16: ne query. > useful in troubleshooting and debugging. Done http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client-test.cc File src/kudu/client/client-test.cc: http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client-test.cc@1278 PS27, Line 1278: the speci > the specified Done http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client-test.cc@1288 PS27, Line 1288: ASSERT_STRINGS_ANY_MATCH(capture_logs.logged_msgs(), : "Query id: .* tablet id:"); : ASSERT_STRINGS_ANY_MATCH(capture_logs.logged_msgs(), : "Query id: .* scanner id:"); > This matches anything, so 'test_query_id' could match this as well. If so, Well, you are right. Here only want to prove that query id is generated when scan without setting a query id. Query id is generated randomly, it is difficult to match precisely. http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client.h File src/kudu/client/client.h: http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client.h@2801 PS27, Line 2801: whole scannin > nit: whole scanning process ? Done http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client.h@2802 PS27, Line 2802: automatic > automatically by the client library code Done http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client.h@2817 PS27, Line 2817: > nit: drop this extra part? Done http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client.h@3375 PS27, Line 3375: > nit: whole scanning process ? Done http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/scan_token-test.cc File src/kudu/client/scan_token-test.cc: http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/scan_token-test.cc@598 PS27, Line 598: ASSERT_STRINGS_ANY_MATCH(capture_logs.logged_msgs(), : "Query id: .* tablet id:"); : ASSERT_STRINGS_ANY_MATCH(capture_logs.logged_msgs(), : "Query id: .* scanner id:"); > This matches anything, right? How is this useful then? Here only want to prove that query id is generated when scan without setting a query id. Query id is generated randomly, it is difficult to match precisely. http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/tserver/tablet_server-test.cc File src/kudu/tserver/tablet_server-test.cc: http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/tserver/tablet_server-test.cc@2870 PS27, Line 2870: // Test for a scan with query id in server side. : TEST_F(TabletServerTest, TestScanWithQueryId) { : InsertTestRowsDirect(0, 100); : : ScanRequestPB req; : NewScanRequestPB* scan = req.mutable_new_scan_request(); : scan->set_tablet_id(kTabletId); : req.set_batch_size_bytes(0); // so it won't return data right away : ASSERT_OK(SchemaToColumnPBs(schema_, scan->mutable_projected_columns())); : req.set_query_id("query_id_for_test"); : : StringVectorSink capture_logs; : ScopedRegisterSink reg(&capture_logs); : : ScanResponsePB resp; : RpcController rpc; : // Send the call : { : SCOPED_TRACE(SecureDebugString(req)); : ASSERT_OK(proxy_->Scan(req, &resp, &rpc)); : > What are we trying to verify here? I'm not sure I'm seeing anything specif Ok. Here add a check about log containing string: query_id_for_test. http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/tserver/tablet_service.cc File src/kudu/tserver/tablet_service.cc: http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/tserver/tablet_service.cc@3039 PS27, Line 3039: LOG(INFO) << Substitute("Query id: $0, scanner id: $1.", : req->query_id(), req->scanner_id()); > Why to keep thing along with the trace statement at line 3075? If the scan Here binding query id with scanner id for troubleshooting and debuging, although a lot of log will be produced. -- To view, visit http://gerrit.cloudera.org:8080/18846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9dbae801596726fec1c85ee547128da3179345d9 Gerrit-Change-Number: 18846 Gerrit-PatchSet: 28 Gerrit-Owner: Wang Xixu <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Jian Zhang <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy <[email protected]> Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Wang Xixu <[email protected]> Gerrit-Reviewer: Yifan Zhang <[email protected]> Gerrit-Reviewer: Yingchun Lai <[email protected]> Gerrit-Reviewer: Yuqi Du <[email protected]> Gerrit-Comment-Date: Tue, 10 Jan 2023 08:29:44 +0000 Gerrit-HasComments: Yes
