Wang Xixu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18846 )

Change subject: [Client] Add query id to trace the whole query process
......................................................................


Patch Set 28:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/18846/27//COMMIT_MSG
Commit Message:

PS27:
> Could you be more specific in describing the use case with Impala?  Where d
In Impala, a sql will be translated into a query plan. The
query plan contains multiple execution units and will be
executed distributly in different hosts of a cluster. There
exists an unique query id to trace the execution process of
every execution unit. It is useful to analyze which step get
stucked or failed, when a slow or failed query happened.
Query id can associate all log records in different hosts,
which belongs to one query.

Kudu is the scan node of a query plan, when defines Kudu table
as external table in Impala. And a scanning on a table will be
splited into multiple scanning on different tablets. The scanner
id is used to trace a scanning on a tablet.

But there is a gap between Impala and Kudu. We can't trace the
whole scanning process in Kudu of a query from Impala. There is
not unique id to associate the execution of all scanners in Kudu.

We can trace all query log in Impala, but can't associate with scanning log in 
Kudu.


http://gerrit.cloudera.org:8080/#/c/18846/27//COMMIT_MSG@14
PS27, Line 14: stucked or failed, when a
> nit: in this patch, no Impala code is involved, as I can see
Done


http://gerrit.cloudera.org:8080/#/c/18846/27//COMMIT_MSG@15
PS27, Line 15: Quer
> duplicated 'Kudu'?
Done


http://gerrit.cloudera.org:8080/#/c/18846/27//COMMIT_MSG@16
PS27, Line 16: ne query.
> useful in troubleshooting and debugging.
Done


http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client-test.cc@1278
PS27, Line 1278: the speci
> the specified
Done


http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client-test.cc@1288
PS27, Line 1288:   ASSERT_STRINGS_ANY_MATCH(capture_logs.logged_msgs(),
               :                            "Query id: .* tablet id:");
               :   ASSERT_STRINGS_ANY_MATCH(capture_logs.logged_msgs(),
               :                            "Query id: .* scanner id:");
> This matches anything, so 'test_query_id' could match this as well. If so,
Well, you are right. Here only want to prove that query id is generated when 
scan without setting a query id. Query id is generated randomly, it is 
difficult to match precisely.


http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client.h
File src/kudu/client/client.h:

http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client.h@2801
PS27, Line 2801: whole scannin
> nit: whole scanning process ?
Done


http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client.h@2802
PS27, Line 2802: automatic
> automatically by the client library code
Done


http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client.h@2817
PS27, Line 2817:
> nit: drop this extra part?
Done


http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/client.h@3375
PS27, Line 3375:
> nit: whole scanning process ?
Done


http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/scan_token-test.cc
File src/kudu/client/scan_token-test.cc:

http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/client/scan_token-test.cc@598
PS27, Line 598:     ASSERT_STRINGS_ANY_MATCH(capture_logs.logged_msgs(),
              :                              "Query id: .* tablet id:");
              :     ASSERT_STRINGS_ANY_MATCH(capture_logs.logged_msgs(),
              :                              "Query id: .* scanner id:");
> This matches anything, right?  How is this useful then?
Here only want to prove that query id is generated when scan without setting a 
query id. Query id is generated randomly, it is difficult to match precisely.


http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/tserver/tablet_server-test.cc@2870
PS27, Line 2870: // Test for a scan with query id in server side.
               : TEST_F(TabletServerTest, TestScanWithQueryId) {
               :   InsertTestRowsDirect(0, 100);
               :
               :   ScanRequestPB req;
               :   NewScanRequestPB* scan = req.mutable_new_scan_request();
               :   scan->set_tablet_id(kTabletId);
               :   req.set_batch_size_bytes(0); // so it won't return data 
right away
               :   ASSERT_OK(SchemaToColumnPBs(schema_, 
scan->mutable_projected_columns()));
               :   req.set_query_id("query_id_for_test");
               :
               :   StringVectorSink capture_logs;
               :   ScopedRegisterSink reg(&capture_logs);
               :
               :   ScanResponsePB resp;
               :   RpcController rpc;
               :   // Send the call
               :   {
               :     SCOPED_TRACE(SecureDebugString(req));
               :     ASSERT_OK(proxy_->Scan(req, &resp, &rpc));
               :
> What are we trying to verify here?  I'm not sure I'm seeing anything specif
Ok. Here add a check about log containing string: query_id_for_test.


http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/18846/27/src/kudu/tserver/tablet_service.cc@3039
PS27, Line 3039:   LOG(INFO) << Substitute("Query id: $0, scanner id: $1.",
               :                           req->query_id(), req->scanner_id());
> Why to keep thing along with the trace statement at line 3075?  If the scan
Here binding query id with scanner id for troubleshooting and debuging, 
although a lot of log will be produced.



--
To view, visit http://gerrit.cloudera.org:8080/18846
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9dbae801596726fec1c85ee547128da3179345d9
Gerrit-Change-Number: 18846
Gerrit-PatchSet: 28
Gerrit-Owner: Wang Xixu <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Jian Zhang <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <[email protected]>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wang Xixu <[email protected]>
Gerrit-Reviewer: Yifan Zhang <[email protected]>
Gerrit-Reviewer: Yingchun Lai <[email protected]>
Gerrit-Reviewer: Yuqi Du <[email protected]>
Gerrit-Comment-Date: Tue, 10 Jan 2023 08:29:44 +0000
Gerrit-HasComments: Yes

Reply via email to