Yingchun Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/12516 )
Change subject: [tools] Add tool 'perf table_scan' ...................................................................... Patch Set 8: (16 comments) http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/kudu-tool-test.cc File src/kudu/tools/kudu-tool-test.cc: http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/kudu-tool-test.cc@410 PS7, Line 410: int64_t value = min_value; > Looks like every iteration of the loop winds up erasing 'line', which is al Done http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/kudu-tool-test.cc@416 PS7, Line 416: auto& c > projected Done http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/kudu-tool-test.cc@438 PS7, Line 438: } : } : i > Why is this wrapped in this condition? I don't see any place in the test se There are default values for `FLAGS_show_value`, for `table scan` it's `true`, for `perf scan_table` it's `false`. http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/kudu-tool-test.cc@1809 PS7, Line 1809: NO_FATALS(RunLoadgen(1, { "--keep_auto_table=true" }, kTableName)); > Wrap in NO_FATALS. Done http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/table_scanner.h File src/kudu/tools/table_scanner.h: http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/table_scanner.h@42 PS7, Line 42: namespace kudu { : namespace tools { : class TableScanner { : public: : TableScanner(client::sp::shared_ptr<kudu::client::KuduClient> client, std::string table_name): : total_count_(0), : client_(std::move(client)), : > Nit: stylistically we've actually been trending in the opposite direction: Done http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/table_scanner.h@51 PS7, Line 51: } > Nit: public/private should be indented by an additional character. Done http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/table_scanner.h@59 PS7, Line 59: > Should doc what happens if this is called vs. not called. Should also doc t Done http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/table_scanner.h@64 PS7, Line 64: } : : pr > This doesn't appear to be used; could you remove it until such a time as so It is used in src/kudu/tools/tool_action_perf.cc, L584 http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/table_scanner.h@72 PS7, Line 72: client::sp::shared_ptr<kudu::client::KuduClient> client_; > Nit: don't need this private; there's one on L68. Done http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/table_scanner.h@79 PS7, Line 79: }; > Now it protects output to out_, right? Done http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/table_scanner.cc File src/kudu/tools/table_scanner.cc: http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/table_scanner.cc@90 PS7, Line 90: "Predicates can be combined together with predicate operators using the syntax\n" > Kind of wish we'd rename this to "show_values" or even "show_row_data" to b Done http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/tool_action_common.h File src/kudu/tools/tool_action_common.h: http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/tool_action_common.h@147 PS7, Line 147: // Creates a Kudu client connected to the cluster whose master addresses are defined by > Nit: reword as "Creates a Kudu client connected to the cluster whose master Done http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/tool_action_perf.cc File src/kudu/tools/tool_action_perf.cc: http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/tool_action_perf.cc@575 PS7, Line 575: // It's necessary to have read-what-you-write behavior here. Since > I don't understand this. The tool isn't doing any writing and this KuduClie TableScanner instance doesn't write anything itself, but the KuduClient instance is reused here which has written some data in previous steps. http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/tool_action_perf.cc@575 PS7, Line 575: // It's necessary to have read-what-you-write behavior here. Since : // tablet leader can change and there might be replica propagation delays, : // set the snapshot to the latest observed one to get correct row count. : // NOTE: +1 is due to the current implementation of the scanner, read at : // snapshot 'N + 1' will wait for all transactions whose snapshot is lower than : // 'N + 1' to complete. > Part (or maybe all?) of this comment now belongs within TableScanner.Run(). Of course, 'Due to KUDU-1656...' should be removed now since KUDU-1656 has been resolved. http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/tool_action_perf.cc@581 PS7, Line 581: scanner.SetSnapshot(client->GetLatestObservedTimestamp() + 1); > Can you be more specific? Done http://gerrit.cloudera.org:8080/#/c/12516/7/src/kudu/tools/tool_action_perf.cc@758 PS7, Line 758: "This can be useful to check for row count skew across different tablets, " : "or whether there is a long latency tail when scanning different tables.") : .AddRequiredParameter({ kMasterAddressesArg, kMasterAddressesArgDesc }) > Nit: reword as "Show row count and scanning time of tablets in a table. Thi Done -- To view, visit http://gerrit.cloudera.org:8080/12516 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibebed05d9a91ae3f126d3cf2d92f66852b253edd Gerrit-Change-Number: 12516 Gerrit-PatchSet: 8 Gerrit-Owner: Yingchun Lai <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Reviewer: Yingchun Lai <[email protected]> Gerrit-Comment-Date: Mon, 18 Mar 2019 14:23:04 +0000 Gerrit-HasComments: Yes
