Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14897 )

Change subject: [master] KUDU-3016 don't lump together updates on all tablet 
reports
......................................................................


Patch Set 6:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/common/row_operations.cc
File src/kudu/common/row_operations.cc:

http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/common/row_operations.cc@105
PS6, Line 105: bool RowOperationsPBEncoder::empty() const {
             :   return pb_->mutable_rows()->empty();
             : }
Can be inlined in the header.


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/common/row_operations.cc@192
PS6, Line 192: rows_size_delta
Nit: 'row_size_delta'?


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/common/row_operations.cc@214
PS6, Line 214:   if (isset_bitmap_size) {
             :     *isset_bitmap_size = isset_size;
             :   }
             :   if (isnull_bitmap_size) {
             :     *isnull_bitmap_size = isnull_size;
             :   }
Do we need these to be nullable or can we assume (and DCHECK) that they're 
always provided?


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/integration-tests/master-stress-test.cc
File src/kudu/integration-tests/master-stress-test.cc:

http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/integration-tests/master-stress-test.cc@596
PS6, Line 596: Kudu master
Nit: can elide this.


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/integration-tests/master-stress-test.cc@599
PS6, Line 599: result
resulting


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/integration-tests/master-stress-test.cc@604
PS6, Line 604: class MasterRpcSizeLimitStressTest : public KuduTest {
Would the ExternalMiniClusterITestBase fixture allow you to shorten Prepare()?


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/integration-tests/master-stress-test.cc@611
PS6, Line 611:   void TearDown() override {
             :     if (cluster_) {
             :       cluster_->Shutdown();
             :     }
             :     client_.reset();
             :     KuduTest::TearDown();
             :   }
Doesn't all this happen naturally?


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/integration-tests/master-stress-test.cc@624
PS6, Line 624: buidls
builds


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/integration-tests/master-stress-test.cc@635
PS6, Line 635:       // Set custom
Finish writing this?


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/integration-tests/master-stress-test.cc@682
PS6, Line 682:         .wait(true)
This is the default and can be omitted.


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/integration-tests/master-stress-test.cc@697
PS6, Line 697: TEST_F(MasterRpcSizeLimitStressTest, TabletReports) {
I don't think this test makes sense here. Stress tests (i.e. *-stress-test) are 
intended to be black box tests that run for a fixed amount of time (usually 
pretty long) and place stress on various components; here you're testing 
correctness and a particular bug fix.


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/integration-tests/master-stress-test.cc@709
PS6, Line 709: re-electing
re-elect


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/integration-tests/master-stress-test.cc@712
PS6, Line 712:   for (auto idx = 0; idx < kNumTabletServers; ++idx) {
Could we do one loop to pause all tservers, then sleep, then another loop to 
resume all of them? That'd be net less sleeping so the test would run faster.


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/master/catalog_manager.cc@4305
PS6, Line 4305: transitional
transient


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/master/catalog_manager.cc@4305
PS6, Line 4305: next successfully processed
              :     // tablet report from the tablet server will fix the 
partial update.
This raises an interesting question: if chunk 2 of 2 fails and the full tablet 
report is retried, will chunk 1 of 2 no-op the second time around? Or will it 
be rewritten?


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/master/sys_catalog.cc
File src/kudu/master/sys_catalog.cc:

http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/master/sys_catalog.cc@838
PS6, Line 838:   size_t req_size = req->ByteSizeLong();
Could we just recompute this after adding each tablet instead of all that 
estimation logic? It's probably more expensive but it's a lot simpler, and DDLs 
aren't really a hot path. Unless you think N calls to ByteSizeLong() on a huge 
tablet report is too expensive...

I also wonder whether we can use the previous row's size as a proxy for the 
next row's size, thus avoiding the recomputation of both Estimate() and Add(). 
How different can they be?


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/master/sys_catalog.cc@850
PS6, Line 850:         do_encode = false;
Could we update excess_tablets and break here? Then maybe avoid needing 
do_encode altogether?


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/master/sys_catalog.cc@942
PS6, Line 942: decltype(tablets_info)
This seems unnecessarily magical; why not just write out 
'vector<scoped_refptr<TabletInfo>>'?


http://gerrit.cloudera.org:8080/#/c/14897/6/src/kudu/master/sys_catalog.cc@949
PS6, Line 949: should
Nit: will ?



--
To view, visit http://gerrit.cloudera.org:8080/14897
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I83e8ca4bc8db7cab8fee6b4a40f48adc8752e7c5
Gerrit-Change-Number: 14897
Gerrit-PatchSet: 6
Gerrit-Owner: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 10 Jan 2020 10:29:36 +0000
Gerrit-HasComments: Yes

Reply via email to