Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/11869 )
Change subject: [compaction] KUDU-1400: Improve rowset compaction policy to consider merging small DRSs ...................................................................... Patch Set 4: (14 comments) http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/compaction_policy-test.cc File src/kudu/tablet/compaction_policy-test.cc: http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/compaction_policy-test.cc@321 PS4, Line 321: if doing so : // doesn't actually reduce the number of rowsets > Does it make sense to add a test to see how the compaction algorithm behave An original test, TestBudgetedSelection, tests the case of a small number of overlapping rowsets. I don't think having many rowsets overlapping tests the code in any new way beyond having 3. http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/compaction_policy-test.cc@336 PS4, Line 336: [0 - 1] > nit: it seems this one is extra -- the index starts with 1. Instead, it se Done http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/compaction_policy-test.cc@345 PS4, Line 345: ASSERT_EQ(quality, 0.0); > nit: the expected values comes first for XXX_EQ() test macros Done. Here and elsewhere. http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/compaction_policy-test.cc@344 PS4, Line 344: ASSERT_TRUE(picked.empty()); : ASSERT_EQ(quality, 0.0); > nit: think it's worth using EXPECT* here so we can run over all rowsets, ev No, because the expectation for the loops is that if it fails one one iteration, it fails for all subsequent ones, so the extra EXPECT_* failures would just be noise. http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/compaction_policy-test.cc@381 PS4, Line 381: if (divisor > 2) { : ASSERT_EQ(rowsets.size(), picked.size()); > I think that isn't what we want. As the number of rowsets increases, the (N I think Alexey is asking about something else. http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/compaction_policy-test.cc@381 PS4, Line 381: if (divisor > 2) { : ASSERT_EQ(rowsets.size(), picked.size()); > Should it also hold if the divisor is 3 in case of 3 rowsets? Or it doesn' You're asking if 3 adjacent rowsets of size 32/3 MB each should be compacted together. That's a decision about how important we think small rowset compaction is. Maybe they should be, maybe not, but the various choices I'm enforcing in this test have already exhausted most of my ability to tune the weights and coefficients, and think they represent reasonable choices, so whatever actually does happen is fine with me. http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/compaction_policy-test.cc@391 PS4, Line 391: FLAGS_budgeted_compaction_target_rowset_size > nit: does it make sense to use 'target_size_bytes' as well? Done http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/compaction_policy.cc File src/kudu/tablet/compaction_policy.cc: http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/compaction_policy.cc@126 PS4, Line 126: const RowSetInfo* > nit: not related to this particular patch, but if item_type is typedef'ed, Done http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/compaction_policy.cc@268 PS4, Line 268: double MaybePenalizeWideSolution(double sum_width, double union_width) { : return sum_width <= union_width ? sum_width - union_width : : sum_width - kSupportAdjust * union_width; : } > This is worth a comment. My attempt: Done http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/rowset_info.h File src/kudu/tablet/rowset_info.h: http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/rowset_info.h@87 PS4, Line 87: by this RowSet > nit: not your fault, but mind tweaking this to be "is spanned by the min/ma Done http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/rowset_info.cc File src/kudu/tablet/rowset_info.cc: http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/rowset_info.cc@65 PS4, Line 65: compaction_small_rowset_tradeoff > +1, same above? Done http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/rowset_info.cc@65 PS4, Line 65: compaction_small_rowset_tradeoff > nit: is it worth marking this flag with the 'runtime' tag? Done http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/rowset_info.cc@90 PS4, Line 90: --compaction_small_rowset_tradeoff=$0 must be less than " : "--compaction_minimum_improvement=$1 in order to prevent pointless " : "compactions; if you know what you are doing, pass " : "--compaction_force_small_rowset_tradeoff to permit this", > nit: consistent -flag_dashes with the above flag? Done http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/rowset_info.cc@230 PS4, Line 230: const auto size_score = : 1 - static_cast<double>(size_bytes) / target_size_bytes; > I understand the sentiment behind wanting to just point to a design doc for Done -- To view, visit http://gerrit.cloudera.org:8080/11869 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7b421c6ed77d28ebab9b91a4d6fcb1e825997e6c Gerrit-Change-Number: 11869 Gerrit-PatchSet: 4 Gerrit-Owner: Will Berkeley <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Comment-Date: Fri, 16 Nov 2018 19:06:08 +0000 Gerrit-HasComments: Yes
