Adar Dembo has posted comments on this change. Change subject: KUDU-1952 Remove round-robin for block placement ......................................................................
Patch Set 14: (27 comments) OK, I reviewed everything. It'd be nice to see a directed unit test for DataDirManager (time for data_dirs-test.cc?). Some of what I saw in block_manager-test (such as the lifecycle functions for data dir groups) would be better served there, as that stuff is really independent of the block manager. And I'd especially like to see directed testing for the "select one of two" functionality. http://gerrit.cloudera.org:8080/#/c/6636/14/src/kudu/fs/block_manager-stress-test.cc File src/kudu/fs/block_manager-stress-test.cc: PS14, Line 433: "test_tablet" Define this once in the test fixture and use it in both places. http://gerrit.cloudera.org:8080/#/c/6636/14/src/kudu/fs/block_manager-test.cc File src/kudu/fs/block_manager-test.cc: Line 118: CHECK_OK(bm_->dd_manager()->LoadDataDirGroupFromPB("test_tablet", test_group_pb_)); Why not RETURN_NOT_OK here too? Line 126: void RunBlockDistributionTest(const vector<string>& paths); Maybe add some coverage for DeleteDataDirGroup too? Line 141: ASSERT_STR_CONTAINS(s.ToString(), "DataDirGroup not found for tablet"); Create a separate test for this; don't overload setUp() with tests. Line 155: int count_files(Env* env, const string& path, const string& instance_name) { Should be CountFiles(). Also, perhaps make it a member of BlockManagerTest so you don't have to pass in 'env'? PS14, Line 171: instance_name Isn't this always "block_manager_instance"? If so, can you hardcode it? Ideally you'd make the existing definition of kInstanceMetadataFileName from data_dirs.cc public in data_dirs.h and use it here. PS14, Line 190: CHECK_OK ASSERT_OK PS14, Line 228: CHECK_OK ASSERT_OK Line 284: CHECK_OK(bm_->dd_manager()->CreateDataDirGroup("multipath_test")); ASSERT_OK PS14, Line 313: CHECK_OK ASSERT_OK PS14, Line 385: CreateBlockOptions({ "test_tablet" }) Maybe you can define this once as a BlockManagerTest member and refer to it in each test, instead of repeating it over and over? http://gerrit.cloudera.org:8080/#/c/6636/14/src/kudu/fs/data_dirs.cc File src/kudu/fs/data_dirs.cc: PS14, Line 160: A group is represented by a list of indices within the list of : // all UUIDs found in the PathSetPB. Maybe "A group is represented by a list of 2-byte indices, which index into the list of all UUIDs found in PathSetPB." Line 478: tablets_by_uuid_idx_map_[uuid_idx].insert(tablet_id); Should this use InsertOrDie(), to emphasize that there's no reason the set insert should fail? InsertOrDie(&tablets_by_uuid_idx_map_[uuid_idx], tablet_id); Line 528: for (int16_t uuid_index : group->uuid_index_list()) { Should be uint16_t Line 529: tablets_by_uuid_idx_map_[uuid_index].erase(tablet_id); Should we use FindOrDie() to get the set of tablets out, since it'd be an error for that to fail? Line 573: candidate = data_dir_by_uuid_idx_[dir_uuids[iter1]]; FindOrDie(). L580 too. Line 590: tablets_by_uuid_idx_map_[uuid1].size() > tablets_by_uuid_idx_map_[uuid2].size()) { Use FindOrDie() here too. http://gerrit.cloudera.org:8080/#/c/6636/14/src/kudu/fs/data_dirs.h File src/kudu/fs/data_dirs.h: Line 27: #include "kudu/fs/fs.pb.h" Do you still need this? Maybe you can get away with forward-declaring DataDirGroupPB? PS14, Line 162: Adds data directories to a specific tablet's dir group, until the limit : // specified by fs_target_data_dirs_per_tablet, or until there is no more space. Yes, but we should also emphasize that this also creates a dir group for the tablet. Otherwise someone might get confused and think they can call this function to add directories to an existing dir group. PS14, Line 170: tablet Nit: tablet's dir group. PS14, Line 171: tablet will be untracked. Nit: "tablet's dir group will be deleted" (to emphasize the symmetry with DeleteDataDirGroup). PS14, Line 174: tracking Nit: since this method is no longer UntrackTablet, I think it'd be clearer if we shifted away from the "track/untrack a tablet" terminology and towards "create/delete a tablet's dir group" terminology. That applies here and elsewhere in the comments. Line 221: bool GetDirForGroupUnlocked(const std::vector<uint16_t>& group, uint16_t* uuid_idx); Maybe the parameters would be more clear as 'uuid_idxes' and 'new_uuid_idx'? I was a little surprised to see a vector<uint16_t> named 'group'; I thought it meant there was a "typedef vector<uint16_t> DataDirGroup" somewhere that didn't make it out here. http://gerrit.cloudera.org:8080/#/c/6636/14/src/kudu/fs/fs.proto File src/kudu/fs/fs.proto: PS14, Line 117: indexes Nit: indices or indexes? Pick one and apply it consistently here (and wherever else you refer to this concept). http://gerrit.cloudera.org:8080/#/c/6636/14/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: PS14, Line 67: "test_tablet" Define just once. PS14, Line 247: CreateBlockOptions({ "test_tablet" } Maybe define just once and use repeatedly? http://gerrit.cloudera.org:8080/#/c/6636/14/src/kudu/tablet/tablet_metadata.cc File src/kudu/tablet/tablet_metadata.cc: Line 96: fs_manager->dd_manager()->DeleteDataDirGroup(tablet_id); So a convention we like to use in these situations is to wrap the cleanup that needs to happen on failure in an RAII object. We use MakeScopedCleanup for that (see util/scoped_cleanup.h). See MessengerBuilder::Build() for an example. -- To view, visit http://gerrit.cloudera.org:8080/6636 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b Gerrit-PatchSet: 14 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Jean-Daniel Cryans <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
