Todd Lipcon has posted comments on this change.

Change subject: KUDU-1952 Remove round-robin for block placement
......................................................................


Patch Set 35:

(11 comments)

only about 50% through, but figured I'd post a couple comments I have so far

http://gerrit.cloudera.org:8080/#/c/6636/17//COMMIT_MSG
Commit Message:

PS17, Line 27:  When loading
             : tablet data from a previous version of Kudu, the tablet's 
metadata
             : will not have a DataDirGroup. One will be generated containing 
all
             : data directories, as the tablet's data may already be spread 
across
             : any number of disks.
> Won't the superblock be rewritten with the old version of Kudu when new blo
yea, I think you're right here. One thing to watch out for in other cases is 
that, if you deserialize a PB which has some fields that are unknown, and then 
serialize it back out, it actually preserves the fields (in proto2). This 
changed with protobuf 3, but they're thinking of adding back that behavior 
again in 3.4.

In the case of TabletMetadata, though, it seems like we deserialize into our 
own structures, and then create a new PB on flush, rather than actually holding 
onto the original PB. So, in a downgrade-and-flush scenario, we will drop those 
fields as you described, and should be fine.


http://gerrit.cloudera.org:8080/#/c/6636/35//COMMIT_MSG
Commit Message:

PS35, Line 35: failure disk-failure
nit: typo


http://gerrit.cloudera.org:8080/#/c/6636/30/src/kudu/fs/block_manager-test.cc
File src/kudu/fs/block_manager-test.cc:

PS30, Line 192: bl
'num_blocks_per_dir'?


http://gerrit.cloudera.org:8080/#/c/6636/17/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

PS17, Line 53: 
> At least while the server is alive and the disk is still there, we do need 
hrm, but even then, don't we need the "placeholder" in the array such that the 
later indexes don't get "shifted forward"?


http://gerrit.cloudera.org:8080/#/c/6636/35/src/kudu/tablet/tablet_bootstrap-test.cc
File src/kudu/tablet/tablet_bootstrap-test.cc:

PS35, Line 207:   // Bootstrapping the tablet while it is copying will attempt 
to create a data
              :   // dir group for the tablet when one already exists.
              :   ASSERT_TRUE(s.IsAlreadyPresent()) << "Expected already 
present: " << s.ToString();
Not sure about this change.

The test is supposed to be simulating the case where a server crashes mid-copy, 
and then restarts. Maybe this isn't realistic anymore (because we would 
roll-forward a COPYING into DELETED at actual startup)? But I dont think 
AlreadyPresent makes much sense to propagate up here.


http://gerrit.cloudera.org:8080/#/c/6636/35/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

PS35, Line 86: RETURN_NOT_OK
can you PREPEND here to give better context? same below


PS35, Line 199:   // Keep a copy of the old data dir group in case of flush 
failure.
              :   DataDirGroupPB pb;
              :   bool old_group_exists = 
fs_manager_->dd_manager()->GetDataDirGroupPBForTablet(tablet_id_, &pb);
              : 
              :   // Remove the tablet's data dir group metadata tracked by the 
DataDirManager.
              :   fs_manager_->dd_manager()->DeleteDataDirGroup(tablet_id_);
              :   if (delete_type == TABLET_DATA_COPYING) {
              :     // If the tablet deletion is a part of a tablet copy, a 
data dir group must
              :     // created so the blocks can be placed for the new tablet.
              :     
RETURN_NOT_OK(fs_manager_->dd_manager()->CreateDataDirGroup(tablet_id_));
              :   }
              : 
              :   auto revert_group_cleanup = MakeScopedCleanup([&]() {
              :     fs_manager_->dd_manager()->DeleteDataDirGroup(tablet_id_);
              :     if (old_group_exists) {
              :       
fs_manager_->dd_manager()->LoadDataDirGroupFromPB(tablet_id_, pb);
              :     }
              :   });
Instead of doing this here, would it work to instead add a new method like 
'ReallocateDataDirGroup' which is called separately by TabletCopy right before 
it starts writing out new blocks? It smells a little funny to me to piggy-back 
this on the deletion instead of on the setting-up of the new tablet metadata.

In other words, when we call TabletCopyClient::Finish(), we're already using 
ReplaceSuperBlock at that point which deletes the data dir group and loads the 
new one from the superblock. So, if the new super block has been allocated a 
new group by TabletCopyClient::Start (eg near where it does clear_rowsets(), 
clear_orphaned_blocks(), etc) then this would be handled somewhat naturally.

That said, if you considered that and went this way, happy to discuss


Line 388:       RETURN_NOT_OK(fs_manager_->dd_manager()->LoadDataDirGroupFromPB(
PREPEND


PS35, Line 393:       // data data 
typo


PS35, Line 398: true)
we try to avoid boolean parameters to change behavior. Even though it seems a 
little silly, would be good to introduce an enum for this so the parameter 
becomes descriptive (like ACROSS_ALL_DISKS or something)


PS35, Line 629: pb.mutable_data_dir_group());
note that this will end up adding an empty data_dir_group field to 'pb' even if 
it didn't add any elements. I'm not sure if that's intended but I'm guessing 
not?


-- 
To view, visit http://gerrit.cloudera.org:8080/6636
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
Gerrit-PatchSet: 35
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Jean-Daniel Cryans <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to