Andrew Wong has posted comments on this change.

Change subject: KUDU-1952 Remove round-robin for block placement
......................................................................


Patch Set 18:

(33 comments)

Failure from LINT and a memory leak in ASAN data_dirs-test dist-test (looking 
into it).

http://gerrit.cloudera.org:8080/#/c/6636/17//COMMIT_MSG
Commit Message:

PS17, Line 9: mitigate the
            : single-disk failure
> nit: we aren't mitigating the failure, but rather mitigating the effects of
Done


PS17, Line 27:  When loading
             : tablet data from a previous version of Kudu, the tablet's 
superblock
             : will not have a DataDirGroup. One will be generated containing 
all
             : data directories, as the tablet's data may already be spread 
across
             : any number of disks.
> what happens if we add a table on the new version, then downgrade, run for 
Won't the superblock be rewritten with the old version of Kudu when new blocks 
are written? I thought the superblock was always updated as new things get 
flushed.
If that's the case, shouldn't the metadata be replaced with a group-less 
version, and upon upgrading, we should be create a group with all dirs.


http://gerrit.cloudera.org:8080/#/c/6636/17/src/kudu/fs/block_manager-test.cc
File src/kudu/fs/block_manager-test.cc:

Line 133:   Status CountFilesCb(int* num_files, Env::FileType type,
> would using Env::Walk() make this easier?
Done


Line 428:   size_t size1 = 5;
> Before asserting on the status' string, assert on its type (i.e. ASSERT_TRU
Done


http://gerrit.cloudera.org:8080/#/c/6636/18/src/kudu/fs/block_manager-test.cc
File src/kudu/fs/block_manager-test.cc:

PS18, Line 154: t CountFiles_OLD(const string& path) {
              :     vector<string> child_paths;
              :     Status s = env_->GetChildren(path, &child_paths);
              :     if (!s.ok()) {
              :       return 0;
              :     }
              :     int count = 0;
              :     for (const string& child_path : child_paths) {
              :       if (child_path == "." || child_path == ".." || child_path 
== kInstanceMetadataFileName) {
              :         continue;
              :       }
              :       string full_child_path = JoinPathSegments(path, 
child_path);
              :       bool is_dir;
              :       s = env_->IsDirectory(full_child_path, &is_dir);
              :       if (is_dir) {
              :         // Count the files in the child directories.
              :         count += CountFiles_OLD(full_child_path);
              :       } else {
              :         // Increment if the child is a file.
              :         count++;
              :       }
              :     }
              :     return count;
              :   }
Removed.


http://gerrit.cloudera.org:8080/#/c/6636/17/src/kudu/fs/data_dirs-test.cc
File src/kudu/fs/data_dirs-test.cc:

PS17, Line 56:   }
             : 
             :  pr
> You can omit this since it doesn't do anything beyond what the base class d
Done


PS17, Line 79:   Status s = dd_manager_->GetNextDataDir(non_existent_opts, 
nullptr);
             :   ASSERT_STR_CONTAIN
> Would it be possible to instantiate a DataDirManager and forgo the block ma
The main reason I added a blockmanager since setting up the env seemed to be 
nicely handled.


Line 96:   }
> Test the Status programmatically before testing its string representation.
Done


PS17, Line 115: TEST_F(DataDirGroupTest, TestDeleteDataDirGroup) {
              :   ASSERT_OK(dd_manager_->CreateDataDirGroup(test_
> Why is it necessary to set these? Won't GetNextDataDir() return Status::OK(
Left in from when I was experimenting with the flags. Removed


Line 138:   // Add 20 tablets, each with size 3.
> Maybe do this once in the test constructor, and do it again in each test th
Done


http://gerrit.cloudera.org:8080/#/c/6636/17/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

PS17, Line 62: riped ac
> maybe even experimental, considering this isn't quite usable yet?
Done


Line 188: 
> ??
Good question. Removed.


PS17, Line 254:   if (metric_entity) {
              :     metrics_.reset(new Data
> I don't think it makes sense to modify process-level state deep in an objec
RNG is used for creating groups and selecting directories within groups. I 
added this here so my dist-test runs would differ between runs. Without it, the 
groups would always be the same.

I've changed it to use GetRandomSeed32 and random_shuffle, which afaik don't 
modify state.


Line 394:     InsertOrDie(&uuid_idx_by_dd, dd.get(), idx);
> Don't need std:: prefixes here.
Done


PS17, Line 410: 
> i think this and the one below could be a bit more specific and refer to da
Good call, done. Also removed "tracking" vernacular, since it's no longer used 
in function names.


PS17, Line 422: 
              : 
              :   // Adjust the disk group size to fit within the total number 
of data dirs.
              :   i
> use std::min?
Done, but a little ugly due to static casting.


Line 460:     // natural tablet_id, select a data dir randomly.
> maybe DCHECK(IsGTest()) from test_util_prod.h?
Done


Line 479:     DataDir* candidate = FindOrDie(data_dir_by_uuid_idx_, uuid_idx);
> FindOrDie() here?
Done


PS17, Line 518: _uuid_idx) {
> Instead of the separate bool, could you wrap new_uuid_idx in a boost::optio
Done


Line 535:     }
> why not also just skip adding the full ones here? might make the lower part
Hmm, mainly wanted to avoid calling RefreshIsFull() on all dirs if possible, 
since it's a disk operation, but from what I've read, it doesn't seem to be a 
very expensive call.

Good point that below would be simpler. Done.


Line 548:     } else if (FindOrDie(tablets_by_uuid_idx_map_, 
candidate_indices[0]).size() <
> Why do we need to refresh?
If we don't refresh, there is a possibility that the group can be assigned a 
directory that's full. If there's a mix of full and non-full disks and we could 
happen to assign all full disks to the tablet.

As an example, DataDirGroupTest.TestFullDisk, which expects groups to not be 
created given full disks, will fail and create a group even when there is no 
space if this is not ALWAYS.

My worry is that if we leave fullness checking to when we create blocks, we may 
end up creating a group that has no space when there actually isn't any on the 
selected disks.


http://gerrit.cloudera.org:8080/#/c/6636/18/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

Line 545:   } else  {
> warning: don't use else after return [readability-else-after-return]
Done


http://gerrit.cloudera.org:8080/#/c/6636/17/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

Line 51: 
> I don't understand why you had to move DataDirGroup back to the header. The
No, forward declaration usually doesn't work for STL containers since I'm not 
using pointers.


PS17, Line 53:  list of 
> maybe missed something somewhere in the design, but when we get to removing
At least while the server is alive and the disk is still there, we do need to 
keep all the dead UUIDs so the indices within the PathSetPB are still valid.

I think the only time we can remove the UUID is when the dead disk is 
physically removed and we're guaranteed nothing will be put on it. Will think 
about this more and update the doc.


PS17, Line 66: 
> std::move
Done


Line 70:   void CopyToPB(DataDirGroupPB* pb) const {
> can just DCHECK(pb);
Done


Line 73:     for (uint16_t uuid_idx : uuid_indices_) {
> I think you can write this as group->add_uuid_indices(uuid_idx)
Done


Line 75:     }
> nit: pb->Swap(group); to avoid an extra allocation
Good call.


PS17, Line 251:  is an optional output denoting which uuid_idx should be
              :   // added to 'group_indices'. If there are no available
> why not return a bad status instead of a second out-param?
The only error that can be returned atm is an env IOError during dir fullness 
refresh. Being full isn't necessarily an error; it just means the group is at 
capacity, and that it should stop adding dirs.

Going with Adar's proposal with boost::optional


Line 259:   // call will reflect the same initial state of directory load.
> warning: function 'kudu::fs::DataDirManager::GetDirForGroupUnlocked' has a 
Done


http://gerrit.cloudera.org:8080/#/c/6636/17/src/kudu/fs/file_block_manager.h
File src/kudu/fs/file_block_manager.h:

PS17, Line 67: CreateBlockOptions
> typo?
? Not sure I see the typo. The caller of CreateBlock() provides a 
CreateBlockOptions indicating the tablet the block belongs to (and more in the 
future, eg disk class).


http://gerrit.cloudera.org:8080/#/c/6636/17/src/kudu/fs/fs.proto
File src/kudu/fs/fs.proto:

Line 116: // This is stored in each TabletSuperBlockPB.
> can you note where this ends up stored?
Noted that it's stored in the TabletSuperBlockPB.


http://gerrit.cloudera.org:8080/#/c/6636/17/src/kudu/tablet/delta_compaction.cc
File src/kudu/tablet/delta_compaction.cc:

Line 65:     string tablet_id)
> nit: pass by value and then std::move below
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6636
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
Gerrit-PatchSet: 18
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-HasComments: Yes

Reply via email to