Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )
Change subject: Consider the available space when selecting data dirs for blocks. ...................................................................... Patch Set 7: (9 comments) I'll look into the test failure some more. If I check out this patch, I see it failing a small percentage (2-6%) of the time. http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.h File src/kudu/fs/data_dirs.h: http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.h@207 PS7, Line 207: available_bytes_ nit: wrap this in ''s like the other variable names. http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.h@212 PS7, Line 212: , updated by RefreshIsFull. nit: remove this, or update it to RefreshAvailableSpace http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.h@338 PS7, Line 338: and give preference to which with more free space : // from the specified option's data dir group. nit: Could you rephrase a bit? "and gives preference to those with more free space in the specified option's data dir group." http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.h@471 PS7, Line 471: as the number of unique tablets in the directory. : // The resulting behavior fills directories that have fewer tablets stored on : // them while not completely neglecting those with more tablets. nit: update this? http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc File src/kudu/fs/data_dirs.cc: http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc@90 PS7, Line 90: " : "During this time, will not really check disk space when using EXPIRED_ONLY." nit: I would remove this sentence. These descriptions usually try to be user-understandable, and EXPIRED_ONLY isn't really user-facing. http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc@110 PS7, Line 110: : : nit: can you remove this extraneous newline? http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc@264 PS7, Line 264: last_check_is_full_ nit: maybe we should rename this to last_space_check_ or something similar http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc@1002 PS7, Line 1002: vector<int> random_indices(candidate_indices.size()); : iota(random_indices.begin(), random_indices.end(), 0); Could we just shuffle candidate_indices directly? And then compare dirs candidate_indices[0] and candidate_indices[1]? http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc@1083 PS7, Line 1083: int tablets_in_first = FindOrDie(tablets_by_uuid_idx_map_, candidate_indices[0]).size(); : int tablets_in_second = FindOrDie(tablets_by_uuid_idx_map_, candidate_indices[1]).size(); : int selected_index = 0; : if (tablets_in_first == tablets_in_second) { : int64_t space_in_first = FindOrDie(data_dir_by_uuid_idx_, : candidate_indices[0])->available_bytes(); : int64_t space_in_second = FindOrDie(data_dir_by_uuid_idx_, : candidate_indices[1])->available_bytes(); : selected_index = space_in_first > space_in_second ? 0 : 1; : } else { : selected_index = tablets_in_first < tablets_in_second ? 0 : 1; : } : group_indices->push_back(candidate_indices[selected_index]); : candidate_indices.erase(candidate_indices.begin() + selected_index); : } To be clear, this is selecting by the tablet count first, and then breaking ties by space. Is that preferred over selecting by space, and then breaking ties by tablet count? This second scenario isn't as likely in production, but it definitely happens in tests since internal mini clusters share a single disk. -- To view, visit http://gerrit.cloudera.org:8080/13975 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529 Gerrit-Change-Number: 13975 Gerrit-PatchSet: 7 Gerrit-Owner: ZhangYao <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yingchun Lai <[email protected]> Gerrit-Reviewer: ZhangYao <[email protected]> Gerrit-Comment-Date: Thu, 08 Aug 2019 18:20:05 +0000 Gerrit-HasComments: Yes
