Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )
Change subject: Consider the available space when selecting data dirs for blocks. ...................................................................... Patch Set 10: (9 comments) http://gerrit.cloudera.org:8080/#/c/13975/10//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13975/10//COMMIT_MSG@7 PS10, Line 7: Consider the available space when selecting data dirs for blocks. Could you tag KUDU-2901 here? http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h File src/kudu/fs/data_dirs.h: http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h@338 PS10, Line 338: // Returns a random directory and gives preference to those with more free Nit: "...directory, giving preference to..." http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h@472 PS10, Line 472: // If those two directories have same load, will prefer the one with more : // free space. Nit: Rephrase as "Ties are broken by choosing the directory with more free space." http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc File src/kudu/fs/data_dirs.cc: http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@90 PS10, Line 90: "Number of seconds we cache the disk available space in the block manager. "); Nit: "Number of seconds we cache available disk space in the block manager." http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@269 PS10, Line 269: int64_t available_bytes; Nit: keeping in with "is_full_" and "is_full_new", perhaps name this "available_bytes_new"? http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@984 PS10, Line 984: vector<int> candidate_indices; Wouldn't this be simpler as a vector<DataDir*>? Seems like we could avoid a few FindOrDie() calls that way. http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@999 PS10, Line 999: shuffle(candidate_indices.begin(), candidate_indices.end(), default_random_engine(rng_.Next())); This can be moved into the if block to avoid a no-op call when candidate_indices is empty. http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/util/env_util.h File src/kudu/util/env_util.h: http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/util/env_util.h@56 PS10, Line 56: // If available_bytes is set, will get the free bytes of the disk space. Nit: "If 'available_bytes' is not null, it will contain the amount of free disk space (in bytes) in 'path' when the function finishes". http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/util/env_util.cc File src/kudu/util/env_util.cc: http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/util/env_util.cc@159 PS10, Line 159: if (available_bytes != nullptr) { : *available_bytes = free_bytes; : } Don't we want this to happen even if we return an ENOSPC IOError on L154? Otherwise won't 'available_bytes' (in data_dirs.cc) contain a garbage value? Curious why this wasn't caught in a unit test; could you make sure we cover this case? -- To view, visit http://gerrit.cloudera.org:8080/13975 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529 Gerrit-Change-Number: 13975 Gerrit-PatchSet: 10 Gerrit-Owner: ZhangYao <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yingchun Lai <[email protected]> Gerrit-Reviewer: ZhangYao <[email protected]> Gerrit-Comment-Date: Tue, 13 Aug 2019 06:18:05 +0000 Gerrit-HasComments: Yes
