Todd Lipcon has submitted this change and it was merged. Change subject: disk failure: make DataDirManager failure-aware ......................................................................
disk failure: make DataDirManager failure-aware The DataDirManager must record what directories are unhealthy in order to avoid placing new data on failed disks. This patch achieves this by maintaining a set of UUID indices in the DataDirManager that correspond to failed directories. Additionally, a count of the number of known failed directories is maintained as a metric. Tests are added to data_dirs-test to ensure that failed directories are not used and are not returned as part of newly created DataDirGroups. If no healthy directories exist, callers will return an IOError with posix code ENODEV. Change-Id: Iee212793152de5de5198751d649ab34fb97f6aa2 Reviewed-on: http://gerrit.cloudera.org:8080/7028 Tested-by: Kudu Jenkins Reviewed-by: Todd Lipcon <[email protected]> --- M src/kudu/fs/block_manager-test.cc M src/kudu/fs/data_dirs-test.cc M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h 4 files changed, 221 insertions(+), 41 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/7028 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Iee212793152de5de5198751d649ab34fb97f6aa2 Gerrit-PatchSet: 16 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]>
