Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10340 )
Change subject: KUDU-2359: allow startup with missing data dirs ...................................................................... KUDU-2359: allow startup with missing data dirs Context ------- As a part of previous disk failure work, Kudu currently supports opening the FS layout in the face of EIO, EROFS, etc. POSIX codes when reading FS and block manager instance files. The directory manager, in such cases, labels the data directory in which the bad file resides as failed in memory. The check that enforces consistency between path instance metadata files (PIMFs) accounts for such failed directories. Separately, the introduction of the `kudu fs update_dirs` tool expanded the logic used to open the FS layout to serve two additional purposes when running the tool: - To open the FS layout as the user specified it on a best-effort basis, ignoring any inconsistencies across PIMFs. This mode allows Kudu to stage the requested FS layout and test whether existing tablets would break under the given layout. - To actually update the FS layout on disk to match the user input. Walking down the FS-layout-opening path in this mode, Kudu will create any missing files or directories it encounters along the way. In this mode, the PIMF consistency check is performed after updating the appropriate instance files. The above behavior is currently encapsulated in FsManager::Open() and the consequent call to DataDirManager::Open(). As mentioned in the JIRA, not accounted for by this previous work, a disk failure can present itself as a failure to read directory entries (POSIX code ENOENT), leading to NotFound errors. Therein lies some conflict in the above logic: - when opening the FS layout normally (i.e. not running the update tool) and a NotFound error is encountered (e.g. the user asks for --fs_data_dirs=/a,/b but only /a can be found), it would be desirable to treat the affected directory how we currently treat failed directories - however, when updating directories, NotFound statuses are still meant to indicate missing directories to be created. The questions then become, "How should we treat ENOENT/NotFound errors? Should they be different than disk failures? If so, how?" There are a couple options to consider: 1. Lump ENOENT in with the list of POSIX codes that signify a disk error. This change would mean that semantically, a failed directory is no different than a missing one, and all of the above mentioned codepaths must be updated to account for this. Additionally, this would affect all other codepaths that may yield ENOENT -- codepaths that previously triggered disk failure handling would trigger that handling for ENOENT errors. 2. Treat NotFound errors as a special case during update/open only. When reading FS or block manager instance files, extend disk error checking to also check for NotFound errors, and treat the errors similarly, creating in-memory "unhealthy" instances. To open the FS normally, these unhealthy instances will simply correspond to failed directories. To update the FS layout, we can leverage the fact that the FS update tool doesn't support running with any failed directories, attempting to create all directories and files for _all_ unhealthy instances (both missing and failed) will correctly yield success in the face of only missing directories and failure in the presence of any failed directory. This patch implements Option 2 instead of Option 1. ENOENT is still a file-specific code, and so lumping it in to trigger DataDir-wide error handling, even if it may be indicative of disk-wide failure, seems inappropriate. It could also be argued that if NotFound errors are discovered at server runtime (e.g. because some malicious user is deleting files left and right), maybe the best course of action wouldn't be to treat the directory as failed, but to crash Kudu altogether. Notes on the new stuff ---------------------- The newly supported scenario is different than starting up Kudu with extra or missing entries in `fs_data_dirs`, which is still not supported unless running the update tool. Examples: - If an existing server were configured with --fs_data_dirs=/a,/b,/c, and it were restarted such that only /a,/b existed on disk, Kudu will start up and list /a,/b,/c, and note that /c is failed. - If the above server were restarted with --fs_data_dirs=/a,/b, even if only /a,/b existed on disk, Kudu would fail to start up until running `kudu fs update_dirs <other flags> --fs_data_dirs=/a,/b` Some changes in this patch include: - methods involved in loading PIMFs now treat missing instance files as "unhealthy", the same way they treat files that fail due to disk errors - DataDirManager::LoadInstances() has been updated to treat missing PIMFs as "unhealthy", the same way they treat PIMFs that yield disk errors, returning the successfully loaded, "unhealthy" instances - a side effect of this is that all user-specified data dirs will have in-memory DataDirs created for them, even if they don't exist on disk - various codepaths that previously ended FsManager::Open() with an IOError/Corruption because all drives were failed will now return NotFound, indicating Kudu should attempt to create a new FS layout. This means that a server that has lost all of its data dirs to disk failures is semantically equivalent to a brand new server; Kudu will attempt to create a new FS layout in these cases - as a byproduct of the above changes, when opening the FS layout in ENFORCE_CONSISTENCY mode with an extra entry in `fs_data_dirs`, Kudu will fail later than before, at the integrity check, and yield an IOError instead of a NotFound error - the UUID and UUID index assignment for missing directories has been updated when opening the speculative directory manager; see DataDirManager::Open() for more details. Change-Id: I61a71265c3cc34a7b72320149770a814ec7f8351 Reviewed-on: http://gerrit.cloudera.org:8080/10340 Reviewed-by: Adar Dembo <[email protected]> Tested-by: Kudu Jenkins Reviewed-by: Todd Lipcon <[email protected]> --- M src/kudu/fs/block_manager_util.cc M src/kudu/fs/data_dirs-test.cc M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/fs_manager-test.cc M src/kudu/fs/fs_manager.cc M src/kudu/tools/kudu-tool-test.cc 7 files changed, 268 insertions(+), 143 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified Todd Lipcon: Looks good to me, but someone else must approve -- To view, visit http://gerrit.cloudera.org:8080/10340 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I61a71265c3cc34a7b72320149770a814ec7f8351 Gerrit-Change-Number: 10340 Gerrit-PatchSet: 7 Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <[email protected]>
