[ 
https://issues.apache.org/jira/browse/KUDU-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988571#comment-16988571
 ] 

ASF subversion and git services commented on KUDU-2993:
-------------------------------------------------------

Commit 83f65ffc187638b376a06b7fc3888369764e9333 in kudu's branch 
refs/heads/master from Andrew Wong
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=83f65ff ]

KUDU-2993: don't require update_dirs to fix directory inconsistencies

This patch removes the ENFORCE_CONSISTENCY behavior when opening the
DataDirManager. By default, the FS layout will be opened with the new
UPDATE_AND_IGNORE_FAILURE mode, wherein:
- We update the PIMFs if we notice any are missing or their metadata is
  not consistent with the actual set of directory UUIDs.
- We tolerate failures when creating and updating the PIMFs.

This also maintains the previous UPDATE_ON_DISK behavior as
UPDATE_AND_ERROR_ON_FAILURE, wherein a disk failure during the update
would halt any further updates and revert any metadata changes thus far.
This is only used by the 'update_dirs' tool to maintain existing
behavior.

Since we now rewrite the PIMFs to be consistent by default, the
"integrity check" is now gone. This check was previously useful to
ensure that the 'all_uuids' fields matched for every PIMF, which ensured
that every data directory that was expected to exist actually existed.
This was important for a couple reasons:
- When a single missing data directory spelled failure for the entire
  node, starting up with even a single "inconsistent" directory would
  break all tablets on the tserver.
- The file block manager requires that the UUID indexes used by the
  DataDirManager are static. These indexes are defined by the ordering
  of the UUIDs in the PIMFs, so we used the integrity check to ensure
  the ordering was consistent across PIMFs.

Now that Kudu tablets can start up with missing directories, the first
reason isn't particularly enticing.

The second is trickier to work around. To work around it, I've kept the
essence of the UUID indexing for the file block manager, though I've
made the "integrity checking" virtually non-existent. For the log block
manager, I've made the UUID indexing much simpler: rather than relying
on the integrity check, we'll now always assign a PIMF a UUID, even if
we couldn't read one from disk.

Tests:
- Updated a few tests that previously enforced consistency among PIMFs
  to instead check for the correct instance-updating behavior.
- Added a test to check that failures while updating the PIMFs don't
  stop us from opening the FS layout.
- Added a test that checks that the adding/removing behavior on a
  tserver affects and fails tablets as expected.
- Added a test to make sure that this doesn't completely break the file
  block manager. Given we don't expect heavy usage of the FBM, I didn't
  do extensive testing when the PIMFs are tampered with.
- Added a test to ensure we don't regress the rollback behavior of
  the 'update_dirs' tool in the face of a disk failure.

Change-Id: Ic3027e7edb5c60e96ced6160fec1a380b38353a5
Reviewed-on: http://gerrit.cloudera.org:8080/14760
Reviewed-by: Alexey Serbin <[email protected]>
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <[email protected]>


> Allow Kudu to start up with a fresh data directory without running update_dirs
> ------------------------------------------------------------------------------
>
>                 Key: KUDU-2993
>                 URL: https://issues.apache.org/jira/browse/KUDU-2993
>             Project: Kudu
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Andrew Wong
>            Assignee: Andrew Wong
>            Priority: Major
>
> In the event of a disk failure, the current workflow is to have operators:
>  # The Kudu operator shuts down Kudu for a maintenance window
>  # The data center operator replaces their disk
>  # The Kudu operator runs {{fs update_dirs}}
>  # The Kudu operator restarts Kudu
> Step 3 is unlike what most systems do. As an operator, it would be nice to 
> not have to do it. Once my disk is replaced, Kudu should just know that it's 
> OK to start up (e.g. because it notices a completely empty disk where it 
> expected an existing one), and perhaps run the {{update_dirs}} tool 
> automatically.
> An argument could be made that we shouldn't do this if we're not sure that 
> the operator wants to, as replacing a disk may result in failed tablets. If 
> the missing directory was caused by a simple user input error, maybe we 
> shouldn't have run the tool and failed some tablets. But given many Kudu 
> operators automate their deployment of Kudu, it's hard to think of a time 
> when they _wouldn't_ want to have Kudu run the tool.
> In the case the tool fails because the "missing" directory ended up being a 
> disk failure, we should simply start Kudu up with the data dir marked failed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to