Hello Dan Burkert, Todd Lipcon, Kudu Jenkins,
I'd like you to reexamine a change. Please visit
to look at the new patch set (#8).
Change subject: master: add read-write lock to serialize operations around
master: add read-write lock to serialize operations around elections
This rigmarole began with an investigation into a test failure , which
led to a new integration test that hammers VisitTablesAndTablets() while
creating tables. That test revealed other locking issues, which brings us
to where we are now.
This patch introduces a read-write lock to serialize all master operations
so that they fall on one side or the other of a leader election. The idea
is to avoid performing operations concurrently with a reload of the master
metadata; doing so can lead to problems in Shutdown() and (very rarely,
perhaps only conceptually) to inconsistent on-disk state.
I was hoping this lock could replace the fencing done by leader_ready_term_,
but eventually reasoned that we need both; without leader_ready_term_
fencing, the master's consensus state machine could fool an operation into
thinking the master became the leader before the metadata was reloaded.
Three other things of note here:
- The new lock is acquired via TryLock() so that, if the lock could not be
acquired, the RPC will fail rather than block. A future patch modifies
TSHeartbeat() to partially accept heartbeats even if the master is a
follower; TryLock() means that a transitioning leader that is pelted with
RPCs won't fill up its service queue and can still process heartbeats.
- TableInfo's AddTask() and RemoveTask() methods now don't hold the table's
lock when adding and removing refs from the task respectively. This is
the fix for the original test failure.
- When reloading metadata, we now abort all outstanding table tasks to
avoid orphaning them.
6 files changed, 334 insertions(+), 144 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/50/3550/8
To view, visit http://gerrit.cloudera.org:8080/3550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Owner: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <d...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dral...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>