CephFS currently deadlocks under CTDB's ping_pong POSIX locking test
when run concurrently on multiple nodes.
The deadlock is caused by failed removal of a waiting_locks entry when
the waiting lock is merged with an existing lock, e.g:
Initial MDS state (two clients, same file):
held_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2
start: 2, length: 1, client: 4110, pid: 40767, type: 2
waiting_locks -- start: 1, length: 1, client: 4116, pid: 7899, type: 2
Waiting lock entry 4116@1:1 fires:
handle_client_file_setlock: start: 1, length: 1,
client: 4116, pid: 7899, type: 2
MDS state after lock is obtained:
held_locks -- start: 0, length: 2, client: 4116, pid: 7899, type: 2
start: 2, length: 1, client: 4110, pid: 40767, type: 2
waiting_locks -- start: 1, length: 1, client: 4116, pid: 7899, type: 2
Note that the waiting 4116@1:1 lock entry is merged with the existing
4116@0:1 held lock to become a 4116@0:2 held lock. However, the now
handled 4116@1:1 waiting_locks entry remains.
When handling a lock request, the MDS calls adjust_locks() to merge
the new lock with available neighbours. If the new lock is merged,
then the waiting_locks entry is not located in the subsequent
remove_waiting() call.
This fix ensures that the waiting_locks entry is removed prior to
modification during merge.
Signed-off-by: David Disseldorp <[email protected]>
---
src/mds/flock.cc | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/src/mds/flock.cc b/src/mds/flock.cc
index e83c5ee..5e329af 100644
--- a/src/mds/flock.cc
+++ b/src/mds/flock.cc
@@ -75,12 +75,14 @@ bool ceph_lock_state_t::add_lock(ceph_filelock& new_lock,
} else {
//yay, we can insert a shared lock
dout(15) << "inserting shared lock" << dendl;
+ remove_waiting(new_lock);
adjust_locks(self_overlapping_locks, new_lock, neighbor_locks);
held_locks.insert(pair<uint64_t, ceph_filelock>(new_lock.start,
new_lock));
ret = true;
}
}
} else { //no overlapping locks except our own
+ remove_waiting(new_lock);
adjust_locks(self_overlapping_locks, new_lock, neighbor_locks);
dout(15) << "no conflicts, inserting " << new_lock << dendl;
held_locks.insert(pair<uint64_t, ceph_filelock>
@@ -89,7 +91,6 @@ bool ceph_lock_state_t::add_lock(ceph_filelock& new_lock,
}
if (ret) {
++client_held_lock_counts[(client_t)new_lock.client];
- remove_waiting(new_lock);
}
else if (wait_on_fail && !replay)
++client_waiting_lock_counts[(client_t)new_lock.client];
@@ -306,7 +307,7 @@ void
ceph_lock_state_t::adjust_locks(list<multimap<uint64_t, ceph_filelock>::ite
old_lock = &(*iter)->second;
old_lock_client = old_lock->client;
dout(15) << "lock to coalesce: " << *old_lock << dendl;
- /* because if it's a neibhoring lock there can't be any self-overlapping
+ /* because if it's a neighboring lock there can't be any self-overlapping
locks that covered it */
if (old_lock->type == new_lock.type) { //merge them
if (0 == new_lock.length) {
--
1.8.1.4
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html