On 2016-10-28 22:24:52 [+0000], Trond Myklebust wrote: > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c > > index 5f4281ec5f72..a442d9867942 100644 > > --- a/fs/nfs/nfs4state.c > > +++ b/fs/nfs/nfs4state.c > > @@ -1497,8 +1498,18 @@ static int nfs4_reclaim_open_state(struct > > nfs4_state_owner *sp, const struct nfs > > * recovering after a network partition or a reboot from a > > * server that doesn't support a grace period. > > */ > > + /* > > + * XXX > > + * This mutex is wrong. It protects against multiple writer. However > > There is only 1 recovery thread per client/server pair, which is why we know > there is only a single writer. No need for a mutex here.
This isn't documented but it should. > > + * write_seqlock() should have been used for this task. This would avoid > > + * preemption while the seqlock is held which is good because the writer > > + * shouldn't be preempted as it would let the reader spin for no good > > Recovery involves sending RPC calls to a server. There is no avoiding > preemption if we have to recover state. The point of the sequence counter is > to signal to processes that may have raced with this recovery thread that > they may need to replay their locks. Then the sequence counter is the wrong mechanism used here. As I explained: the call can be preempted and once it is, the reader will spin and waste cycles. What is wrong with a rwsem? Are the reader not preemptible? > > + * reason. There are a few memory allocations and kthread_run() so we > > + * have this mutex now. > > + */ > > + mutex_lock(&sp->so_reclaim_seqlock_mutex); > > + write_seqcount_begin(&sp->so_reclaim_seqlock.seqcount); > > spin_lock(&sp->so_lock); > > - raw_write_seqcount_begin(&sp->so_reclaim_seqcount); > > restart: > > list_for_each_entry(state, &sp->so_states, open_states) { > > if (!test_and_clear_bit(ops->state_flag_bit, &state->flags)) > > @@ -1566,14 +1577,14 @@ static int nfs4_reclaim_open_state(struct > > nfs4_state_owner *sp, const struct nfs > > spin_lock(&sp->so_lock); > > goto restart; > > } > > - raw_write_seqcount_end(&sp->so_reclaim_seqcount); > > spin_unlock(&sp->so_lock); > > + write_seqcount_end(&sp->so_reclaim_seqlock.seqcount); > > This will reintroduce lockdep checking. We don’t need or want that... Why isn't lockdep welcome? And what kind warning will it introduce? How do I trigger this? I tried various things but never got close to this. Sebastian