There's one potential memory ordering problem regarding eb::blocking_writers, although this hasn't been hit in practice. On TSO (total store ordering) architectures, the writes will always be seen in right order.
In case the calls in btrfs_set_lock_blocking_write are seen in this (wrong) order on another CPU: 0: /* blocking_writers == 0 */ 1: write_unlock() 2: blocking_writers = 1 btrfs_try_tree_read_lock, btrfs_tree_read_lock_atomic or btrfs_try_tree_write_lock would have to squeeze all of this between 1: and 2: - check if blocking_writers is 0 (it is, continue) - try read lock, read lock or write lock (succeeds after 1:) - check blocking_writers again (continue) All of that assumes that the reordered writes can survive for quite some time (unlikely if its in the internal store buffer). Another point against observing that in practice is that blocking_writers and the lock are on the same cacheline (64B), so it's more likely both values are stored in order, or some sort of pending write flush will update blocking_writers, rwlock before the try lock happens. Assuming the CPUs work like that and don't hide other surprises. struct extent_buffer { u64 start; /* 0 8 */ long unsigned int len; /* 8 8 */ long unsigned int bflags; /* 16 8 */ struct btrfs_fs_info * fs_info; /* 24 8 */ spinlock_t refs_lock; /* 32 4 */ atomic_t refs; /* 36 4 */ atomic_t io_pages; /* 40 4 */ int read_mirror; /* 44 4 */ struct callback_head callback_head __attribute__((__aligned__(8))); /* 48 16 */ /* --- cacheline 1 boundary (64 bytes) --- */ pid_t lock_owner; /* 64 4 */ int blocking_writers; /* 68 4 */ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ atomic_t blocking_readers; /* 72 4 */ bool lock_nested; /* 76 1 */ /* XXX 1 byte hole, try to pack */ short int log_index; /* 78 2 */ rwlock_t lock; /* 80 8 */ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ wait_queue_head_t write_lock_wq; /* 88 24 */ wait_queue_head_t read_lock_wq; /* 112 24 */ /* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */ struct page * pages[16]; /* 136 128 */ /* size: 264, cachelines: 5, members: 18 */ /* sum members: 263, holes: 1, sum holes: 1 */ /* forced alignments: 1 */ /* last cacheline: 8 bytes */ } __attribute__((__aligned__(8))); Add the barriers for correctness sake. Signed-off-by: David Sterba <dste...@suse.com> --- fs/btrfs/locking.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c index a12f3edc3505..e0e0430577aa 100644 --- a/fs/btrfs/locking.c +++ b/fs/btrfs/locking.c @@ -110,6 +110,18 @@ void btrfs_set_lock_blocking_write(struct extent_buffer *eb) btrfs_assert_spinning_writers_put(eb); btrfs_assert_tree_locked(eb); WRITE_ONCE(eb->blocking_writers, 1); + /* + * Writers must be be updated before the lock is released so + * that other threads see that in order. Otherwise this could + * theoretically lead to blocking_writers be still set to 1 but + * this would be unexpected eg. for spinning locks. + * + * There are no pairing read barriers for unlocked access as we + * don't strictly need them for correctness (eg. in + * btrfs_try_tree_read_lock), and the unlock/lock is an implied + * barrier in other cases. + */ + smp_wmb(); write_unlock(&eb->lock); } } @@ -316,7 +328,9 @@ void btrfs_tree_unlock(struct extent_buffer *eb) /* * We need to order modifying blocking_writers above with * actually waking up the sleepers to ensure they see the - * updated value of blocking_writers + * updated value of blocking_writers. + * + * cond_wake_up calls smp_mb */ cond_wake_up(&eb->write_lock_wq); } else { -- 2.23.0