On Tue, Jan 12, 2021 at 11:17:45AM +0200, Nikolay Borisov wrote:
> 
> 
> On 11.01.21 г. 23:50 ч., David Sterba wrote:
> > On Mon, Jan 11, 2021 at 10:33:42AM +0200, Nikolay Borisov wrote:
> >> On 8.01.21 г. 18:01 ч., David Sterba wrote:
> >>> On Fri, Dec 18, 2020 at 02:24:20PM -0500, Josef Bacik wrote:
> >>>> @@ -2043,23 +2043,22 @@ int btrfs_commit_transaction(struct 
> >>>> btrfs_trans_handle *trans)
> >>>>          btrfs_trans_release_metadata(trans);
> >>>>          trans->block_rsv = NULL;
> >>>>  
> >>>> -        /* make a pass through all the delayed refs we have so far
> >>>> -         * any runnings procs may add more while we are here
> >>>> -         */
> >>>> -        ret = btrfs_run_delayed_refs(trans, 0);
> >>>> -        if (ret) {
> >>>> -                btrfs_end_transaction(trans);
> >>>> -                return ret;
> >>>> -        }
> >>>> -
> >>>> -        cur_trans = trans->transaction;
> >>>> -
> >>>>          /*
> >>>> -         * set the flushing flag so procs in this transaction have to
> >>>> -         * start sending their work down.
> >>>> +         * We only want one transaction commit doing the flushing so we 
> >>>> do not
> >>>> +         * waste a bunch of time on lock contention on the extent root 
> >>>> node.
> >>>>           */
> >>>> -        cur_trans->delayed_refs.flushing = 1;
> >>>> -        smp_wmb();
> >>>
> >>> This barrier obviously separates the flushing = 1 and the rest of the
> >>> code, now implemented as test_and_set_bit, which implies full barrier.
> >>>
> >>> However, hunk in btrfs_should_end_transaction removes the barrier and
> >>> I'm not sure whether this is correct:
> >>>
> >>> - smp_mb();
> >>>   if (cur_trans->state >= TRANS_STATE_COMMIT_START ||
> >>> -     cur_trans->delayed_refs.flushing)
> >>> +     test_bit(BTRFS_DELAYED_REFS_FLUSHING,
> >>> +              &cur_trans->delayed_refs.flags))
> >>>           return true;
> >>>
> >>> This is never called under locks so we don't have complete
> >>> synchronization of neither the transaction state nor the flushing bit.
> >>> btrfs_should_end_transaction is merely a hint and not called in critical
> >>> places so we could probably afford to keep it without a barrier, or keep
> >>> it with comment(s).
> >>
> >> I think the point is moot in this case, because the test_bit either sees
> >> the flag or it doesn't. It's not possible for the flag to be set AND
> >> should_end_transaction return false that would be gross violation of
> >> program correctness.
> > 
> > So that's for the flushing part, but what about cur_trans->state?
> 
> Looking at the code, the barrier was there to order the publishing of
> the delayed_ref.flushing (now replaced by the bit flag) against
> surrounding code.
> 
> So independently of this patch, let's reason about trans state. In
> should_end_transaction it's read without holding any locks. (U)
> 
> It's modified in btrfs_cleanup_transaction without holding the
> fs_info->trans_lock (U), but the STATE_ERROR flag is going to be set.
> 
> set in cleanup_transaction under fs_info->trans_lock (L)
> set in btrfs_commit_trans to COMMIT_START under fs_info->trans_lock.(L)
> set in btrfs_commit_trans to COMMIT_DOING under fs_info->trans_lock.(L)
> set in btrfs_commit_trans to COMMIT_UNBLOCK under fs_info->trans_lock.(L)
> 
> set in btrfs_commit_trans to COMMIT_COMPLETED without locks but at this
> point the transaction is finished and fs_info->running_trans is NULL (U
> but irrelevant).
> 
> So by the looks of it we can have a concurrent READ race with a Write,
> due to reads not taking a lock. In this case what we want to ensure is
> we either see new or old state. I consulted with Will Deacon and he said
> that in such a case we'd want to annotate the accesses to ->state with
> (READ|WRITE)_ONCE so as to avoid a theoretical tear, in this case I
> don't think this could happen but I imagine at some point kcsan would
> flag such an access as racy (which it is).

Thanks for the analysis, I've copied it to the changelog as there's
probably no shorter way to explain it.

Reply via email to