On Sat, May 30, 2026, at 9:02 AM, John Groves wrote:
> On 26/05/26 05:16PM, Dave Jiang wrote:
> > 
> > 
> > On 5/22/26 12:19 PM, John Groves wrote:
> > > From: John Groves <[email protected]>
> > > 
> > > Clear holder_ops before holder_data so that a concurrent fs_dax_get()
> > > cannot have its newly installed holder_ops overwritten. Also add a
> > > kerneldoc comment documenting that fs_put_dax() must only be called
> > > by the current holder.
> > > 
> > > Fixes: eec38f5d86d27 ("dax: add fs_dax_get() for devdax")
> > > Signed-off-by: John Groves <[email protected]>
> > 
> > Couple things from Claude that may be worth taking a look at:
> > 
> >   1. Memory ordering is now load-bearing and missing
> > 
> >   The whole correctness argument depends on the reader observing holder_ops 
> > =
> >   NULL before observing holder_data = NULL. The patch uses a plain store
> >   followed by cmpxchg. On x86 plain stores are ordered, but on arm64/ppc 
> > they
> >   are not — the reader can observe cmpxchg's release of holder_data while 
> > still
> >   seeing the old holder_ops. That puts us back in the dangerous 
> > (holder_data ==
> >   NULL, holder_ops == old) state on weakly-ordered arches.
> > 
> >   Required:
> > 
> >   smp_store_release(&dax_dev->holder_ops, NULL);   /* publish ops=NULL 
> > first */
> >   cmpxchg(&dax_dev->holder_data, holder, NULL);    /* then release 
> > holder_data
> >   */
> 
> Updating to WRITE_ONCE(), which I think is the right choice
> 
> > 
> >   And the reader in dax_holder_notify_failure should use
> >   smp_load_acquire/READ_ONCE because today it reads dax_dev->holder_ops 
> > twice
> >   (line 334 and line 339), allowing tearing or stale-cache reads. 
> > Pre-existing
> >   weakness, but this patch is what makes the ordering matter.
> > 
> >   kill_dax (line 461-462) has the same naked-store pattern — it should be 
> > made
> >   consistent.
> 
> Will study this and post a separate patch for kill_dax if I think it's
> warranted
> 

Fixing kill_dax() isn't necessary because it does a synchronize_srcu() after
clear_bit(DAXDEV_ALIVE). So it can't race with fs_dax_get()...

Thanks,
John

Reply via email to