On 26/06/12 11:08AM, Richard Cheng wrote: > On Thu, Jun 11, 2026 at 05:31:59PM +0800, John Groves wrote: > > From: John Groves <[email protected]> > > > > Fix memory_failure offset calculation for multi-range devices. The old code > > subtracted ranges[0].range.start from the faulting PFN's physical address, > > which produces an incorrect (inflated) logical offset when the PFN falls in > > ranges[1] or beyond due to physical gaps between ranges. Add > > fsdev_pfn_to_offset() to walk the range list and compute the correct > > device-linear byte offset. > > > > Walk the pagemap's own range array (pgmap->ranges[]) rather than > > dev_dax->ranges[]. The pgmap copy is the immutable snapshot populated at > > probe and is never mutated afterwards, whereas dev_dax->ranges[] can be > > krealloc()'d by a concurrent sysfs mapping_store() (under dax_region_rwsem, > > which this ->memory_failure callback does not hold). For dynamic devices the > > two arrays are identical, so the reported offset is unchanged for the > > multi-range case this targets. > > > > Fixes: d5406bd458b0a ("dax: add fsdev.c driver for fs-dax on character dax") > > > > Suggested-by: Richard Cheng <[email protected]> > > Reviewed-by: Dave Jiang <[email protected]> > > Reviewed-by: Alison Schofield <[email protected]> > > Signed-off-by: John Groves <[email protected]> > > --- > > drivers/dax/fsdev.c | 17 ++++++++++++++++- > > 1 file changed, 16 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/dax/fsdev.c b/drivers/dax/fsdev.c > > index 188b2526bee45..2c5de3d80a618 100644 > > --- a/drivers/dax/fsdev.c > > +++ b/drivers/dax/fsdev.c > > @@ -135,11 +135,26 @@ static void fsdev_clear_ops(void *data) > > * The core mm code in free_zone_device_folio() handles the wake_up_var() > > * directly for this memory type. > > */ > > +static u64 fsdev_pfn_to_offset(struct dev_pagemap *pgmap, unsigned long > > pfn) > > +{ > > + phys_addr_t phys = PFN_PHYS(pfn); > > + u64 offset = 0; > > + > > + for (int i = 0; i < pgmap->nr_range; i++) { > > + struct range *range = &pgmap->ranges[i]; > > + > > + if (phys >= range->start && phys <= range->end) > > + return offset + (phys - range->start); > > + offset += range_len(range); > > + } > > + return -1ULL; > > +} > > + > > static int fsdev_pagemap_memory_failure(struct dev_pagemap *pgmap, > > unsigned long pfn, unsigned long nr_pages, int mf_flags) > > { > > struct dev_dax *dev_dax = pgmap->owner; > > - u64 offset = PFN_PHYS(pfn) - dev_dax->ranges[0].range.start; > > + u64 offset = fsdev_pfn_to_offset(pgmap, pfn); > > Hi John, > > I think this regresses static devices. pgmap->ranges[0].start can sit > data_offset below it on a static device, so the new offset = old + > data_offset, > and XFS poisons the wrong blocks. > > The gap walk only helps dynamic devices where data_offset ==0 . Maybe walking > pgmap->ranges and > substract the probe's data_offset. > > --Richard
Ugh, right. Subtracting the data_offset would require newly stashing it somewhere the ->memory_failure callback could reach. So I'm reverting to walking dev_dax->ranges[] -- the maybe-race there is the same one the pre-existing single-range code already had. I'd like to land this series before going too much farther down the suspected pre-existing issues rabbit hole :D Note: the current version of this patch (switching to pgmap->ranges) might have been a bit much for keeping Dave and Alison's RB tags - but I'm reverting back to what they reviewed for V6. Thanks, John <snip>

