On Thu, Jun 11, 2026 at 05:31:59PM +0800, John Groves wrote:
> From: John Groves <[email protected]>
> 
> Fix memory_failure offset calculation for multi-range devices. The old code
> subtracted ranges[0].range.start from the faulting PFN's physical address,
> which produces an incorrect (inflated) logical offset when the PFN falls in
> ranges[1] or beyond due to physical gaps between ranges. Add
> fsdev_pfn_to_offset() to walk the range list and compute the correct
> device-linear byte offset.
> 
> Walk the pagemap's own range array (pgmap->ranges[]) rather than
> dev_dax->ranges[]. The pgmap copy is the immutable snapshot populated at
> probe and is never mutated afterwards, whereas dev_dax->ranges[] can be
> krealloc()'d by a concurrent sysfs mapping_store() (under dax_region_rwsem,
> which this ->memory_failure callback does not hold). For dynamic devices the
> two arrays are identical, so the reported offset is unchanged for the
> multi-range case this targets.
> 
> Fixes: d5406bd458b0a ("dax: add fsdev.c driver for fs-dax on character dax")
> 
> Suggested-by: Richard Cheng <[email protected]>
> Reviewed-by: Dave Jiang <[email protected]>
> Reviewed-by: Alison Schofield <[email protected]>
> Signed-off-by: John Groves <[email protected]>
> ---
>  drivers/dax/fsdev.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/dax/fsdev.c b/drivers/dax/fsdev.c
> index 188b2526bee45..2c5de3d80a618 100644
> --- a/drivers/dax/fsdev.c
> +++ b/drivers/dax/fsdev.c
> @@ -135,11 +135,26 @@ static void fsdev_clear_ops(void *data)
>   * The core mm code in free_zone_device_folio() handles the wake_up_var()
>   * directly for this memory type.
>   */
> +static u64 fsdev_pfn_to_offset(struct dev_pagemap *pgmap, unsigned long pfn)
> +{
> +     phys_addr_t phys = PFN_PHYS(pfn);
> +     u64 offset = 0;
> +
> +     for (int i = 0; i < pgmap->nr_range; i++) {
> +             struct range *range = &pgmap->ranges[i];
> +
> +             if (phys >= range->start && phys <= range->end)
> +                     return offset + (phys - range->start);
> +             offset += range_len(range);
> +     }
> +     return -1ULL;
> +}
> +
>  static int fsdev_pagemap_memory_failure(struct dev_pagemap *pgmap,
>               unsigned long pfn, unsigned long nr_pages, int mf_flags)
>  {
>       struct dev_dax *dev_dax = pgmap->owner;
> -     u64 offset = PFN_PHYS(pfn) - dev_dax->ranges[0].range.start;
> +     u64 offset = fsdev_pfn_to_offset(pgmap, pfn);

Hi John,

I think this regresses static devices. pgmap->ranges[0].start can sit
data_offset below it on a static device, so the new offset = old + data_offset,
and XFS poisons the wrong blocks.

The gap walk only helps dynamic devices where data_offset ==0 . Maybe walking 
pgmap->ranges and
substract the probe's data_offset.

--Richard

>       u64 len = nr_pages << PAGE_SHIFT;
>  
>       return dax_holder_notify_failure(dev_dax->dax_dev, offset,
> -- 
> 2.53.0
> 

Reply via email to