From: Naman Jain <[email protected]> Sent: Tuesday, March 31, 2026 
10:40 PM
> 
> When registering VTL0 memory via MSHV_ADD_VTL0_MEMORY, the kernel
> computes pgmap->vmemmap_shift as the number of trailing zeros in the
> OR of start_pfn and last_pfn, intending to use the largest compound
> page order both endpoints are aligned to.
> 
> However, this value is not clamped to MAX_FOLIO_ORDER, so a
> sufficiently aligned range (e.g. physical range 0x800000000000-
> 0x800080000000, corresponding to start_pfn=0x800000000 with 35
> trailing zeros) can produce a shift larger than what
> memremap_pages() accepts, triggering a WARN and returning -EINVAL:
> 
>   WARNING: ... memremap_pages+0x512/0x650
>   requested folio size unsupported
> 
> The MAX_FOLIO_ORDER check was added by
> commit 646b67d57589 ("mm/memremap: reject unreasonable folio/compound
> page sizes in memremap_pages()").
> When CONFIG_HAVE_GIGANTIC_FOLIOS=y, CONFIG_SPARSEMEM_VMEMMAP=y, and
> CONFIG_HUGETLB_PAGE is not set, MAX_FOLIO_ORDER resolves to
> (PUD_SHIFT - PAGE_SHIFT) = 18. Any range whose PFN alignment exceeds
> order 18 hits this path.

I'm not clear on what point you are making with this specific
configuration that results in MAX_FOLIO_ORDER being 18. Is it just
an example? Is 18 the largest expected value for MAX_FOLIO_ORDER?
And note that PUD_SHIFT and PAGE_SHIFT might have different values
on arm64 with a page size other than 4K.

> 
> Fix this by clamping vmemmap_shift to MAX_FOLIO_ORDER so we always
> request the largest order the kernel supports, rather than an
> out-of-range value.
> 
> Also fix the error path to propagate the actual error code from
> devm_memremap_pages() instead of hard-coding -EFAULT, which was
> masking the real -EINVAL return.
> 
> Fixes: 7bfe3b8ea6e3 ("Drivers: hv: Introduce mshv_vtl driver")
> Cc: <[email protected]>
> Signed-off-by: Naman Jain <[email protected]>
> ---
>  drivers/hv/mshv_vtl_main.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hv/mshv_vtl_main.c b/drivers/hv/mshv_vtl_main.c
> index 5856975f32e12..255fed3a740c1 100644
> --- a/drivers/hv/mshv_vtl_main.c
> +++ b/drivers/hv/mshv_vtl_main.c
> @@ -405,8 +405,12 @@ static int mshv_vtl_ioctl_add_vtl0_mem(struct mshv_vtl 
> *vtl, void __user *arg)
>       /*
>        * Determine the highest page order that can be used for the given 
> memory range.
>        * This works best when the range is aligned; i.e. both the start and 
> the length.
> +      * Clamp to MAX_FOLIO_ORDER to avoid a WARN in memremap_pages() when 
> the range
> +      * alignment exceeds the maximum supported folio order for this kernel 
> config.
>        */
> -     pgmap->vmemmap_shift = count_trailing_zeros(vtl0_mem.start_pfn | 
> vtl0_mem.last_pfn);
> +     pgmap->vmemmap_shift = min_t(unsigned long,
> +                                  count_trailing_zeros(vtl0_mem.start_pfn | 
> vtl0_mem.last_pfn),
> +                                  MAX_FOLIO_ORDER);

Is it necessary to use min_t() here, or would min() work?  Neither 
count_trailing_zeros()
nor MAX_FOLIO_ORDER is ever negative, so it seems like just min() would work 
with
no potential for doing a bogus comparison or assignment.

The shift is calculated using the originally passed in start_pfn and last_pfn, 
while the
"range" struct in pgmap has an "end" value that is one page less. So is the 
idea to
go ahead and create the mapping with folios of a size that includes that last 
page,
and then just waste the last page of the last folio?

>       dev_dbg(vtl->module_dev,
>               "Add VTL0 memory: start: 0x%llx, end_pfn: 0x%llx, page order: 
> %lu\n",
>               vtl0_mem.start_pfn, vtl0_mem.last_pfn, pgmap->vmemmap_shift);
> @@ -415,7 +419,7 @@ static int mshv_vtl_ioctl_add_vtl0_mem(struct mshv_vtl 
> *vtl, void __user *arg)
>       if (IS_ERR(addr)) {
>               dev_err(vtl->module_dev, "devm_memremap_pages error: %ld\n", 
> PTR_ERR(addr));
>               kfree(pgmap);
> -             return -EFAULT;
> +             return PTR_ERR(addr);
>       }
> 
>       /* Don't free pgmap, since it has to stick around until the memory
> 
> base-commit: 36ece9697e89016181e5ae87510e40fb31d86f2b
> --
> 2.43.0
> 


Reply via email to