On 2025/7/30 14:54, Hugh Dickins wrote:
On Mon, 28 Jul 2025, Baolin Wang wrote:

After commit acd7ccb284b8 ("mm: shmem: add large folio support for tmpfs"),
we extend the 'huge=' option to allow any sized large folios for tmpfs,
which means tmpfs will allow getting a highest order hint based on the size
of write() and fallocate() paths, and then will try each allowable large order.

However, when the i915 driver allocates shmem memory, it doesn't provide hint
information about the size of the large folio to be allocated, resulting in
the inability to allocate PMD-sized shmem, which in turn affects GPU 
performance.

To fix this issue, add the 'end' information for shmem_read_folio_gfp()  to help
allocate PMD-sized large folios. Additionally, use the maximum allocation chunk
(via mapping_max_folio_size()) to determine the size of the large folios to
allocate in the i915 driver.

Fixes: acd7ccb284b8 ("mm: shmem: add large folio support for tmpfs")
Reported-by: Patryk Kowalczyk <pat...@kowalczyk.ws>
Reported-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
Tested-by: Patryk Kowalczyk <pat...@kowalczyk.ws>
Signed-off-by: Baolin Wang <baolin.w...@linux.alibaba.com>
---
  drivers/gpu/drm/drm_gem.c                 | 2 +-
  drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 7 ++++++-
  drivers/gpu/drm/ttm/ttm_backup.c          | 2 +-
  include/linux/shmem_fs.h                  | 4 ++--
  mm/shmem.c                                | 7 ++++---
  5 files changed, 14 insertions(+), 8 deletions(-)

I know I said "I shall not object to a temporary workaround to suit the
i915 driver", but really, I have to question this patch.  Why should any
change be required at the drivers/gpu/drm end?

And in drivers/gpu/drm/{i915,v3d} I find they are using huge=within_size:
I had been complaining about the userspace regression in huge=always,
and thought it had been changed to behave like huge=within_size,
but apparently huge=within_size has itself regressed too.

I'm preparing a RFC patch to discuss this.

Please explain why the below is not a better patch for i915 and v3d
(but still a temporary workaround, because the root of the within_size
regression must lie deeper, in the handling of write_end versus i_size).

OK. This looks good to me. Patryk, could you try Hugh's simple patch? Thanks.

---
  mm/shmem.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 3a5a65b1f41a..c67dfc17a819 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -5928,8 +5928,8 @@ struct folio *shmem_read_folio_gfp(struct address_space 
*mapping,
        struct folio *folio;
        int error;
- error = shmem_get_folio_gfp(inode, index, 0, &folio, SGP_CACHE,
-                                   gfp, NULL, NULL);
+       error = shmem_get_folio_gfp(inode, index, i_size_read(inode),
+                                   &folio, SGP_CACHE, gfp, NULL, NULL);
        if (error)
                return ERR_PTR(error);

Reply via email to