David Hildenbrand wrote:
[..]
> > Maybe there is something missing in ZONE_DEVICE freeing/splitting code
> > of large folios, where we should do the same, to make sure that all
> > page->memcg_data is actually 0?
> > 
> > I assume so. Let me dig.
> > 
> 
> I suspect this should do the trick:
> 
> diff --git a/fs/dax.c b/fs/dax.c
> index af5045b0f476e..8dffffef70d21 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -397,6 +397,10 @@ static inline unsigned long dax_folio_put(struct folio 
> *folio)
>          if (!order)
>                  return 0;
>   
> +#ifdef NR_PAGES_IN_LARGE_FOLIO
> +       folio->_nr_pages = 0;
> +#endif

I assume this new fs/dax.c instance of this pattern motivates a
folio_set_nr_pages() helper to hide the ifdef?

While it is concerning that fs/dax.c misses common expectations like
this, but I think that is the nature of bypassing the page allocator to
get folios().

However, raises the question if fixing it here is sufficient for other
ZONE_DEVICE folio cases. I did not immediately find a place where other
ZONE_DEVICE users might be calling prep_compound_page() and leaving
stale tail page metadata lying around. Alistair?

> +
>          for (i = 0; i < (1UL << order); i++) {
>                  struct dev_pagemap *pgmap = page_pgmap(&folio->page);
>                  struct page *page = folio_page(folio, i);
> 
> 
> Alternatively (in the style of fa23a338de93aa03eb0b6146a0440f5762309f85)
> 
> diff --git a/fs/dax.c b/fs/dax.c
> index af5045b0f476e..a1e354b748522 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -412,6 +412,9 @@ static inline unsigned long dax_folio_put(struct folio 
> *folio)
>                   */
>                  new_folio->pgmap = pgmap;
>                  new_folio->share = 0;
> +#ifdef CONFIG_MEMCG
> +               new_folio->memcg_data = 0;
> +#endif

This looks correct, but I like the first option because I would never
expect a dax-page to need to worry about being part of a memcg.

>                  WARN_ON_ONCE(folio_ref_count(new_folio));
>          }
>   
> 
> 
> -- 
> Cheers,
> 
> David / dhildenb

Thanks for the help, David!

Reply via email to