On Sat, Feb 12, 2022 at 6:08 PM Muchun Song <songmuc...@bytedance.com> wrote:
>
> On Fri, Feb 11, 2022 at 8:37 PM Joao Martins <joao.m.mart...@oracle.com> 
> wrote:
> >
> > On 2/11/22 07:54, Muchun Song wrote:
> > > On Fri, Feb 11, 2022 at 3:34 AM Joao Martins <joao.m.mart...@oracle.com> 
> > > wrote:
> > > [...]
> > >>  pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, 
> > >> int node,
> > >> -                                      struct vmem_altmap *altmap)
> > >> +                                      struct vmem_altmap *altmap,
> > >> +                                      struct page *block)
> > >
> > > Why not use the name of "reuse" instead of "block"?
> > > Seems like "reuse" is more clear.
> > >
> > Good idea, let me rename that to @reuse.
> >
> > >>  {
> > >>         pte_t *pte = pte_offset_kernel(pmd, addr);
> > >>         if (pte_none(*pte)) {
> > >>                 pte_t entry;
> > >>                 void *p;
> > >>
> > >> -               p = vmemmap_alloc_block_buf(PAGE_SIZE, node, altmap);
> > >> -               if (!p)
> > >> -                       return NULL;
> > >> +               if (!block) {
> > >> +                       p = vmemmap_alloc_block_buf(PAGE_SIZE, node, 
> > >> altmap);
> > >> +                       if (!p)
> > >> +                               return NULL;
> > >> +               } else {
> > >> +                       /*
> > >> +                        * When a PTE/PMD entry is freed from the init_mm
> > >> +                        * there's a a free_pages() call to this page 
> > >> allocated
> > >> +                        * above. Thus this get_page() is paired with the
> > >> +                        * put_page_testzero() on the freeing path.
> > >> +                        * This can only called by certain ZONE_DEVICE 
> > >> path,
> > >> +                        * and through vmemmap_populate_compound_pages() 
> > >> when
> > >> +                        * slab is available.
> > >> +                        */
> > >> +                       get_page(block);
> > >> +                       p = page_to_virt(block);
> > >> +               }
> > >>                 entry = pfn_pte(__pa(p) >> PAGE_SHIFT, PAGE_KERNEL);
> > >>                 set_pte_at(&init_mm, addr, pte, entry);
> > >>         }
> > >> @@ -609,7 +624,8 @@ pgd_t * __meminit vmemmap_pgd_populate(unsigned long 
> > >> addr, int node)
> > >>  }
> > >>
> > >>  static int __meminit vmemmap_populate_address(unsigned long addr, int 
> > >> node,
> > >> -                                             struct vmem_altmap *altmap)
> > >> +                                             struct vmem_altmap *altmap,
> > >> +                                             struct page *reuse, struct 
> > >> page **page)
> > >
> > > We can remove the last argument (struct page **page) if we change
> > > the return type to "pte_t *".  More simple, don't you think?
> > >
> >
> > Hmmm, perhaps it is simpler, specially provided the only error code is 
> > ENOMEM.
> >
> > Albeit perhaps what we want is a `struct page *` rather than a pte.
>
> The caller can extract `struct page` from a pte.
>
> [...]
>
> > >> -       if (vmemmap_populate(start, end, nid, altmap))
> > >> +       if (pgmap && pgmap_vmemmap_nr(pgmap) > 1 && !altmap)
> > >
> > > Should we add a judgment like "is_power_of_2(sizeof(struct page))" since
> > > this optimization is only applied when the size of the struct page does 
> > > not
> > > cross page boundaries?
> >
> > Totally miss that -- let me make that adjustment.
> >
> > Can I ask which architectures/conditions this happens?
>
> E.g. arm64 when !CONFIG_MEMCG.

Plus !CONFIG_SLUB even on x86_64.

>
> Thanks.

Reply via email to