Re: [RFC 1/7] mm: introduce MADV_COOL

Minchan Kim Tue, 28 May 2019 05:39:45 -0700

On Tue, May 28, 2019 at 08:15:23PM +0800, Hillf Danton wrote:
< snip >
> > > > +
> > > > +                       get_page(page);
> > > > +                       spin_unlock(ptl);
> > > > +                       lock_page(page);
> > > > +                       err = split_huge_page(page);
> > > > +                       unlock_page(page);
> > > > +                       put_page(page);
> > > > +                       if (!err)
> > > > +                               goto regular_page;
> > > > +                       return 0;
> > > > +               }
> > > > +
> > > > +               pmdp_test_and_clear_young(vma, addr, pmd);
> > > > +               deactivate_page(page);
> > > > +huge_unlock:
> > > > +               spin_unlock(ptl);
> > > > +               return 0;
> > > > +       }
> > > > +
> > > > +       if (pmd_trans_unstable(pmd))
> > > > +               return 0;
> > > > +
> > > > +regular_page:
> > >
> > > Take a look at pending signal?
> >
> > Do you have any reason to see pending signal here? I want to know what's
> > your requirement so that what's the better place to handle it.
> >
> We could bail out without work done IMO if there is a fatal siganl pending.
> And we can do that, if it makes sense to you, before the hard work.


Make sense, especically, swapping out.
I will add it in next revision.

> 
> > >
> > > > +       orig_pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> > > > +       for (pte = orig_pte; addr < end; pte++, addr += PAGE_SIZE) {
> > >
> > > s/end/next/ ?
> >
> > Why do you think it should be next?
> >
> Simply based on the following line, and afraid that next != end
>       > > > + next = pmd_addr_end(addr, end);

pmd_addr_end will return smaller address so end is more proper.

> 
> > > > +               ptent = *pte;
> > > > +
> > > > +               if (pte_none(ptent))
> > > > +                       continue;
> > > > +
> > > > +               if (!pte_present(ptent))
> > > > +                       continue;
> > > > +
> > > > +               page = vm_normal_page(vma, addr, ptent);
> > > > +               if (!page)
> > > > +                       continue;
> > > > +
> > > > +               if (page_mapcount(page) > 1)
> > > > +                       continue;
> > > > +
> > > > +               ptep_test_and_clear_young(vma, addr, pte);
> > > > +               deactivate_page(page);
> > > > +       }
> > > > +
> > > > +       pte_unmap_unlock(orig_pte, ptl);
> > > > +       cond_resched();
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static long madvise_cool(struct vm_area_struct *vma,
> > > > +                       unsigned long start_addr, unsigned long 
> > > > end_addr)
> > > > +{
> > > > +       struct mm_struct *mm = vma->vm_mm;
> > > > +       struct mmu_gather tlb;
> > > > +
> > > > +       if (vma->vm_flags & (VM_LOCKED|VM_HUGETLB|VM_PFNMAP))
> > > > +               return -EINVAL;
> > >
> > > No service in case of VM_IO?
> >
> > I don't know VM_IO would have regular LRU pages but just follow normal
> > convention for DONTNEED and FREE.
> > Do you have anything in your mind?
> >
> I want to skip a mapping set up for DMA.

What you meant is those pages in VM_IO vma are not in LRU list?
Or
pages in the vma are always pinned so no worth to deactivate or reclaim?

Re: [RFC 1/7] mm: introduce MADV_COOL

Reply via email to