On Mon, Nov 15, 2021 at 01:11:32PM +0100, Joao Martins wrote:
> On 11/12/21 16:34, Jason Gunthorpe wrote:
> > On Fri, Nov 12, 2021 at 04:08:24PM +0100, Joao Martins wrote:
> >
> >> diff --git a/drivers/dax/device.c b/drivers/dax/device.c
> >> index a65c67ab5ee0..0c2ac97d397d 100644
> >> +++ b/drivers/dax/device.c
> >> @@ -192,6 +192,42 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax
> >> *dev_dax,
> >> }
> >> #endif /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
> >>
> >> +static void set_page_mapping(struct vm_fault *vmf, pfn_t pfn,
> >> + unsigned long fault_size,
> >> + struct address_space *f_mapping)
> >> +{
> >> + unsigned long i;
> >> + pgoff_t pgoff;
> >> +
> >> + pgoff = linear_page_index(vmf->vma, ALIGN(vmf->address, fault_size));
> >> +
> >> + for (i = 0; i < fault_size / PAGE_SIZE; i++) {
> >> + struct page *page;
> >> +
> >> + page = pfn_to_page(pfn_t_to_pfn(pfn) + i);
> >> + if (page->mapping)
> >> + continue;
> >> + page->mapping = f_mapping;
> >> + page->index = pgoff + i;
> >> + }
> >> +}
> >> +
> >> +static void set_compound_mapping(struct vm_fault *vmf, pfn_t pfn,
> >> + unsigned long fault_size,
> >> + struct address_space *f_mapping)
> >> +{
> >> + struct page *head;
> >> +
> >> + head = pfn_to_page(pfn_t_to_pfn(pfn));
> >> + head = compound_head(head);
> >> + if (head->mapping)
> >> + return;
> >> +
> >> + head->mapping = f_mapping;
> >> + head->index = linear_page_index(vmf->vma,
> >> + ALIGN(vmf->address, fault_size));
> >> +}
> >
> > Should this stuff be setup before doing vmf_insert_pfn_XX?
> >
>
> Interestingly filesystem-dax does this, but not device-dax.
I think it may be a bug ?
> set_page_mapping/set_compound_mapping() could be moved to before and
> then torn down on @rc != VM_FAULT_NOPAGE (failure). I am not sure
> what's the benefit in this series.. besides the ordering (that you
> hinted below) ?
Well, it should probably be fixed in a precursor patch.
I think the general idea is that page->mapping/index are stable once
the page is published in a PTE?
> > In normal cases the page should be returned in the vmf and populated
> > to the page tables by the core code after all this is done.
>
> So I suppose by call sites examples as 'core code' is either hugetlbfs call to
> __filemap_add_folio() (on hugetlbfs fault handler), shmem_add_to_page_cache()
> or
> anon-equivalent.
I was talking more about the normal page insertion flow which is done
by setting vmf->page and then returning. finish_fault() will install
the PTE
If this is the best way then I would expect some future where maybe
there is a vmf->folio and finish_fault() will install a PUD/PMD/PTE
and we don't call vmf_insert_pfnxx in DAX.
Jason