On 11/15/21 17:49, Jason Gunthorpe wrote:
> On Mon, Nov 15, 2021 at 01:11:32PM +0100, Joao Martins wrote:
>> On 11/12/21 16:34, Jason Gunthorpe wrote:
>>> On Fri, Nov 12, 2021 at 04:08:24PM +0100, Joao Martins wrote:
>>>> diff --git a/drivers/dax/device.c b/drivers/dax/device.c
>>>> index a65c67ab5ee0..0c2ac97d397d 100644
>>>> +++ b/drivers/dax/device.c
>>>> @@ -192,6 +192,42 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax
>>>> *dev_dax,
>>>> }
>>>> #endif /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
>>>>
>>>> +static void set_page_mapping(struct vm_fault *vmf, pfn_t pfn,
>>>> + unsigned long fault_size,
>>>> + struct address_space *f_mapping)
>>>> +{
>>>> + unsigned long i;
>>>> + pgoff_t pgoff;
>>>> +
>>>> + pgoff = linear_page_index(vmf->vma, ALIGN(vmf->address, fault_size));
>>>> +
>>>> + for (i = 0; i < fault_size / PAGE_SIZE; i++) {
>>>> + struct page *page;
>>>> +
>>>> + page = pfn_to_page(pfn_t_to_pfn(pfn) + i);
>>>> + if (page->mapping)
>>>> + continue;
>>>> + page->mapping = f_mapping;
>>>> + page->index = pgoff + i;
>>>> + }
>>>> +}
>>>> +
>>>> +static void set_compound_mapping(struct vm_fault *vmf, pfn_t pfn,
>>>> + unsigned long fault_size,
>>>> + struct address_space *f_mapping)
>>>> +{
>>>> + struct page *head;
>>>> +
>>>> + head = pfn_to_page(pfn_t_to_pfn(pfn));
>>>> + head = compound_head(head);
>>>> + if (head->mapping)
>>>> + return;
>>>> +
>>>> + head->mapping = f_mapping;
>>>> + head->index = linear_page_index(vmf->vma,
>>>> + ALIGN(vmf->address, fault_size));
>>>> +}
>>>
>>> Should this stuff be setup before doing vmf_insert_pfn_XX?
>>>
>>
>> Interestingly filesystem-dax does this, but not device-dax.
>
> I think it may be a bug ?
>
Possibly.
Dan, any thoughts (see also below) ? You probably hold all that
history since its inception on commit 2232c6382a4 ("device-dax: Enable
page_mapping()")
and commit 35de299547d1 ("device-dax: Set page->index").
>> set_page_mapping/set_compound_mapping() could be moved to before and
>> then torn down on @rc != VM_FAULT_NOPAGE (failure). I am not sure
>> what's the benefit in this series.. besides the ordering (that you
>> hinted below) ?
>
> Well, it should probably be fixed in a precursor patch.
>
Yeap -- I would move page_mapping prior to introduce set_compound_mapping().
Now I am thinking again and with that logic it makes more sense to ammend inside
set_page_mapping() -- should have less nested around in the fault handler.
> I think the general idea is that page->mapping/index are stable once
> the page is published in a PTE?
>
/me nods
>>> In normal cases the page should be returned in the vmf and populated
>>> to the page tables by the core code after all this is done.
>>
>> So I suppose by call sites examples as 'core code' is either hugetlbfs call
>> to
>> __filemap_add_folio() (on hugetlbfs fault handler),
>> shmem_add_to_page_cache() or
>> anon-equivalent.
>
> I was talking more about the normal page insertion flow which is done
> by setting vmf->page and then returning. finish_fault() will install
> the PTE
>
I misunderstood you earlier -- I thought you were suggesting me to look at
how mapping/index is set (in the context of the flow you just described)
Joao