On 03/09/2017 05:03 PM, Dan Williams wrote:
> On Thu, Mar 9, 2017 at 3:56 PM, Dave Jiang <[email protected]> wrote:
>> Jeff Moyer reports that:
>> "
>> With a device dax alignment of 4KB or 2MB, I get sigbus when running the
>> attached fio job file for the current kernel (4.11.0-rc1+).  If I
>> specify an alignment of 1GB, it works.
>>
>> I turned on debug output, and saw that it was failing in the huge fault
>> code.
>>
>> [ 4614.138357] dax dax1.0: dax_open
>> [ 4614.154838] dax dax1.0: dax_mmap
>> [ 4614.171898] dax dax1.0: dax_dev_huge_fault: fio: write (0x7f08f0a00000 - 
>> 0x7f0ce0800000)
>> [ 4614.211720] dax dax1.0: __dax_dev_pud_fault: 
>> phys_to_pgoff(0xffffffffcf600) failed
>> [ 4614.568911] dax dax1.0: dax_release
>>
>> fio config for reproduce:
>> [global]
>> ioengine=dev-dax
>> direct=0
>> filename=/dev/dax0.0
>> bs=2m
>>
>> [write]
>> rw=write
>>
>> [read]
>> stonewall
>> rw=read
>> "
>>
>> It looks like the code does not fallback at all when handling faults. The
>> fault handlers are missing some boundary checking for the VMA. Also, we need
>> to fall back if the alignment requested is less than the P-entry size.
>>
>> Reported-by: Jeff Moyer <[email protected]>
>> Signed-off-by: Dave Jiang <[email protected]>
>> ---
>>  drivers/dax/dax.c |   22 ++++++++++++++++++++--
>>  1 file changed, 20 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
>> index 174690a..5c7998e 100644
>> --- a/drivers/dax/dax.c
>> +++ b/drivers/dax/dax.c
>> @@ -480,12 +480,21 @@ static int __dax_dev_pmd_fault(struct dax_dev 
>> *dax_dev, struct vm_fault *vmf)
>>                 return VM_FAULT_SIGBUS;
>>         }
>>
>> +       /* fall back if we are outside of the VMA */
>> +       if (pmd_addr < vmf->vma->vm_start)
>> +               return VM_FAULT_FALLBACK;
>> +       if ((pmd_addr + PMD_SIZE) > vmf->vma->vm_end)
>> +               return VM_FAULT_FALLBACK;
> 
> I think these are already handled by check_vma(), and we only want to
> fallback if dax_region->align < PMD_SIZE, otherwise we're violating
> device-dax's guaranteed fault granularity.

I was seeing pud_addr < vmf->vma->vm_start in the pud handler. It
doesn't look like check_vma() checks the boundaries. Also, it wouldn't
know the pud_addr or pmd_addr. Or are you saying it doesn't matter?

> 
>> +
>> +       if (dax_region->align < PMD_SIZE)
>> +               return VM_FAULT_FALLBACK;
>> +
>>         pgoff = linear_page_index(vmf->vma, pmd_addr);
>>         phys = pgoff_to_phys(dax_dev, pgoff, PMD_SIZE);
>>         if (phys == -1) {
>>                 dev_dbg(dev, "%s: pgoff_to_phys(%#lx) failed\n", __func__,
>>                                 pgoff);
>> -               return VM_FAULT_SIGBUS;
>> +               return VM_FAULT_FALLBACK;
> 
> If we get past all the checks above then this is a failure.
> 
>>         }
>>
>>         pfn = phys_to_pfn_t(phys, dax_region->pfn_flags);
>> @@ -519,12 +528,21 @@ static int __dax_dev_pud_fault(struct dax_dev 
>> *dax_dev, struct vm_fault *vmf)
>>                 return VM_FAULT_SIGBUS;
>>         }
>>
>> +       /* fall back if we are outside of the VMA */
>> +       if (pud_addr < vmf->vma->vm_start)
>> +               return VM_FAULT_FALLBACK;
>> +       if ((pud_addr + PUD_SIZE) > vmf->vma->vm_end)
>> +               return VM_FAULT_FALLBACK;
>> +
>> +       if (dax_region->align < PUD_SIZE)
>> +               return VM_FAULT_FALLBACK;
>> +
>>         pgoff = linear_page_index(vmf->vma, pud_addr);
>>         phys = pgoff_to_phys(dax_dev, pgoff, PUD_SIZE);
>>         if (phys == -1) {
>>                 dev_dbg(dev, "%s: pgoff_to_phys(%#lx) failed\n", __func__,
>>                                 pgoff);
>> -               return VM_FAULT_SIGBUS;
>> +               return VM_FAULT_FALLBACK;
> 
> Same comments as above. The dax_region->align check should catch all
> the valid fallback cases and check_vma() should clean up everything
> else.
> 
>>         }
>>
>>         pfn = phys_to_pfn_t(phys, dax_region->pfn_flags);
>>
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm

Reply via email to