Hi,
The /proc/<pid>/smaps MMUPageSize field for a mapping backed by
a daxfs(xfs,ext4) on a namespace created in 2MiB alignment is 4KB.
It's understandable because both xfs and ext4's vm_operations_struct
are missing vm_ops->pagesize() - a callback that could potentially
retrieve the alignment value from the driver as implemented for
dax_vm_ops for device-dax.
1GiB aligned /dev/dax2.0 :
7f19c0000000-7f1a40000000 rw-s 00000000 00:06 928839
/dev/dax2.0
Size: 2097152 kB
KernelPageSize: 1048576 kB
MMUPageSize: 1048576 kB
/mnt_nm4/file2GB backed by /dev/pmem4 that is 2MiB aligned :
7fd3d5600000-7fd415600000 rw-p 00000000 103:03 195
/mnt_nm4/file2GB
Size: 1048576 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Things work because xfs_file_mmap() always does
if (IS_DAX(file_inode(filp)))
vma->vm_flags |= VM_HUGEPAGE;
Since it knows that it only support 2 pagesizes if S_DAX is set:
4K or 2M, it can always try dax_iomap_pmd_fault(), and that fails
then fall back to dax_iomap_pte_fault() later.
If the /dev/pmem device is created in 4K alignment and 4K pagesize
was intended, the extra code that needs to be executed per page fault
is unnecessary right?
7efdce200000-7efe0e200000 rw-p 00000000 103:02 195
/mnt_nm3_4Kalign/file_2G
Size: 1048576 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
..
VmFlags: rd wr mr mw me ac sd mm hg <= VM_HUGEPAGE set in 4K alignment
Would it make sense to add vm_ops->pagesize() op to xfs, and make
xfs_file_mmap() to check the pagesize() instead of always setting
VM_HUGEPAGE? Same for ext4.
Doing so bring the additional benefit of exposing the true MMUPageSize
to users through procfs besides for daxfs not making assumptions on
pagesize.
thanks!
-jane
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm