[PATCH v5 4/4] mm/nvdimm: Pick the right alignment default when creating dax devices

2019-08-09 Thread Aneesh Kumar K.V
Allow arch to provide the supported alignments and use hugepage alignment only if we support hugepage. Right now we depend on compile time configs whereas this patch switch this to runtime discovery. Architectures like ppc64 can have THP enabled in code, but then can have hugepage size disabled by

[PATCH v5 3/4] mm/nvdimm: Use correct #defines instead of open coding

2019-08-09 Thread Aneesh Kumar K.V
Use PAGE_SIZE instead of SZ_4K and sizeof(struct page) instead of 64. If we have a kernel built with different struct page size the previous patch should handle marking the namespace disabled. Signed-off-by: Aneesh Kumar K.V --- drivers/nvdimm/label.c | 2 +- drivers/nvdimm/namespace_de

[PATCH v5 1/4] nvdimm: Consider probe return -EOPNOTSUPP as success

2019-08-09 Thread Aneesh Kumar K.V
This patch add -EOPNOTSUPP as return from probe callback to indicate we were not able to initialize a namespace due to pfn superblock feature/version mismatch. We want to consider this a probe success so that we can create new namesapce seed and there by avoid marking the failed namespace as the se

[PATCH v5 2/4] mm/nvdimm: Add page size and struct page size to pfn superblock

2019-08-09 Thread Aneesh Kumar K.V
This is needed so that we don't wrongly initialize a namespace which doesn't have enough space reserved for holding struct pages with the current kernel. Signed-off-by: Aneesh Kumar K.V --- drivers/nvdimm/pfn.h | 5 - drivers/nvdimm/pfn_devs.c | 27 ++- 2 files

[PATCH v5 0/4] Mark the namespace disabled on pfn superblock mismatch

2019-08-09 Thread Aneesh Kumar K.V
We add new members to pfn superblock (PAGE_SIZE and struct page size) in this series. This is now checked while initializing the namespace. If we find a mismatch we mark the namespace disabled. This series also handle configs where hugepage support is not enabled by default. This can result in

[PATCH v5] mm/nvdimm: Fix endian conversion issues 

2019-08-09 Thread Aneesh Kumar K.V
nd_label->dpa issue was observed when trying to enable the namespace created with little-endian kernel on a big-endian kernel. That made me run `sparse` on the rest of the code and other changes are the result of that. Fixes: d9b83c756953 ("libnvdimm, btt: rework error clearing") Fixes: 9dedc73a46

[PATCH v5] mm/nvdimm: Use correct alignment when looking at first pfn from a region

2019-08-09 Thread Aneesh Kumar K.V
vmem_altmap_offset() adjust the section aligned base_pfn offset. So we need to make sure we account for the same when computing base_pfn. ie, for altmap_valid case, our pfn_first should be: pfn_first = altmap->base_pfn + vmem_altmap_offset(altmap); Signed-off-by: Aneesh Kumar K.V --- Changes fr

[PATCH] nvdimm: Initialize bad block for volatile namespaces

2019-08-09 Thread Aneesh Kumar K.V
We do check for a bad block during namespace init and that use region bad block list. We need to initialize the bad block for volatile regions for this to work. We also observe a lockdep warning as below because the lock is not initialized correctly since we skip bad block init for volatile regions

Re: [PATCH] mm/memremap: Fix reuse of pgmap instances with internal references

2019-08-09 Thread Christoph Hellwig
Looks good: Reviewed-by: Christoph Hellwig ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm

[RFC PATCH v2 02/19] fs/locks: Add Exclusive flag to user Layout lease

2019-08-09 Thread ira . weiny
From: Ira Weiny Add an exclusive lease flag which indicates that the layout mechanism can not be broken. Exclusive layout leases allow the file system to know that pages may be GUP pined and that attempts to change the layout, ie truncate, should be failed. A process which attempts to break it'

[RFC PATCH v2 04/19] mm/gup: Ensure F_LAYOUT lease is held prior to GUP'ing pages

2019-08-09 Thread ira . weiny
From: Ira Weiny On FS DAX files users must inform the file system they intend to take long term GUP pins on the file pages. Failure to do so should result in an error. Ensure that a F_LAYOUT lease exists at the time the GUP call is made. If not return EPERM. Signed-off-by: Ira Weiny --- Chan

[RFC PATCH v2 12/19] mm/gup: Prep put_user_pages() to take an vaddr_pin struct

2019-08-09 Thread ira . weiny
From: Ira Weiny Once callers start to use vaddr_pin the put_user_pages calls will need to have access to this data coming in. Prep put_user_pages() for this data. Signed-off-by: Ira Weiny --- include/linux/mm.h | 20 +--- mm/gup.c | 122 -

[RFC PATCH v2 11/19] mm/gup: Pass follow_page_context further down the call stack

2019-08-09 Thread ira . weiny
From: Ira Weiny In preparation for passing more information (vaddr_pin) into follow_page_pte(), follow_devmap_pud(), and follow_devmap_pmd(). Signed-off-by: Ira Weiny --- include/linux/huge_mm.h | 17 - mm/gup.c| 31 +++ mm/huge_memor

[RFC PATCH v2 09/19] mm/gup: Introduce vaddr_pin structure

2019-08-09 Thread ira . weiny
From: Ira Weiny Some subsystems need to pass owning file information to GUP calls to allow for GUP to associate the "owning file" to any files being pinned within the GUP call. Introduce an object to specify this information and pass it down through some of the GUP call stack. Signed-off-by: Ir

[RFC PATCH v2 05/19] fs/ext4: Teach ext4 to break layout leases

2019-08-09 Thread ira . weiny
From: Ira Weiny ext4 must attempt to break a layout lease if it is held to know if the layout can be modified. Split out the logic to determine if a mapping is DAX, export it, and then break layout leases if a mapping is DAX. Signed-off-by: Ira Weiny --- Changes from RFC v1: Based on

[RFC PATCH v2 03/19] mm/gup: Pass flags down to __gup_device_huge* calls

2019-08-09 Thread ira . weiny
From: Ira Weiny In order to support checking for a layout lease on a FS DAX inode these calls need to know if FOLL_LONGTERM was specified. Signed-off-by: Ira Weiny --- mm/gup.c | 26 +- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/mm/gup.c b/mm/gup.c i

[RFC PATCH v2 06/19] fs/ext4: Teach dax_layout_busy_page() to operate on a sub-range

2019-08-09 Thread ira . weiny
From: Ira Weiny Callers of dax_layout_busy_page() are only rarely operating on the entire file of concern. Teach dax_layout_busy_page() to operate on a sub-range of the address_space provided. Specifying 0 - ULONG_MAX however, will continue to operate on the "entire file" and XFS is split out t

[RFC PATCH v2 10/19] mm/gup: Pass a NULL vaddr_pin through GUP fast

2019-08-09 Thread ira . weiny
From: Ira Weiny Internally GUP fast needs to know that fast users will not support file pins. Pass NULL for vaddr_pin through the fast call stack so that the pin code can return an error if it encounters file backed memory within the address range. Signed-off-by: Ira Weiny --- mm/gup.c | 65 +

[RFC PATCH v2 07/19] fs/xfs: Teach xfs to use new dax_layout_busy_page()

2019-08-09 Thread ira . weiny
From: Ira Weiny dax_layout_busy_page() can now operate on a sub-range of the address_space provided. Have xfs specify the sub range to dax_layout_busy_page() Signed-off-by: Ira Weiny --- fs/xfs/xfs_file.c | 19 +-- fs/xfs/xfs_inode.h | 5 +++-- fs/xfs/xfs_ioctl.c | 15 ++

[RFC PATCH v2 08/19] fs/xfs: Fail truncate if page lease can't be broken

2019-08-09 Thread ira . weiny
From: Ira Weiny If pages are under a lease fail the truncate operation. We change the order of lease breaks to directly fail the operation if the lease exists. Select EXPORT_BLOCK_OPS for FS_DAX to ensure that xfs_break_lease_layouts() is defined for FS_DAX as well as pNFS. Signed-off-by: Ira

[RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

2019-08-09 Thread ira . weiny
From: Ira Weiny Pre-requisites == Based on mmotm tree. Based on the feedback from LSFmm, the LWN article, the RFC series since then, and a ton of scenarios I've worked in my mind and/or tested...[1] Solution summary The real issue is that there is no use ca

[RFC PATCH v2 13/19] {mm,file}: Add file_pins objects

2019-08-09 Thread ira . weiny
From: Ira Weiny User page pins (aka GUP) needs to track file information of files being pinned by those calls. Depending on the needs of the caller this information is stored in 1 of 2 ways. 1) Some subsystems like RDMA associate GUP pins with file descriptors which can be passed around to o

[RFC PATCH v2 01/19] fs/locks: Export F_LAYOUT lease to user space

2019-08-09 Thread ira . weiny
From: Ira Weiny In order to support an opt-in policy for users to allow long term pins of FS DAX pages we need to export the LAYOUT lease to user space. This is the first of 2 new lease flags which must be used to allow a long term pin to be made on a file. After the complete series: 0) Regist

[RFC PATCH v2 19/19] mm/gup: Remove FOLL_LONGTERM DAX exclusion

2019-08-09 Thread ira . weiny
From: Ira Weiny Now that there is a mechanism for users to safely take LONGTERM pins on FS DAX pages, remove the FS DAX exclusion from the GUP implementation. Special processing remains in effect for CONFIG_CMA NOTE: Some callers still fail because the vaddr_pin information has not been passed

[RFC PATCH v2 16/19] RDMA/uverbs: Add back pointer to system file object

2019-08-09 Thread ira . weiny
From: Ira Weiny In order for MRs to be tracked against the open verbs context the ufile needs to have a pointer to hand to the GUP code. No references need to be taken as this should be valid for the lifetime of the context. Signed-off-by: Ira Weiny --- drivers/infiniband/core/uverbs.h |

[RFC PATCH v2 18/19] {mm,procfs}: Add display file_pins proc

2019-08-09 Thread ira . weiny
From: Ira Weiny Now that we have the file pins information stored add a new procfs entry to display them to the user. NOTE output will be dependant on where the file pin is tied to. Some processes may have the pin associated with a file descriptor in which case that file is reported as well. O

[RFC PATCH v2 14/19] fs/locks: Associate file pins while performing GUP

2019-08-09 Thread ira . weiny
From: Ira Weiny When a file back area is being pinned add the appropriate file pin information to the appropriate file or mm owner. This information can then be used by admins to determine who is causing a failure to change the layout of a file. Signed-off-by: Ira Weiny --- fs/locks.c

[RFC PATCH v2 15/19] mm/gup: Introduce vaddr_pin_pages()

2019-08-09 Thread ira . weiny
From: Ira Weiny The addition of FOLL_LONGTERM has taken on additional meaning for CMA pages. In addition subsystems such as RDMA require new information to be passed to the GUP interface to track file owning information. As such a simple FOLL_LONGTERM flag is no longer sufficient for these user

[RFC PATCH v2 17/19] RDMA/umem: Convert to vaddr_[pin|unpin]* operations.

2019-08-09 Thread ira . weiny
From: Ira Weiny In order to properly track the pinning information we need to keep a vaddr_pin object around. Store that within the umem object directly. The vaddr_pin object allows the GUP code to associate any files it pins with the RDMA file descriptor associated with this GUP. Furthermore,

Re: [RFC PATCH v2 08/19] fs/xfs: Fail truncate if page lease can't be broken

2019-08-09 Thread Dave Chinner
On Fri, Aug 09, 2019 at 03:58:22PM -0700, ira.we...@intel.com wrote: > From: Ira Weiny > > If pages are under a lease fail the truncate operation. We change the order > of > lease breaks to directly fail the operation if the lease exists. > > Select EXPORT_BLOCK_OPS for FS_DAX to ensure that x

Re: [RFC PATCH v2 07/19] fs/xfs: Teach xfs to use new dax_layout_busy_page()

2019-08-09 Thread Dave Chinner
On Fri, Aug 09, 2019 at 03:58:21PM -0700, ira.we...@intel.com wrote: > From: Ira Weiny > > dax_layout_busy_page() can now operate on a sub-range of the > address_space provided. > > Have xfs specify the sub range to dax_layout_busy_page() Hmmm. I've got patches that change all these XFS interfa

Re: [RFC PATCH v2 01/19] fs/locks: Export F_LAYOUT lease to user space

2019-08-09 Thread Dave Chinner
On Fri, Aug 09, 2019 at 03:58:15PM -0700, ira.we...@intel.com wrote: > From: Ira Weiny > > In order to support an opt-in policy for users to allow long term pins > of FS DAX pages we need to export the LAYOUT lease to user space. > > This is the first of 2 new lease flags which must be used to a

Re: [RFC PATCH v2 09/19] mm/gup: Introduce vaddr_pin structure

2019-08-09 Thread John Hubbard
On 8/9/19 3:58 PM, ira.we...@intel.com wrote: > From: Ira Weiny > > Some subsystems need to pass owning file information to GUP calls to > allow for GUP to associate the "owning file" to any files being pinned > within the GUP call. > > Introduce an object to specify this information and pass it

Re: [RFC PATCH v2 10/19] mm/gup: Pass a NULL vaddr_pin through GUP fast

2019-08-09 Thread John Hubbard
On 8/9/19 3:58 PM, ira.we...@intel.com wrote: > From: Ira Weiny > > Internally GUP fast needs to know that fast users will not support file > pins. Pass NULL for vaddr_pin through the fast call stack so that the > pin code can return an error if it encounters file backed memory within > the addr

Re: [RFC PATCH v2 15/19] mm/gup: Introduce vaddr_pin_pages()

2019-08-09 Thread John Hubbard
On 8/9/19 3:58 PM, ira.we...@intel.com wrote: > From: Ira Weiny > > The addition of FOLL_LONGTERM has taken on additional meaning for CMA > pages. > > In addition subsystems such as RDMA require new information to be passed > to the GUP interface to track file owning information. As such a simp

Re: [RFC PATCH v2 11/19] mm/gup: Pass follow_page_context further down the call stack

2019-08-09 Thread John Hubbard
On 8/9/19 3:58 PM, ira.we...@intel.com wrote: > From: Ira Weiny > > In preparation for passing more information (vaddr_pin) into > follow_page_pte(), follow_devmap_pud(), and follow_devmap_pmd(). > > Signed-off-by: Ira Weiny > --- > include/linux/huge_mm.h | 17 - > mm/gup.c

Re: [RFC PATCH v2 12/19] mm/gup: Prep put_user_pages() to take an vaddr_pin struct

2019-08-09 Thread John Hubbard
On 8/9/19 3:58 PM, ira.we...@intel.com wrote: > From: Ira Weiny > > Once callers start to use vaddr_pin the put_user_pages calls will need > to have access to this data coming in. Prep put_user_pages() for this > data. > > Signed-off-by: Ira Weiny > --- > include/linux/mm.h | 20 +--- >

Re: [PATCH v5 2/4] mm/nvdimm: Add page size and struct page size to pfn superblock

2019-08-09 Thread Aneesh Kumar K.V
"Aneesh Kumar K.V" writes: case PFN_MODE_PMEM: > @@ -475,6 +484,20 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char > *sig) > align = 1UL << ilog2(offset); > mode = le32_to_cpu(pfn_sb->mode); > > + if (le32_to_cpu(pfn_sb->page_size) != PAGE_SIZE) { > +