Re: [PATCH v7 09/12] xfs: wire up ->lease_direct()

2017-10-08 Thread Dave Chinner
On Fri, Oct 06, 2017 at 03:36:06PM -0700, Dan Williams wrote:
> A 'lease_direct' lease requires that the vma have a valid MAP_DIRECT
> mapping established. For xfs we establish a new lease and then check if
> the MAP_DIRECT mapping has been broken. We want to be sure that the
> process will receive notification that the MAP_DIRECT mapping is being
> torn down so it knows why other code paths are throwing failures.
> 
> For example in the RDMA/ibverbs case we want ibv_reg_mr() to fail if the
> MAP_DIRECT mapping is invalid or in the process of being invalidated.
> 
> Cc: Jan Kara 
> Cc: Jeff Moyer 
> Cc: Christoph Hellwig 
> Cc: Dave Chinner 
> Cc: "Darrick J. Wong" 
> Cc: Ross Zwisler 
> Cc: Jeff Layton 
> Cc: "J. Bruce Fields" 
> Signed-off-by: Dan Williams 
> ---
>  fs/xfs/xfs_file.c |   28 
>  1 file changed, 28 insertions(+)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index e35518600e28..823b65f17429 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1166,6 +1166,33 @@ xfs_filemap_direct_close(
>   put_map_direct_vma(vma->vm_private_data);
>  }
>  
> +static struct lease_direct *
> +xfs_filemap_direct_lease(
> + struct vm_area_struct   *vma,
> + void(*break_fn)(void *),
> + void*owner)
> +{
> + struct lease_direct *ld;
> +
> + ld = map_direct_lease(vma, break_fn, owner);
> +
> + if (IS_ERR(ld))
> + return ld;
> +
> + /*
> +  * We now have an established lease while the base MAP_DIRECT
> +  * lease was not broken. So, we know that the "lease holder" will
> +  * receive a SIGIO notification when the lease is broken and
> +  * take any necessary cleanup actions.
> +  */
> + if (!is_map_direct_broken(vma->vm_private_data))
> + return ld;
> +
> + map_direct_lease_destroy(ld);
> +
> + return ERR_PTR(-ENXIO);

What's any of this got to do with XFS? Shouldn't it be in generic
code, and called generic_filemap_direct_lease()?

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v7 06/12] xfs: wire up MAP_DIRECT

2017-10-08 Thread Dave Chinner
On Fri, Oct 06, 2017 at 03:35:49PM -0700, Dan Williams wrote:
> MAP_DIRECT is an mmap(2) flag with the following semantics:
> 
>   MAP_DIRECT
>   When specified with MAP_SHARED_VALIDATE, sets up a file lease with the
>   same lifetime as the mapping. Unlike a typical F_RDLCK lease this lease
>   is broken when a "lease breaker" attempts to write(2), change the block
>   map (fallocate), or change the size of the file. Otherwise the mechanism
>   of a lease break is identical to the typical lease break case where the
>   lease needs to be removed (munmap) within the number of seconds
>   specified by /proc/sys/fs/lease-break-time. If the lease holder fails to
>   remove the lease in time the kernel will invalidate the mapping and
>   force all future accesses to the mapping to trigger SIGBUS.
> 
>   In addition to lease break timeouts causing faults in the mapping to
>   result in SIGBUS, other states of the file will trigger SIGBUS at fault
>   time:
> 
>   * The file is not DAX capable
>   * The file has reflinked (copy-on-write) blocks
>   * The fault would trigger the filesystem to allocate blocks
>   * The fault would trigger the filesystem to perform extent conversion
> 
>   In other words, MAP_DIRECT expects and enforces a fully allocated file
>   where faults can be satisfied without modifying block map metadata.
> 
>   An unprivileged process may establish a MAP_DIRECT mapping on a file
>   whose UID (owner) matches the filesystem UID of the  process. A process
>   with the CAP_LEASE capability may establish a MAP_DIRECT mapping on
>   arbitrary files
> 
>   ERRORS
>   EACCES Beyond the typical mmap(2) conditions that trigger EACCES
>   MAP_DIRECT also requires the permission to set a file lease.
> 
>   EOPNOTSUPP The filesystem explicitly does not support the flag
> 
>   SIGBUS Attempted to write a MAP_DIRECT mapping at a file offset that
>  might require block-map updates, or the lease timed out and the
>  kernel invalidated the mapping.
> 
> Cc: Jan Kara 
> Cc: Arnd Bergmann 
> Cc: Jeff Moyer 
> Cc: Christoph Hellwig 
> Cc: Dave Chinner 
> Cc: Alexander Viro 
> Cc: "Darrick J. Wong" 
> Cc: Ross Zwisler 
> Cc: Jeff Layton 
> Cc: "J. Bruce Fields" 
> Signed-off-by: Dan Williams 
> ---
>  fs/xfs/Kconfig  |2 -
>  fs/xfs/xfs_file.c   |  102 
> +++
>  include/linux/mman.h|3 +
>  include/uapi/asm-generic/mman.h |1 
>  4 files changed, 106 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/Kconfig b/fs/xfs/Kconfig
> index f62fc6629abb..f8765653a438 100644
> --- a/fs/xfs/Kconfig
> +++ b/fs/xfs/Kconfig
> @@ -112,4 +112,4 @@ config XFS_ASSERT_FATAL
>  
>  config XFS_LAYOUT
>   def_bool y
> - depends on EXPORTFS_BLOCK_OPS
> + depends on EXPORTFS_BLOCK_OPS || FS_DAX
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index ebdd0bd2b261..e35518600e28 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -40,12 +40,22 @@
>  #include "xfs_iomap.h"
>  #include "xfs_reflink.h"
>  
> +#include 
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  static const struct vm_operations_struct xfs_file_vm_ops;
> +static const struct vm_operations_struct xfs_file_vm_direct_ops;
> +
> +static inline bool
> +is_xfs_map_direct(
> + struct vm_area_struct *vma)
> +{
> + return vma->vm_ops == _file_vm_direct_ops;
> +}

Namespacing (xfs_vma_is_direct) and whitespace damage.

>  
>  /*
>   * Clear the specified ranges to zero through either the pagecache or DAX.
> @@ -1008,6 +1018,26 @@ xfs_file_llseek(
>   return vfs_setpos(file, offset, inode->i_sb->s_maxbytes);
>  }
>  
> +static int
> +xfs_vma_checks(
> + struct vm_area_struct   *vma,
> + struct inode*inode)

Exactly what are we checking for - function name doesn't tell me,
and there's no comments, either?

> +{
> + if (!is_xfs_map_direct(vma))
> + return 0;
> +
> + if (!is_map_direct_valid(vma->vm_private_data))
> + return VM_FAULT_SIGBUS;
> +
> + if (xfs_is_reflink_inode(XFS_I(inode)))
> + return VM_FAULT_SIGBUS;
> +
> + if (!IS_DAX(inode))
> + return VM_FAULT_SIGBUS;

And how do we get is_xfs_map_direct() set to true if we don't have a
DAX inode or the inode has shared extents?

> +
> + return 0;
> +}
> +
>  /*
>   * Locking for serialisation of IO during page faults. This results in a lock
>   * ordering of:
> @@ -1024,6 +1054,7 @@ __xfs_filemap_fault(
>   enum page_entry_sizepe_size,
>   boolwrite_fault)
>  {
> + struct vm_area_struct   *vma = vmf->vma;
>   struct inode*inode = file_inode(vmf->vma->vm_file);

You 

Re: [PATCH v7 03/12] fs: introduce i_mapdcount

2017-10-08 Thread Dave Chinner
On Fri, Oct 06, 2017 at 03:35:32PM -0700, Dan Williams wrote:
> When ->iomap_begin() sees this count being non-zero and determines that
> the block map of the file needs to be modified to satisfy the I/O
> request it will instead return an error. This is needed for MAP_DIRECT
> where, due to locking constraints, we can't rely on xfs_break_layouts()
> to protect against allocating write-faults either from the process that
> setup the MAP_DIRECT mapping nor other processes that have the file
> mapped.  xfs_break_layouts() requires XFS_IOLOCK which is problematic to
> mix with the XFS_MMAPLOCK in the fault path.
> 
> Cc: Jan Kara 
> Cc: Jeff Moyer 
> Cc: Christoph Hellwig 
> Cc: Dave Chinner 
> Cc: "Darrick J. Wong" 
> Cc: Ross Zwisler 
> Cc: Jeff Layton 
> Cc: "J. Bruce Fields" 
> Signed-off-by: Dan Williams 
> ---
>  fs/xfs/xfs_iomap.c |9 +
>  include/linux/fs.h |   31 +++
>  2 files changed, 40 insertions(+)
> 
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index a1909bc064e9..6816f8ebbdcf 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -1053,6 +1053,15 @@ xfs_file_iomap_begin(
>   goto out_unlock;
>   }
>   /*
> +  * If a file has MAP_DIRECT mappings disable block map
> +  * updates. This should only effect mmap write faults as
> +  * other paths are protected by an FL_LAYOUT lease.
> +  */
> + if (i_mapdcount_read(inode)) {
> + error = -ETXTBSY;
> + goto out_unlock;
> + }

That looks really fragile. For one, it's going to miss modifications
to reflinked files altogether. Ignoring that, however, I don't want to
have to care one bit about the internals of the MAP_DIRECT
implementation in the filesystem code. Hide it behind something with
an obvious name that returns the appropriate error and the
filesystem code becomes self documenting:

if ((flags & IOMAP_WRITE) && imap_needs_alloc(inode, , nimaps)) {
.
error = iomap_can_allocate(inode);
if (error)
goto out_unlock;

Then you can put all the MAP_DIRECT stuff and the comments
explaining what is does inside the generic function that determines
if we are allowed to allocate on that inode or not.

> + /*
>* We cap the maximum length we map here to MAX_WRITEBACK_PAGES
>* pages to keep the chunks of work done where somewhat 
> symmetric
>* with the work writeback does. This is a completely arbitrary
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index c2b9bf3dc4e9..f83871b188ff 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -642,6 +642,9 @@ struct inode {
>   atomic_ti_count;
>   atomic_ti_dio_count;
>   atomic_ti_writecount;
> +#ifdef CONFIG_FS_DAX
> + atomic_ti_mapdcount;/* count of MAP_DIRECT vmas */
> +#endif

Is there any way to avoid growing the struct inode for this?

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


linux-nvdimm现场质量问题分析与解决

2017-10-08 Thread 鲍娴
您好!linux-nvdimm
工厂现场质量问题分析与解决训练营
广州  即将开班;-详情请阅读附件大纲
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v8 2/2] IB/core: use MAP_DIRECT to fix / enable RDMA to DAX mappings

2017-10-08 Thread Dan Williams
On Sat, Oct 7, 2017 at 11:45 PM, kbuild test robot <l...@intel.com> wrote:
> Hi Dan,
>
> [auto build test ERROR on rdma/master]
> [also build test ERROR on v4.14-rc3 next-20170929]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]

This was a fixed up resend of patch [v7 11/12]. It's not clear how to
teach the kbuild robot to be aware of patch-replies to individual
patches in the series. I.e. reworked patches without resending the
complete series?

> url:
> https://github.com/0day-ci/linux/commits/Dan-Williams/iommu-up-level-sg_num_pages-from-amd-iommu/20171008-133505
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma.git 
> master
> config: i386-randconfig-n0-201741 (attached as .config)
> compiler: gcc-4.8 (Debian 4.8.4-1) 4.8.4
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=i386
>
> All errors (new ones prefixed by >>):
>
>>> drivers/infiniband/core/umem.c:39:29: fatal error: linux/mapdirect.h: No 
>>> such file or directory
> #include 

mapdirect.h indeed does not exist when missing the earlier patches in
the series. It would be slick if the 0day-robot read the the
"in-reply-to" header and auto replaced a patch in a series, but that
would be a feature approaching magic.

>compilation terminated.
>
> vim +39 drivers/infiniband/core/umem.c
>
>   > 39  #include 
> 40  #include 
> 41  #include 
> 42  #include 
> 43  #include 
> 44
>
> ---
> 0-DAY kernel test infrastructureOpen Source Technology Center
> https://lists.01.org/pipermail/kbuild-all   Intel Corporation
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v8 2/2] IB/core: use MAP_DIRECT to fix / enable RDMA to DAX mappings

2017-10-08 Thread kbuild test robot
Hi Dan,

[auto build test ERROR on rdma/master]
[also build test ERROR on v4.14-rc3 next-20170929]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Dan-Williams/iommu-up-level-sg_num_pages-from-amd-iommu/20171008-133505
base:   https://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma.git master
config: i386-randconfig-n0-201741 (attached as .config)
compiler: gcc-4.8 (Debian 4.8.4-1) 4.8.4
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

>> drivers/infiniband/core/umem.c:39:29: fatal error: linux/mapdirect.h: No 
>> such file or directory
#include 
^
   compilation terminated.

vim +39 drivers/infiniband/core/umem.c

  > 39  #include 
40  #include 
41  #include 
42  #include 
43  #include 
44  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm