Returned mail: see transcript for details

2018-05-29 Thread Mail Delivery Subsystem



___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 05/11] filesystem-dax: set page->index

2018-05-29 Thread Dan Williams
On Wed, May 23, 2018 at 1:40 AM, Jan Kara  wrote:
> On Tue 22-05-18 07:39:57, Dan Williams wrote:
>> In support of enabling memory_failure() handling for filesystem-dax
>> mappings, set ->index to the pgoff of the page. The rmap implementation
>> requires ->index to bound the search through the vma interval tree. The
>> index is set and cleared at dax_associate_entry() and
>> dax_disassociate_entry() time respectively.
>>
>> Cc: Jan Kara 
>> Cc: Christoph Hellwig 
>> Cc: Matthew Wilcox 
>> Cc: Ross Zwisler 
>> Signed-off-by: Dan Williams 
>> ---
>>  fs/dax.c |   11 ---
>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/dax.c b/fs/dax.c
>> index aaec72ded1b6..2e4682cd7c69 100644
>> --- a/fs/dax.c
>> +++ b/fs/dax.c
>> @@ -319,18 +319,22 @@ static unsigned long dax_radix_end_pfn(void *entry)
>>   for (pfn = dax_radix_pfn(entry); \
>>   pfn < dax_radix_end_pfn(entry); pfn++)
>>
>> -static void dax_associate_entry(void *entry, struct address_space *mapping)
>> +static void dax_associate_entry(void *entry, struct address_space *mapping,
>> + struct vm_area_struct *vma, unsigned long address)
>>  {
>> - unsigned long pfn;
>> + unsigned long size = dax_entry_size(entry), pfn, index;
>> + int i = 0;
>>
>>   if (IS_ENABLED(CONFIG_FS_DAX_LIMITED))
>>   return;
>>
>> + index = linear_page_index(vma, address & ~(size - 1));
>>   for_each_mapped_pfn(entry, pfn) {
>>   struct page *page = pfn_to_page(pfn);
>>
>>   WARN_ON_ONCE(page->mapping);
>>   page->mapping = mapping;
>> + page->index = index + i++;
>>   }
>>  }
>
> Hum, this just made me think: How is this going to work with XFS reflink?
> In fact is not the page->mapping association already broken by XFS reflink?
> Because with reflink we can have two or more mappings pointing to the same
> physical blocks (i.e., pages in DAX case)...

Good question. I assume we are ok in the non-DAX reflink case because
rmap of failing / poison pages is only relative to the specific page
cache page for a given inode in the reflink. However, DAX would seem
to break this because we only get one shared 'struct page' for all
possible mappings of the physical file block. I think this means for
iterating over the rmap of "where is this page mapped" would require
iterating over the other "sibling" inodes that know about the given
physical file block.

As far as I can see reflink+dax would require teaching kernel code
paths that ->mapping may not be a singular relationship. Something
along the line's of what Jerome was presenting at LSF to create a
special value to indicate, "call back into the filesystem (or the page
owner)" to perform this operation.

In the meantime the kernel crashes when userspace accesses poisoned
pmem via DAX. I assume that reworking rmap for the dax+reflink case
should not block dax poison handling? Yell if you disagree.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[ndctl PATCH v2] ndctl, list: display the 'map' location in listings

2018-05-29 Thread Vishal Verma
For 'fsdax' and 'devdax' namespaces, a 'map' location may be specified
for page structures storage. This can be 'mem', for system RAM, or 'dev'
for using pmem as the backing storage. Once set, there was no way of
telling using ndctl, which of the two locations a namespace was
configured for. Add this in util_namespace_to_json so that all
namespace listings contain the map location.

Reported-by: "Yigal Korman" 
Cc: Dan Williams 
Signed-off-by: Vishal Verma 
---
 util/json.c | 32 +++-
 1 file changed, 27 insertions(+), 5 deletions(-)

v2: Also account for memmap=ss!nn or legacy-e820 namespaces. (Dan)

diff --git a/util/json.c b/util/json.c
index c606e1c..b020300 100644
--- a/util/json.c
+++ b/util/json.c
@@ -667,11 +667,17 @@ struct json_object *util_namespace_to_json(struct 
ndctl_namespace *ndns,
 {
struct json_object *jndns = json_object_new_object();
struct json_object *jobj, *jbbs = NULL;
+   const char *locations[] = {
+   [NDCTL_PFN_LOC_NONE] = "none",
+   [NDCTL_PFN_LOC_RAM] = "mem",
+   [NDCTL_PFN_LOC_PMEM] = "dev",
+   };
unsigned long long size = ULLONG_MAX;
unsigned int sector_size = UINT_MAX;
enum ndctl_namespace_mode mode;
const char *bdev = NULL, *name;
unsigned int bb_count = 0;
+   enum ndctl_pfn_loc loc;
struct ndctl_btt *btt;
struct ndctl_pfn *pfn;
struct ndctl_dax *dax;
@@ -693,33 +699,49 @@ struct json_object *util_namespace_to_json(struct 
ndctl_namespace *ndns,
mode = ndctl_namespace_get_mode(ndns);
switch (mode) {
case NDCTL_NS_MODE_MEMORY:
-   if (pfn) /* dynamic memory mode */
+   jobj = json_object_new_string("fsdax");
+   if (jobj)
+   json_object_object_add(jndns, "mode", jobj);
+   loc = ndctl_pfn_get_location(pfn);
+   if (pfn) { /* dynamic memory mode */
size = ndctl_pfn_get_size(pfn);
-   else /* native/static memory mode */
+   jobj = json_object_new_string(locations[loc]);
+   } else { /* native/static memory mode */
size = ndctl_namespace_get_size(ndns);
-   jobj = json_object_new_string("fsdax");
+   jobj = json_object_new_string("mem");
+   }
+   if (jobj)
+   json_object_object_add(jndns, "map", jobj);
break;
case NDCTL_NS_MODE_DAX:
if (!dax)
goto err;
size = ndctl_dax_get_size(dax);
jobj = json_object_new_string("devdax");
+   if (jobj)
+   json_object_object_add(jndns, "mode", jobj);
+   loc = ndctl_dax_get_location(dax);
+   jobj = json_object_new_string(locations[loc]);
+   if (jobj)
+   json_object_object_add(jndns, "map", jobj);
break;
case NDCTL_NS_MODE_SAFE:
if (!btt)
goto err;
jobj = json_object_new_string("sector");
+   if (jobj)
+   json_object_object_add(jndns, "mode", jobj);
size = ndctl_btt_get_size(btt);
break;
case NDCTL_NS_MODE_RAW:
size = ndctl_namespace_get_size(ndns);
jobj = json_object_new_string("raw");
+   if (jobj)
+   json_object_object_add(jndns, "mode", jobj);
break;
default:
jobj = NULL;
}
-   if (jobj)
-   json_object_object_add(jndns, "mode", jobj);
 
if (size < ULLONG_MAX) {
jobj = util_json_object_size(size, flags);
-- 
2.17.0

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v2 2/7] dax: change bdev_dax_supported() to support boolean returns

2018-05-29 Thread Ross Zwisler
On Tue, May 29, 2018 at 02:25:10PM -0700, Darrick J. Wong wrote:
> On Tue, May 29, 2018 at 01:51:01PM -0600, Ross Zwisler wrote:
> > From: Dave Jiang 
> > 
> > The function return values are confusing with the way the function is
> > named. We expect a true or false return value but it actually returns
> > 0/-errno.  This makes the code very confusing. Changing the return values
> > to return a bool where if DAX is supported then return true and no DAX
> > support returns false.
> > 
> > Signed-off-by: Dave Jiang 
> > Signed-off-by: Ross Zwisler 
> 
> Looks ok, do you want me to pull the first two patches through the xfs
> tree?
> 
> Reviewed-by: Darrick J. Wong 

Thanks for the review.

I'm not sure what's best.  If you do that then Mike will need to have a DM
branch for the rest of the series based on your stable commits, yea?

Mike what would you prefer?
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v2 2/7] dax: change bdev_dax_supported() to support boolean returns

2018-05-29 Thread Darrick J. Wong
On Tue, May 29, 2018 at 01:51:01PM -0600, Ross Zwisler wrote:
> From: Dave Jiang 
> 
> The function return values are confusing with the way the function is
> named. We expect a true or false return value but it actually returns
> 0/-errno.  This makes the code very confusing. Changing the return values
> to return a bool where if DAX is supported then return true and no DAX
> support returns false.
> 
> Signed-off-by: Dave Jiang 
> Signed-off-by: Ross Zwisler 

Looks ok, do you want me to pull the first two patches through the xfs
tree?

Reviewed-by: Darrick J. Wong 

--D

> ---
>  drivers/dax/super.c | 16 
>  fs/ext2/super.c |  3 +--
>  fs/ext4/super.c |  3 +--
>  fs/xfs/xfs_ioctl.c  |  4 ++--
>  fs/xfs/xfs_super.c  | 12 ++--
>  include/linux/dax.h |  8 
>  6 files changed, 22 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> index 3943feb9a090..1d7bd96511f0 100644
> --- a/drivers/dax/super.c
> +++ b/drivers/dax/super.c
> @@ -80,9 +80,9 @@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);
>   * This is a library function for filesystems to check if the block device
>   * can be mounted with dax option.
>   *
> - * Return: negative errno if unsupported, 0 if supported.
> + * Return: true if supported, false if unsupported
>   */
> -int __bdev_dax_supported(struct block_device *bdev, int blocksize)
> +bool __bdev_dax_supported(struct block_device *bdev, int blocksize)
>  {
>   struct dax_device *dax_dev;
>   pgoff_t pgoff;
> @@ -95,21 +95,21 @@ int __bdev_dax_supported(struct block_device *bdev, int 
> blocksize)
>   if (blocksize != PAGE_SIZE) {
>   pr_debug("%s: error: unsupported blocksize for dax\n",
>   bdevname(bdev, buf));
> - return -EINVAL;
> + return false;
>   }
>  
>   err = bdev_dax_pgoff(bdev, 0, PAGE_SIZE, );
>   if (err) {
>   pr_debug("%s: error: unaligned partition for dax\n",
>   bdevname(bdev, buf));
> - return err;
> + return false;
>   }
>  
>   dax_dev = dax_get_by_host(bdev->bd_disk->disk_name);
>   if (!dax_dev) {
>   pr_debug("%s: error: device does not support dax\n",
>   bdevname(bdev, buf));
> - return -EOPNOTSUPP;
> + return false;
>   }
>  
>   id = dax_read_lock();
> @@ -121,7 +121,7 @@ int __bdev_dax_supported(struct block_device *bdev, int 
> blocksize)
>   if (len < 1) {
>   pr_debug("%s: error: dax access failed (%ld)\n",
>   bdevname(bdev, buf), len);
> - return len < 0 ? len : -EIO;
> + return false;
>   }
>  
>   if (IS_ENABLED(CONFIG_FS_DAX_LIMITED) && pfn_t_special(pfn)) {
> @@ -139,10 +139,10 @@ int __bdev_dax_supported(struct block_device *bdev, int 
> blocksize)
>   } else {
>   pr_debug("%s: error: dax support not enabled\n",
>   bdevname(bdev, buf));
> - return -EOPNOTSUPP;
> + return false;
>   }
>  
> - return 0;
> + return true;
>  }
>  EXPORT_SYMBOL_GPL(__bdev_dax_supported);
>  #endif
> diff --git a/fs/ext2/super.c b/fs/ext2/super.c
> index 9627c3054b5c..c09289a42dc5 100644
> --- a/fs/ext2/super.c
> +++ b/fs/ext2/super.c
> @@ -961,8 +961,7 @@ static int ext2_fill_super(struct super_block *sb, void 
> *data, int silent)
>   blocksize = BLOCK_SIZE << le32_to_cpu(sbi->s_es->s_log_block_size);
>  
>   if (sbi->s_mount_opt & EXT2_MOUNT_DAX) {
> - err = bdev_dax_supported(sb->s_bdev, blocksize);
> - if (err) {
> + if (!bdev_dax_supported(sb->s_bdev, blocksize)) {
>   ext2_msg(sb, KERN_ERR,
>   "DAX unsupported by block device. Turning off 
> DAX.");
>   sbi->s_mount_opt &= ~EXT2_MOUNT_DAX;
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 089170e99895..2e1622907f4a 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -3732,8 +3732,7 @@ static int ext4_fill_super(struct super_block *sb, void 
> *data, int silent)
>   " that may contain inline data");
>   sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
>   }
> - err = bdev_dax_supported(sb->s_bdev, blocksize);
> - if (err) {
> + if (!bdev_dax_supported(sb->s_bdev, blocksize)) {
>   ext4_msg(sb, KERN_ERR,
>   "DAX unsupported by block device. Turning off 
> DAX.");
>   sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index 0effd46b965f..2c70a0a4f59f 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1103,8 +1103,8 @@ xfs_ioctl_setattr_dax_invalidate(
>   if (fa->fsx_xflags & FS_XFLAG_DAX) {
>   if 

Re: [PATCH v10] mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS

2018-05-29 Thread Dan Williams
On Wed, May 23, 2018 at 11:50 AM, Gerald Schaefer
 wrote:
> On Tue, 22 May 2018 08:28:06 +0200
> Christoph Hellwig  wrote:
>
>> On Mon, May 21, 2018 at 11:04:10AM +0200, Jan Kara wrote:
>> > We definitely do have customers using "execute in place" on s390x from
>> > dcssblk. I've got about two bug reports for it when customers were updating
>> > from old kernels using original XIP to kernels using DAX. So we need to
>> > keep that working.
>>
>> That is all good an fine, but I think time has come where s390 needs
>> to migrate to provide the pmem API so that we can get rid of these
>> special cases.  Especially given that the old XIP/legacy DAX has all
>> kinds of known bugs at this point in time.
>
> I haven't yet looked at this patch series, but I can feel that this
> FS_DAX_LIMITED workaround is beginning to cause some headaches, apart
> from being quite ugly of course.
>
> Just to make sure I still understand the basic problem, which I thought
> was missing struct pages for the dcssblk memory, what exactly do you
> mean with "provide the pmem API", is there more to do?

No, just 'struct page' is needed.

What used to be the pmem API is now pushed down into to dax_operations
provided by the device driver. dcssblk is free to just redirect to the
generic implementations for copy_from_iter() and copy_to_iter(), and
be done. I.e. we've removed the "pmem API" requirement.

> I do have a prototype patch lying around that adds struct pages, but
> didn't yet have time to fully test/complete it. Of course we initially
> introduced XIP as a mechanism to reduce memory consumption, and that
> is probably the use case for the remaining customer(s). Adding struct
> pages would somehow reduce that benefit, but as long as we can still
> "execute in place", I guess it will be OK.

The pmem driver has the option to allocate the 'struct page' map out
of pmem directly. If the overhead of having the map in System RAM is
too high it could borrow the same approach, but that adds another
degree of configuration complexity freedom.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [qemu PATCH v4 0/4] support NFIT platform capabilities

2018-05-29 Thread Ross Zwisler
Ping on this series.  Rob, I think I've addressed all your feedback.  Can you
please verify?

Thanks,
- Ross

On Mon, May 21, 2018 at 10:31:59AM -0600, Ross Zwisler wrote:
> Changes since v3:
>  * Updated the text in docs/nvdimm.txt to make it clear that the value
>being passed in on the command line in an integer made up of various
>bit fields. (Rob Elliott)
>  
>  * Updated the "Highest Valid Capability" byte to be dynamic based on
>the highest valid bit in the user's input. (Rob Elliott)
> 
> ---
> 
> The first 2 patches in this series clean up some things I noticed while
> coding.
> 
> Patch 3 adds support for the new Platform Capabilities Structure, which
> was added to the NFIT in ACPI 6.2 Errata A.  We add a machine command
> line option "nvdimm-cap":
> 
> -machine pc,accel=kvm,nvdimm,nvdimm-cap=2
> 
> which allows the user to pass in a value for this structure.  When such
> a value is passed in we will generate the new NFIT subtable.
> 
> Patch 4 adds code to the "make check" self test infrastructure so that
> we generate the new Platform Capabilities Structure, and adds it to the
> expected NFIT output so that we test for it.
> 
> Ross Zwisler (4):
>   nvdimm: fix typo in label-size definition
>   tests/.gitignore: add entry for generated file
>   nvdimm, acpi: support NFIT platform capabilities
>   ACPI testing: test NFIT platform capabilities
> 
>  docs/nvdimm.txt   |  27 
>  hw/acpi/nvdimm.c  |  45 
> +++---
>  hw/i386/pc.c  |  31 +++
>  hw/mem/nvdimm.c   |   2 +-
>  include/hw/i386/pc.h  |   1 +
>  include/hw/mem/nvdimm.h   |   7 +-
>  tests/.gitignore  |   1 +
>  tests/acpi-test-data/pc/NFIT.dimmpxm  | Bin 224 -> 240 bytes
>  tests/acpi-test-data/q35/NFIT.dimmpxm | Bin 224 -> 240 bytes
>  tests/bios-tables-test.c  |   2 +-
>  10 files changed, 109 insertions(+), 7 deletions(-)
> 
> -- 
> 2.14.3
> 
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH v2 0/7] Fix DM DAX handling

2018-05-29 Thread Ross Zwisler
Changes from v1:
 * Reworked patches 1 and 2 so that the __bdev_dax_supported() function
   stays hidden behind the bdev_dax_supported() wrapper.  This is needed
   to prevent compilation errors in configs where CONFIG_FS_DAX isn't
   defined. (0-day)

 * Added Eric's Reviewed-by to patch 1.  I did this in spite of the
   bdev_dax_supported() changes because they were minor and I think
   Eric's review was focused on the XFS parts.

---

This series fixes a few issues that I found with DM's handling of DAX
devices.  Here are some of the issues I found:

 * We can create a dm-stripe or dm-linear device which is made up of an
   fsdax PMEM namespace and a raw PMEM namespace but which can hold a
   filesystem mounted with the -o dax mount option.  DAX operations to
   the raw PMEM namespace part lack struct page and can fail in
   interesting/unexpected ways when doing things like fork(), examining
   memory with gdb, etc.

 * We can create a dm-stripe or dm-linear device which is made up of an
   fsdax PMEM namespace and a BRD ramdisk which can hold a filesystem
   mounted with the -o dax mount option.  All I/O to this filesystem
   will fail.

 * In DM you can't transition a dm target which could possibly support
   DAX (mode DM_TYPE_DAX_BIO_BASED) to one which can't support DAX
   (mode DM_TYPE_BIO_BASED), even if you never use DAX.

The first 2 patches in this series are prep work from Darrick and Dave
which improve bdev_dax_supported().  The last 5 problems fix the above
mentioned problems in DM.  I feel that this series simplifies the
handling of DAX devices in DM, and the last 5 DM-related patches have a
net code reduction of 50 lines.

Darrick J. Wong (1):
  fs: allow per-device dax status checking for filesystems

Dave Jiang (1):
  dax: change bdev_dax_supported() to support boolean returns

Ross Zwisler (5):
  dm: fix test for DAX device support
  dm: prevent DAX mounts if not supported
  dm: remove DM_TYPE_DAX_BIO_BASED dm_queue_mode
  dm-snap: remove unnecessary direct_access() stub
  dm-error: remove unnecessary direct_access() stub

 drivers/dax/super.c   | 40 
 drivers/md/dm-ioctl.c | 16 ++--
 drivers/md/dm-snap.c  |  8 
 drivers/md/dm-table.c | 29 +++--
 drivers/md/dm-target.c|  7 ---
 drivers/md/dm.c   |  7 ++-
 fs/ext2/super.c   |  3 +--
 fs/ext4/super.c   |  3 +--
 fs/xfs/xfs_ioctl.c|  3 ++-
 fs/xfs/xfs_iops.c | 30 +-
 fs/xfs/xfs_super.c| 10 --
 include/linux/dax.h   | 11 ++-
 include/linux/device-mapper.h |  8 ++--
 13 files changed, 88 insertions(+), 87 deletions(-)

-- 
2.14.3

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH v2 7/7] dm-error: remove unnecessary direct_access() stub

2018-05-29 Thread Ross Zwisler
This stub was added so that we could use dm-error with
DM_TYPE_DAX_BIO_BASED mode devices.  That mode and the transition issues
associated with it no longer exist, so we can remove this dead code.

Signed-off-by: Ross Zwisler 
---
 drivers/md/dm-target.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/md/dm-target.c b/drivers/md/dm-target.c
index 314d17ca6466..c4dbc15f7862 100644
--- a/drivers/md/dm-target.c
+++ b/drivers/md/dm-target.c
@@ -140,12 +140,6 @@ static void io_err_release_clone_rq(struct request *clone)
 {
 }
 
-static long io_err_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
-   long nr_pages, void **kaddr, pfn_t *pfn)
-{
-   return -EIO;
-}
-
 static struct target_type error_target = {
.name = "error",
.version = {1, 5, 0},
@@ -155,7 +149,6 @@ static struct target_type error_target = {
.map  = io_err_map,
.clone_and_map_rq = io_err_clone_and_map_rq,
.release_clone_rq = io_err_release_clone_rq,
-   .direct_access = io_err_dax_direct_access,
 };
 
 int __init dm_target_init(void)
-- 
2.14.3

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH v2 1/7] fs: allow per-device dax status checking for filesystems

2018-05-29 Thread Ross Zwisler
From: "Darrick J. Wong" 

Change bdev_dax_supported so it takes a bdev parameter.  This enables
multi-device filesystems like xfs to check that a dax device can work for
the particular filesystem.  Once that's in place, actually fix all the
parts of XFS where we need to be able to distinguish between datadev and
rtdev.

This patch fixes the problem where we screw up the dax support checking
in xfs if the datadev and rtdev have different dax capabilities.

Signed-off-by: Darrick J. Wong 
[rez: Re-added __bdev_dax_supported() for !CONFIG_FS_DAX cases]
Signed-off-by: Ross Zwisler 
Reviewed-by: Eric Sandeen 
---
 drivers/dax/super.c | 26 +-
 fs/ext2/super.c |  2 +-
 fs/ext4/super.c |  2 +-
 fs/xfs/xfs_ioctl.c  |  3 ++-
 fs/xfs/xfs_iops.c   | 30 +-
 fs/xfs/xfs_super.c  | 10 --
 include/linux/dax.h |  9 +
 7 files changed, 55 insertions(+), 27 deletions(-)

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 2b2332b605e4..3943feb9a090 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -74,7 +74,7 @@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);
 
 /**
  * __bdev_dax_supported() - Check if the device supports dax for filesystem
- * @sb: The superblock of the device
+ * @bdev: block device to check
  * @blocksize: The block size of the device
  *
  * This is a library function for filesystems to check if the block device
@@ -82,33 +82,33 @@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);
  *
  * Return: negative errno if unsupported, 0 if supported.
  */
-int __bdev_dax_supported(struct super_block *sb, int blocksize)
+int __bdev_dax_supported(struct block_device *bdev, int blocksize)
 {
-   struct block_device *bdev = sb->s_bdev;
struct dax_device *dax_dev;
pgoff_t pgoff;
int err, id;
void *kaddr;
pfn_t pfn;
long len;
+   char buf[BDEVNAME_SIZE];
 
if (blocksize != PAGE_SIZE) {
-   pr_debug("VFS (%s): error: unsupported blocksize for dax\n",
-   sb->s_id);
+   pr_debug("%s: error: unsupported blocksize for dax\n",
+   bdevname(bdev, buf));
return -EINVAL;
}
 
err = bdev_dax_pgoff(bdev, 0, PAGE_SIZE, );
if (err) {
-   pr_debug("VFS (%s): error: unaligned partition for dax\n",
-   sb->s_id);
+   pr_debug("%s: error: unaligned partition for dax\n",
+   bdevname(bdev, buf));
return err;
}
 
dax_dev = dax_get_by_host(bdev->bd_disk->disk_name);
if (!dax_dev) {
-   pr_debug("VFS (%s): error: device does not support dax\n",
-   sb->s_id);
+   pr_debug("%s: error: device does not support dax\n",
+   bdevname(bdev, buf));
return -EOPNOTSUPP;
}
 
@@ -119,8 +119,8 @@ int __bdev_dax_supported(struct super_block *sb, int 
blocksize)
put_dax(dax_dev);
 
if (len < 1) {
-   pr_debug("VFS (%s): error: dax access failed (%ld)\n",
-   sb->s_id, len);
+   pr_debug("%s: error: dax access failed (%ld)\n",
+   bdevname(bdev, buf), len);
return len < 0 ? len : -EIO;
}
 
@@ -137,8 +137,8 @@ int __bdev_dax_supported(struct super_block *sb, int 
blocksize)
} else if (pfn_t_devmap(pfn)) {
/* pass */;
} else {
-   pr_debug("VFS (%s): error: dax support not enabled\n",
-   sb->s_id);
+   pr_debug("%s: error: dax support not enabled\n",
+   bdevname(bdev, buf));
return -EOPNOTSUPP;
}
 
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index de1694512f1f..9627c3054b5c 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -961,7 +961,7 @@ static int ext2_fill_super(struct super_block *sb, void 
*data, int silent)
blocksize = BLOCK_SIZE << le32_to_cpu(sbi->s_es->s_log_block_size);
 
if (sbi->s_mount_opt & EXT2_MOUNT_DAX) {
-   err = bdev_dax_supported(sb, blocksize);
+   err = bdev_dax_supported(sb->s_bdev, blocksize);
if (err) {
ext2_msg(sb, KERN_ERR,
"DAX unsupported by block device. Turning off 
DAX.");
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index eb104e8476f0..089170e99895 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3732,7 +3732,7 @@ static int ext4_fill_super(struct super_block *sb, void 
*data, int silent)
" that may contain inline data");
sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
}
-   err = bdev_dax_supported(sb, blocksize);
+   err = bdev_dax_supported(sb->s_bdev, blocksize);
  

[PATCH v2 3/7] dm: fix test for DAX device support

2018-05-29 Thread Ross Zwisler
Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX
flag is set on the device's request queue to decide whether or not the
device supports filesystem DAX.  This is insufficient because there are
devices like PMEM namespaces in raw mode which have QUEUE_FLAG_DAX set but
which don't actually support DAX.

This means that you could create a dm-linear device, for example, where the
first part of the dm-linear device was a PMEM namespace in fsdax mode and
the second part was a PMEM namespace in raw mode.  Both DM and the
filesystem you put on that dm-linear device would think the whole device
supports DAX, which would lead to bad behavior once your raw PMEM namespace
part using DAX needed struct page for something.

Fix this by using bdev_dax_supported() like filesystems do at mount time.
This checks for raw mode and also performs other tests like checking to
make sure the dax_direct_access() path works.

Signed-off-by: Ross Zwisler 
Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support")
---
 drivers/md/dm-table.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 0589a4da12bb..5bb994b012ca 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -885,9 +885,7 @@ EXPORT_SYMBOL_GPL(dm_table_set_type);
 static int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
   sector_t start, sector_t len, void *data)
 {
-   struct request_queue *q = bdev_get_queue(dev->bdev);
-
-   return q && blk_queue_dax(q);
+   return bdev_dax_supported(dev->bdev, PAGE_SIZE);
 }
 
 static bool dm_table_supports_dax(struct dm_table *t)
-- 
2.14.3

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH v2 2/7] dax: change bdev_dax_supported() to support boolean returns

2018-05-29 Thread Ross Zwisler
From: Dave Jiang 

The function return values are confusing with the way the function is
named. We expect a true or false return value but it actually returns
0/-errno.  This makes the code very confusing. Changing the return values
to return a bool where if DAX is supported then return true and no DAX
support returns false.

Signed-off-by: Dave Jiang 
Signed-off-by: Ross Zwisler 
---
 drivers/dax/super.c | 16 
 fs/ext2/super.c |  3 +--
 fs/ext4/super.c |  3 +--
 fs/xfs/xfs_ioctl.c  |  4 ++--
 fs/xfs/xfs_super.c  | 12 ++--
 include/linux/dax.h |  8 
 6 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 3943feb9a090..1d7bd96511f0 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -80,9 +80,9 @@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);
  * This is a library function for filesystems to check if the block device
  * can be mounted with dax option.
  *
- * Return: negative errno if unsupported, 0 if supported.
+ * Return: true if supported, false if unsupported
  */
-int __bdev_dax_supported(struct block_device *bdev, int blocksize)
+bool __bdev_dax_supported(struct block_device *bdev, int blocksize)
 {
struct dax_device *dax_dev;
pgoff_t pgoff;
@@ -95,21 +95,21 @@ int __bdev_dax_supported(struct block_device *bdev, int 
blocksize)
if (blocksize != PAGE_SIZE) {
pr_debug("%s: error: unsupported blocksize for dax\n",
bdevname(bdev, buf));
-   return -EINVAL;
+   return false;
}
 
err = bdev_dax_pgoff(bdev, 0, PAGE_SIZE, );
if (err) {
pr_debug("%s: error: unaligned partition for dax\n",
bdevname(bdev, buf));
-   return err;
+   return false;
}
 
dax_dev = dax_get_by_host(bdev->bd_disk->disk_name);
if (!dax_dev) {
pr_debug("%s: error: device does not support dax\n",
bdevname(bdev, buf));
-   return -EOPNOTSUPP;
+   return false;
}
 
id = dax_read_lock();
@@ -121,7 +121,7 @@ int __bdev_dax_supported(struct block_device *bdev, int 
blocksize)
if (len < 1) {
pr_debug("%s: error: dax access failed (%ld)\n",
bdevname(bdev, buf), len);
-   return len < 0 ? len : -EIO;
+   return false;
}
 
if (IS_ENABLED(CONFIG_FS_DAX_LIMITED) && pfn_t_special(pfn)) {
@@ -139,10 +139,10 @@ int __bdev_dax_supported(struct block_device *bdev, int 
blocksize)
} else {
pr_debug("%s: error: dax support not enabled\n",
bdevname(bdev, buf));
-   return -EOPNOTSUPP;
+   return false;
}
 
-   return 0;
+   return true;
 }
 EXPORT_SYMBOL_GPL(__bdev_dax_supported);
 #endif
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 9627c3054b5c..c09289a42dc5 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -961,8 +961,7 @@ static int ext2_fill_super(struct super_block *sb, void 
*data, int silent)
blocksize = BLOCK_SIZE << le32_to_cpu(sbi->s_es->s_log_block_size);
 
if (sbi->s_mount_opt & EXT2_MOUNT_DAX) {
-   err = bdev_dax_supported(sb->s_bdev, blocksize);
-   if (err) {
+   if (!bdev_dax_supported(sb->s_bdev, blocksize)) {
ext2_msg(sb, KERN_ERR,
"DAX unsupported by block device. Turning off 
DAX.");
sbi->s_mount_opt &= ~EXT2_MOUNT_DAX;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 089170e99895..2e1622907f4a 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3732,8 +3732,7 @@ static int ext4_fill_super(struct super_block *sb, void 
*data, int silent)
" that may contain inline data");
sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
}
-   err = bdev_dax_supported(sb->s_bdev, blocksize);
-   if (err) {
+   if (!bdev_dax_supported(sb->s_bdev, blocksize)) {
ext4_msg(sb, KERN_ERR,
"DAX unsupported by block device. Turning off 
DAX.");
sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 0effd46b965f..2c70a0a4f59f 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1103,8 +1103,8 @@ xfs_ioctl_setattr_dax_invalidate(
if (fa->fsx_xflags & FS_XFLAG_DAX) {
if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode)))
return -EINVAL;
-   if (bdev_dax_supported(xfs_find_bdev_for_inode(VFS_I(ip)),
-   sb->s_blocksize) < 0)
+   if (!bdev_dax_supported(xfs_find_bdev_for_inode(VFS_I(ip)),
+

[PATCH v2 4/7] dm: prevent DAX mounts if not supported

2018-05-29 Thread Ross Zwisler
Currently the code in dm_dax_direct_access() only checks whether the target
type has a direct_access() operation defined, not whether the underlying
block devices all support DAX.  This latter property can be seen by looking
at whether we set the QUEUE_FLAG_DAX request queue flag when creating the
DM device.

This is problematic if we have, for example, a dm-linear device made up of
a PMEM namespace in fsdax mode followed by a ramdisk from BRD.
QUEUE_FLAG_DAX won't be set on the dm-linear device's request queue, but
we have a working direct_access() entry point and the first member of the
dm-linear set *does* support DAX.

This allows the user to create a filesystem on the dm-linear device, and
then mount it with DAX.  The filesystem's bdev_dax_supported() test will
pass because it'll operate on the first member of the dm-linear device,
which happens to be a fsdax PMEM namespace.

All DAX I/O will then fail to that dm-linear device because the lack of
QUEUE_FLAG_DAX prevents fs_dax_get_by_bdev() from working.  This means that
the struct dax_device isn't ever set in the filesystem, so
dax_direct_access() will always return -EOPNOTSUPP.

By failing out of dm_dax_direct_access() if QUEUE_FLAG_DAX isn't set we let
the filesystem know we don't support DAX at mount time.  The filesystem
will then silently fall back and remove the dax mount option, causing it to
work properly.

Signed-off-by: Ross Zwisler 
Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support")
---
 drivers/md/dm.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 0a7b0107ca78..9728433362d1 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1050,14 +1050,13 @@ static long dm_dax_direct_access(struct dax_device 
*dax_dev, pgoff_t pgoff,
 
if (!ti)
goto out;
-   if (!ti->type->direct_access)
+   if (!blk_queue_dax(md->queue))
goto out;
len = max_io_len(sector, ti) / PAGE_SECTORS;
if (len < 1)
goto out;
nr_pages = min(len, nr_pages);
-   if (ti->type->direct_access)
-   ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn);
+   ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn);
 
  out:
dm_put_live_table(md, srcu_idx);
-- 
2.14.3

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [ndctl PATCH] ndctl, list: display the 'map' location in listings

2018-05-29 Thread Dan Williams
On Tue, May 29, 2018 at 11:52 AM, Verma, Vishal L
 wrote:
> On Mon, 2018-05-28 at 11:04 -0700, Dan Williams wrote:
>> On Fri, May 25, 2018 at 3:36 PM Vishal Verma 
>> wrote:
>>
>> > For 'fsdax' and 'devdax' namespaces, a 'map' location may be specified
>> > for page structures storage. This can be 'mem', for system RAM, or
>> > 'dev'
>> > for using pmem as the backing storage. Once set, there was no way of
>> > telling using ndctl, which of the two locations a namespace was
>> > configured for. Add this in util_namespace_to_json so that all
>> > namespace listings contain the map location.
>> > Reported-by: "Yigal Korman" 
>> > Cc: Dan Williams 
>> > Signed-off-by: Vishal Verma 
>> > ---
>> >   util/json.c | 18 ++
>> >   1 file changed, 18 insertions(+)
>> > diff --git a/util/json.c b/util/json.c
>> > index c606e1c..17dd90c 100644
>> > --- a/util/json.c
>> > +++ b/util/json.c
>> > @@ -667,11 +667,17 @@ struct json_object *util_namespace_to_json(struct
>>
>> ndctl_namespace *ndns,
>> >   {
>> >  struct json_object *jndns = json_object_new_object();
>> >  struct json_object *jobj, *jbbs = NULL;
>> > +   const char *locations[] = {
>> > +   [NDCTL_PFN_LOC_NONE] = "none",
>> > +   [NDCTL_PFN_LOC_RAM] = "mem",
>> > +   [NDCTL_PFN_LOC_PMEM] = "dev",
>> > +   };
>> >  unsigned long long size = ULLONG_MAX;
>> >  unsigned int sector_size = UINT_MAX;
>> >  enum ndctl_namespace_mode mode;
>> >  const char *bdev = NULL, *name;
>> >  unsigned int bb_count = 0;
>> > +   enum ndctl_pfn_loc loc;
>> >  struct ndctl_btt *btt;
>> >  struct ndctl_pfn *pfn;
>> >  struct ndctl_dax *dax;
>> > @@ -749,6 +755,12 @@ struct json_object *util_namespace_to_json(struct
>>
>> ndctl_namespace *ndns,
>> >  jobj = util_raw_uuid(ndns);
>> >  if (jobj)
>> >  json_object_object_add(jndns, "raw_uuid",
>> > jobj);
>> > +   loc = ndctl_pfn_get_location(pfn);
>> > +   jobj = json_object_new_string(locations[loc]);
>> > +   if (!jobj)
>> > +   goto err;
>> > +   if (jobj)
>> > +   json_object_object_add(jndns, "map", jobj);
>> >  bdev = ndctl_pfn_get_block_device(pfn);
>> >  } else if (dax) {
>> >  struct daxctl_region *dax_region;
>> > @@ -763,6 +775,12 @@ struct json_object *util_namespace_to_json(struct
>>
>> ndctl_namespace *ndns,
>> >  jobj = util_raw_uuid(ndns);
>> >  if (jobj)
>> >  json_object_object_add(jndns, "raw_uuid",
>> > jobj);
>> > +   loc = ndctl_dax_get_location(dax);
>> > +   jobj = json_object_new_string(locations[loc]);
>> > +   if (!jobj)
>> > +   goto err;
>> > +   if (jobj)
>> > +   json_object_object_add(jndns, "map", jobj);
>> >  if ((flags & UTIL_JSON_DAX) && dax_region) {
>> >  jobj = util_daxctl_region_to_json(dax_region,
>>
>> NULL,
>> >  flags);
>>
>> There appears to be one case missing in this:
>>
>>  case NDCTL_NS_MODE_MEMORY:
>>  if (pfn) /* dynamic memory mode */
>>  size = ndctl_pfn_get_size(pfn);
>>  else /* native/static memory mode */
>>  size = ndctl_namespace_get_size(ndns);
>>  jobj = json_object_new_string("fsdax");
>>  break;
>>
>> In the "/* native/static memory mode */" configuration we should emit a
>> 'map:"mem"' indication.
>
> Ah good catch. Is this for legacy/labelless namespaces that have been
> configured into fsdax mode?

This is strictly for memmap=ss!nn and legacy-e820 defined namespaces
where the assumption is that they are small and never need to have the
map allocated anywhere else but System RAM.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [ndctl PATCH] ndctl, list: display the 'map' location in listings

2018-05-29 Thread Verma, Vishal L
On Mon, 2018-05-28 at 11:04 -0700, Dan Williams wrote:
> On Fri, May 25, 2018 at 3:36 PM Vishal Verma 
> wrote:
> 
> > For 'fsdax' and 'devdax' namespaces, a 'map' location may be specified
> > for page structures storage. This can be 'mem', for system RAM, or
> > 'dev'
> > for using pmem as the backing storage. Once set, there was no way of
> > telling using ndctl, which of the two locations a namespace was
> > configured for. Add this in util_namespace_to_json so that all
> > namespace listings contain the map location.
> > Reported-by: "Yigal Korman" 
> > Cc: Dan Williams 
> > Signed-off-by: Vishal Verma 
> > ---
> >   util/json.c | 18 ++
> >   1 file changed, 18 insertions(+)
> > diff --git a/util/json.c b/util/json.c
> > index c606e1c..17dd90c 100644
> > --- a/util/json.c
> > +++ b/util/json.c
> > @@ -667,11 +667,17 @@ struct json_object *util_namespace_to_json(struct
> 
> ndctl_namespace *ndns,
> >   {
> >  struct json_object *jndns = json_object_new_object();
> >  struct json_object *jobj, *jbbs = NULL;
> > +   const char *locations[] = {
> > +   [NDCTL_PFN_LOC_NONE] = "none",
> > +   [NDCTL_PFN_LOC_RAM] = "mem",
> > +   [NDCTL_PFN_LOC_PMEM] = "dev",
> > +   };
> >  unsigned long long size = ULLONG_MAX;
> >  unsigned int sector_size = UINT_MAX;
> >  enum ndctl_namespace_mode mode;
> >  const char *bdev = NULL, *name;
> >  unsigned int bb_count = 0;
> > +   enum ndctl_pfn_loc loc;
> >  struct ndctl_btt *btt;
> >  struct ndctl_pfn *pfn;
> >  struct ndctl_dax *dax;
> > @@ -749,6 +755,12 @@ struct json_object *util_namespace_to_json(struct
> 
> ndctl_namespace *ndns,
> >  jobj = util_raw_uuid(ndns);
> >  if (jobj)
> >  json_object_object_add(jndns, "raw_uuid",
> > jobj);
> > +   loc = ndctl_pfn_get_location(pfn);
> > +   jobj = json_object_new_string(locations[loc]);
> > +   if (!jobj)
> > +   goto err;
> > +   if (jobj)
> > +   json_object_object_add(jndns, "map", jobj);
> >  bdev = ndctl_pfn_get_block_device(pfn);
> >  } else if (dax) {
> >  struct daxctl_region *dax_region;
> > @@ -763,6 +775,12 @@ struct json_object *util_namespace_to_json(struct
> 
> ndctl_namespace *ndns,
> >  jobj = util_raw_uuid(ndns);
> >  if (jobj)
> >  json_object_object_add(jndns, "raw_uuid",
> > jobj);
> > +   loc = ndctl_dax_get_location(dax);
> > +   jobj = json_object_new_string(locations[loc]);
> > +   if (!jobj)
> > +   goto err;
> > +   if (jobj)
> > +   json_object_object_add(jndns, "map", jobj);
> >  if ((flags & UTIL_JSON_DAX) && dax_region) {
> >  jobj = util_daxctl_region_to_json(dax_region,
> 
> NULL,
> >  flags);
> 
> There appears to be one case missing in this:
> 
>  case NDCTL_NS_MODE_MEMORY:
>  if (pfn) /* dynamic memory mode */
>  size = ndctl_pfn_get_size(pfn);
>  else /* native/static memory mode */
>  size = ndctl_namespace_get_size(ndns);
>  jobj = json_object_new_string("fsdax");
>  break;
> 
> In the "/* native/static memory mode */" configuration we should emit a
> 'map:"mem"' indication.

Ah good catch. Is this for legacy/labelless namespaces that have been
configured into fsdax mode?

I'll fixup and send a new version.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


回复:184306651企 业 流 程 与 组 织 变 革

2018-05-29 Thread 魏先生
 
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm