RE: [External] [PATCH v2 2/6] s390, dcssblk: kaddr and pfn can be NULL to ->direct_access()

2018-07-27 Thread Huaisheng HS1 Ye
Dear Maintainers of Linux-s390,

Greetings.

May I have your ack's for this patch? The whole series would be applied
to libnvdimm if it could get your approval.
And any suggestion is welcome.

Cheers,
Huaisheng Ye

> From: Huaisheng Ye 
> Sent: Thursday, July 26, 2018 12:29 AM
> 
> From: Huaisheng Ye 
> 
> dcssblk_direct_access() needs to check the validity of pointers kaddr
> and pfn for NULL assignment. If anyone equals to NULL, it doesn't need
> to calculate the value.
> 
> If either of them is equal to NULL, that is to say callers may
> have no need for kaddr or pfn, so this patch is prepared for allowing
> them to pass in NULL instead of having to pass in a pointer or local
> variable that they then just throw away.
> 
> Signed-off-by: Huaisheng Ye 
> ---
>  drivers/s390/block/dcssblk.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c
> index ed60728..23e526c 100644
> --- a/drivers/s390/block/dcssblk.c
> +++ b/drivers/s390/block/dcssblk.c
> @@ -922,9 +922,11 @@ static DEVICE_ATTR(save, S_IWUSR | S_IRUSR, 
> dcssblk_save_show,
>   unsigned long dev_sz;
> 
>   dev_sz = dev_info->end - dev_info->start + 1;
> - *kaddr = (void *) dev_info->start + offset;
> - *pfn = __pfn_to_pfn_t(PFN_DOWN(dev_info->start + offset),
> - PFN_DEV|PFN_SPECIAL);
> + if (kaddr)
> + *kaddr = (void *) dev_info->start + offset;
> + if (pfn)
> + *pfn = __pfn_to_pfn_t(PFN_DOWN(dev_info->start + offset),
> + PFN_DEV|PFN_SPECIAL);
> 
>   return (dev_sz - offset) / PAGE_SIZE;
>  }
> --
> 1.8.3.1
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH] ipc/shm.c add ->pagesize function to shm_vm_ops

2018-07-27 Thread Jane Chu

Hi, Andrew,

On 7/27/2018 2:50 PM, Andrew Morton wrote:


On Fri, 27 Jul 2018 15:17:27 -0600 Jane Chu  wrote:


Commit 05ea88608d4e13 (mm, hugetlbfs: introduce ->pagesize() to
vm_operations_struct) adds a new ->pagesize() function to
hugetlb_vm_ops, intended to cover all hugetlbfs backed files.

That was merged three months ago.  Can you suggest why this was only
noticed now?


The issue was recently reported by a QA engineer running Oracle database
test in Oracle Linux. He first noticed the issue in upstream 4.17, then 4.18,
but because the issue wasn't in Oracle product, it wasn't reported, not
until I cherry picked the patch into Oracle Linux recently.


What workload triggered this?  I see no cc:stable, but 4.17 is affected?


It's Oracle database workload. Large shared memory segments(SGAs) were created
and shared among dozens to hundreds of processes. The crash occurs when the
test stops the database workload.  I do not have access to the test source.
Yes, 4.17 is affected.


With System V shared memory model, if "huge page" is specified,
the "shared memory" is backed by hugetlbfs files, but the mappings
initiated via shmget/shmat have their original vm_ops overwritten
with shm_vm_ops, so we need to add a ->pagesize function to shm_vm_ops.
Otherwise, vma_kernel_pagesize() returns PAGE_SIZE given a hugetlbfs
backed vma, result in below BUG:

fs/hugetlbfs/inode.c
 443 if (unlikely(page_mapped(page))) {
 444 BUG_ON(truncate_op);

OK, help me out here.  How does an incorrect return value from
vma_kernel_pagesize() result in remove_inode_hugepages() deciding that
it's truncating a mapped page?


To be honest, I don't have a satisfactory answer to how the wrong
pagesize causes a page that's about to be truncated remain mapped.
I relied on the hind sight of BUG_ON(truncate_op).

At a time I inserted dump_stack() into vma_kernel_pagesize() as Mike
suggested to try to dig out more,

unsigned long vma_kernel_pagesize(struct vm_area_struct *vma)
{
-   if (vma->vm_ops && vma->vm_ops->pagesize)
+   if (vma->vm_ops && vma->vm_ops->pagesize) {
return vma->vm_ops->pagesize(vma);
+} else if (is_vm_hugetlb_page(vma)) {
+   struct hstate *hstate;
+   dump_stack();
+   hstate = hstate_vma(vma);
+   return 1UL << huge_page_shift(hstate);
+   }
return PAGE_SIZE;
}

There were too many stack traces that clogged the console, I didn't
capture the entire output, perhaps I should go back to capture them.

Any other ideas?

Regards,
-jane


___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH v2 5/6] ndctl: add support for secure erase

2018-07-27 Thread Dave Jiang
Add support for secure erase to libndctl and also command line option
of "secure-erase" for ndctl. This will initiate the request to securely
erase a DIMM. ndctl does not actually handle the verification of the
security. That is handled by the kernel and the key upcall mechanism.

Signed-off-by: Dave Jiang 
---
 Documentation/ndctl/Makefile.am|1 +
 Documentation/ndctl/ndctl-secure-erase.txt |   21 +
 builtin.h  |1 +
 ndctl/dimm.c   |   21 +
 ndctl/lib/dimm.c   |5 +
 ndctl/lib/libndctl.sym |1 +
 ndctl/libndctl.h   |1 +
 ndctl/ndctl.c  |1 +
 8 files changed, 52 insertions(+)
 create mode 100644 Documentation/ndctl/ndctl-secure-erase.txt

diff --git a/Documentation/ndctl/Makefile.am b/Documentation/ndctl/Makefile.am
index 98348178..8a84c11c 100644
--- a/Documentation/ndctl/Makefile.am
+++ b/Documentation/ndctl/Makefile.am
@@ -49,6 +49,7 @@ man1_MANS = \
ndctl-update-security.1 \
ndctl-disable-security.1 \
ndctl-freeze-security.1 \
+   ndctl-secure-erase.1 \
ndctl-list.1
 
 CLEANFILES = $(man1_MANS)
diff --git a/Documentation/ndctl/ndctl-secure-erase.txt 
b/Documentation/ndctl/ndctl-secure-erase.txt
new file mode 100644
index ..8f82248a
--- /dev/null
+++ b/Documentation/ndctl/ndctl-secure-erase.txt
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+
+ndctl-secure-erase(1)
+=
+
+NAME
+
+ndctl-secure-erase - securely erasing the user data on the NVDIMM
+
+SYNOPSIS
+
+[verse]
+'ndctl secure-erase' 
+
+DESCRIPTION
+---
+Provide a generic interface to securely erase NVDIMM. This does not change
+the label storage area. The use of this depends on support from the underlying
+libndctl, kernel, as well as the platform itself.
+
+include::../copyright.txt[]
diff --git a/builtin.h b/builtin.h
index 6e6ae069..a9b6443e 100644
--- a/builtin.h
+++ b/builtin.h
@@ -50,4 +50,5 @@ int cmd_inject_smart(int argc, const char **argv, void *ctx);
 int cmd_update_security(int argc, const char **argv, void *ctx);
 int cmd_disable_security(int argc, const char **argv, void *ctx);
 int cmd_freeze_security(int argc, const char **argv, void *ctx);
+int cmd_secure_erase(int argc, const char **argv, void *ctx);
 #endif /* _NDCTL_BUILTIN_H_ */
diff --git a/ndctl/dimm.c b/ndctl/dimm.c
index 49097cff..45c9efce 100644
--- a/ndctl/dimm.c
+++ b/ndctl/dimm.c
@@ -857,6 +857,17 @@ static int action_security_freeze(struct ndctl_dimm *dimm,
return rc;
 }
 
+static int action_secure_erase(struct ndctl_dimm *dimm,
+   struct action_context *actx)
+{
+   int rc;
+
+   rc = ndctl_dimm_secure_erase(dimm);
+   if (rc < 0)
+   error("Failed to secure erase for %s\n",
+   ndctl_dimm_get_devname(dimm));
+   return rc;
+}
 static struct parameters {
const char *bus;
const char *outfile;
@@ -1244,3 +1255,13 @@ int cmd_freeze_security(int argc, const char **argv, 
void *ctx)
count > 1 ? "s" : "");
return count >= 0 ? 0 : EXIT_FAILURE;
 }
+
+int cmd_secure_erase(int argc, const char **argv, void *ctx)
+{
+   int count = dimm_action(argc, argv, ctx, action_secure_erase, 
base_options,
+   "ndctl secure-erase  [..] 
[]");
+
+   fprintf(stderr, "secure erased %d nmem%s.\n", count >= 0 ? count : 0,
+   count > 1 ? "s" : "");
+   return count >= 0 ? 0 : EXIT_FAILURE;
+}
diff --git a/ndctl/lib/dimm.c b/ndctl/lib/dimm.c
index 2ff68daa..bb896d21 100644
--- a/ndctl/lib/dimm.c
+++ b/ndctl/lib/dimm.c
@@ -625,3 +625,8 @@ NDCTL_EXPORT int ndctl_dimm_freeze_security(struct 
ndctl_dimm *dimm)
 {
return ndctl_dimm_write_security(dimm, "freeze");
 }
+
+NDCTL_EXPORT int ndctl_dimm_secure_erase(struct ndctl_dimm *dimm)
+{
+   return ndctl_dimm_write_security(dimm, "erase");
+}
diff --git a/ndctl/lib/libndctl.sym b/ndctl/lib/libndctl.sym
index 613d9bb2..08727f94 100644
--- a/ndctl/lib/libndctl.sym
+++ b/ndctl/lib/libndctl.sym
@@ -375,4 +375,5 @@ global:
ndctl_dimm_set_change_key;
ndctl_dimm_disable_security;
ndctl_dimm_freeze_security;
+   ndctl_dimm_secure_erase;
 } LIBNDCTL_16;
diff --git a/ndctl/libndctl.h b/ndctl/libndctl.h
index 4615ba09..88ba8a29 100644
--- a/ndctl/libndctl.h
+++ b/ndctl/libndctl.h
@@ -663,6 +663,7 @@ int ndctl_dimm_get_security_state(struct ndctl_dimm *dimm, 
char *state);
 int ndctl_dimm_set_change_key(struct ndctl_dimm *dimm);
 int ndctl_dimm_disable_security(struct ndctl_dimm *dimm);
 int ndctl_dimm_freeze_security(struct ndctl_dimm *dimm);
+int ndctl_dimm_secure_erase(struct ndctl_dimm *dimm);
 
 #ifdef __cplusplus
 } /* extern "C" */
diff --git a/ndctl/ndctl.c b/ndctl/ndctl.c
index 5a1a81e4..b1a91033 100644
--- 

[PATCH v2 6/6] ndctl: add request-key upcall reference app

2018-07-27 Thread Dave Jiang
Adding a reference upcall helper for request-key in order to retrieve the
security passphrase from userspace to provide to the kernel. The reference
app uses keyutils API to respond to the upcall from the kernel and is
invoked by /sbin/request-key of the keyutils.

Signed-off-by: Dave Jiang 
---
 Documentation/ndctl/Makefile.am   |3 -
 Documentation/ndctl/nvdimm-upcall.txt |   33 
 configure.ac  |1 
 ndctl.spec.in |2 
 ndctl/Makefile.am |5 +
 ndctl/nvdimm-upcall.c |  138 +
 6 files changed, 181 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/ndctl/nvdimm-upcall.txt
 create mode 100644 ndctl/nvdimm-upcall.c

diff --git a/Documentation/ndctl/Makefile.am b/Documentation/ndctl/Makefile.am
index 8a84c11c..d61e5f9a 100644
--- a/Documentation/ndctl/Makefile.am
+++ b/Documentation/ndctl/Makefile.am
@@ -50,7 +50,8 @@ man1_MANS = \
ndctl-disable-security.1 \
ndctl-freeze-security.1 \
ndctl-secure-erase.1 \
-   ndctl-list.1
+   ndctl-list.1 \
+   nvdimm-upcall.1
 
 CLEANFILES = $(man1_MANS)
 
diff --git a/Documentation/ndctl/nvdimm-upcall.txt 
b/Documentation/ndctl/nvdimm-upcall.txt
new file mode 100644
index ..bbf52fd1
--- /dev/null
+++ b/Documentation/ndctl/nvdimm-upcall.txt
@@ -0,0 +1,33 @@
+// SPDX-License-Identifier: GPL-2.0
+
+nvdimm-upcall(1)
+
+
+NAME
+
+nvdimm-upcall - upcall helper for keyutils
+
+SYNOPSIS
+
+[verse]
+'nvdimm-upcall' [key_id]
+
+DESCRIPTION
+---
+nvdimm-upcall is called by request-key from keyutils and not meant to be used
+directly by the user. It expects to read from /etc/nvdimm.passwd to retrieve
+the description and passphrase for the NVDIMM key.
+
+The nvdimm.passwd is formatted as:
+:
+cdab-0a-07e0-feff:
+
+In order for this util to be called, /etc/request-key.conf must be appended
+with the following line:
+create   logon   nvdimm*  *   /usr/sbin/nvdimm-upcall %k
+
+include::../copyright.txt[]
+
+SEE ALSO
+
+http://pmem.io/documents/NVDIMM_DSM_Interface-V1.7.pdf
diff --git a/configure.ac b/configure.ac
index cf442607..4b125919 100644
--- a/configure.ac
+++ b/configure.ac
@@ -114,6 +114,7 @@ PKG_CHECK_MODULES([KMOD], [libkmod])
 PKG_CHECK_MODULES([UDEV], [libudev])
 PKG_CHECK_MODULES([UUID], [uuid])
 PKG_CHECK_MODULES([JSON], [json-c])
+PKG_CHECK_MODULES([KEYUTILS], [keyutils])
 
 AC_ARG_WITH([bash-completion-dir],
AS_HELP_STRING([--with-bash-completion-dir[=PATH]],
diff --git a/ndctl.spec.in b/ndctl.spec.in
index e2c879ca..342f01d7 100644
--- a/ndctl.spec.in
+++ b/ndctl.spec.in
@@ -20,6 +20,7 @@ BuildRequires:pkgconfig(libudev)
 BuildRequires: pkgconfig(uuid)
 BuildRequires: pkgconfig(json-c)
 BuildRequires: pkgconfig(bash-completion)
+BuildRequires:  pkgconfig(keyutils)
 
 %description
 Utility library for managing the "libnvdimm" subsystem.  The "libnvdimm"
@@ -116,6 +117,7 @@ make check
 %{_bindir}/ndctl
 %{_mandir}/man1/ndctl*
 %{bashcompdir}/
+%{_sbindir}/nvdimm-upcall
 
 %files -n daxctl
 %defattr(-,root,root)
diff --git a/ndctl/Makefile.am b/ndctl/Makefile.am
index 0f568719..d6401df9 100644
--- a/ndctl/Makefile.am
+++ b/ndctl/Makefile.am
@@ -1,6 +1,7 @@
 include $(top_srcdir)/Makefile.am.in
 
 bin_PROGRAMS = ndctl
+sbin_PROGRAMS = nvdimm-upcall
 
 ndctl_SOURCES = ndctl.c \
bus.c \
@@ -41,3 +42,7 @@ ndctl_SOURCES += ../test/libndctl.c \
 ../test/core.c \
 test.c
 endif
+
+nvdimm_upcall_SOURCES = nvdimm-upcall.c
+
+nvdimm_upcall_LDADD = $(KEYUTILS_LIBS)
diff --git a/ndctl/nvdimm-upcall.c b/ndctl/nvdimm-upcall.c
new file mode 100644
index ..9406b23a
--- /dev/null
+++ b/ndctl/nvdimm-upcall.c
@@ -0,0 +1,138 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2018 Intel Corporation. All rights reserved. */
+
+/*
+ * Used by /sbin/request-key for handling nvdimm upcall of key requests
+ * for security DSMs.
+ */
+
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define PASSPHRASE_SIZE32
+#define PASS_PATH  "/etc/nvdimm.passwd"
+
+static FILE *fp;
+
+static int get_passphrase_from_id(const char *id, char *pass, int *psize)
+{
+   ssize_t rc = 0;
+   size_t size;
+   char *line = NULL;
+   char *tmp, *id_tok, *secret;
+   int found = 0;
+
+   fp = fopen(PASS_PATH, "r+");
+   if (!fp) {
+   syslog(LOG_ERR, "fopen: %s\n", strerror(errno));
+   return -errno;
+   }
+
+   while ((rc = getline(, , fp)) != -1) {
+   id_tok = strtok_r(line, ":", );
+   if (!id_tok)
+   break;
+   if (strcmp(id_tok, id) == 0) {
+   secret = tmp;
+   found = 1;
+   rc 

[PATCH v2 2/6] ndctl: add update to security support

2018-07-27 Thread Dave Jiang
Add API call for triggering sysfs knob to update the security for a DIMM
in libndctl. Also add the ndctl "update-security" to trigger that as well.
ndctl does not actually handle the passphrase file. It only initiates the
update and expects all the necessary mechanisms are already in place.

Signed-off-by: Dave Jiang 
---
 Documentation/ndctl/Makefile.am   |1 +
 Documentation/ndctl/ndctl-update-security.txt |   21 +
 builtin.h |1 +
 ndctl/dimm.c  |   22 ++
 ndctl/lib/dimm.c  |   20 
 ndctl/lib/libndctl.sym|1 +
 ndctl/libndctl.h  |1 +
 ndctl/ndctl.c |1 +
 8 files changed, 68 insertions(+)
 create mode 100644 Documentation/ndctl/ndctl-update-security.txt

diff --git a/Documentation/ndctl/Makefile.am b/Documentation/ndctl/Makefile.am
index 4fd9636f..02405c44 100644
--- a/Documentation/ndctl/Makefile.am
+++ b/Documentation/ndctl/Makefile.am
@@ -46,6 +46,7 @@ man1_MANS = \
ndctl-inject-error.1 \
ndctl-inject-smart.1 \
ndctl-update-firmware.1 \
+   ndctl-update-security.1 \
ndctl-list.1
 
 CLEANFILES = $(man1_MANS)
diff --git a/Documentation/ndctl/ndctl-update-security.txt 
b/Documentation/ndctl/ndctl-update-security.txt
new file mode 100644
index ..38ca02fa
--- /dev/null
+++ b/Documentation/ndctl/ndctl-update-security.txt
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+
+ndctl-update-security(1)
+
+
+NAME
+
+ndctl-update-security - enabling or update the security passphrase for an 
NVDIMM
+
+SYNOPSIS
+
+[verse]
+'ndctl update-security' 
+
+DESCRIPTION
+---
+Provide a generic interface for enabling or updating security passphrase for
+NVDIMM. The use of this depends on support from the underlying libndctl, 
kernel,
+as well as the platform itself.
+
+include::../copyright.txt[]
diff --git a/builtin.h b/builtin.h
index d3cc7239..e6dc5340 100644
--- a/builtin.h
+++ b/builtin.h
@@ -47,4 +47,5 @@ int cmd_bat(int argc, const char **argv, void *ctx);
 #endif
 int cmd_update_firmware(int argc, const char **argv, void *ctx);
 int cmd_inject_smart(int argc, const char **argv, void *ctx);
+int cmd_update_security(int argc, const char **argv, void *ctx);
 #endif /* _NDCTL_BUILTIN_H_ */
diff --git a/ndctl/dimm.c b/ndctl/dimm.c
index 97643a3c..274714d6 100644
--- a/ndctl/dimm.c
+++ b/ndctl/dimm.c
@@ -821,6 +821,18 @@ static int action_update(struct ndctl_dimm *dimm, struct 
action_context *actx)
return rc;
 }
 
+static int action_security_update(struct ndctl_dimm *dimm,
+   struct action_context *actx)
+{
+   int rc;
+
+   rc = ndctl_dimm_set_change_key(dimm);
+   if (rc < 0)
+   error("Failed to update security for %s\n",
+   ndctl_dimm_get_devname(dimm));
+   return rc;
+}
+
 static struct parameters {
const char *bus;
const char *outfile;
@@ -1178,3 +1190,13 @@ int cmd_update_firmware(int argc, const char **argv, 
void *ctx)
count > 1 ? "s" : "");
return count >= 0 ? 0 : EXIT_FAILURE;
 }
+
+int cmd_update_security(int argc, const char **argv, void *ctx)
+{
+   int count = dimm_action(argc, argv, ctx, action_security_update, 
base_options,
+   "ndctl update-security  [..] 
[]");
+
+   fprintf(stderr, "security updated %d nmem%s.\n", count >= 0 ? count : 0,
+   count > 1 ? "s" : "");
+   return count >= 0 ? 0 : EXIT_FAILURE;
+}
diff --git a/ndctl/lib/dimm.c b/ndctl/lib/dimm.c
index 8ed58559..ca4d439b 100644
--- a/ndctl/lib/dimm.c
+++ b/ndctl/lib/dimm.c
@@ -595,3 +595,23 @@ NDCTL_EXPORT int ndctl_dimm_get_security_state(struct 
ndctl_dimm *dimm,
 
return sysfs_read_attr(ctx, path, state);
 }
+
+static int ndctl_dimm_write_security(struct ndctl_dimm *dimm, const char *cmd)
+{
+   struct ndctl_ctx *ctx = ndctl_dimm_get_ctx(dimm);
+   char *path = dimm->dimm_buf;
+   int len = dimm->buf_len;
+
+   if (snprintf(path, len, "%s/security", dimm->dimm_path) >= len) {
+   err(ctx, "%s: buffer too small!\n",
+   ndctl_dimm_get_devname(dimm));
+   return -ERANGE;
+   }
+
+   return sysfs_write_attr(ctx, path, cmd);
+}
+
+NDCTL_EXPORT int ndctl_dimm_set_change_key(struct ndctl_dimm *dimm)
+{
+   return ndctl_dimm_write_security(dimm, "update");
+}
diff --git a/ndctl/lib/libndctl.sym b/ndctl/lib/libndctl.sym
index 8040d7de..e2a359b8 100644
--- a/ndctl/lib/libndctl.sym
+++ b/ndctl/lib/libndctl.sym
@@ -372,4 +372,5 @@ LIBNDCTL_17 {
 global:
ndctl_dimm_smart_inject_supported;
ndctl_dimm_get_security_state;
+   ndctl_dimm_set_change_key;
 } LIBNDCTL_16;
diff --git a/ndctl/libndctl.h b/ndctl/libndctl.h

[PATCH v2 0/6] ndctl: add security support

2018-07-27 Thread Dave Jiang
The following series implements mechanisms that utilize the sysfs knobs
provided by the kernel in order to support the Intel DSM v1.7 spec
that provides security to NVDIMM. The following abilities are added:
1. display security state
2. update security
3. disable security
4. freeze security
5. secure erase

Also a reference helper app is provided to retrieve security information
through the keyutils and kernel key management API.

v2:
- Fixup the upcall util to match recent kernel updates for nvdimm security.

---

Dave Jiang (6):
  ndctl: add support for display security state
  ndctl: add update to security support
  ndctl: add disable security support
  ndctl: add support for freeze security
  ndctl: add support for secure erase
  ndctl: add request-key upcall reference app


 Documentation/ndctl/Makefile.am|7 +
 Documentation/ndctl/ndctl-disable-security.txt |   21 
 Documentation/ndctl/ndctl-freeze-security.txt  |   21 
 Documentation/ndctl/ndctl-list.txt |8 +
 Documentation/ndctl/ndctl-secure-erase.txt |   21 
 Documentation/ndctl/ndctl-update-security.txt  |   21 
 Documentation/ndctl/nvdimm-upcall.txt  |   33 ++
 builtin.h  |4 +
 configure.ac   |1 
 ndctl.spec.in  |2 
 ndctl/Makefile.am  |5 +
 ndctl/dimm.c   |   87 +++
 ndctl/lib/dimm.c   |   51 +
 ndctl/lib/libndctl.sym |5 +
 ndctl/libndctl.h   |5 +
 ndctl/ndctl.c  |4 +
 ndctl/nvdimm-upcall.c  |  138 
 util/json.c|8 +
 18 files changed, 441 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/ndctl/ndctl-disable-security.txt
 create mode 100644 Documentation/ndctl/ndctl-freeze-security.txt
 create mode 100644 Documentation/ndctl/ndctl-secure-erase.txt
 create mode 100644 Documentation/ndctl/ndctl-update-security.txt
 create mode 100644 Documentation/ndctl/nvdimm-upcall.txt
 create mode 100644 ndctl/nvdimm-upcall.c

--
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH v2 3/6] ndctl: add disable security support

2018-07-27 Thread Dave Jiang
Add support for disable security to libndctl and also command line option
of "disable-security" for ndctl. This provides a way to disable security
on the nvdimm. ndctl does not handle the actual processing of the
passphrase. It only starts the request.

Signed-off-by: Dave Jiang 
---
 Documentation/ndctl/Makefile.am|1 +
 Documentation/ndctl/ndctl-disable-security.txt |   21 +
 builtin.h  |1 +
 ndctl/dimm.c   |   22 ++
 ndctl/lib/dimm.c   |5 +
 ndctl/lib/libndctl.sym |1 +
 ndctl/libndctl.h   |1 +
 ndctl/ndctl.c  |1 +
 8 files changed, 53 insertions(+)
 create mode 100644 Documentation/ndctl/ndctl-disable-security.txt

diff --git a/Documentation/ndctl/Makefile.am b/Documentation/ndctl/Makefile.am
index 02405c44..b1cb3b5b 100644
--- a/Documentation/ndctl/Makefile.am
+++ b/Documentation/ndctl/Makefile.am
@@ -47,6 +47,7 @@ man1_MANS = \
ndctl-inject-smart.1 \
ndctl-update-firmware.1 \
ndctl-update-security.1 \
+   ndctl-disable-security.1 \
ndctl-list.1
 
 CLEANFILES = $(man1_MANS)
diff --git a/Documentation/ndctl/ndctl-disable-security.txt 
b/Documentation/ndctl/ndctl-disable-security.txt
new file mode 100644
index ..651f8d03
--- /dev/null
+++ b/Documentation/ndctl/ndctl-disable-security.txt
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+
+ndctl-disable-security(1)
+
+
+NAME
+
+ndctl-disable-security - enabling or disable security for an NVDIMM
+
+SYNOPSIS
+
+[verse]
+'ndctl disable-security' 
+
+DESCRIPTION
+---
+Provide a generic interface for disabling security for NVDIMM. The use of this
+depends on support from the underlying libndctl, kernel, as well as the
+platform itself.
+
+include::../copyright.txt[]
diff --git a/builtin.h b/builtin.h
index e6dc5340..e8ba9aee 100644
--- a/builtin.h
+++ b/builtin.h
@@ -48,4 +48,5 @@ int cmd_bat(int argc, const char **argv, void *ctx);
 int cmd_update_firmware(int argc, const char **argv, void *ctx);
 int cmd_inject_smart(int argc, const char **argv, void *ctx);
 int cmd_update_security(int argc, const char **argv, void *ctx);
+int cmd_disable_security(int argc, const char **argv, void *ctx);
 #endif /* _NDCTL_BUILTIN_H_ */
diff --git a/ndctl/dimm.c b/ndctl/dimm.c
index 274714d6..e64f6717 100644
--- a/ndctl/dimm.c
+++ b/ndctl/dimm.c
@@ -833,6 +833,18 @@ static int action_security_update(struct ndctl_dimm *dimm,
return rc;
 }
 
+static int action_security_disable(struct ndctl_dimm *dimm,
+   struct action_context *actx)
+{
+   int rc;
+
+   rc = ndctl_dimm_disable_security(dimm);
+   if (rc < 0)
+   error("Failed to disable security for %s\n",
+   ndctl_dimm_get_devname(dimm));
+   return rc;
+}
+
 static struct parameters {
const char *bus;
const char *outfile;
@@ -1200,3 +1212,13 @@ int cmd_update_security(int argc, const char **argv, 
void *ctx)
count > 1 ? "s" : "");
return count >= 0 ? 0 : EXIT_FAILURE;
 }
+
+int cmd_disable_security(int argc, const char **argv, void *ctx)
+{
+   int count = dimm_action(argc, argv, ctx, action_security_disable, 
base_options,
+   "ndctl disable-security  [..] 
[]");
+
+   fprintf(stderr, "security disabled %d nmem%s.\n", count >= 0 ? count : 
0,
+   count > 1 ? "s" : "");
+   return count >= 0 ? 0 : EXIT_FAILURE;
+}
diff --git a/ndctl/lib/dimm.c b/ndctl/lib/dimm.c
index ca4d439b..13c2806c 100644
--- a/ndctl/lib/dimm.c
+++ b/ndctl/lib/dimm.c
@@ -615,3 +615,8 @@ NDCTL_EXPORT int ndctl_dimm_set_change_key(struct 
ndctl_dimm *dimm)
 {
return ndctl_dimm_write_security(dimm, "update");
 }
+
+NDCTL_EXPORT int ndctl_dimm_disable_security(struct ndctl_dimm *dimm)
+{
+   return ndctl_dimm_write_security(dimm, "disable");
+}
diff --git a/ndctl/lib/libndctl.sym b/ndctl/lib/libndctl.sym
index e2a359b8..1a919567 100644
--- a/ndctl/lib/libndctl.sym
+++ b/ndctl/lib/libndctl.sym
@@ -373,4 +373,5 @@ global:
ndctl_dimm_smart_inject_supported;
ndctl_dimm_get_security_state;
ndctl_dimm_set_change_key;
+   ndctl_dimm_disable_security;
 } LIBNDCTL_16;
diff --git a/ndctl/libndctl.h b/ndctl/libndctl.h
index 663847b6..1ca6cb18 100644
--- a/ndctl/libndctl.h
+++ b/ndctl/libndctl.h
@@ -661,6 +661,7 @@ struct ndctl_cmd 
*ndctl_dimm_cmd_new_ack_shutdown_count(struct ndctl_dimm *dimm)
 int ndctl_dimm_fw_update_supported(struct ndctl_dimm *dimm);
 int ndctl_dimm_get_security_state(struct ndctl_dimm *dimm, char *state);
 int ndctl_dimm_set_change_key(struct ndctl_dimm *dimm);
+int ndctl_dimm_disable_security(struct ndctl_dimm *dimm);
 
 #ifdef __cplusplus
 } /* extern "C" */
diff --git 

[PATCH v2 4/6] ndctl: add support for freeze security

2018-07-27 Thread Dave Jiang
Add support for freeze security to libndctl and also command line option
of "freeze-security" for ndctl. This will lock the ability to make changes
to the NVDIMM security.

Signed-off-by: Dave Jiang 
---
 Documentation/ndctl/Makefile.am   |1 +
 Documentation/ndctl/ndctl-freeze-security.txt |   21 +
 builtin.h |1 +
 ndctl/dimm.c  |   22 ++
 ndctl/lib/dimm.c  |5 +
 ndctl/lib/libndctl.sym|1 +
 ndctl/libndctl.h  |1 +
 ndctl/ndctl.c |1 +
 8 files changed, 53 insertions(+)
 create mode 100644 Documentation/ndctl/ndctl-freeze-security.txt

diff --git a/Documentation/ndctl/Makefile.am b/Documentation/ndctl/Makefile.am
index b1cb3b5b..98348178 100644
--- a/Documentation/ndctl/Makefile.am
+++ b/Documentation/ndctl/Makefile.am
@@ -48,6 +48,7 @@ man1_MANS = \
ndctl-update-firmware.1 \
ndctl-update-security.1 \
ndctl-disable-security.1 \
+   ndctl-freeze-security.1 \
ndctl-list.1
 
 CLEANFILES = $(man1_MANS)
diff --git a/Documentation/ndctl/ndctl-freeze-security.txt 
b/Documentation/ndctl/ndctl-freeze-security.txt
new file mode 100644
index ..343bb019
--- /dev/null
+++ b/Documentation/ndctl/ndctl-freeze-security.txt
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+
+ndctl-freeze-security(1)
+
+
+NAME
+
+ndctl-freeze-security - enabling or freeze the security for an NVDIMM
+
+SYNOPSIS
+
+[verse]
+'ndctl freeze-security' 
+
+DESCRIPTION
+---
+Provide a generic interface to freeze the security for NVDIMM. The use of this
+depends on support from the underlying libndctl, kernel, as well as the
+platform itself.
+
+include::../copyright.txt[]
diff --git a/builtin.h b/builtin.h
index e8ba9aee..6e6ae069 100644
--- a/builtin.h
+++ b/builtin.h
@@ -49,4 +49,5 @@ int cmd_update_firmware(int argc, const char **argv, void 
*ctx);
 int cmd_inject_smart(int argc, const char **argv, void *ctx);
 int cmd_update_security(int argc, const char **argv, void *ctx);
 int cmd_disable_security(int argc, const char **argv, void *ctx);
+int cmd_freeze_security(int argc, const char **argv, void *ctx);
 #endif /* _NDCTL_BUILTIN_H_ */
diff --git a/ndctl/dimm.c b/ndctl/dimm.c
index e64f6717..49097cff 100644
--- a/ndctl/dimm.c
+++ b/ndctl/dimm.c
@@ -845,6 +845,18 @@ static int action_security_disable(struct ndctl_dimm *dimm,
return rc;
 }
 
+static int action_security_freeze(struct ndctl_dimm *dimm,
+   struct action_context *actx)
+{
+   int rc;
+
+   rc = ndctl_dimm_freeze_security(dimm);
+   if (rc < 0)
+   error("Failed to freeze security for %s\n",
+   ndctl_dimm_get_devname(dimm));
+   return rc;
+}
+
 static struct parameters {
const char *bus;
const char *outfile;
@@ -1222,3 +1234,13 @@ int cmd_disable_security(int argc, const char **argv, 
void *ctx)
count > 1 ? "s" : "");
return count >= 0 ? 0 : EXIT_FAILURE;
 }
+
+int cmd_freeze_security(int argc, const char **argv, void *ctx)
+{
+   int count = dimm_action(argc, argv, ctx, action_security_freeze, 
base_options,
+   "ndctl freeze-security  [..] 
[]");
+
+   fprintf(stderr, "security freezed %d nmem%s.\n", count >= 0 ? count : 0,
+   count > 1 ? "s" : "");
+   return count >= 0 ? 0 : EXIT_FAILURE;
+}
diff --git a/ndctl/lib/dimm.c b/ndctl/lib/dimm.c
index 13c2806c..2ff68daa 100644
--- a/ndctl/lib/dimm.c
+++ b/ndctl/lib/dimm.c
@@ -620,3 +620,8 @@ NDCTL_EXPORT int ndctl_dimm_disable_security(struct 
ndctl_dimm *dimm)
 {
return ndctl_dimm_write_security(dimm, "disable");
 }
+
+NDCTL_EXPORT int ndctl_dimm_freeze_security(struct ndctl_dimm *dimm)
+{
+   return ndctl_dimm_write_security(dimm, "freeze");
+}
diff --git a/ndctl/lib/libndctl.sym b/ndctl/lib/libndctl.sym
index 1a919567..613d9bb2 100644
--- a/ndctl/lib/libndctl.sym
+++ b/ndctl/lib/libndctl.sym
@@ -374,4 +374,5 @@ global:
ndctl_dimm_get_security_state;
ndctl_dimm_set_change_key;
ndctl_dimm_disable_security;
+   ndctl_dimm_freeze_security;
 } LIBNDCTL_16;
diff --git a/ndctl/libndctl.h b/ndctl/libndctl.h
index 1ca6cb18..4615ba09 100644
--- a/ndctl/libndctl.h
+++ b/ndctl/libndctl.h
@@ -662,6 +662,7 @@ int ndctl_dimm_fw_update_supported(struct ndctl_dimm *dimm);
 int ndctl_dimm_get_security_state(struct ndctl_dimm *dimm, char *state);
 int ndctl_dimm_set_change_key(struct ndctl_dimm *dimm);
 int ndctl_dimm_disable_security(struct ndctl_dimm *dimm);
+int ndctl_dimm_freeze_security(struct ndctl_dimm *dimm);
 
 #ifdef __cplusplus
 } /* extern "C" */
diff --git a/ndctl/ndctl.c b/ndctl/ndctl.c
index eeef41da..5a1a81e4 100644
--- a/ndctl/ndctl.c
+++ b/ndctl/ndctl.c
@@ -90,6 +90,7 @@ 

[PATCH v2 1/6] ndctl: add support for display security state

2018-07-27 Thread Dave Jiang
Adding libndctl API call for retrieving security state for a DIMM and also
adding support to ndctl list for displaying security state.

Signed-off-by: Dave Jiang 
---
 Documentation/ndctl/ndctl-list.txt |8 
 ndctl/lib/dimm.c   |   16 
 ndctl/lib/libndctl.sym |1 +
 ndctl/libndctl.h   |1 +
 util/json.c|8 
 5 files changed, 34 insertions(+)

diff --git a/Documentation/ndctl/ndctl-list.txt 
b/Documentation/ndctl/ndctl-list.txt
index 13ebdcdd..57eac1fe 100644
--- a/Documentation/ndctl/ndctl-list.txt
+++ b/Documentation/ndctl/ndctl-list.txt
@@ -97,6 +97,14 @@ include::xable-region-options.txt[]
 -D::
 --dimms::
Include dimm info in the listing
+[verse]
+{
+  "dev":"nmem0",
+  "id":"cdab-0a-07e0-",
+  "handle":0,
+  "phys_id":0,
+  "security_state:":"disabled"
+}
 
 -H::
 --health::
diff --git a/ndctl/lib/dimm.c b/ndctl/lib/dimm.c
index b3e032e0..8ed58559 100644
--- a/ndctl/lib/dimm.c
+++ b/ndctl/lib/dimm.c
@@ -579,3 +579,19 @@ NDCTL_EXPORT unsigned long ndctl_dimm_get_available_labels(
 
return strtoul(buf, NULL, 0);
 }
+
+NDCTL_EXPORT int ndctl_dimm_get_security_state(struct ndctl_dimm *dimm,
+   char *state)
+{
+   struct ndctl_ctx *ctx = ndctl_dimm_get_ctx(dimm);
+   char *path = dimm->dimm_buf;
+   int len = dimm->buf_len;
+
+   if (snprintf(path, len, "%s/security", dimm->dimm_path) >= len) {
+   err(ctx, "%s: buffer too small!\n",
+   ndctl_dimm_get_devname(dimm));
+   return -ERANGE;
+   }
+
+   return sysfs_read_attr(ctx, path, state);
+}
diff --git a/ndctl/lib/libndctl.sym b/ndctl/lib/libndctl.sym
index 8932ef66..8040d7de 100644
--- a/ndctl/lib/libndctl.sym
+++ b/ndctl/lib/libndctl.sym
@@ -371,4 +371,5 @@ global:
 LIBNDCTL_17 {
 global:
ndctl_dimm_smart_inject_supported;
+   ndctl_dimm_get_security_state;
 } LIBNDCTL_16;
diff --git a/ndctl/libndctl.h b/ndctl/libndctl.h
index 8a96c846..d48513ce 100644
--- a/ndctl/libndctl.h
+++ b/ndctl/libndctl.h
@@ -659,6 +659,7 @@ unsigned long long ndctl_cmd_fw_fquery_get_fw_rev(struct 
ndctl_cmd *cmd);
 enum ND_FW_STATUS ndctl_cmd_fw_xlat_firmware_status(struct ndctl_cmd *cmd);
 struct ndctl_cmd *ndctl_dimm_cmd_new_ack_shutdown_count(struct ndctl_dimm 
*dimm);
 int ndctl_dimm_fw_update_supported(struct ndctl_dimm *dimm);
+int ndctl_dimm_get_security_state(struct ndctl_dimm *dimm, char *state);
 
 #ifdef __cplusplus
 } /* extern "C" */
diff --git a/util/json.c b/util/json.c
index 13324588..90f05528 100644
--- a/util/json.c
+++ b/util/json.c
@@ -164,6 +164,7 @@ struct json_object *util_dimm_to_json(struct ndctl_dimm 
*dimm,
unsigned int handle = ndctl_dimm_get_handle(dimm);
unsigned short phys_id = ndctl_dimm_get_phys_id(dimm);
struct json_object *jobj;
+   char security_state[32];
 
if (!jdimm)
return NULL;
@@ -243,6 +244,13 @@ struct json_object *util_dimm_to_json(struct ndctl_dimm 
*dimm,
json_object_object_add(jdimm, "flag_smart_event", jobj);
}
 
+   if (ndctl_dimm_get_security_state(dimm, security_state) == 0) {
+   jobj = json_object_new_string(security_state);
+   if (!jobj)
+   goto err;
+   json_object_object_add(jdimm, "security_state:", jobj);
+   }
+
return jdimm;
  err:
json_object_put(jdimm);

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH] ipc/shm.c add ->pagesize function to shm_vm_ops

2018-07-27 Thread Andrew Morton
On Fri, 27 Jul 2018 15:17:27 -0600 Jane Chu  wrote:

> Commit 05ea88608d4e13 (mm, hugetlbfs: introduce ->pagesize() to
> vm_operations_struct) adds a new ->pagesize() function to
> hugetlb_vm_ops, intended to cover all hugetlbfs backed files.

That was merged three months ago.  Can you suggest why this was only
noticed now?

What workload triggered this?  I see no cc:stable, but 4.17 is affected?

> With System V shared memory model, if "huge page" is specified,
> the "shared memory" is backed by hugetlbfs files, but the mappings
> initiated via shmget/shmat have their original vm_ops overwritten
> with shm_vm_ops, so we need to add a ->pagesize function to shm_vm_ops.
> Otherwise, vma_kernel_pagesize() returns PAGE_SIZE given a hugetlbfs
> backed vma, result in below BUG:
> 
> fs/hugetlbfs/inode.c
> 443 if (unlikely(page_mapped(page))) {
> 444 BUG_ON(truncate_op);

OK, help me out here.  How does an incorrect return value from
vma_kernel_pagesize() result in remove_inode_hugepages() deciding that
it's truncating a mapped page?


___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH] ipc/shm.c add ->pagesize function to shm_vm_ops

2018-07-27 Thread Mike Kravetz
On 07/27/2018 02:17 PM, Jane Chu wrote:
> Commit 05ea88608d4e13 (mm, hugetlbfs: introduce ->pagesize() to
> vm_operations_struct) adds a new ->pagesize() function to
> hugetlb_vm_ops, intended to cover all hugetlbfs backed files.

Thanks Jane!
Adding Dan on Cc as he authored 05ea88608d4e13.  Note that this is the
same type of omission that was made when adding ->split() to
vm_operations_struct. :(  That is why I suggested adding the comment
above vm_operations_struct.

> With System V shared memory model, if "huge page" is specified,
> the "shared memory" is backed by hugetlbfs files, but the mappings
> initiated via shmget/shmat have their original vm_ops overwritten
> with shm_vm_ops, so we need to add a ->pagesize function to shm_vm_ops.
> Otherwise, vma_kernel_pagesize() returns PAGE_SIZE given a hugetlbfs
> backed vma, result in below BUG:
> 
> fs/hugetlbfs/inode.c
> 443 if (unlikely(page_mapped(page))) {
> 444 BUG_ON(truncate_op);
> 
> [  242.268342] hugetlbfs: oracle (4592): Using mlock ulimits for SHM_HUGETLB 
> is deprecated
> [  282.653208] [ cut here ]
> [  282.708447] kernel BUG at fs/hugetlbfs/inode.c:444!
> [  282.818957] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 ...
> [  284.025873] CPU: 35 PID: 5583 Comm: oracle_5583_sbt Not tainted 
> 4.14.35-1829.el7uek.x86_64 #2
> [  284.246609] task: 9bf0507aaf80 task.stack: a9e625628000
> [  284.317455] RIP: 0010:remove_inode_hugepages+0x3db/0x3e2
> 
> [  285.292389] Call Trace:
> [  285.321630]  hugetlbfs_evict_inode+0x1e/0x3e
> [  285.372707]  evict+0xdb/0x1af
> [  285.408185]  iput+0x1a2/0x1f7
> [  285.443661]  dentry_unlink_inode+0xc6/0xf0
> [  285.492661]  __dentry_kill+0xd8/0x18d
> [  285.536459]  dput+0x1b5/0x1ed
> [  285.571939]  __fput+0x18b/0x216
> [  285.609495]  fput+0xe/0x10
> [  285.646030]  task_work_run+0x90/0xa7
> [  285.688788]  exit_to_usermode_loop+0xdd/0x116
> [  285.740905]  do_syscall_64+0x187/0x1ae
> [  285.785740]  entry_SYSCALL_64_after_hwframe+0x150/0x0

We will need the tag,
Fixes: 05ea88608d4e13 ("mm, hugetlbfs: introduce ->pagesize() to 
vm_operations_struct")
and,
CC: sta...@vger.kernel.org

> Suggested-by: Mike Kravetz 
> Signed-off-by: Jane Chu 
> ---
>  include/linux/mm.h |  7 +++
>  ipc/shm.c  | 12 
>  2 files changed, 19 insertions(+)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index a0fbb9ffe380..0c759379f2d9 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -387,6 +387,13 @@ enum page_entry_size {
>   * These are the virtual MM functions - opening of an area, closing and
>   * unmapping it (needed to keep files on disk up-to-date etc), pointer
>   * to the functions called when a no-page or a wp-page exception occurs.
> + *
> + * Note, when a new function is introduced to vm_operations_struct and
> + * added to hugetlb_vm_ops, please consider adding the function to
> + * shm_vm_ops. This is because under System V memory model, though
> + * mappings created via shmget/shmat with "huge page" specified are
> + * backed by hugetlbfs files, their original vm_ops are overwritten with
> + * shm_vm_ops.
>   */
>  struct vm_operations_struct {
>   void (*open)(struct vm_area_struct * area);
> diff --git a/ipc/shm.c b/ipc/shm.c
> index 051a3e1fb8df..fefa00d310fb 100644
> --- a/ipc/shm.c
> +++ b/ipc/shm.c
> @@ -427,6 +427,17 @@ static int shm_split(struct vm_area_struct *vma, 
> unsigned long addr)
>   return 0;
>  }
>  
> +static unsigned long shm_pagesize(struct vm_area_struct *vma)
> +{
> + struct file *file = vma->vm_file;
> + struct shm_file_data *sfd = shm_file_data(file);
> +
> + if (sfd->vm_ops->pagesize)
> + return sfd->vm_ops->pagesize(vma);
> +
> + return PAGE_SIZE;
> +}
> +
>  #ifdef CONFIG_NUMA
>  static int shm_set_policy(struct vm_area_struct *vma, struct mempolicy *new)
>  {
> @@ -554,6 +565,7 @@ static const struct vm_operations_struct shm_vm_ops = {
>   .close  = shm_close,/* callback for when the vm-area is released */
>   .fault  = shm_fault,
>   .split  = shm_split,
> + .pagesize = shm_pagesize,
>  #if defined(CONFIG_NUMA)
>   .set_policy = shm_set_policy,
>   .get_policy = shm_get_policy,
> 

Reviewed-by: Mike Kravetz 
-- 
Mike Kravetz
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[ndctl PATCH] ndctl inject-smart: add an option to uninject smart fields

2018-07-27 Thread Vishal Verma
The smart injection command in ndctl was missing an option to uninject
the different inject-able fields. The APIs in libndctl already allow for
this by setting the 'enable' flag to false. Add a *-uninject argument
for each of the injectable fields that sets this flag and disables
injected values. Also add an "uninject-all" option to uninject all
fields.

Signed-off-by: Vishal Verma 
---
 Documentation/ndctl/ndctl-inject-smart.txt | 20 ++
 ndctl/inject-smart.c   | 61 --
 2 files changed, 78 insertions(+), 3 deletions(-)

diff --git a/Documentation/ndctl/ndctl-inject-smart.txt 
b/Documentation/ndctl/ndctl-inject-smart.txt
index 20a1621..d28be46 100644
--- a/Documentation/ndctl/ndctl-inject-smart.txt
+++ b/Documentation/ndctl/ndctl-inject-smart.txt
@@ -54,6 +54,9 @@ OPTIONS
Enable or disable the smart media temperature alarm. Options are
'on' or 'off'.
 
+--media-temperature-uninject::
+   Uninject any media temperature previously injected.
+
 -c::
 --ctrl-temperature=::
Inject  for the controller temperature smart attribute.
@@ -66,6 +69,9 @@ OPTIONS
Enable or disable the smart controller temperature alarm. Options are
'on' or 'off'.
 
+--ctrl-temperature-uninject::
+   Uninject any controller temperature previously injected.
+
 -s::
 --spares=::
Inject  for the spares smart attribute.
@@ -77,14 +83,28 @@ OPTIONS
 --spares-alarm=::
Enable or disable the smart spares alarm. Options are 'on' or 'off'.
 
+--spares-uninject::
+   Uninject any spare percentage previously injected.
+
 -f::
 --fatal::
Set the flag to spoof fatal health status.
 
+--fatal-uninject::
+   Uninject the fatal health status flag.
+
 -U::
 --unsafe-shutdown::
Set the flag to spoof an unsafe shutdown on the next power down.
 
+--unsafe-shutdown-uninject::
+   Uninject the unsafe shutdown flag.
+
+-N::
+--uninject-all::
+   Uninject all possible smart fields/values irrespective of whether
+   they have been previously injected or not.
+
 -v::
 --verbose::
Emit debug messages for the error injection process
diff --git a/ndctl/inject-smart.c b/ndctl/inject-smart.c
index edb87e1..f5a3a6f 100644
--- a/ndctl/inject-smart.c
+++ b/ndctl/inject-smart.c
@@ -44,6 +44,12 @@ static struct parameters {
const char *spares_alarm;
bool fatal;
bool unsafe_shutdown;
+   bool media_temperature_uninject;
+   bool ctrl_temperature_uninject;
+   bool spares_uninject;
+   bool fatal_uninject;
+   bool unsafe_shutdown_uninject;
+   bool uninject_all;
 } param;
 
 static struct smart_ctx {
@@ -76,6 +82,8 @@ OPT_STRING('M', "media-temperature-threshold", \
 OPT_STRING('\0', "media-temperature-alarm", _temperature_alarm, \
"smart media temperature alarm", \
"enable or disable the smart media temperature alarm"), \
+OPT_BOOLEAN('\0', "media-temperature-uninject", \
+   _temperature_uninject, "uninject media temperature"), \
 OPT_STRING('c', "ctrl-temperature", _temperature, \
"smart controller temperature attribute", \
"inject a value for smart controller temperature"), \
@@ -86,6 +94,8 @@ OPT_STRING('C', "ctrl-temperature-threshold", \
 OPT_STRING('\0', "ctrl-temperature-alarm", _temperature_alarm, \
"smart controller temperature alarm", \
"enable or disable the smart controller temperature alarm"), \
+OPT_BOOLEAN('\0', "ctrl-temperature-uninject", \
+   _temperature_uninject, "uninject controller temperature"), \
 OPT_STRING('s', "spares", , \
"smart spares attribute", \
"inject a value for smart spares"), \
@@ -95,9 +105,17 @@ OPT_STRING('S', "spares-threshold", 
_threshold, \
 OPT_STRING('\0', "spares-alarm", _alarm, \
"smart spares alarm", \
"enable or disable the smart spares alarm"), \
+OPT_BOOLEAN('\0', "spares-uninject", \
+   _uninject, "uninject spare percentage"), \
 OPT_BOOLEAN('f', "fatal", , "inject fatal smart health status"), \
+OPT_BOOLEAN('F', "fatal-uninject", \
+   _uninject, "uninject fatal health condition"), \
 OPT_BOOLEAN('U', "unsafe-shutdown", _shutdown, \
-   "inject smart unsafe shutdown status")
+   "inject smart unsafe shutdown status"), \
+OPT_BOOLEAN('\0', "unsafe-shutdown-uninject", \
+   _shutdown_uninject, "uninject unsafe shutdown status"), \
+OPT_BOOLEAN('N', "uninject-all", \
+   _all, "uninject all possible fields")
 
 
 static const struct option smart_opts[] = {
@@ -183,6 +201,21 @@ static inline void enable_inject(void)
} \
 }
 
+#define smart_param_setup_uninj(arg) \
+{ \
+   if (param.arg##_uninject) { \
+   /* Ensure user didn't set inject and uninject together */ \
+   if (param.arg) { \
+   error("Cannot use %s inject and uninject together\n", \
+   #arg); \
+   return -EINVAL; \
+   } \

ndctl create-namespace region order

2018-07-27 Thread Elliott, Robert (Persistent Memory)
ndctl create-namespace doesn't walk through the regions
from lowest to highest; in this example it's going
1,3,5,0,2,4,6:

$ sudo ndctl create-namespace --verbose
ROB: parsing bus id=0 path=ndbus0
ROB: parsing region id=1 path=region1
namespace_create:770: region1: insufficient capacity size: 0 avail: 0
ROB: parsing region id=3 path=region3
namespace_create:770: region3: insufficient capacity size: 0 avail: 0
ROB: parsing region id=5 path=region5
namespace_create:770: region5: insufficient capacity size: 0 avail: 0
ROB: parsing region id=0 path=region0
namespace_create:770: region0: insufficient capacity size: 0 avail: 0
ROB: parsing region id=2 path=region2

That makes the results unpredictable.

That's from do_xaction_namespace():

ndctl_bus_foreach(ctx, bus) {
printf("ROB: parsing bus id=%d path=%s\n", ndctl_bus_get_id(bus), 
ndctl_bus_get_devname(bus));
if (!util_bus_filter(bus, param.bus))
continue;

ndctl_region_foreach(bus, region) {
printf("ROB: parsing region id=%d path=%s\n", ndctl_region_get_id(region), 
ndctl_region_get_devname(region));
if (!util_region_filter(region, param.region))

The same order is used for other commands like
disable-namespace and destroy-namespace.

Questions
=
1. Could the busses and regions (anything accessed with
a _foreach() macro) be sorted in ascending order so this
and other commands are more predictable?

2. For create-namespace with no region arguments, rather than
just creating one random namespace, how about creating namespaces
on each available region?

3. The manpage suggests that "-r all" does that, but it seems to 
just make every region number pass the filter in that foreach
loop (i.e., it has no effect for the create-namespace command,
which just creates one and quits).

---
Robert Elliott, HPE Persistent Memory

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH] ipc/shm.c add ->pagesize function to shm_vm_ops

2018-07-27 Thread Jane Chu
Commit 05ea88608d4e13 (mm, hugetlbfs: introduce ->pagesize() to
vm_operations_struct) adds a new ->pagesize() function to
hugetlb_vm_ops, intended to cover all hugetlbfs backed files.

With System V shared memory model, if "huge page" is specified,
the "shared memory" is backed by hugetlbfs files, but the mappings
initiated via shmget/shmat have their original vm_ops overwritten
with shm_vm_ops, so we need to add a ->pagesize function to shm_vm_ops.
Otherwise, vma_kernel_pagesize() returns PAGE_SIZE given a hugetlbfs
backed vma, result in below BUG:

fs/hugetlbfs/inode.c
443 if (unlikely(page_mapped(page))) {
444 BUG_ON(truncate_op);

[  242.268342] hugetlbfs: oracle (4592): Using mlock ulimits for SHM_HUGETLB is 
deprecated
[  282.653208] [ cut here ]
[  282.708447] kernel BUG at fs/hugetlbfs/inode.c:444!
[  282.818957] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 ...
[  284.025873] CPU: 35 PID: 5583 Comm: oracle_5583_sbt Not tainted 
4.14.35-1829.el7uek.x86_64 #2
[  284.246609] task: 9bf0507aaf80 task.stack: a9e625628000
[  284.317455] RIP: 0010:remove_inode_hugepages+0x3db/0x3e2

[  285.292389] Call Trace:
[  285.321630]  hugetlbfs_evict_inode+0x1e/0x3e
[  285.372707]  evict+0xdb/0x1af
[  285.408185]  iput+0x1a2/0x1f7
[  285.443661]  dentry_unlink_inode+0xc6/0xf0
[  285.492661]  __dentry_kill+0xd8/0x18d
[  285.536459]  dput+0x1b5/0x1ed
[  285.571939]  __fput+0x18b/0x216
[  285.609495]  fput+0xe/0x10
[  285.646030]  task_work_run+0x90/0xa7
[  285.688788]  exit_to_usermode_loop+0xdd/0x116
[  285.740905]  do_syscall_64+0x187/0x1ae
[  285.785740]  entry_SYSCALL_64_after_hwframe+0x150/0x0

Suggested-by: Mike Kravetz 
Signed-off-by: Jane Chu 
---
 include/linux/mm.h |  7 +++
 ipc/shm.c  | 12 
 2 files changed, 19 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a0fbb9ffe380..0c759379f2d9 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -387,6 +387,13 @@ enum page_entry_size {
  * These are the virtual MM functions - opening of an area, closing and
  * unmapping it (needed to keep files on disk up-to-date etc), pointer
  * to the functions called when a no-page or a wp-page exception occurs.
+ *
+ * Note, when a new function is introduced to vm_operations_struct and
+ * added to hugetlb_vm_ops, please consider adding the function to
+ * shm_vm_ops. This is because under System V memory model, though
+ * mappings created via shmget/shmat with "huge page" specified are
+ * backed by hugetlbfs files, their original vm_ops are overwritten with
+ * shm_vm_ops.
  */
 struct vm_operations_struct {
void (*open)(struct vm_area_struct * area);
diff --git a/ipc/shm.c b/ipc/shm.c
index 051a3e1fb8df..fefa00d310fb 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -427,6 +427,17 @@ static int shm_split(struct vm_area_struct *vma, unsigned 
long addr)
return 0;
 }
 
+static unsigned long shm_pagesize(struct vm_area_struct *vma)
+{
+   struct file *file = vma->vm_file;
+   struct shm_file_data *sfd = shm_file_data(file);
+
+   if (sfd->vm_ops->pagesize)
+   return sfd->vm_ops->pagesize(vma);
+
+   return PAGE_SIZE;
+}
+
 #ifdef CONFIG_NUMA
 static int shm_set_policy(struct vm_area_struct *vma, struct mempolicy *new)
 {
@@ -554,6 +565,7 @@ static const struct vm_operations_struct shm_vm_ops = {
.close  = shm_close,/* callback for when the vm-area is released */
.fault  = shm_fault,
.split  = shm_split,
+   .pagesize = shm_pagesize,
 #if defined(CONFIG_NUMA)
.set_policy = shm_set_policy,
.get_policy = shm_get_policy,
-- 
2.15.GIT

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[fstests PATCH v3] generic/999: test DAX DMA vs truncate/hole-punch

2018-07-27 Thread Ross Zwisler
From: Ross Zwisler 

This adds a regression test for the following series:

https://lists.01.org/pipermail/linux-nvdimm/2018-July/016842.html

which adds synchronization between DAX DMA in ext4 and truncate/hole-punch.
The intention of the test is to test those specific changes, but it runs
fine both with XFS and without DAX so I've put it in the generic tests
instead of ext4 and not restricted it to only DAX configurations.

When run with v4.18-rc6 + DAX + ext4, this test will hit the following
WARN_ON_ONCE() in dax_disassociate_entry():

WARN_ON_ONCE(trunc && page_ref_count(page) > 1);

If you change this to a WARN_ON() instead, you can see that each of the
four paths being exercised in this test hits that condition many times in
the one second that the subtest is being run.

Signed-off-by: Ross Zwisler 
---

Changes since v2:
 - Added detailed description to tests/generic/999 explaining the purpose
   of the test (Eryu).

 - Added _require_xfs_io_command for falloc, fpunch, fcollapse and fzero
   to account for filesystems that don't support these commands (Eryu).

 - Incorporated other feedback from Eryu.  Thanks, Eryu, for the review.

---
 .gitignore |   1 +
 src/Makefile   |   2 +-
 src/t_mmap_collision.c | 235 +
 tests/generic/999  |  64 ++
 tests/generic/999.out  |   2 +
 tests/generic/group|   1 +
 6 files changed, 304 insertions(+), 1 deletion(-)
 create mode 100644 src/t_mmap_collision.c
 create mode 100755 tests/generic/999
 create mode 100644 tests/generic/999.out

diff --git a/.gitignore b/.gitignore
index efc73a7c..ea1aac8a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -125,6 +125,7 @@
 /src/t_holes
 /src/t_immutable
 /src/t_locks_execve
+/src/t_mmap_collision
 /src/t_mmap_cow_race
 /src/t_mmap_dio
 /src/t_mmap_fallocate
diff --git a/src/Makefile b/src/Makefile
index 9e971bcc..41826585 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -16,7 +16,7 @@ TARGETS = dirstress fill fill2 getpagesize holes lstat64 \
holetest t_truncate_self t_mmap_dio af_unix t_mmap_stale_pmd \
t_mmap_cow_race t_mmap_fallocate fsync-err t_mmap_write_ro \
t_ext4_dax_journal_corruption t_ext4_dax_inline_corruption \
-   t_ofd_locks t_locks_execve
+   t_ofd_locks t_locks_execve t_mmap_collision
 
 LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
preallo_rw_pattern_writer ftrunc trunc fs_perms testx looptest \
diff --git a/src/t_mmap_collision.c b/src/t_mmap_collision.c
new file mode 100644
index ..d547bc05
--- /dev/null
+++ b/src/t_mmap_collision.c
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2018 Intel Corporation.
+ *
+ * As of kernel version 4.18-rc6 Linux has an issue with ext4+DAX where DMA
+ * and direct I/O operations aren't synchronized with respect to operations
+ * which can change the block mappings of an inode.  This means that we can
+ * schedule an I/O for an inode and have the block mapping for that inode
+ * change before the I/O is actually complete.  So, blocks which were once
+ * allocated to a given inode and then freed could still have I/O operations
+ * happening to them.  If these blocks have also been reallocated to a
+ * different inode, this interaction can lead to data corruption.
+ *
+ * This test exercises four of the paths in ext4 which hit this issue.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define PAGE(a) ((a)*0x1000)
+#define FILE_SIZE PAGE(4)
+
+void *dax_data;
+int nodax_fd;
+int dax_fd;
+bool done;
+
+#define err_exit(op)  \
+{ \
+   fprintf(stderr, "%s %s: %s\n", __func__, op, strerror(errno));\
+   exit(1);  \
+}
+
+#if defined(FALLOC_FL_PUNCH_HOLE) && defined(FALLOC_FL_KEEP_SIZE)
+void punch_hole_fn(void *ptr)
+{
+   ssize_t read;
+   int rc;
+
+   while (!done) {
+   read = 0;
+
+   do {
+   rc = pread(nodax_fd, dax_data + read, FILE_SIZE - read,
+   read);
+   if (rc > 0)
+   read += rc;
+   } while (rc > 0);
+
+   if (read != FILE_SIZE || rc != 0)
+   err_exit("pread");
+
+   rc = fallocate(dax_fd,
+   FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+   0, FILE_SIZE);
+   if (rc < 0)
+   err_exit("fallocate");
+
+   usleep(rand() % 1000);
+   }
+}
+#else
+void punch_hole_fn(void *ptr) { }
+#endif
+
+#if defined(FALLOC_FL_ZERO_RANGE) && defined(FALLOC_FL_KEEP_SIZE)
+void zero_range_fn(void *ptr)
+{

Re: [PATCH v2 5/6] md/dm-writecache: Don't request pointer dummy_addr when not required

2018-07-27 Thread Mike Snitzer
On Wed, Jul 25 2018 at 12:28pm -0400,
Huaisheng Ye  wrote:

> From: Huaisheng Ye 
> 
> Function persistent_memory_claim doesn't need to get local pointer
> dummy_addr from direct_access. Using NULL instead of having to pass
> in a useless local pointer that caller then just throw away.
> 
> Suggested-by: Ross Zwisler 
> Signed-off-by: Huaisheng Ye 

Acked-by: Mike Snitzer 
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH] acpi/nfit: queue issuing of ars when an uc error notification comes in

2018-07-27 Thread Verma, Vishal L


On Fri, 2018-07-27 at 09:04 -0700, Dave Jiang wrote:
> When the ACPI UC error notifier gets called and ARS_REQ bit is set
> with the passed in flag, we can receive -EBUSY if ARS_REQ bit is already
> set for the nfit_spa->ars_state. When that happens, the ARS request is
> dropped. That can potentially cause us to miss the unreported errors that
> the on going ARS request does not receive. Add an ARS_REQ_REDO state that
> will request short ARS upon ARS completion to grab any errors we missed.
> 
> Signed-off-by: Dave Jiang 
> ---
>  drivers/acpi/nfit/core.c |   12 +---
>  drivers/acpi/nfit/nfit.h |1 +
>  2 files changed, 10 insertions(+), 3 deletions(-)

Looks good to me, you can add:
Reviewed-by: Vishal Verma 

> 
> diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
> index 274f636cd3d2..def64259f9a7 100644
> --- a/drivers/acpi/nfit/core.c
> +++ b/drivers/acpi/nfit/core.c
> @@ -2567,7 +2567,12 @@ static void ars_complete(struct acpi_nfit_desc 
> *acpi_desc,
>   test_bit(ARS_SHORT, _spa->ars_state)
>   ? "short" : "long");
>   clear_bit(ARS_SHORT, _spa->ars_state);
> - set_bit(ARS_DONE, _spa->ars_state);
> + if (test_and_clear_bit(ARS_REQ_REDO, _spa->ars_state)) {
> + set_bit(ARS_SHORT, _spa->ars_state);
> + set_bit(ARS_REQ, _spa->ars_state);
> + dev_dbg(dev, "ARS: processing scrub request received while in 
> progress\n");
> + } else
> + set_bit(ARS_DONE, _spa->ars_state);
>  }
>  
>  static int ars_status_process_records(struct acpi_nfit_desc *acpi_desc)
> @@ -3264,9 +3269,10 @@ int acpi_nfit_ars_rescan(struct acpi_nfit_desc 
> *acpi_desc, unsigned long flags)
>   if (test_bit(ARS_FAILED, _spa->ars_state))
>   continue;
>  
> - if (test_and_set_bit(ARS_REQ, _spa->ars_state))
> + if (test_and_set_bit(ARS_REQ, _spa->ars_state)) {
>   busy++;
> - else {
> + set_bit(ARS_REQ_REDO, _spa->ars_state);
> + } else {
>   if (test_bit(ARS_SHORT, ))
>   set_bit(ARS_SHORT, _spa->ars_state);
>   scheduled++;
> diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h
> index a97ff42fe311..d1274ea2d251 100644
> --- a/drivers/acpi/nfit/nfit.h
> +++ b/drivers/acpi/nfit/nfit.h
> @@ -119,6 +119,7 @@ enum nfit_dimm_notifiers {
>  
>  enum nfit_ars_state {
>   ARS_REQ,
> + ARS_REQ_REDO,
>   ARS_DONE,
>   ARS_SHORT,
>   ARS_FAILED,
> 
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[ndctl PATCH] ndctl, inject-smart: Fix man page to match the current behavior

2018-07-27 Thread Vishal Verma
The inject-smart man page had a -H option for health status, which was
from a previous development iteration, and not how it actually works.
Fix it to advertise a -f flag for setting fatal status, and remove an
unnecessart '=' from the unsafe-shutdown option as it doesn't accept any
additional arguments.

Signed-off-by: Vishal Verma 
---
 Documentation/ndctl/ndctl-inject-smart.txt | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/Documentation/ndctl/ndctl-inject-smart.txt 
b/Documentation/ndctl/ndctl-inject-smart.txt
index 0dc6481..20a1621 100644
--- a/Documentation/ndctl/ndctl-inject-smart.txt
+++ b/Documentation/ndctl/ndctl-inject-smart.txt
@@ -77,13 +77,12 @@ OPTIONS
 --spares-alarm=::
Enable or disable the smart spares alarm. Options are 'on' or 'off'.
 
--H::
---health=::
-   Smart attribute for health status. Provide either 'fatal' or 'nominal'
-   to set the state of the attribute.
+-f::
+--fatal::
+   Set the flag to spoof fatal health status.
 
 -U::
---unsafe-shutdown=::
+--unsafe-shutdown::
Set the flag to spoof an unsafe shutdown on the next power down.
 
 -v::
-- 
2.14.4

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[ndctl PATCH] ndctl: deprecate undocumented short-options

2018-07-27 Thread Vishal Verma
The inject-smart and monitor man pages refrained from displaying short
options for various arguments (alarms, daemon) due to a lack of a coherent
letter that could be made to relate to the event name. It was expected
that the user will always use the long option for these. Since the
OPT_STRING helper refused to take an empty string for the short option,
we used bogus characters for each of them.

However there is a better way to provide no short options, and that is
by using '\0' for the short option field to OPT_STRING. Replace the
bogus characters with '\0' so that the 'short help' also becomes
consistent with the man pages.

Cc: Cc: QI Fuli 
Signed-off-by: Vishal Verma 
---
 ndctl/inject-smart.c | 7 ---
 ndctl/monitor.c  | 2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/ndctl/inject-smart.c b/ndctl/inject-smart.c
index 006ea2a..edb87e1 100644
--- a/ndctl/inject-smart.c
+++ b/ndctl/inject-smart.c
@@ -73,7 +73,7 @@ OPT_STRING('M', "media-temperature-threshold", \
_temperature_threshold, \
"set smart media temperature threshold", \
"set threshold value for smart media temperature"), \
-OPT_STRING('x', "media-temperature-alarm", _temperature_alarm, \
+OPT_STRING('\0', "media-temperature-alarm", _temperature_alarm, \
"smart media temperature alarm", \
"enable or disable the smart media temperature alarm"), \
 OPT_STRING('c', "ctrl-temperature", _temperature, \
@@ -83,7 +83,7 @@ OPT_STRING('C', "ctrl-temperature-threshold", \
_temperature_threshold, \
"set smart controller temperature threshold", \
"set threshold value for smart controller temperature"), \
-OPT_STRING('y', "ctrl-temperature-alarm", _temperature_alarm, \
+OPT_STRING('\0', "ctrl-temperature-alarm", _temperature_alarm, \
"smart controller temperature alarm", \
"enable or disable the smart controller temperature alarm"), \
 OPT_STRING('s', "spares", , \
@@ -92,13 +92,14 @@ OPT_STRING('s', "spares", , \
 OPT_STRING('S', "spares-threshold", _threshold, \
"set smart spares threshold", \
"set a threshold value for smart spares"), \
-OPT_STRING('z', "spares-alarm", _alarm, \
+OPT_STRING('\0', "spares-alarm", _alarm, \
"smart spares alarm", \
"enable or disable the smart spares alarm"), \
 OPT_BOOLEAN('f', "fatal", , "inject fatal smart health status"), \
 OPT_BOOLEAN('U', "unsafe-shutdown", _shutdown, \
"inject smart unsafe shutdown status")
 
+
 static const struct option smart_opts[] = {
SMART_OPTIONS(),
OPT_END(),
diff --git a/ndctl/monitor.c b/ndctl/monitor.c
index b97d1ea..c6419ad 100644
--- a/ndctl/monitor.c
+++ b/ndctl/monitor.c
@@ -583,7 +583,7 @@ int cmd_monitor(int argc, const char **argv, void *ctx)
"where to output the monitor's notification"),
OPT_FILENAME('c', "config-file", _file,
"config-file", "override the default config"),
-   OPT_BOOLEAN('x', "daemon", ,
+   OPT_BOOLEAN('\0', "daemon", ,
"run ndctl monitor as a daemon"),
OPT_BOOLEAN('u', "human", ,
"use human friendly output formats"),
-- 
2.14.4

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[ndctl PATCH] ndctl, documentation: document the label-version option for init-labels

2018-07-27 Thread Vishal Verma
Add the missing --label-verison option to the ndctl init-labels man
page.

Signed-off-by: Vishal Verma 
---
 Documentation/ndctl/ndctl-init-labels.txt | 5 +
 1 file changed, 5 insertions(+)

diff --git a/Documentation/ndctl/ndctl-init-labels.txt 
b/Documentation/ndctl/ndctl-init-labels.txt
index 736d52d..733ef0e 100644
--- a/Documentation/ndctl/ndctl-init-labels.txt
+++ b/Documentation/ndctl/ndctl-init-labels.txt
@@ -83,6 +83,11 @@ include::labels-options.txt[]
be an existing / valid namespace index.  Warning, this will
destroy all defined namespaces on the dimm.
 
+-V::
+--label-version::
+   Initialize with a specific version of labels from the namespace
+   label specification. Defaults to 1.1
+
 include::../copyright.txt[]
 
 SEE ALSO
-- 
2.14.4

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v4 0/2] ext4: fix DAX dma vs truncate/hole-punch

2018-07-27 Thread Ross Zwisler
+ fsdevel and the xfs list.

On Wed, Jul 25, 2018 at 4:28 PM Ross Zwisler
 wrote:
> On Wed, Jul 11, 2018 at 10:17:41AM +0200, Jan Kara wrote:
> > On Tue 10-07-18 13:10:29, Ross Zwisler wrote:
> > > Changes since v3:
> > >  * Added an ext4_break_layouts() call to ext4_insert_range() to ensure
> > >that the {ext4,xfs}_break_layouts() calls have the same meaning.
> > >(Dave, Darrick and Jan)
> >
> > How about the occasional WARN_ON_ONCE you mention below. Were you able to
> > hunt them down?
>
> The root cause of this issue is that while the ei->i_mmap_sem provides
> synchronization between ext4_break_layouts() and page faults, it doesn't
> provide synchronize us with the direct I/O path.  This exact same issue exists
> in XFS AFAICT, with the synchronization tool there being the XFS_MMAPLOCK.
>
> This allows the direct I/O path to do I/O and raise & lower page->_refcount
> while we're executing a truncate/hole punch.  This leads to us trying to free
> a page with an elevated refcount.
>
> Here's one instance of the race:
>
> CPU 0   CPU 1
> -   -
> ext4_punch_hole()
>   ext4_break_layouts() # all pages have refcount=1
>
> ext4_direct_IO()
>   ... lots of layers ...
>   follow_page_pte()
> get_page() # elevates refcount
>
>   truncate_pagecache_range()
>... a few layers ...
>dax_disassociate_entry() # sees elevated refcount, WARN_ON_ONCE()
>
> A similar race occurs when the refcount is being dropped while we're running
> ext4_break_layouts(), and this is the one that my test was actually hitting:
>
> CPU 0   CPU 1
> -   -
> ext4_direct_IO()
>   ... lots of layers ...
>   follow_page_pte()
> get_page()
> # elevates refcount of page X
> ext4_punch_hole()
>   ext4_break_layouts() # two pages, X & Y, have refcount == 2
> __wait_var_event() # called for page X
>
>   __put_devmap_managed_page()
>   # drops refcount of X to 1
>
># __wait_var_events() checks X's refcount in "if (condition)", and breaks.
># We never actually called ext4_wait_dax_page(), so 'retry' in
># ext4_break_layouts() is still false.  Exit do/while loop in
># ext4_break_layouts, never attempting to wait on page Y which still has an
># elevated refcount of 2.
>
>   truncate_pagecache_range()
>... a few layers ...
>dax_disassociate_entry() # sees elevated refcount for Y, WARN_ON_ONCE()
>
> This second race can be fixed with the patch at the end of this function,
> which I think should go in, unless there is a benfit to the current retry
> scheme which relies on the 'retry' variable in {ext4,xfs}_break_layouts()?
> With this patch applied I've been able to run my unit test through
> thousands of iterations, where it used to failed consistently within 10 or
> so.
>
> Even so, I wonder if the real solution is to add synchronization between
> the direct I/O path and {ext4,xfs}_break_layouts()?  Other ideas on how
> this should be handled?
>
> --- >8 ---
>
> From a4519b0f40362f0a63ae96acaf986092aff0f0d3 Mon Sep 17 00:00:00 2001
> From: Ross Zwisler 
> Date: Wed, 25 Jul 2018 16:16:05 -0600
> Subject: [PATCH] ext4: Close race between direct IO and ext4_break_layouts()
>
> If the refcount of a page is lowered between the time that it is returned
> by dax_busy_page() and when the refcount is again checked in
> ext4_break_layouts() => ___wait_var_event(), the waiting function
> ext4_wait_dax_page() will never be called.  This means that
> ext4_break_layouts() will still have 'retry' set to false, so we'll stop
> looping and never check the refcount of other pages in this inode.
>
> Instead, always continue looping as long as dax_layout_busy_page() gives us
> a page which it found with an elevated refcount.
>
> Note that this works around the race exposed by my unit test, but I think
> that there is another race that needs to be addressed, probably with
> additional synchronization added between direct I/O and
> {ext4,xfs}_break_layouts().
>
> Signed-off-by: Ross Zwisler 
> ---
>  fs/ext4/inode.c | 9 +++--
>  1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 27e9643bc13b..fedb88104bbf 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4193,9 +4193,8 @@ int ext4_update_disksize_before_punch(struct inode 
> *inode, loff_t offset,
> return 0;
>  }
>
> -static void ext4_wait_dax_page(struct ext4_inode_info *ei, bool *did_unlock)
> +static void 

[PATCH] acpi/nfit: queue issuing of ars when an uc error notification comes in

2018-07-27 Thread Dave Jiang
When the ACPI UC error notifier gets called and ARS_REQ bit is set
with the passed in flag, we can receive -EBUSY if ARS_REQ bit is already
set for the nfit_spa->ars_state. When that happens, the ARS request is
dropped. That can potentially cause us to miss the unreported errors that
the on going ARS request does not receive. Add an ARS_REQ_REDO state that
will request short ARS upon ARS completion to grab any errors we missed.

Signed-off-by: Dave Jiang 
---
 drivers/acpi/nfit/core.c |   12 +---
 drivers/acpi/nfit/nfit.h |1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 274f636cd3d2..def64259f9a7 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -2567,7 +2567,12 @@ static void ars_complete(struct acpi_nfit_desc 
*acpi_desc,
test_bit(ARS_SHORT, _spa->ars_state)
? "short" : "long");
clear_bit(ARS_SHORT, _spa->ars_state);
-   set_bit(ARS_DONE, _spa->ars_state);
+   if (test_and_clear_bit(ARS_REQ_REDO, _spa->ars_state)) {
+   set_bit(ARS_SHORT, _spa->ars_state);
+   set_bit(ARS_REQ, _spa->ars_state);
+   dev_dbg(dev, "ARS: processing scrub request received while in 
progress\n");
+   } else
+   set_bit(ARS_DONE, _spa->ars_state);
 }
 
 static int ars_status_process_records(struct acpi_nfit_desc *acpi_desc)
@@ -3264,9 +3269,10 @@ int acpi_nfit_ars_rescan(struct acpi_nfit_desc 
*acpi_desc, unsigned long flags)
if (test_bit(ARS_FAILED, _spa->ars_state))
continue;
 
-   if (test_and_set_bit(ARS_REQ, _spa->ars_state))
+   if (test_and_set_bit(ARS_REQ, _spa->ars_state)) {
busy++;
-   else {
+   set_bit(ARS_REQ_REDO, _spa->ars_state);
+   } else {
if (test_bit(ARS_SHORT, ))
set_bit(ARS_SHORT, _spa->ars_state);
scheduled++;
diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h
index a97ff42fe311..d1274ea2d251 100644
--- a/drivers/acpi/nfit/nfit.h
+++ b/drivers/acpi/nfit/nfit.h
@@ -119,6 +119,7 @@ enum nfit_dimm_notifiers {
 
 enum nfit_ars_state {
ARS_REQ,
+   ARS_REQ_REDO,
ARS_DONE,
ARS_SHORT,
ARS_FAILED,

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm