RE: [External] [PATCH v2 2/6] s390, dcssblk: kaddr and pfn can be NULL to ->direct_access()
Dear Maintainers of Linux-s390, Greetings. May I have your ack's for this patch? The whole series would be applied to libnvdimm if it could get your approval. And any suggestion is welcome. Cheers, Huaisheng Ye > From: Huaisheng Ye > Sent: Thursday, July 26, 2018 12:29 AM > > From: Huaisheng Ye > > dcssblk_direct_access() needs to check the validity of pointers kaddr > and pfn for NULL assignment. If anyone equals to NULL, it doesn't need > to calculate the value. > > If either of them is equal to NULL, that is to say callers may > have no need for kaddr or pfn, so this patch is prepared for allowing > them to pass in NULL instead of having to pass in a pointer or local > variable that they then just throw away. > > Signed-off-by: Huaisheng Ye > --- > drivers/s390/block/dcssblk.c | 8 +--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c > index ed60728..23e526c 100644 > --- a/drivers/s390/block/dcssblk.c > +++ b/drivers/s390/block/dcssblk.c > @@ -922,9 +922,11 @@ static DEVICE_ATTR(save, S_IWUSR | S_IRUSR, > dcssblk_save_show, > unsigned long dev_sz; > > dev_sz = dev_info->end - dev_info->start + 1; > - *kaddr = (void *) dev_info->start + offset; > - *pfn = __pfn_to_pfn_t(PFN_DOWN(dev_info->start + offset), > - PFN_DEV|PFN_SPECIAL); > + if (kaddr) > + *kaddr = (void *) dev_info->start + offset; > + if (pfn) > + *pfn = __pfn_to_pfn_t(PFN_DOWN(dev_info->start + offset), > + PFN_DEV|PFN_SPECIAL); > > return (dev_sz - offset) / PAGE_SIZE; > } > -- > 1.8.3.1 ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
Re: [PATCH] ipc/shm.c add ->pagesize function to shm_vm_ops
Hi, Andrew, On 7/27/2018 2:50 PM, Andrew Morton wrote: On Fri, 27 Jul 2018 15:17:27 -0600 Jane Chu wrote: Commit 05ea88608d4e13 (mm, hugetlbfs: introduce ->pagesize() to vm_operations_struct) adds a new ->pagesize() function to hugetlb_vm_ops, intended to cover all hugetlbfs backed files. That was merged three months ago. Can you suggest why this was only noticed now? The issue was recently reported by a QA engineer running Oracle database test in Oracle Linux. He first noticed the issue in upstream 4.17, then 4.18, but because the issue wasn't in Oracle product, it wasn't reported, not until I cherry picked the patch into Oracle Linux recently. What workload triggered this? I see no cc:stable, but 4.17 is affected? It's Oracle database workload. Large shared memory segments(SGAs) were created and shared among dozens to hundreds of processes. The crash occurs when the test stops the database workload. I do not have access to the test source. Yes, 4.17 is affected. With System V shared memory model, if "huge page" is specified, the "shared memory" is backed by hugetlbfs files, but the mappings initiated via shmget/shmat have their original vm_ops overwritten with shm_vm_ops, so we need to add a ->pagesize function to shm_vm_ops. Otherwise, vma_kernel_pagesize() returns PAGE_SIZE given a hugetlbfs backed vma, result in below BUG: fs/hugetlbfs/inode.c 443 if (unlikely(page_mapped(page))) { 444 BUG_ON(truncate_op); OK, help me out here. How does an incorrect return value from vma_kernel_pagesize() result in remove_inode_hugepages() deciding that it's truncating a mapped page? To be honest, I don't have a satisfactory answer to how the wrong pagesize causes a page that's about to be truncated remain mapped. I relied on the hind sight of BUG_ON(truncate_op). At a time I inserted dump_stack() into vma_kernel_pagesize() as Mike suggested to try to dig out more, unsigned long vma_kernel_pagesize(struct vm_area_struct *vma) { - if (vma->vm_ops && vma->vm_ops->pagesize) + if (vma->vm_ops && vma->vm_ops->pagesize) { return vma->vm_ops->pagesize(vma); +} else if (is_vm_hugetlb_page(vma)) { + struct hstate *hstate; + dump_stack(); + hstate = hstate_vma(vma); + return 1UL << huge_page_shift(hstate); + } return PAGE_SIZE; } There were too many stack traces that clogged the console, I didn't capture the entire output, perhaps I should go back to capture them. Any other ideas? Regards, -jane ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
[PATCH v2 5/6] ndctl: add support for secure erase
Add support for secure erase to libndctl and also command line option of "secure-erase" for ndctl. This will initiate the request to securely erase a DIMM. ndctl does not actually handle the verification of the security. That is handled by the kernel and the key upcall mechanism. Signed-off-by: Dave Jiang --- Documentation/ndctl/Makefile.am|1 + Documentation/ndctl/ndctl-secure-erase.txt | 21 + builtin.h |1 + ndctl/dimm.c | 21 + ndctl/lib/dimm.c |5 + ndctl/lib/libndctl.sym |1 + ndctl/libndctl.h |1 + ndctl/ndctl.c |1 + 8 files changed, 52 insertions(+) create mode 100644 Documentation/ndctl/ndctl-secure-erase.txt diff --git a/Documentation/ndctl/Makefile.am b/Documentation/ndctl/Makefile.am index 98348178..8a84c11c 100644 --- a/Documentation/ndctl/Makefile.am +++ b/Documentation/ndctl/Makefile.am @@ -49,6 +49,7 @@ man1_MANS = \ ndctl-update-security.1 \ ndctl-disable-security.1 \ ndctl-freeze-security.1 \ + ndctl-secure-erase.1 \ ndctl-list.1 CLEANFILES = $(man1_MANS) diff --git a/Documentation/ndctl/ndctl-secure-erase.txt b/Documentation/ndctl/ndctl-secure-erase.txt new file mode 100644 index ..8f82248a --- /dev/null +++ b/Documentation/ndctl/ndctl-secure-erase.txt @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0 + +ndctl-secure-erase(1) += + +NAME + +ndctl-secure-erase - securely erasing the user data on the NVDIMM + +SYNOPSIS + +[verse] +'ndctl secure-erase' + +DESCRIPTION +--- +Provide a generic interface to securely erase NVDIMM. This does not change +the label storage area. The use of this depends on support from the underlying +libndctl, kernel, as well as the platform itself. + +include::../copyright.txt[] diff --git a/builtin.h b/builtin.h index 6e6ae069..a9b6443e 100644 --- a/builtin.h +++ b/builtin.h @@ -50,4 +50,5 @@ int cmd_inject_smart(int argc, const char **argv, void *ctx); int cmd_update_security(int argc, const char **argv, void *ctx); int cmd_disable_security(int argc, const char **argv, void *ctx); int cmd_freeze_security(int argc, const char **argv, void *ctx); +int cmd_secure_erase(int argc, const char **argv, void *ctx); #endif /* _NDCTL_BUILTIN_H_ */ diff --git a/ndctl/dimm.c b/ndctl/dimm.c index 49097cff..45c9efce 100644 --- a/ndctl/dimm.c +++ b/ndctl/dimm.c @@ -857,6 +857,17 @@ static int action_security_freeze(struct ndctl_dimm *dimm, return rc; } +static int action_secure_erase(struct ndctl_dimm *dimm, + struct action_context *actx) +{ + int rc; + + rc = ndctl_dimm_secure_erase(dimm); + if (rc < 0) + error("Failed to secure erase for %s\n", + ndctl_dimm_get_devname(dimm)); + return rc; +} static struct parameters { const char *bus; const char *outfile; @@ -1244,3 +1255,13 @@ int cmd_freeze_security(int argc, const char **argv, void *ctx) count > 1 ? "s" : ""); return count >= 0 ? 0 : EXIT_FAILURE; } + +int cmd_secure_erase(int argc, const char **argv, void *ctx) +{ + int count = dimm_action(argc, argv, ctx, action_secure_erase, base_options, + "ndctl secure-erase [..] []"); + + fprintf(stderr, "secure erased %d nmem%s.\n", count >= 0 ? count : 0, + count > 1 ? "s" : ""); + return count >= 0 ? 0 : EXIT_FAILURE; +} diff --git a/ndctl/lib/dimm.c b/ndctl/lib/dimm.c index 2ff68daa..bb896d21 100644 --- a/ndctl/lib/dimm.c +++ b/ndctl/lib/dimm.c @@ -625,3 +625,8 @@ NDCTL_EXPORT int ndctl_dimm_freeze_security(struct ndctl_dimm *dimm) { return ndctl_dimm_write_security(dimm, "freeze"); } + +NDCTL_EXPORT int ndctl_dimm_secure_erase(struct ndctl_dimm *dimm) +{ + return ndctl_dimm_write_security(dimm, "erase"); +} diff --git a/ndctl/lib/libndctl.sym b/ndctl/lib/libndctl.sym index 613d9bb2..08727f94 100644 --- a/ndctl/lib/libndctl.sym +++ b/ndctl/lib/libndctl.sym @@ -375,4 +375,5 @@ global: ndctl_dimm_set_change_key; ndctl_dimm_disable_security; ndctl_dimm_freeze_security; + ndctl_dimm_secure_erase; } LIBNDCTL_16; diff --git a/ndctl/libndctl.h b/ndctl/libndctl.h index 4615ba09..88ba8a29 100644 --- a/ndctl/libndctl.h +++ b/ndctl/libndctl.h @@ -663,6 +663,7 @@ int ndctl_dimm_get_security_state(struct ndctl_dimm *dimm, char *state); int ndctl_dimm_set_change_key(struct ndctl_dimm *dimm); int ndctl_dimm_disable_security(struct ndctl_dimm *dimm); int ndctl_dimm_freeze_security(struct ndctl_dimm *dimm); +int ndctl_dimm_secure_erase(struct ndctl_dimm *dimm); #ifdef __cplusplus } /* extern "C" */ diff --git a/ndctl/ndctl.c b/ndctl/ndctl.c index 5a1a81e4..b1a91033 100644 ---
[PATCH v2 6/6] ndctl: add request-key upcall reference app
Adding a reference upcall helper for request-key in order to retrieve the security passphrase from userspace to provide to the kernel. The reference app uses keyutils API to respond to the upcall from the kernel and is invoked by /sbin/request-key of the keyutils. Signed-off-by: Dave Jiang --- Documentation/ndctl/Makefile.am |3 - Documentation/ndctl/nvdimm-upcall.txt | 33 configure.ac |1 ndctl.spec.in |2 ndctl/Makefile.am |5 + ndctl/nvdimm-upcall.c | 138 + 6 files changed, 181 insertions(+), 1 deletion(-) create mode 100644 Documentation/ndctl/nvdimm-upcall.txt create mode 100644 ndctl/nvdimm-upcall.c diff --git a/Documentation/ndctl/Makefile.am b/Documentation/ndctl/Makefile.am index 8a84c11c..d61e5f9a 100644 --- a/Documentation/ndctl/Makefile.am +++ b/Documentation/ndctl/Makefile.am @@ -50,7 +50,8 @@ man1_MANS = \ ndctl-disable-security.1 \ ndctl-freeze-security.1 \ ndctl-secure-erase.1 \ - ndctl-list.1 + ndctl-list.1 \ + nvdimm-upcall.1 CLEANFILES = $(man1_MANS) diff --git a/Documentation/ndctl/nvdimm-upcall.txt b/Documentation/ndctl/nvdimm-upcall.txt new file mode 100644 index ..bbf52fd1 --- /dev/null +++ b/Documentation/ndctl/nvdimm-upcall.txt @@ -0,0 +1,33 @@ +// SPDX-License-Identifier: GPL-2.0 + +nvdimm-upcall(1) + + +NAME + +nvdimm-upcall - upcall helper for keyutils + +SYNOPSIS + +[verse] +'nvdimm-upcall' [key_id] + +DESCRIPTION +--- +nvdimm-upcall is called by request-key from keyutils and not meant to be used +directly by the user. It expects to read from /etc/nvdimm.passwd to retrieve +the description and passphrase for the NVDIMM key. + +The nvdimm.passwd is formatted as: +: +cdab-0a-07e0-feff: + +In order for this util to be called, /etc/request-key.conf must be appended +with the following line: +create logon nvdimm* * /usr/sbin/nvdimm-upcall %k + +include::../copyright.txt[] + +SEE ALSO + +http://pmem.io/documents/NVDIMM_DSM_Interface-V1.7.pdf diff --git a/configure.ac b/configure.ac index cf442607..4b125919 100644 --- a/configure.ac +++ b/configure.ac @@ -114,6 +114,7 @@ PKG_CHECK_MODULES([KMOD], [libkmod]) PKG_CHECK_MODULES([UDEV], [libudev]) PKG_CHECK_MODULES([UUID], [uuid]) PKG_CHECK_MODULES([JSON], [json-c]) +PKG_CHECK_MODULES([KEYUTILS], [keyutils]) AC_ARG_WITH([bash-completion-dir], AS_HELP_STRING([--with-bash-completion-dir[=PATH]], diff --git a/ndctl.spec.in b/ndctl.spec.in index e2c879ca..342f01d7 100644 --- a/ndctl.spec.in +++ b/ndctl.spec.in @@ -20,6 +20,7 @@ BuildRequires:pkgconfig(libudev) BuildRequires: pkgconfig(uuid) BuildRequires: pkgconfig(json-c) BuildRequires: pkgconfig(bash-completion) +BuildRequires: pkgconfig(keyutils) %description Utility library for managing the "libnvdimm" subsystem. The "libnvdimm" @@ -116,6 +117,7 @@ make check %{_bindir}/ndctl %{_mandir}/man1/ndctl* %{bashcompdir}/ +%{_sbindir}/nvdimm-upcall %files -n daxctl %defattr(-,root,root) diff --git a/ndctl/Makefile.am b/ndctl/Makefile.am index 0f568719..d6401df9 100644 --- a/ndctl/Makefile.am +++ b/ndctl/Makefile.am @@ -1,6 +1,7 @@ include $(top_srcdir)/Makefile.am.in bin_PROGRAMS = ndctl +sbin_PROGRAMS = nvdimm-upcall ndctl_SOURCES = ndctl.c \ bus.c \ @@ -41,3 +42,7 @@ ndctl_SOURCES += ../test/libndctl.c \ ../test/core.c \ test.c endif + +nvdimm_upcall_SOURCES = nvdimm-upcall.c + +nvdimm_upcall_LDADD = $(KEYUTILS_LIBS) diff --git a/ndctl/nvdimm-upcall.c b/ndctl/nvdimm-upcall.c new file mode 100644 index ..9406b23a --- /dev/null +++ b/ndctl/nvdimm-upcall.c @@ -0,0 +1,138 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright(c) 2018 Intel Corporation. All rights reserved. */ + +/* + * Used by /sbin/request-key for handling nvdimm upcall of key requests + * for security DSMs. + */ + + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define PASSPHRASE_SIZE32 +#define PASS_PATH "/etc/nvdimm.passwd" + +static FILE *fp; + +static int get_passphrase_from_id(const char *id, char *pass, int *psize) +{ + ssize_t rc = 0; + size_t size; + char *line = NULL; + char *tmp, *id_tok, *secret; + int found = 0; + + fp = fopen(PASS_PATH, "r+"); + if (!fp) { + syslog(LOG_ERR, "fopen: %s\n", strerror(errno)); + return -errno; + } + + while ((rc = getline(, , fp)) != -1) { + id_tok = strtok_r(line, ":", ); + if (!id_tok) + break; + if (strcmp(id_tok, id) == 0) { + secret = tmp; + found = 1; + rc
[PATCH v2 2/6] ndctl: add update to security support
Add API call for triggering sysfs knob to update the security for a DIMM in libndctl. Also add the ndctl "update-security" to trigger that as well. ndctl does not actually handle the passphrase file. It only initiates the update and expects all the necessary mechanisms are already in place. Signed-off-by: Dave Jiang --- Documentation/ndctl/Makefile.am |1 + Documentation/ndctl/ndctl-update-security.txt | 21 + builtin.h |1 + ndctl/dimm.c | 22 ++ ndctl/lib/dimm.c | 20 ndctl/lib/libndctl.sym|1 + ndctl/libndctl.h |1 + ndctl/ndctl.c |1 + 8 files changed, 68 insertions(+) create mode 100644 Documentation/ndctl/ndctl-update-security.txt diff --git a/Documentation/ndctl/Makefile.am b/Documentation/ndctl/Makefile.am index 4fd9636f..02405c44 100644 --- a/Documentation/ndctl/Makefile.am +++ b/Documentation/ndctl/Makefile.am @@ -46,6 +46,7 @@ man1_MANS = \ ndctl-inject-error.1 \ ndctl-inject-smart.1 \ ndctl-update-firmware.1 \ + ndctl-update-security.1 \ ndctl-list.1 CLEANFILES = $(man1_MANS) diff --git a/Documentation/ndctl/ndctl-update-security.txt b/Documentation/ndctl/ndctl-update-security.txt new file mode 100644 index ..38ca02fa --- /dev/null +++ b/Documentation/ndctl/ndctl-update-security.txt @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0 + +ndctl-update-security(1) + + +NAME + +ndctl-update-security - enabling or update the security passphrase for an NVDIMM + +SYNOPSIS + +[verse] +'ndctl update-security' + +DESCRIPTION +--- +Provide a generic interface for enabling or updating security passphrase for +NVDIMM. The use of this depends on support from the underlying libndctl, kernel, +as well as the platform itself. + +include::../copyright.txt[] diff --git a/builtin.h b/builtin.h index d3cc7239..e6dc5340 100644 --- a/builtin.h +++ b/builtin.h @@ -47,4 +47,5 @@ int cmd_bat(int argc, const char **argv, void *ctx); #endif int cmd_update_firmware(int argc, const char **argv, void *ctx); int cmd_inject_smart(int argc, const char **argv, void *ctx); +int cmd_update_security(int argc, const char **argv, void *ctx); #endif /* _NDCTL_BUILTIN_H_ */ diff --git a/ndctl/dimm.c b/ndctl/dimm.c index 97643a3c..274714d6 100644 --- a/ndctl/dimm.c +++ b/ndctl/dimm.c @@ -821,6 +821,18 @@ static int action_update(struct ndctl_dimm *dimm, struct action_context *actx) return rc; } +static int action_security_update(struct ndctl_dimm *dimm, + struct action_context *actx) +{ + int rc; + + rc = ndctl_dimm_set_change_key(dimm); + if (rc < 0) + error("Failed to update security for %s\n", + ndctl_dimm_get_devname(dimm)); + return rc; +} + static struct parameters { const char *bus; const char *outfile; @@ -1178,3 +1190,13 @@ int cmd_update_firmware(int argc, const char **argv, void *ctx) count > 1 ? "s" : ""); return count >= 0 ? 0 : EXIT_FAILURE; } + +int cmd_update_security(int argc, const char **argv, void *ctx) +{ + int count = dimm_action(argc, argv, ctx, action_security_update, base_options, + "ndctl update-security [..] []"); + + fprintf(stderr, "security updated %d nmem%s.\n", count >= 0 ? count : 0, + count > 1 ? "s" : ""); + return count >= 0 ? 0 : EXIT_FAILURE; +} diff --git a/ndctl/lib/dimm.c b/ndctl/lib/dimm.c index 8ed58559..ca4d439b 100644 --- a/ndctl/lib/dimm.c +++ b/ndctl/lib/dimm.c @@ -595,3 +595,23 @@ NDCTL_EXPORT int ndctl_dimm_get_security_state(struct ndctl_dimm *dimm, return sysfs_read_attr(ctx, path, state); } + +static int ndctl_dimm_write_security(struct ndctl_dimm *dimm, const char *cmd) +{ + struct ndctl_ctx *ctx = ndctl_dimm_get_ctx(dimm); + char *path = dimm->dimm_buf; + int len = dimm->buf_len; + + if (snprintf(path, len, "%s/security", dimm->dimm_path) >= len) { + err(ctx, "%s: buffer too small!\n", + ndctl_dimm_get_devname(dimm)); + return -ERANGE; + } + + return sysfs_write_attr(ctx, path, cmd); +} + +NDCTL_EXPORT int ndctl_dimm_set_change_key(struct ndctl_dimm *dimm) +{ + return ndctl_dimm_write_security(dimm, "update"); +} diff --git a/ndctl/lib/libndctl.sym b/ndctl/lib/libndctl.sym index 8040d7de..e2a359b8 100644 --- a/ndctl/lib/libndctl.sym +++ b/ndctl/lib/libndctl.sym @@ -372,4 +372,5 @@ LIBNDCTL_17 { global: ndctl_dimm_smart_inject_supported; ndctl_dimm_get_security_state; + ndctl_dimm_set_change_key; } LIBNDCTL_16; diff --git a/ndctl/libndctl.h b/ndctl/libndctl.h
[PATCH v2 0/6] ndctl: add security support
The following series implements mechanisms that utilize the sysfs knobs provided by the kernel in order to support the Intel DSM v1.7 spec that provides security to NVDIMM. The following abilities are added: 1. display security state 2. update security 3. disable security 4. freeze security 5. secure erase Also a reference helper app is provided to retrieve security information through the keyutils and kernel key management API. v2: - Fixup the upcall util to match recent kernel updates for nvdimm security. --- Dave Jiang (6): ndctl: add support for display security state ndctl: add update to security support ndctl: add disable security support ndctl: add support for freeze security ndctl: add support for secure erase ndctl: add request-key upcall reference app Documentation/ndctl/Makefile.am|7 + Documentation/ndctl/ndctl-disable-security.txt | 21 Documentation/ndctl/ndctl-freeze-security.txt | 21 Documentation/ndctl/ndctl-list.txt |8 + Documentation/ndctl/ndctl-secure-erase.txt | 21 Documentation/ndctl/ndctl-update-security.txt | 21 Documentation/ndctl/nvdimm-upcall.txt | 33 ++ builtin.h |4 + configure.ac |1 ndctl.spec.in |2 ndctl/Makefile.am |5 + ndctl/dimm.c | 87 +++ ndctl/lib/dimm.c | 51 + ndctl/lib/libndctl.sym |5 + ndctl/libndctl.h |5 + ndctl/ndctl.c |4 + ndctl/nvdimm-upcall.c | 138 util/json.c|8 + 18 files changed, 441 insertions(+), 1 deletion(-) create mode 100644 Documentation/ndctl/ndctl-disable-security.txt create mode 100644 Documentation/ndctl/ndctl-freeze-security.txt create mode 100644 Documentation/ndctl/ndctl-secure-erase.txt create mode 100644 Documentation/ndctl/ndctl-update-security.txt create mode 100644 Documentation/ndctl/nvdimm-upcall.txt create mode 100644 ndctl/nvdimm-upcall.c -- ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
[PATCH v2 3/6] ndctl: add disable security support
Add support for disable security to libndctl and also command line option of "disable-security" for ndctl. This provides a way to disable security on the nvdimm. ndctl does not handle the actual processing of the passphrase. It only starts the request. Signed-off-by: Dave Jiang --- Documentation/ndctl/Makefile.am|1 + Documentation/ndctl/ndctl-disable-security.txt | 21 + builtin.h |1 + ndctl/dimm.c | 22 ++ ndctl/lib/dimm.c |5 + ndctl/lib/libndctl.sym |1 + ndctl/libndctl.h |1 + ndctl/ndctl.c |1 + 8 files changed, 53 insertions(+) create mode 100644 Documentation/ndctl/ndctl-disable-security.txt diff --git a/Documentation/ndctl/Makefile.am b/Documentation/ndctl/Makefile.am index 02405c44..b1cb3b5b 100644 --- a/Documentation/ndctl/Makefile.am +++ b/Documentation/ndctl/Makefile.am @@ -47,6 +47,7 @@ man1_MANS = \ ndctl-inject-smart.1 \ ndctl-update-firmware.1 \ ndctl-update-security.1 \ + ndctl-disable-security.1 \ ndctl-list.1 CLEANFILES = $(man1_MANS) diff --git a/Documentation/ndctl/ndctl-disable-security.txt b/Documentation/ndctl/ndctl-disable-security.txt new file mode 100644 index ..651f8d03 --- /dev/null +++ b/Documentation/ndctl/ndctl-disable-security.txt @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0 + +ndctl-disable-security(1) + + +NAME + +ndctl-disable-security - enabling or disable security for an NVDIMM + +SYNOPSIS + +[verse] +'ndctl disable-security' + +DESCRIPTION +--- +Provide a generic interface for disabling security for NVDIMM. The use of this +depends on support from the underlying libndctl, kernel, as well as the +platform itself. + +include::../copyright.txt[] diff --git a/builtin.h b/builtin.h index e6dc5340..e8ba9aee 100644 --- a/builtin.h +++ b/builtin.h @@ -48,4 +48,5 @@ int cmd_bat(int argc, const char **argv, void *ctx); int cmd_update_firmware(int argc, const char **argv, void *ctx); int cmd_inject_smart(int argc, const char **argv, void *ctx); int cmd_update_security(int argc, const char **argv, void *ctx); +int cmd_disable_security(int argc, const char **argv, void *ctx); #endif /* _NDCTL_BUILTIN_H_ */ diff --git a/ndctl/dimm.c b/ndctl/dimm.c index 274714d6..e64f6717 100644 --- a/ndctl/dimm.c +++ b/ndctl/dimm.c @@ -833,6 +833,18 @@ static int action_security_update(struct ndctl_dimm *dimm, return rc; } +static int action_security_disable(struct ndctl_dimm *dimm, + struct action_context *actx) +{ + int rc; + + rc = ndctl_dimm_disable_security(dimm); + if (rc < 0) + error("Failed to disable security for %s\n", + ndctl_dimm_get_devname(dimm)); + return rc; +} + static struct parameters { const char *bus; const char *outfile; @@ -1200,3 +1212,13 @@ int cmd_update_security(int argc, const char **argv, void *ctx) count > 1 ? "s" : ""); return count >= 0 ? 0 : EXIT_FAILURE; } + +int cmd_disable_security(int argc, const char **argv, void *ctx) +{ + int count = dimm_action(argc, argv, ctx, action_security_disable, base_options, + "ndctl disable-security [..] []"); + + fprintf(stderr, "security disabled %d nmem%s.\n", count >= 0 ? count : 0, + count > 1 ? "s" : ""); + return count >= 0 ? 0 : EXIT_FAILURE; +} diff --git a/ndctl/lib/dimm.c b/ndctl/lib/dimm.c index ca4d439b..13c2806c 100644 --- a/ndctl/lib/dimm.c +++ b/ndctl/lib/dimm.c @@ -615,3 +615,8 @@ NDCTL_EXPORT int ndctl_dimm_set_change_key(struct ndctl_dimm *dimm) { return ndctl_dimm_write_security(dimm, "update"); } + +NDCTL_EXPORT int ndctl_dimm_disable_security(struct ndctl_dimm *dimm) +{ + return ndctl_dimm_write_security(dimm, "disable"); +} diff --git a/ndctl/lib/libndctl.sym b/ndctl/lib/libndctl.sym index e2a359b8..1a919567 100644 --- a/ndctl/lib/libndctl.sym +++ b/ndctl/lib/libndctl.sym @@ -373,4 +373,5 @@ global: ndctl_dimm_smart_inject_supported; ndctl_dimm_get_security_state; ndctl_dimm_set_change_key; + ndctl_dimm_disable_security; } LIBNDCTL_16; diff --git a/ndctl/libndctl.h b/ndctl/libndctl.h index 663847b6..1ca6cb18 100644 --- a/ndctl/libndctl.h +++ b/ndctl/libndctl.h @@ -661,6 +661,7 @@ struct ndctl_cmd *ndctl_dimm_cmd_new_ack_shutdown_count(struct ndctl_dimm *dimm) int ndctl_dimm_fw_update_supported(struct ndctl_dimm *dimm); int ndctl_dimm_get_security_state(struct ndctl_dimm *dimm, char *state); int ndctl_dimm_set_change_key(struct ndctl_dimm *dimm); +int ndctl_dimm_disable_security(struct ndctl_dimm *dimm); #ifdef __cplusplus } /* extern "C" */ diff --git
[PATCH v2 4/6] ndctl: add support for freeze security
Add support for freeze security to libndctl and also command line option of "freeze-security" for ndctl. This will lock the ability to make changes to the NVDIMM security. Signed-off-by: Dave Jiang --- Documentation/ndctl/Makefile.am |1 + Documentation/ndctl/ndctl-freeze-security.txt | 21 + builtin.h |1 + ndctl/dimm.c | 22 ++ ndctl/lib/dimm.c |5 + ndctl/lib/libndctl.sym|1 + ndctl/libndctl.h |1 + ndctl/ndctl.c |1 + 8 files changed, 53 insertions(+) create mode 100644 Documentation/ndctl/ndctl-freeze-security.txt diff --git a/Documentation/ndctl/Makefile.am b/Documentation/ndctl/Makefile.am index b1cb3b5b..98348178 100644 --- a/Documentation/ndctl/Makefile.am +++ b/Documentation/ndctl/Makefile.am @@ -48,6 +48,7 @@ man1_MANS = \ ndctl-update-firmware.1 \ ndctl-update-security.1 \ ndctl-disable-security.1 \ + ndctl-freeze-security.1 \ ndctl-list.1 CLEANFILES = $(man1_MANS) diff --git a/Documentation/ndctl/ndctl-freeze-security.txt b/Documentation/ndctl/ndctl-freeze-security.txt new file mode 100644 index ..343bb019 --- /dev/null +++ b/Documentation/ndctl/ndctl-freeze-security.txt @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0 + +ndctl-freeze-security(1) + + +NAME + +ndctl-freeze-security - enabling or freeze the security for an NVDIMM + +SYNOPSIS + +[verse] +'ndctl freeze-security' + +DESCRIPTION +--- +Provide a generic interface to freeze the security for NVDIMM. The use of this +depends on support from the underlying libndctl, kernel, as well as the +platform itself. + +include::../copyright.txt[] diff --git a/builtin.h b/builtin.h index e8ba9aee..6e6ae069 100644 --- a/builtin.h +++ b/builtin.h @@ -49,4 +49,5 @@ int cmd_update_firmware(int argc, const char **argv, void *ctx); int cmd_inject_smart(int argc, const char **argv, void *ctx); int cmd_update_security(int argc, const char **argv, void *ctx); int cmd_disable_security(int argc, const char **argv, void *ctx); +int cmd_freeze_security(int argc, const char **argv, void *ctx); #endif /* _NDCTL_BUILTIN_H_ */ diff --git a/ndctl/dimm.c b/ndctl/dimm.c index e64f6717..49097cff 100644 --- a/ndctl/dimm.c +++ b/ndctl/dimm.c @@ -845,6 +845,18 @@ static int action_security_disable(struct ndctl_dimm *dimm, return rc; } +static int action_security_freeze(struct ndctl_dimm *dimm, + struct action_context *actx) +{ + int rc; + + rc = ndctl_dimm_freeze_security(dimm); + if (rc < 0) + error("Failed to freeze security for %s\n", + ndctl_dimm_get_devname(dimm)); + return rc; +} + static struct parameters { const char *bus; const char *outfile; @@ -1222,3 +1234,13 @@ int cmd_disable_security(int argc, const char **argv, void *ctx) count > 1 ? "s" : ""); return count >= 0 ? 0 : EXIT_FAILURE; } + +int cmd_freeze_security(int argc, const char **argv, void *ctx) +{ + int count = dimm_action(argc, argv, ctx, action_security_freeze, base_options, + "ndctl freeze-security [..] []"); + + fprintf(stderr, "security freezed %d nmem%s.\n", count >= 0 ? count : 0, + count > 1 ? "s" : ""); + return count >= 0 ? 0 : EXIT_FAILURE; +} diff --git a/ndctl/lib/dimm.c b/ndctl/lib/dimm.c index 13c2806c..2ff68daa 100644 --- a/ndctl/lib/dimm.c +++ b/ndctl/lib/dimm.c @@ -620,3 +620,8 @@ NDCTL_EXPORT int ndctl_dimm_disable_security(struct ndctl_dimm *dimm) { return ndctl_dimm_write_security(dimm, "disable"); } + +NDCTL_EXPORT int ndctl_dimm_freeze_security(struct ndctl_dimm *dimm) +{ + return ndctl_dimm_write_security(dimm, "freeze"); +} diff --git a/ndctl/lib/libndctl.sym b/ndctl/lib/libndctl.sym index 1a919567..613d9bb2 100644 --- a/ndctl/lib/libndctl.sym +++ b/ndctl/lib/libndctl.sym @@ -374,4 +374,5 @@ global: ndctl_dimm_get_security_state; ndctl_dimm_set_change_key; ndctl_dimm_disable_security; + ndctl_dimm_freeze_security; } LIBNDCTL_16; diff --git a/ndctl/libndctl.h b/ndctl/libndctl.h index 1ca6cb18..4615ba09 100644 --- a/ndctl/libndctl.h +++ b/ndctl/libndctl.h @@ -662,6 +662,7 @@ int ndctl_dimm_fw_update_supported(struct ndctl_dimm *dimm); int ndctl_dimm_get_security_state(struct ndctl_dimm *dimm, char *state); int ndctl_dimm_set_change_key(struct ndctl_dimm *dimm); int ndctl_dimm_disable_security(struct ndctl_dimm *dimm); +int ndctl_dimm_freeze_security(struct ndctl_dimm *dimm); #ifdef __cplusplus } /* extern "C" */ diff --git a/ndctl/ndctl.c b/ndctl/ndctl.c index eeef41da..5a1a81e4 100644 --- a/ndctl/ndctl.c +++ b/ndctl/ndctl.c @@ -90,6 +90,7 @@
[PATCH v2 1/6] ndctl: add support for display security state
Adding libndctl API call for retrieving security state for a DIMM and also adding support to ndctl list for displaying security state. Signed-off-by: Dave Jiang --- Documentation/ndctl/ndctl-list.txt |8 ndctl/lib/dimm.c | 16 ndctl/lib/libndctl.sym |1 + ndctl/libndctl.h |1 + util/json.c|8 5 files changed, 34 insertions(+) diff --git a/Documentation/ndctl/ndctl-list.txt b/Documentation/ndctl/ndctl-list.txt index 13ebdcdd..57eac1fe 100644 --- a/Documentation/ndctl/ndctl-list.txt +++ b/Documentation/ndctl/ndctl-list.txt @@ -97,6 +97,14 @@ include::xable-region-options.txt[] -D:: --dimms:: Include dimm info in the listing +[verse] +{ + "dev":"nmem0", + "id":"cdab-0a-07e0-", + "handle":0, + "phys_id":0, + "security_state:":"disabled" +} -H:: --health:: diff --git a/ndctl/lib/dimm.c b/ndctl/lib/dimm.c index b3e032e0..8ed58559 100644 --- a/ndctl/lib/dimm.c +++ b/ndctl/lib/dimm.c @@ -579,3 +579,19 @@ NDCTL_EXPORT unsigned long ndctl_dimm_get_available_labels( return strtoul(buf, NULL, 0); } + +NDCTL_EXPORT int ndctl_dimm_get_security_state(struct ndctl_dimm *dimm, + char *state) +{ + struct ndctl_ctx *ctx = ndctl_dimm_get_ctx(dimm); + char *path = dimm->dimm_buf; + int len = dimm->buf_len; + + if (snprintf(path, len, "%s/security", dimm->dimm_path) >= len) { + err(ctx, "%s: buffer too small!\n", + ndctl_dimm_get_devname(dimm)); + return -ERANGE; + } + + return sysfs_read_attr(ctx, path, state); +} diff --git a/ndctl/lib/libndctl.sym b/ndctl/lib/libndctl.sym index 8932ef66..8040d7de 100644 --- a/ndctl/lib/libndctl.sym +++ b/ndctl/lib/libndctl.sym @@ -371,4 +371,5 @@ global: LIBNDCTL_17 { global: ndctl_dimm_smart_inject_supported; + ndctl_dimm_get_security_state; } LIBNDCTL_16; diff --git a/ndctl/libndctl.h b/ndctl/libndctl.h index 8a96c846..d48513ce 100644 --- a/ndctl/libndctl.h +++ b/ndctl/libndctl.h @@ -659,6 +659,7 @@ unsigned long long ndctl_cmd_fw_fquery_get_fw_rev(struct ndctl_cmd *cmd); enum ND_FW_STATUS ndctl_cmd_fw_xlat_firmware_status(struct ndctl_cmd *cmd); struct ndctl_cmd *ndctl_dimm_cmd_new_ack_shutdown_count(struct ndctl_dimm *dimm); int ndctl_dimm_fw_update_supported(struct ndctl_dimm *dimm); +int ndctl_dimm_get_security_state(struct ndctl_dimm *dimm, char *state); #ifdef __cplusplus } /* extern "C" */ diff --git a/util/json.c b/util/json.c index 13324588..90f05528 100644 --- a/util/json.c +++ b/util/json.c @@ -164,6 +164,7 @@ struct json_object *util_dimm_to_json(struct ndctl_dimm *dimm, unsigned int handle = ndctl_dimm_get_handle(dimm); unsigned short phys_id = ndctl_dimm_get_phys_id(dimm); struct json_object *jobj; + char security_state[32]; if (!jdimm) return NULL; @@ -243,6 +244,13 @@ struct json_object *util_dimm_to_json(struct ndctl_dimm *dimm, json_object_object_add(jdimm, "flag_smart_event", jobj); } + if (ndctl_dimm_get_security_state(dimm, security_state) == 0) { + jobj = json_object_new_string(security_state); + if (!jobj) + goto err; + json_object_object_add(jdimm, "security_state:", jobj); + } + return jdimm; err: json_object_put(jdimm); ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
Re: [PATCH] ipc/shm.c add ->pagesize function to shm_vm_ops
On Fri, 27 Jul 2018 15:17:27 -0600 Jane Chu wrote: > Commit 05ea88608d4e13 (mm, hugetlbfs: introduce ->pagesize() to > vm_operations_struct) adds a new ->pagesize() function to > hugetlb_vm_ops, intended to cover all hugetlbfs backed files. That was merged three months ago. Can you suggest why this was only noticed now? What workload triggered this? I see no cc:stable, but 4.17 is affected? > With System V shared memory model, if "huge page" is specified, > the "shared memory" is backed by hugetlbfs files, but the mappings > initiated via shmget/shmat have their original vm_ops overwritten > with shm_vm_ops, so we need to add a ->pagesize function to shm_vm_ops. > Otherwise, vma_kernel_pagesize() returns PAGE_SIZE given a hugetlbfs > backed vma, result in below BUG: > > fs/hugetlbfs/inode.c > 443 if (unlikely(page_mapped(page))) { > 444 BUG_ON(truncate_op); OK, help me out here. How does an incorrect return value from vma_kernel_pagesize() result in remove_inode_hugepages() deciding that it's truncating a mapped page? ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
Re: [PATCH] ipc/shm.c add ->pagesize function to shm_vm_ops
On 07/27/2018 02:17 PM, Jane Chu wrote: > Commit 05ea88608d4e13 (mm, hugetlbfs: introduce ->pagesize() to > vm_operations_struct) adds a new ->pagesize() function to > hugetlb_vm_ops, intended to cover all hugetlbfs backed files. Thanks Jane! Adding Dan on Cc as he authored 05ea88608d4e13. Note that this is the same type of omission that was made when adding ->split() to vm_operations_struct. :( That is why I suggested adding the comment above vm_operations_struct. > With System V shared memory model, if "huge page" is specified, > the "shared memory" is backed by hugetlbfs files, but the mappings > initiated via shmget/shmat have their original vm_ops overwritten > with shm_vm_ops, so we need to add a ->pagesize function to shm_vm_ops. > Otherwise, vma_kernel_pagesize() returns PAGE_SIZE given a hugetlbfs > backed vma, result in below BUG: > > fs/hugetlbfs/inode.c > 443 if (unlikely(page_mapped(page))) { > 444 BUG_ON(truncate_op); > > [ 242.268342] hugetlbfs: oracle (4592): Using mlock ulimits for SHM_HUGETLB > is deprecated > [ 282.653208] [ cut here ] > [ 282.708447] kernel BUG at fs/hugetlbfs/inode.c:444! > [ 282.818957] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 ... > [ 284.025873] CPU: 35 PID: 5583 Comm: oracle_5583_sbt Not tainted > 4.14.35-1829.el7uek.x86_64 #2 > [ 284.246609] task: 9bf0507aaf80 task.stack: a9e625628000 > [ 284.317455] RIP: 0010:remove_inode_hugepages+0x3db/0x3e2 > > [ 285.292389] Call Trace: > [ 285.321630] hugetlbfs_evict_inode+0x1e/0x3e > [ 285.372707] evict+0xdb/0x1af > [ 285.408185] iput+0x1a2/0x1f7 > [ 285.443661] dentry_unlink_inode+0xc6/0xf0 > [ 285.492661] __dentry_kill+0xd8/0x18d > [ 285.536459] dput+0x1b5/0x1ed > [ 285.571939] __fput+0x18b/0x216 > [ 285.609495] fput+0xe/0x10 > [ 285.646030] task_work_run+0x90/0xa7 > [ 285.688788] exit_to_usermode_loop+0xdd/0x116 > [ 285.740905] do_syscall_64+0x187/0x1ae > [ 285.785740] entry_SYSCALL_64_after_hwframe+0x150/0x0 We will need the tag, Fixes: 05ea88608d4e13 ("mm, hugetlbfs: introduce ->pagesize() to vm_operations_struct") and, CC: sta...@vger.kernel.org > Suggested-by: Mike Kravetz > Signed-off-by: Jane Chu > --- > include/linux/mm.h | 7 +++ > ipc/shm.c | 12 > 2 files changed, 19 insertions(+) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index a0fbb9ffe380..0c759379f2d9 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -387,6 +387,13 @@ enum page_entry_size { > * These are the virtual MM functions - opening of an area, closing and > * unmapping it (needed to keep files on disk up-to-date etc), pointer > * to the functions called when a no-page or a wp-page exception occurs. > + * > + * Note, when a new function is introduced to vm_operations_struct and > + * added to hugetlb_vm_ops, please consider adding the function to > + * shm_vm_ops. This is because under System V memory model, though > + * mappings created via shmget/shmat with "huge page" specified are > + * backed by hugetlbfs files, their original vm_ops are overwritten with > + * shm_vm_ops. > */ > struct vm_operations_struct { > void (*open)(struct vm_area_struct * area); > diff --git a/ipc/shm.c b/ipc/shm.c > index 051a3e1fb8df..fefa00d310fb 100644 > --- a/ipc/shm.c > +++ b/ipc/shm.c > @@ -427,6 +427,17 @@ static int shm_split(struct vm_area_struct *vma, > unsigned long addr) > return 0; > } > > +static unsigned long shm_pagesize(struct vm_area_struct *vma) > +{ > + struct file *file = vma->vm_file; > + struct shm_file_data *sfd = shm_file_data(file); > + > + if (sfd->vm_ops->pagesize) > + return sfd->vm_ops->pagesize(vma); > + > + return PAGE_SIZE; > +} > + > #ifdef CONFIG_NUMA > static int shm_set_policy(struct vm_area_struct *vma, struct mempolicy *new) > { > @@ -554,6 +565,7 @@ static const struct vm_operations_struct shm_vm_ops = { > .close = shm_close,/* callback for when the vm-area is released */ > .fault = shm_fault, > .split = shm_split, > + .pagesize = shm_pagesize, > #if defined(CONFIG_NUMA) > .set_policy = shm_set_policy, > .get_policy = shm_get_policy, > Reviewed-by: Mike Kravetz -- Mike Kravetz ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
[ndctl PATCH] ndctl inject-smart: add an option to uninject smart fields
The smart injection command in ndctl was missing an option to uninject the different inject-able fields. The APIs in libndctl already allow for this by setting the 'enable' flag to false. Add a *-uninject argument for each of the injectable fields that sets this flag and disables injected values. Also add an "uninject-all" option to uninject all fields. Signed-off-by: Vishal Verma --- Documentation/ndctl/ndctl-inject-smart.txt | 20 ++ ndctl/inject-smart.c | 61 -- 2 files changed, 78 insertions(+), 3 deletions(-) diff --git a/Documentation/ndctl/ndctl-inject-smart.txt b/Documentation/ndctl/ndctl-inject-smart.txt index 20a1621..d28be46 100644 --- a/Documentation/ndctl/ndctl-inject-smart.txt +++ b/Documentation/ndctl/ndctl-inject-smart.txt @@ -54,6 +54,9 @@ OPTIONS Enable or disable the smart media temperature alarm. Options are 'on' or 'off'. +--media-temperature-uninject:: + Uninject any media temperature previously injected. + -c:: --ctrl-temperature=:: Inject for the controller temperature smart attribute. @@ -66,6 +69,9 @@ OPTIONS Enable or disable the smart controller temperature alarm. Options are 'on' or 'off'. +--ctrl-temperature-uninject:: + Uninject any controller temperature previously injected. + -s:: --spares=:: Inject for the spares smart attribute. @@ -77,14 +83,28 @@ OPTIONS --spares-alarm=:: Enable or disable the smart spares alarm. Options are 'on' or 'off'. +--spares-uninject:: + Uninject any spare percentage previously injected. + -f:: --fatal:: Set the flag to spoof fatal health status. +--fatal-uninject:: + Uninject the fatal health status flag. + -U:: --unsafe-shutdown:: Set the flag to spoof an unsafe shutdown on the next power down. +--unsafe-shutdown-uninject:: + Uninject the unsafe shutdown flag. + +-N:: +--uninject-all:: + Uninject all possible smart fields/values irrespective of whether + they have been previously injected or not. + -v:: --verbose:: Emit debug messages for the error injection process diff --git a/ndctl/inject-smart.c b/ndctl/inject-smart.c index edb87e1..f5a3a6f 100644 --- a/ndctl/inject-smart.c +++ b/ndctl/inject-smart.c @@ -44,6 +44,12 @@ static struct parameters { const char *spares_alarm; bool fatal; bool unsafe_shutdown; + bool media_temperature_uninject; + bool ctrl_temperature_uninject; + bool spares_uninject; + bool fatal_uninject; + bool unsafe_shutdown_uninject; + bool uninject_all; } param; static struct smart_ctx { @@ -76,6 +82,8 @@ OPT_STRING('M', "media-temperature-threshold", \ OPT_STRING('\0', "media-temperature-alarm", _temperature_alarm, \ "smart media temperature alarm", \ "enable or disable the smart media temperature alarm"), \ +OPT_BOOLEAN('\0', "media-temperature-uninject", \ + _temperature_uninject, "uninject media temperature"), \ OPT_STRING('c', "ctrl-temperature", _temperature, \ "smart controller temperature attribute", \ "inject a value for smart controller temperature"), \ @@ -86,6 +94,8 @@ OPT_STRING('C', "ctrl-temperature-threshold", \ OPT_STRING('\0', "ctrl-temperature-alarm", _temperature_alarm, \ "smart controller temperature alarm", \ "enable or disable the smart controller temperature alarm"), \ +OPT_BOOLEAN('\0', "ctrl-temperature-uninject", \ + _temperature_uninject, "uninject controller temperature"), \ OPT_STRING('s', "spares", , \ "smart spares attribute", \ "inject a value for smart spares"), \ @@ -95,9 +105,17 @@ OPT_STRING('S', "spares-threshold", _threshold, \ OPT_STRING('\0', "spares-alarm", _alarm, \ "smart spares alarm", \ "enable or disable the smart spares alarm"), \ +OPT_BOOLEAN('\0', "spares-uninject", \ + _uninject, "uninject spare percentage"), \ OPT_BOOLEAN('f', "fatal", , "inject fatal smart health status"), \ +OPT_BOOLEAN('F', "fatal-uninject", \ + _uninject, "uninject fatal health condition"), \ OPT_BOOLEAN('U', "unsafe-shutdown", _shutdown, \ - "inject smart unsafe shutdown status") + "inject smart unsafe shutdown status"), \ +OPT_BOOLEAN('\0', "unsafe-shutdown-uninject", \ + _shutdown_uninject, "uninject unsafe shutdown status"), \ +OPT_BOOLEAN('N', "uninject-all", \ + _all, "uninject all possible fields") static const struct option smart_opts[] = { @@ -183,6 +201,21 @@ static inline void enable_inject(void) } \ } +#define smart_param_setup_uninj(arg) \ +{ \ + if (param.arg##_uninject) { \ + /* Ensure user didn't set inject and uninject together */ \ + if (param.arg) { \ + error("Cannot use %s inject and uninject together\n", \ + #arg); \ + return -EINVAL; \ + } \
ndctl create-namespace region order
ndctl create-namespace doesn't walk through the regions from lowest to highest; in this example it's going 1,3,5,0,2,4,6: $ sudo ndctl create-namespace --verbose ROB: parsing bus id=0 path=ndbus0 ROB: parsing region id=1 path=region1 namespace_create:770: region1: insufficient capacity size: 0 avail: 0 ROB: parsing region id=3 path=region3 namespace_create:770: region3: insufficient capacity size: 0 avail: 0 ROB: parsing region id=5 path=region5 namespace_create:770: region5: insufficient capacity size: 0 avail: 0 ROB: parsing region id=0 path=region0 namespace_create:770: region0: insufficient capacity size: 0 avail: 0 ROB: parsing region id=2 path=region2 That makes the results unpredictable. That's from do_xaction_namespace(): ndctl_bus_foreach(ctx, bus) { printf("ROB: parsing bus id=%d path=%s\n", ndctl_bus_get_id(bus), ndctl_bus_get_devname(bus)); if (!util_bus_filter(bus, param.bus)) continue; ndctl_region_foreach(bus, region) { printf("ROB: parsing region id=%d path=%s\n", ndctl_region_get_id(region), ndctl_region_get_devname(region)); if (!util_region_filter(region, param.region)) The same order is used for other commands like disable-namespace and destroy-namespace. Questions = 1. Could the busses and regions (anything accessed with a _foreach() macro) be sorted in ascending order so this and other commands are more predictable? 2. For create-namespace with no region arguments, rather than just creating one random namespace, how about creating namespaces on each available region? 3. The manpage suggests that "-r all" does that, but it seems to just make every region number pass the filter in that foreach loop (i.e., it has no effect for the create-namespace command, which just creates one and quits). --- Robert Elliott, HPE Persistent Memory ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
[PATCH] ipc/shm.c add ->pagesize function to shm_vm_ops
Commit 05ea88608d4e13 (mm, hugetlbfs: introduce ->pagesize() to vm_operations_struct) adds a new ->pagesize() function to hugetlb_vm_ops, intended to cover all hugetlbfs backed files. With System V shared memory model, if "huge page" is specified, the "shared memory" is backed by hugetlbfs files, but the mappings initiated via shmget/shmat have their original vm_ops overwritten with shm_vm_ops, so we need to add a ->pagesize function to shm_vm_ops. Otherwise, vma_kernel_pagesize() returns PAGE_SIZE given a hugetlbfs backed vma, result in below BUG: fs/hugetlbfs/inode.c 443 if (unlikely(page_mapped(page))) { 444 BUG_ON(truncate_op); [ 242.268342] hugetlbfs: oracle (4592): Using mlock ulimits for SHM_HUGETLB is deprecated [ 282.653208] [ cut here ] [ 282.708447] kernel BUG at fs/hugetlbfs/inode.c:444! [ 282.818957] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 ... [ 284.025873] CPU: 35 PID: 5583 Comm: oracle_5583_sbt Not tainted 4.14.35-1829.el7uek.x86_64 #2 [ 284.246609] task: 9bf0507aaf80 task.stack: a9e625628000 [ 284.317455] RIP: 0010:remove_inode_hugepages+0x3db/0x3e2 [ 285.292389] Call Trace: [ 285.321630] hugetlbfs_evict_inode+0x1e/0x3e [ 285.372707] evict+0xdb/0x1af [ 285.408185] iput+0x1a2/0x1f7 [ 285.443661] dentry_unlink_inode+0xc6/0xf0 [ 285.492661] __dentry_kill+0xd8/0x18d [ 285.536459] dput+0x1b5/0x1ed [ 285.571939] __fput+0x18b/0x216 [ 285.609495] fput+0xe/0x10 [ 285.646030] task_work_run+0x90/0xa7 [ 285.688788] exit_to_usermode_loop+0xdd/0x116 [ 285.740905] do_syscall_64+0x187/0x1ae [ 285.785740] entry_SYSCALL_64_after_hwframe+0x150/0x0 Suggested-by: Mike Kravetz Signed-off-by: Jane Chu --- include/linux/mm.h | 7 +++ ipc/shm.c | 12 2 files changed, 19 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index a0fbb9ffe380..0c759379f2d9 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -387,6 +387,13 @@ enum page_entry_size { * These are the virtual MM functions - opening of an area, closing and * unmapping it (needed to keep files on disk up-to-date etc), pointer * to the functions called when a no-page or a wp-page exception occurs. + * + * Note, when a new function is introduced to vm_operations_struct and + * added to hugetlb_vm_ops, please consider adding the function to + * shm_vm_ops. This is because under System V memory model, though + * mappings created via shmget/shmat with "huge page" specified are + * backed by hugetlbfs files, their original vm_ops are overwritten with + * shm_vm_ops. */ struct vm_operations_struct { void (*open)(struct vm_area_struct * area); diff --git a/ipc/shm.c b/ipc/shm.c index 051a3e1fb8df..fefa00d310fb 100644 --- a/ipc/shm.c +++ b/ipc/shm.c @@ -427,6 +427,17 @@ static int shm_split(struct vm_area_struct *vma, unsigned long addr) return 0; } +static unsigned long shm_pagesize(struct vm_area_struct *vma) +{ + struct file *file = vma->vm_file; + struct shm_file_data *sfd = shm_file_data(file); + + if (sfd->vm_ops->pagesize) + return sfd->vm_ops->pagesize(vma); + + return PAGE_SIZE; +} + #ifdef CONFIG_NUMA static int shm_set_policy(struct vm_area_struct *vma, struct mempolicy *new) { @@ -554,6 +565,7 @@ static const struct vm_operations_struct shm_vm_ops = { .close = shm_close,/* callback for when the vm-area is released */ .fault = shm_fault, .split = shm_split, + .pagesize = shm_pagesize, #if defined(CONFIG_NUMA) .set_policy = shm_set_policy, .get_policy = shm_get_policy, -- 2.15.GIT ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
[fstests PATCH v3] generic/999: test DAX DMA vs truncate/hole-punch
From: Ross Zwisler This adds a regression test for the following series: https://lists.01.org/pipermail/linux-nvdimm/2018-July/016842.html which adds synchronization between DAX DMA in ext4 and truncate/hole-punch. The intention of the test is to test those specific changes, but it runs fine both with XFS and without DAX so I've put it in the generic tests instead of ext4 and not restricted it to only DAX configurations. When run with v4.18-rc6 + DAX + ext4, this test will hit the following WARN_ON_ONCE() in dax_disassociate_entry(): WARN_ON_ONCE(trunc && page_ref_count(page) > 1); If you change this to a WARN_ON() instead, you can see that each of the four paths being exercised in this test hits that condition many times in the one second that the subtest is being run. Signed-off-by: Ross Zwisler --- Changes since v2: - Added detailed description to tests/generic/999 explaining the purpose of the test (Eryu). - Added _require_xfs_io_command for falloc, fpunch, fcollapse and fzero to account for filesystems that don't support these commands (Eryu). - Incorporated other feedback from Eryu. Thanks, Eryu, for the review. --- .gitignore | 1 + src/Makefile | 2 +- src/t_mmap_collision.c | 235 + tests/generic/999 | 64 ++ tests/generic/999.out | 2 + tests/generic/group| 1 + 6 files changed, 304 insertions(+), 1 deletion(-) create mode 100644 src/t_mmap_collision.c create mode 100755 tests/generic/999 create mode 100644 tests/generic/999.out diff --git a/.gitignore b/.gitignore index efc73a7c..ea1aac8a 100644 --- a/.gitignore +++ b/.gitignore @@ -125,6 +125,7 @@ /src/t_holes /src/t_immutable /src/t_locks_execve +/src/t_mmap_collision /src/t_mmap_cow_race /src/t_mmap_dio /src/t_mmap_fallocate diff --git a/src/Makefile b/src/Makefile index 9e971bcc..41826585 100644 --- a/src/Makefile +++ b/src/Makefile @@ -16,7 +16,7 @@ TARGETS = dirstress fill fill2 getpagesize holes lstat64 \ holetest t_truncate_self t_mmap_dio af_unix t_mmap_stale_pmd \ t_mmap_cow_race t_mmap_fallocate fsync-err t_mmap_write_ro \ t_ext4_dax_journal_corruption t_ext4_dax_inline_corruption \ - t_ofd_locks t_locks_execve + t_ofd_locks t_locks_execve t_mmap_collision LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \ preallo_rw_pattern_writer ftrunc trunc fs_perms testx looptest \ diff --git a/src/t_mmap_collision.c b/src/t_mmap_collision.c new file mode 100644 index ..d547bc05 --- /dev/null +++ b/src/t_mmap_collision.c @@ -0,0 +1,235 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2018 Intel Corporation. + * + * As of kernel version 4.18-rc6 Linux has an issue with ext4+DAX where DMA + * and direct I/O operations aren't synchronized with respect to operations + * which can change the block mappings of an inode. This means that we can + * schedule an I/O for an inode and have the block mapping for that inode + * change before the I/O is actually complete. So, blocks which were once + * allocated to a given inode and then freed could still have I/O operations + * happening to them. If these blocks have also been reallocated to a + * different inode, this interaction can lead to data corruption. + * + * This test exercises four of the paths in ext4 which hit this issue. + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define PAGE(a) ((a)*0x1000) +#define FILE_SIZE PAGE(4) + +void *dax_data; +int nodax_fd; +int dax_fd; +bool done; + +#define err_exit(op) \ +{ \ + fprintf(stderr, "%s %s: %s\n", __func__, op, strerror(errno));\ + exit(1); \ +} + +#if defined(FALLOC_FL_PUNCH_HOLE) && defined(FALLOC_FL_KEEP_SIZE) +void punch_hole_fn(void *ptr) +{ + ssize_t read; + int rc; + + while (!done) { + read = 0; + + do { + rc = pread(nodax_fd, dax_data + read, FILE_SIZE - read, + read); + if (rc > 0) + read += rc; + } while (rc > 0); + + if (read != FILE_SIZE || rc != 0) + err_exit("pread"); + + rc = fallocate(dax_fd, + FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, + 0, FILE_SIZE); + if (rc < 0) + err_exit("fallocate"); + + usleep(rand() % 1000); + } +} +#else +void punch_hole_fn(void *ptr) { } +#endif + +#if defined(FALLOC_FL_ZERO_RANGE) && defined(FALLOC_FL_KEEP_SIZE) +void zero_range_fn(void *ptr) +{
Re: [PATCH v2 5/6] md/dm-writecache: Don't request pointer dummy_addr when not required
On Wed, Jul 25 2018 at 12:28pm -0400, Huaisheng Ye wrote: > From: Huaisheng Ye > > Function persistent_memory_claim doesn't need to get local pointer > dummy_addr from direct_access. Using NULL instead of having to pass > in a useless local pointer that caller then just throw away. > > Suggested-by: Ross Zwisler > Signed-off-by: Huaisheng Ye Acked-by: Mike Snitzer ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
Re: [PATCH] acpi/nfit: queue issuing of ars when an uc error notification comes in
On Fri, 2018-07-27 at 09:04 -0700, Dave Jiang wrote: > When the ACPI UC error notifier gets called and ARS_REQ bit is set > with the passed in flag, we can receive -EBUSY if ARS_REQ bit is already > set for the nfit_spa->ars_state. When that happens, the ARS request is > dropped. That can potentially cause us to miss the unreported errors that > the on going ARS request does not receive. Add an ARS_REQ_REDO state that > will request short ARS upon ARS completion to grab any errors we missed. > > Signed-off-by: Dave Jiang > --- > drivers/acpi/nfit/core.c | 12 +--- > drivers/acpi/nfit/nfit.h |1 + > 2 files changed, 10 insertions(+), 3 deletions(-) Looks good to me, you can add: Reviewed-by: Vishal Verma > > diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c > index 274f636cd3d2..def64259f9a7 100644 > --- a/drivers/acpi/nfit/core.c > +++ b/drivers/acpi/nfit/core.c > @@ -2567,7 +2567,12 @@ static void ars_complete(struct acpi_nfit_desc > *acpi_desc, > test_bit(ARS_SHORT, _spa->ars_state) > ? "short" : "long"); > clear_bit(ARS_SHORT, _spa->ars_state); > - set_bit(ARS_DONE, _spa->ars_state); > + if (test_and_clear_bit(ARS_REQ_REDO, _spa->ars_state)) { > + set_bit(ARS_SHORT, _spa->ars_state); > + set_bit(ARS_REQ, _spa->ars_state); > + dev_dbg(dev, "ARS: processing scrub request received while in > progress\n"); > + } else > + set_bit(ARS_DONE, _spa->ars_state); > } > > static int ars_status_process_records(struct acpi_nfit_desc *acpi_desc) > @@ -3264,9 +3269,10 @@ int acpi_nfit_ars_rescan(struct acpi_nfit_desc > *acpi_desc, unsigned long flags) > if (test_bit(ARS_FAILED, _spa->ars_state)) > continue; > > - if (test_and_set_bit(ARS_REQ, _spa->ars_state)) > + if (test_and_set_bit(ARS_REQ, _spa->ars_state)) { > busy++; > - else { > + set_bit(ARS_REQ_REDO, _spa->ars_state); > + } else { > if (test_bit(ARS_SHORT, )) > set_bit(ARS_SHORT, _spa->ars_state); > scheduled++; > diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h > index a97ff42fe311..d1274ea2d251 100644 > --- a/drivers/acpi/nfit/nfit.h > +++ b/drivers/acpi/nfit/nfit.h > @@ -119,6 +119,7 @@ enum nfit_dimm_notifiers { > > enum nfit_ars_state { > ARS_REQ, > + ARS_REQ_REDO, > ARS_DONE, > ARS_SHORT, > ARS_FAILED, > ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
[ndctl PATCH] ndctl, inject-smart: Fix man page to match the current behavior
The inject-smart man page had a -H option for health status, which was from a previous development iteration, and not how it actually works. Fix it to advertise a -f flag for setting fatal status, and remove an unnecessart '=' from the unsafe-shutdown option as it doesn't accept any additional arguments. Signed-off-by: Vishal Verma --- Documentation/ndctl/ndctl-inject-smart.txt | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/Documentation/ndctl/ndctl-inject-smart.txt b/Documentation/ndctl/ndctl-inject-smart.txt index 0dc6481..20a1621 100644 --- a/Documentation/ndctl/ndctl-inject-smart.txt +++ b/Documentation/ndctl/ndctl-inject-smart.txt @@ -77,13 +77,12 @@ OPTIONS --spares-alarm=:: Enable or disable the smart spares alarm. Options are 'on' or 'off'. --H:: ---health=:: - Smart attribute for health status. Provide either 'fatal' or 'nominal' - to set the state of the attribute. +-f:: +--fatal:: + Set the flag to spoof fatal health status. -U:: ---unsafe-shutdown=:: +--unsafe-shutdown:: Set the flag to spoof an unsafe shutdown on the next power down. -v:: -- 2.14.4 ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
[ndctl PATCH] ndctl: deprecate undocumented short-options
The inject-smart and monitor man pages refrained from displaying short options for various arguments (alarms, daemon) due to a lack of a coherent letter that could be made to relate to the event name. It was expected that the user will always use the long option for these. Since the OPT_STRING helper refused to take an empty string for the short option, we used bogus characters for each of them. However there is a better way to provide no short options, and that is by using '\0' for the short option field to OPT_STRING. Replace the bogus characters with '\0' so that the 'short help' also becomes consistent with the man pages. Cc: Cc: QI Fuli Signed-off-by: Vishal Verma --- ndctl/inject-smart.c | 7 --- ndctl/monitor.c | 2 +- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/ndctl/inject-smart.c b/ndctl/inject-smart.c index 006ea2a..edb87e1 100644 --- a/ndctl/inject-smart.c +++ b/ndctl/inject-smart.c @@ -73,7 +73,7 @@ OPT_STRING('M', "media-temperature-threshold", \ _temperature_threshold, \ "set smart media temperature threshold", \ "set threshold value for smart media temperature"), \ -OPT_STRING('x', "media-temperature-alarm", _temperature_alarm, \ +OPT_STRING('\0', "media-temperature-alarm", _temperature_alarm, \ "smart media temperature alarm", \ "enable or disable the smart media temperature alarm"), \ OPT_STRING('c', "ctrl-temperature", _temperature, \ @@ -83,7 +83,7 @@ OPT_STRING('C', "ctrl-temperature-threshold", \ _temperature_threshold, \ "set smart controller temperature threshold", \ "set threshold value for smart controller temperature"), \ -OPT_STRING('y', "ctrl-temperature-alarm", _temperature_alarm, \ +OPT_STRING('\0', "ctrl-temperature-alarm", _temperature_alarm, \ "smart controller temperature alarm", \ "enable or disable the smart controller temperature alarm"), \ OPT_STRING('s', "spares", , \ @@ -92,13 +92,14 @@ OPT_STRING('s', "spares", , \ OPT_STRING('S', "spares-threshold", _threshold, \ "set smart spares threshold", \ "set a threshold value for smart spares"), \ -OPT_STRING('z', "spares-alarm", _alarm, \ +OPT_STRING('\0', "spares-alarm", _alarm, \ "smart spares alarm", \ "enable or disable the smart spares alarm"), \ OPT_BOOLEAN('f', "fatal", , "inject fatal smart health status"), \ OPT_BOOLEAN('U', "unsafe-shutdown", _shutdown, \ "inject smart unsafe shutdown status") + static const struct option smart_opts[] = { SMART_OPTIONS(), OPT_END(), diff --git a/ndctl/monitor.c b/ndctl/monitor.c index b97d1ea..c6419ad 100644 --- a/ndctl/monitor.c +++ b/ndctl/monitor.c @@ -583,7 +583,7 @@ int cmd_monitor(int argc, const char **argv, void *ctx) "where to output the monitor's notification"), OPT_FILENAME('c', "config-file", _file, "config-file", "override the default config"), - OPT_BOOLEAN('x', "daemon", , + OPT_BOOLEAN('\0', "daemon", , "run ndctl monitor as a daemon"), OPT_BOOLEAN('u', "human", , "use human friendly output formats"), -- 2.14.4 ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
[ndctl PATCH] ndctl, documentation: document the label-version option for init-labels
Add the missing --label-verison option to the ndctl init-labels man page. Signed-off-by: Vishal Verma --- Documentation/ndctl/ndctl-init-labels.txt | 5 + 1 file changed, 5 insertions(+) diff --git a/Documentation/ndctl/ndctl-init-labels.txt b/Documentation/ndctl/ndctl-init-labels.txt index 736d52d..733ef0e 100644 --- a/Documentation/ndctl/ndctl-init-labels.txt +++ b/Documentation/ndctl/ndctl-init-labels.txt @@ -83,6 +83,11 @@ include::labels-options.txt[] be an existing / valid namespace index. Warning, this will destroy all defined namespaces on the dimm. +-V:: +--label-version:: + Initialize with a specific version of labels from the namespace + label specification. Defaults to 1.1 + include::../copyright.txt[] SEE ALSO -- 2.14.4 ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
Re: [PATCH v4 0/2] ext4: fix DAX dma vs truncate/hole-punch
+ fsdevel and the xfs list. On Wed, Jul 25, 2018 at 4:28 PM Ross Zwisler wrote: > On Wed, Jul 11, 2018 at 10:17:41AM +0200, Jan Kara wrote: > > On Tue 10-07-18 13:10:29, Ross Zwisler wrote: > > > Changes since v3: > > > * Added an ext4_break_layouts() call to ext4_insert_range() to ensure > > >that the {ext4,xfs}_break_layouts() calls have the same meaning. > > >(Dave, Darrick and Jan) > > > > How about the occasional WARN_ON_ONCE you mention below. Were you able to > > hunt them down? > > The root cause of this issue is that while the ei->i_mmap_sem provides > synchronization between ext4_break_layouts() and page faults, it doesn't > provide synchronize us with the direct I/O path. This exact same issue exists > in XFS AFAICT, with the synchronization tool there being the XFS_MMAPLOCK. > > This allows the direct I/O path to do I/O and raise & lower page->_refcount > while we're executing a truncate/hole punch. This leads to us trying to free > a page with an elevated refcount. > > Here's one instance of the race: > > CPU 0 CPU 1 > - - > ext4_punch_hole() > ext4_break_layouts() # all pages have refcount=1 > > ext4_direct_IO() > ... lots of layers ... > follow_page_pte() > get_page() # elevates refcount > > truncate_pagecache_range() >... a few layers ... >dax_disassociate_entry() # sees elevated refcount, WARN_ON_ONCE() > > A similar race occurs when the refcount is being dropped while we're running > ext4_break_layouts(), and this is the one that my test was actually hitting: > > CPU 0 CPU 1 > - - > ext4_direct_IO() > ... lots of layers ... > follow_page_pte() > get_page() > # elevates refcount of page X > ext4_punch_hole() > ext4_break_layouts() # two pages, X & Y, have refcount == 2 > __wait_var_event() # called for page X > > __put_devmap_managed_page() > # drops refcount of X to 1 > ># __wait_var_events() checks X's refcount in "if (condition)", and breaks. ># We never actually called ext4_wait_dax_page(), so 'retry' in ># ext4_break_layouts() is still false. Exit do/while loop in ># ext4_break_layouts, never attempting to wait on page Y which still has an ># elevated refcount of 2. > > truncate_pagecache_range() >... a few layers ... >dax_disassociate_entry() # sees elevated refcount for Y, WARN_ON_ONCE() > > This second race can be fixed with the patch at the end of this function, > which I think should go in, unless there is a benfit to the current retry > scheme which relies on the 'retry' variable in {ext4,xfs}_break_layouts()? > With this patch applied I've been able to run my unit test through > thousands of iterations, where it used to failed consistently within 10 or > so. > > Even so, I wonder if the real solution is to add synchronization between > the direct I/O path and {ext4,xfs}_break_layouts()? Other ideas on how > this should be handled? > > --- >8 --- > > From a4519b0f40362f0a63ae96acaf986092aff0f0d3 Mon Sep 17 00:00:00 2001 > From: Ross Zwisler > Date: Wed, 25 Jul 2018 16:16:05 -0600 > Subject: [PATCH] ext4: Close race between direct IO and ext4_break_layouts() > > If the refcount of a page is lowered between the time that it is returned > by dax_busy_page() and when the refcount is again checked in > ext4_break_layouts() => ___wait_var_event(), the waiting function > ext4_wait_dax_page() will never be called. This means that > ext4_break_layouts() will still have 'retry' set to false, so we'll stop > looping and never check the refcount of other pages in this inode. > > Instead, always continue looping as long as dax_layout_busy_page() gives us > a page which it found with an elevated refcount. > > Note that this works around the race exposed by my unit test, but I think > that there is another race that needs to be addressed, probably with > additional synchronization added between direct I/O and > {ext4,xfs}_break_layouts(). > > Signed-off-by: Ross Zwisler > --- > fs/ext4/inode.c | 9 +++-- > 1 file changed, 3 insertions(+), 6 deletions(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 27e9643bc13b..fedb88104bbf 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -4193,9 +4193,8 @@ int ext4_update_disksize_before_punch(struct inode > *inode, loff_t offset, > return 0; > } > > -static void ext4_wait_dax_page(struct ext4_inode_info *ei, bool *did_unlock) > +static void
[PATCH] acpi/nfit: queue issuing of ars when an uc error notification comes in
When the ACPI UC error notifier gets called and ARS_REQ bit is set with the passed in flag, we can receive -EBUSY if ARS_REQ bit is already set for the nfit_spa->ars_state. When that happens, the ARS request is dropped. That can potentially cause us to miss the unreported errors that the on going ARS request does not receive. Add an ARS_REQ_REDO state that will request short ARS upon ARS completion to grab any errors we missed. Signed-off-by: Dave Jiang --- drivers/acpi/nfit/core.c | 12 +--- drivers/acpi/nfit/nfit.h |1 + 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index 274f636cd3d2..def64259f9a7 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -2567,7 +2567,12 @@ static void ars_complete(struct acpi_nfit_desc *acpi_desc, test_bit(ARS_SHORT, _spa->ars_state) ? "short" : "long"); clear_bit(ARS_SHORT, _spa->ars_state); - set_bit(ARS_DONE, _spa->ars_state); + if (test_and_clear_bit(ARS_REQ_REDO, _spa->ars_state)) { + set_bit(ARS_SHORT, _spa->ars_state); + set_bit(ARS_REQ, _spa->ars_state); + dev_dbg(dev, "ARS: processing scrub request received while in progress\n"); + } else + set_bit(ARS_DONE, _spa->ars_state); } static int ars_status_process_records(struct acpi_nfit_desc *acpi_desc) @@ -3264,9 +3269,10 @@ int acpi_nfit_ars_rescan(struct acpi_nfit_desc *acpi_desc, unsigned long flags) if (test_bit(ARS_FAILED, _spa->ars_state)) continue; - if (test_and_set_bit(ARS_REQ, _spa->ars_state)) + if (test_and_set_bit(ARS_REQ, _spa->ars_state)) { busy++; - else { + set_bit(ARS_REQ_REDO, _spa->ars_state); + } else { if (test_bit(ARS_SHORT, )) set_bit(ARS_SHORT, _spa->ars_state); scheduled++; diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h index a97ff42fe311..d1274ea2d251 100644 --- a/drivers/acpi/nfit/nfit.h +++ b/drivers/acpi/nfit/nfit.h @@ -119,6 +119,7 @@ enum nfit_dimm_notifiers { enum nfit_ars_state { ARS_REQ, + ARS_REQ_REDO, ARS_DONE, ARS_SHORT, ARS_FAILED, ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm