[PATCH 1/6] libnvdimm: Add of_node to region and bus descriptors

2018-03-23 Thread Oliver O'Halloran
We want to be able to cross reference the region and bus devices
with the device tree node that they were spawned from. libNVDIMM
handles creating the actual devices for these internally, so we
need to pass in a pointer to the relevant node in the descriptor.

Signed-off-by: Oliver O'Halloran 
Acked-by: Dan Williams 
---
 drivers/nvdimm/bus.c | 1 +
 drivers/nvdimm/region_devs.c | 1 +
 include/linux/libnvdimm.h| 3 +++
 3 files changed, 5 insertions(+)

diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 78eabc3a1ab1..c6106914f396 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -358,6 +358,7 @@ struct nvdimm_bus *nvdimm_bus_register(struct device 
*parent,
nvdimm_bus->dev.release = nvdimm_bus_release;
nvdimm_bus->dev.groups = nd_desc->attr_groups;
nvdimm_bus->dev.bus = _bus_type;
+   nvdimm_bus->dev.of_node = nd_desc->of_node;
dev_set_name(_bus->dev, "ndbus%d", nvdimm_bus->id);
rc = device_register(_bus->dev);
if (rc) {
diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index e6d01911e092..2f1d5771100e 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -1005,6 +1005,7 @@ static struct nd_region *nd_region_create(struct 
nvdimm_bus *nvdimm_bus,
dev->parent = _bus->dev;
dev->type = dev_type;
dev->groups = ndr_desc->attr_groups;
+   dev->of_node = ndr_desc->of_node;
nd_region->ndr_size = resource_size(ndr_desc->res);
nd_region->ndr_start = ndr_desc->res->start;
nd_device_register(dev);
diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
index ff855ed965fb..f61cb5050297 100644
--- a/include/linux/libnvdimm.h
+++ b/include/linux/libnvdimm.h
@@ -76,12 +76,14 @@ typedef int (*ndctl_fn)(struct nvdimm_bus_descriptor 
*nd_desc,
struct nvdimm *nvdimm, unsigned int cmd, void *buf,
unsigned int buf_len, int *cmd_rc);
 
+struct device_node;
 struct nvdimm_bus_descriptor {
const struct attribute_group **attr_groups;
unsigned long bus_dsm_mask;
unsigned long cmd_mask;
struct module *module;
char *provider_name;
+   struct device_node *of_node;
ndctl_fn ndctl;
int (*flush_probe)(struct nvdimm_bus_descriptor *nd_desc);
int (*clear_to_send)(struct nvdimm_bus_descriptor *nd_desc,
@@ -123,6 +125,7 @@ struct nd_region_desc {
int num_lanes;
int numa_node;
unsigned long flags;
+   struct device_node *of_node;
 };
 
 struct device;
-- 
2.9.5

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH 2/6] libnvdimm: Add nd_region_destroy()

2018-03-23 Thread Oliver O'Halloran
Currently there's no way to remove a region from and nvdimm_bus without
tearing down the whole bus. This patch adds an API for removing a single
region from the bus so that we can implement a sensible unbind operation
for the of_nd_region platform driver.

Signed-off-by: Oliver O'Halloran 
---
 drivers/nvdimm/region_devs.c | 6 ++
 include/linux/libnvdimm.h| 1 +
 2 files changed, 7 insertions(+)

diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index 2f1d5771100e..76f46fd1fae0 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -1039,6 +1039,12 @@ struct nd_region *nvdimm_blk_region_create(struct 
nvdimm_bus *nvdimm_bus,
 }
 EXPORT_SYMBOL_GPL(nvdimm_blk_region_create);
 
+void nd_region_destroy(struct nd_region *region)
+{
+   nd_device_unregister(>dev, ND_SYNC);
+}
+EXPORT_SYMBOL_GPL(nd_region_destroy);
+
 struct nd_region *nvdimm_volatile_region_create(struct nvdimm_bus *nvdimm_bus,
struct nd_region_desc *ndr_desc)
 {
diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
index f61cb5050297..df21ca176e98 100644
--- a/include/linux/libnvdimm.h
+++ b/include/linux/libnvdimm.h
@@ -192,6 +192,7 @@ struct nd_region *nvdimm_blk_region_create(struct 
nvdimm_bus *nvdimm_bus,
struct nd_region_desc *ndr_desc);
 struct nd_region *nvdimm_volatile_region_create(struct nvdimm_bus *nvdimm_bus,
struct nd_region_desc *ndr_desc);
+void nd_region_destroy(struct nd_region *region);
 void *nd_region_provider_data(struct nd_region *nd_region);
 void *nd_blk_region_provider_data(struct nd_blk_region *ndbr);
 void nd_blk_region_set_provider_data(struct nd_blk_region *ndbr, void *data);
-- 
2.9.5

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH 3/6] libnvdimm: Add device-tree based driver

2018-03-23 Thread Oliver O'Halloran
This patch adds peliminary device-tree bindings for the NVDIMM driver.
Currently this only supports one bus (created at probe time) which all
regions are added to with individual regions being created by a platform
device driver.

Signed-off-by: Oliver O'Halloran 
---
I suspect the platform driver should be holding a reference to the
created region. I left that out here since previously Dan has said
he'd rather keep the struct device internal to libnvdimm and the only
other way a region device can disappear is when the bus is unregistered.
---
 MAINTAINERS|   8 +++
 drivers/nvdimm/Kconfig |  10 
 drivers/nvdimm/Makefile|   1 +
 drivers/nvdimm/of_nvdimm.c | 130 +
 4 files changed, 149 insertions(+)
 create mode 100644 drivers/nvdimm/of_nvdimm.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 4e62756936fa..e3fc47fbfc7a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8035,6 +8035,14 @@ Q:   
https://patchwork.kernel.org/project/linux-nvdimm/list/
 S: Supported
 F: drivers/nvdimm/pmem*
 
+LIBNVDIMM: DEVICETREE BINDINGS
+M: Oliver O'Halloran 
+L: linux-nvdimm@lists.01.org
+Q: https://patchwork.kernel.org/project/linux-nvdimm/list/
+S: Supported
+F: drivers/nvdimm/of_nvdimm.c
+F: Documentation/devicetree/bindings/nvdimm/nvdimm-bus.txt
+
 LIBNVDIMM: NON-VOLATILE MEMORY DEVICE SUBSYSTEM
 M: Dan Williams 
 L: linux-nvdimm@lists.01.org
diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig
index a65f2e1d9f53..505a9bbbe49f 100644
--- a/drivers/nvdimm/Kconfig
+++ b/drivers/nvdimm/Kconfig
@@ -102,4 +102,14 @@ config NVDIMM_DAX
 
  Select Y if unsure
 
+config OF_NVDIMM
+   tristate "Device-tree support for NVDIMMs"
+   depends on OF
+   default LIBNVDIMM
+   help
+ Allows byte addressable persistent memory regions to be described in 
the
+ device-tree.
+
+ Select Y if unsure.
+
 endif
diff --git a/drivers/nvdimm/Makefile b/drivers/nvdimm/Makefile
index 70d5f3ad9909..fd6a5838aa25 100644
--- a/drivers/nvdimm/Makefile
+++ b/drivers/nvdimm/Makefile
@@ -4,6 +4,7 @@ obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
 obj-$(CONFIG_ND_BTT) += nd_btt.o
 obj-$(CONFIG_ND_BLK) += nd_blk.o
 obj-$(CONFIG_X86_PMEM_LEGACY) += nd_e820.o
+obj-$(CONFIG_OF_NVDIMM) += of_nvdimm.o
 
 nd_pmem-y := pmem.o
 
diff --git a/drivers/nvdimm/of_nvdimm.c b/drivers/nvdimm/of_nvdimm.c
new file mode 100644
index ..79c28291f420
--- /dev/null
+++ b/drivers/nvdimm/of_nvdimm.c
@@ -0,0 +1,130 @@
+// SPDX-License-Identifier: GPL-2.0+
+
+#define pr_fmt(fmt) "of_nvdimm: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * Container bus stuff.  For now we just chunk regions into a default
+ * bus with no ndctl support. In the future we'll add some mechanism
+ * for dispatching regions into the correct bus type, but this is useful
+ * for now.
+ */
+struct nvdimm_bus_descriptor bus_desc;
+struct nvdimm_bus *bus;
+
+/* region driver */
+
+static const struct attribute_group *region_attr_groups[] = {
+   _region_attribute_group,
+   _device_attribute_group,
+   NULL,
+};
+
+static const struct attribute_group *bus_attr_groups[] = {
+   _bus_attribute_group,
+   NULL,
+};
+
+static int of_nd_region_probe(struct platform_device *pdev)
+{
+   struct nd_region_desc ndr_desc;
+   struct resource temp_res;
+   struct nd_region *region;
+   struct device_node *np;
+
+   np = dev_of_node(>dev);
+   if (!np)
+   return -ENXIO;
+
+   pr_err("registering region for %pOF\n", np);
+
+   if (of_address_to_resource(np, 0, _res)) {
+   pr_warn("Unable to parse reg[0] for %pOF\n", np);
+   return -ENXIO;
+   }
+
+   memset(_desc, 0, sizeof(ndr_desc));
+   ndr_desc.res = _res;
+   ndr_desc.of_node = np;
+   ndr_desc.attr_groups = region_attr_groups;
+   ndr_desc.numa_node = of_node_to_nid(np);
+   set_bit(ND_REGION_PAGEMAP, _desc.flags);
+
+   /*
+* NB: libnvdimm copies the data from ndr_desc into it's own structures
+* so passing stack pointers is fine.
+*/
+   if (of_get_property(np, "volatile", NULL))
+   region = nvdimm_volatile_region_create(bus, _desc);
+   else
+   region = nvdimm_pmem_region_create(bus, _desc);
+
+   pr_warn("registered pmem region %px\n", region);
+   if (!region)
+   return -ENXIO;
+
+   platform_set_drvdata(pdev, region);
+
+   return 0;
+}
+
+static int of_nd_region_remove(struct platform_device *pdev)
+{
+   struct nd_region *r = platform_get_drvdata(pdev);
+
+   nd_region_destroy(r);
+
+   return 0;
+}
+
+static const struct of_device_id of_nd_region_match[] = {
+   { .compatible = "nvdimm-region" },
+   { },
+};
+
+static struct platform_driver of_nd_region_driver = {
+  

[PATCH 4/6] libnvdimm/of: Symlink platform and region devices

2018-03-23 Thread Oliver O'Halloran
Add a way direct link between the region and the platform device that
creates the region.

Signed-off-by: Oliver O'Halloran 
---
 drivers/nvdimm/of_nvdimm.c   | 11 +++
 drivers/nvdimm/region_devs.c | 13 +
 include/linux/libnvdimm.h|  1 +
 3 files changed, 25 insertions(+)

diff --git a/drivers/nvdimm/of_nvdimm.c b/drivers/nvdimm/of_nvdimm.c
index 79c28291f420..28f4ca23a690 100644
--- a/drivers/nvdimm/of_nvdimm.c
+++ b/drivers/nvdimm/of_nvdimm.c
@@ -37,6 +37,7 @@ static int of_nd_region_probe(struct platform_device *pdev)
struct resource temp_res;
struct nd_region *region;
struct device_node *np;
+   int rc;
 
np = dev_of_node(>dev);
if (!np)
@@ -71,6 +72,15 @@ static int of_nd_region_probe(struct platform_device *pdev)
 
platform_set_drvdata(pdev, region);
 
+   /*
+* Add a symlink to the ndbus region object. Without this there's no
+* simple way to go from the platform device to the region it spawned.
+*/
+   rc = sysfs_create_link(>dev.kobj,
+   nd_region_kobj(region), "region");
+   if (rc)
+   pr_warn("Failed to create symlink to region (rc = %d)!\n", rc);
+
return 0;
 }
 
@@ -78,6 +88,7 @@ static int of_nd_region_remove(struct platform_device *pdev)
 {
struct nd_region *r = platform_get_drvdata(pdev);
 
+   sysfs_delete_link(>dev.kobj, nd_region_kobj(r), "region");
nd_region_destroy(r);
 
return 0;
diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index 76f46fd1fae0..af09acc1d93b 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -1054,6 +1054,19 @@ struct nd_region *nvdimm_volatile_region_create(struct 
nvdimm_bus *nvdimm_bus,
 }
 EXPORT_SYMBOL_GPL(nvdimm_volatile_region_create);
 
+struct kobject *nd_region_kobj(struct nd_region *region)
+{
+   /*
+* region init is async so we need to explicitly synchronise
+* to prevent handing out a kobj reference before device_add()
+* has been run
+*/
+   nd_synchronize();
+
+   return >dev.kobj;
+}
+EXPORT_SYMBOL_GPL(nd_region_kobj);
+
 /**
  * nvdimm_flush - flush any posted write queues between the cpu and pmem media
  * @nd_region: blk or interleaved pmem region
diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
index df21ca176e98..a4b3663bac38 100644
--- a/include/linux/libnvdimm.h
+++ b/include/linux/libnvdimm.h
@@ -172,6 +172,7 @@ struct nvdimm_bus_descriptor *to_nd_desc(struct nvdimm_bus 
*nvdimm_bus);
 struct device *to_nvdimm_bus_dev(struct nvdimm_bus *nvdimm_bus);
 const char *nvdimm_name(struct nvdimm *nvdimm);
 struct kobject *nvdimm_kobj(struct nvdimm *nvdimm);
+struct kobject *nd_region_kobj(struct nd_region *region);
 unsigned long nvdimm_cmd_mask(struct nvdimm *nvdimm);
 void *nvdimm_provider_data(struct nvdimm *nvdimm);
 struct nvdimm *nvdimm_create(struct nvdimm_bus *nvdimm_bus, void 
*provider_data,
-- 
2.9.5

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH 6/6] doc/devicetree: NVDIMM region documentation

2018-03-23 Thread Oliver O'Halloran
Add device-tree binding documentation for the nvdimm region driver.

Cc: devicet...@vger.kernel.org
Signed-off-by: Oliver O'Halloran 
---
 .../devicetree/bindings/nvdimm/nvdimm-region.txt   | 45 ++
 1 file changed, 45 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/nvdimm/nvdimm-region.txt

diff --git a/Documentation/devicetree/bindings/nvdimm/nvdimm-region.txt 
b/Documentation/devicetree/bindings/nvdimm/nvdimm-region.txt
new file mode 100644
index ..02091117ff16
--- /dev/null
+++ b/Documentation/devicetree/bindings/nvdimm/nvdimm-region.txt
@@ -0,0 +1,45 @@
+Device-tree bindings for NVDIMM memory regions
+-
+
+Non-volatile DIMMs are memory modules used to provide (cacheable) main memory
+that retains its contents across power cycles. In more practical terms, they
+are kind of storage device where the contents can be accessed by the CPU
+directly, rather than indirectly via a storage controller or similar. The an
+nvdimm-region specifies a physical address range that is hosted on an NVDIMM
+device.
+
+Bindings for the region nodes:
+-
+
+Required properties:
+   - compatible = "nvdimm-region"
+
+   - reg = ;
+   The system physical address range of this nvdimm region.
+
+Optional properties:
+   - Any relevant NUMA assocativity properties for the target platform.
+   - A "volatile" property indicating that this region is actually in
+ normal DRAM and does not require cache flushes after each write.
+
+A complete example:
+
+
+/ {
+   #size-cells = <2>;
+   #address-cells = <2>;
+
+   platform {
+   region@5000 {
+   compatible = "nvdimm-region;
+   reg = <0x0001 0x 0x 0x4000>
+
+   };
+
+   region@6000 {
+   compatible = "nvdimm-region";
+   reg = <0x0001 0x 0x 0x4000>
+   volatile;
+   };
+   };
+};
-- 
2.9.5

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[PATCH 5/6] powerpc/powernv: Create platform devs for nvdimm buses

2018-03-23 Thread Oliver O'Halloran
Scan the devicetree for an nvdimm-bus compatible and create
a platform device for them.

Signed-off-by: Oliver O'Halloran 
---
 arch/powerpc/platforms/powernv/opal.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/opal.c 
b/arch/powerpc/platforms/powernv/opal.c
index c15182765ff5..a16f4b63ccf2 100644
--- a/arch/powerpc/platforms/powernv/opal.c
+++ b/arch/powerpc/platforms/powernv/opal.c
@@ -821,6 +821,9 @@ static int __init opal_init(void)
/* Create i2c platform devices */
opal_pdev_init("ibm,opal-i2c");
 
+   /* Handle non-volatile memory devices */
+   opal_pdev_init("nvdimm-region");
+
/* Setup a heatbeat thread if requested by OPAL */
opal_init_heartbeat();
 
-- 
2.9.5

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[ndctl PATCH] ndctl, documentation: update copyright

2018-03-23 Thread Dan Williams
Use SPDX identifiers in the documentation source, and move the
"Copyright" boilerplate that is emitted into the man page into a common
header file.

Signed-off-by: Dan Williams 
---
 Documentation/copyright.txt |8 
 Documentation/daxctl/Makefile.am|3 ++-
 Documentation/daxctl/daxctl-list.txt|9 +++--
 Documentation/daxctl/daxctl.txt |4 
 Documentation/ndctl/Makefile.am |3 ++-
 Documentation/ndctl/dimm-description.txt|2 ++
 Documentation/ndctl/human-option.txt|2 ++
 Documentation/ndctl/labels-description.txt  |2 ++
 Documentation/ndctl/labels-options.txt  |2 ++
 Documentation/ndctl/namespace-description.txt   |2 ++
 Documentation/ndctl/ndctl-check-labels.txt  |9 +++--
 Documentation/ndctl/ndctl-check-namespace.txt   |9 +++--
 Documentation/ndctl/ndctl-create-namespace.txt  |9 +++--
 Documentation/ndctl/ndctl-destroy-namespace.txt |9 +++--
 Documentation/ndctl/ndctl-disable-dimm.txt  |9 +++--
 Documentation/ndctl/ndctl-disable-namespace.txt |9 +++--
 Documentation/ndctl/ndctl-disable-region.txt|9 +++--
 Documentation/ndctl/ndctl-enable-dimm.txt   |9 +++--
 Documentation/ndctl/ndctl-enable-namespace.txt  |9 +++--
 Documentation/ndctl/ndctl-enable-region.txt |9 +++--
 Documentation/ndctl/ndctl-init-labels.txt   |9 +++--
 Documentation/ndctl/ndctl-inject-error.txt  |9 +++--
 Documentation/ndctl/ndctl-inject-smart.txt  |9 +++--
 Documentation/ndctl/ndctl-list.txt  |9 +++--
 Documentation/ndctl/ndctl-read-labels.txt   |9 +++--
 Documentation/ndctl/ndctl-update-firmware.txt   |9 +++--
 Documentation/ndctl/ndctl-write-labels.txt  |2 ++
 Documentation/ndctl/ndctl-zero-labels.txt   |9 +++--
 Documentation/ndctl/ndctl.txt   |9 +++--
 Documentation/ndctl/region-description.txt  |2 ++
 Documentation/ndctl/xable-dimm-options.txt  |2 ++
 Documentation/ndctl/xable-namespace-options.txt |2 ++
 Documentation/ndctl/xable-region-options.txt|2 ++
 33 files changed, 93 insertions(+), 116 deletions(-)
 create mode 100644 Documentation/copyright.txt

diff --git a/Documentation/copyright.txt b/Documentation/copyright.txt
new file mode 100644
index ..1d603645211c
--- /dev/null
+++ b/Documentation/copyright.txt
@@ -0,0 +1,8 @@
+// SPDX-License-Identifier: GPL-2.0
+
+COPYRIGHT
+-
+Copyright (c) 2016 - 2018, Intel Corporation. License GPLv2: GNU GPL
+version 2 .  This is free software:
+you are free to change and redistribute it.  There is NO WARRANTY, to
+the extent permitted by law.
diff --git a/Documentation/daxctl/Makefile.am b/Documentation/daxctl/Makefile.am
index 5913c94ca3be..259dafd18e5e 100644
--- a/Documentation/daxctl/Makefile.am
+++ b/Documentation/daxctl/Makefile.am
@@ -22,6 +22,7 @@ CLEANFILES = $(man1_MANS)
 
 XML_DEPS = \
../../version.m4 \
+   ../copyright.txt \
Makefile \
asciidoc.conf
 
@@ -33,6 +34,6 @@ RM ?= rm -f
--unsafe -adaxctl_version=$(VERSION) -o $@+ $< && \
mv $@+ $@
 
-%.1: %.xml
+%.1: %.xml $(XML_DEPS)
$(AM_V_GEN)$(RM) $@ && \
$(XMLTO) -o . -m ../manpage-normal.xsl man $<
diff --git a/Documentation/daxctl/daxctl-list.txt 
b/Documentation/daxctl/daxctl-list.txt
index 6249645f4bbd..99ca4a1b4f28 100644
--- a/Documentation/daxctl/daxctl-list.txt
+++ b/Documentation/daxctl/daxctl-list.txt
@@ -1,3 +1,5 @@
+// SPDX-License-Identifier: GPL-2.0
+
 daxctl-list(1)
 =
 
@@ -93,9 +95,4 @@ OPTIONS
   "size":"30.57 GiB (32.83 GB)"
 }
 
-COPYRIGHT
--
-Copyright (c) 2016 - 2017, Intel Corporation. License GPLv2: GNU GPL
-version 2 .  This is free software:
-you are free to change and redistribute it.  There is NO WARRANTY, to
-the extent permitted by law.
+include::../copyright.txt[]
diff --git a/Documentation/daxctl/daxctl.txt b/Documentation/daxctl/daxctl.txt
index 5da9746ea62a..f81b161c9771 100644
--- a/Documentation/daxctl/daxctl.txt
+++ b/Documentation/daxctl/daxctl.txt
@@ -1,3 +1,5 @@
+// SPDX-License-Identifier: GPL-2.0
+
 daxctl(1)
 =
 
@@ -27,6 +29,8 @@ the Linux kernel Device-DAX facility. This facility enables 
DAX mappings
 of performance / feature differentiated memory without need of a
 filesystem.
 
+include::../copyright.txt[]
+
 SEE ALSO
 
 linkdaxctl:ndctl-create-namespace[1],
diff --git a/Documentation/ndctl/Makefile.am b/Documentation/ndctl/Makefile.am
index 27b207698d5a..9acb4acd966a 100644
--- a/Documentation/ndctl/Makefile.am
+++ b/Documentation/ndctl/Makefile.am
@@ -40,6 +40,7 @@ XML_DEPS = \
../../version.m4 \
Makefile \
asciidoc.conf 

Re: [PATCH 2/6] libnvdimm: Add nd_region_destroy()

2018-03-23 Thread Dan Williams
On Fri, Mar 23, 2018 at 1:12 AM, Oliver O'Halloran  wrote:
> Currently there's no way to remove a region from and nvdimm_bus without
> tearing down the whole bus. This patch adds an API for removing a single
> region from the bus so that we can implement a sensible unbind operation
> for the of_nd_region platform driver.
>
> Signed-off-by: Oliver O'Halloran 
> ---
>  drivers/nvdimm/region_devs.c | 6 ++
>  include/linux/libnvdimm.h| 1 +
>  2 files changed, 7 insertions(+)
>
> diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> index 2f1d5771100e..76f46fd1fae0 100644
> --- a/drivers/nvdimm/region_devs.c
> +++ b/drivers/nvdimm/region_devs.c
> @@ -1039,6 +1039,12 @@ struct nd_region *nvdimm_blk_region_create(struct 
> nvdimm_bus *nvdimm_bus,
>  }
>  EXPORT_SYMBOL_GPL(nvdimm_blk_region_create);
>
> +void nd_region_destroy(struct nd_region *region)

Let's put this in the "nvdimm_" namespace so it pairs with the
nvdimm_*_region_create() apis.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 3/6] libnvdimm: Add device-tree based driver

2018-03-23 Thread Dan Williams
On Fri, Mar 23, 2018 at 1:12 AM, Oliver O'Halloran  wrote:
> This patch adds peliminary device-tree bindings for the NVDIMM driver.

*preliminary

> Currently this only supports one bus (created at probe time) which all
> regions are added to with individual regions being created by a platform
> device driver.
>
> Signed-off-by: Oliver O'Halloran 
> ---
> I suspect the platform driver should be holding a reference to the
> created region. I left that out here since previously Dan has said
> he'd rather keep the struct device internal to libnvdimm and the only
> other way a region device can disappear is when the bus is unregistered.

Hmm, but this still seems broken if bus teardown races region
teardown. I think the more natural model, given the way libnvdimm is
structured, to have each of these of_nd_region instances create their
own nvdimm bus. Thoughts?
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 4/6] libnvdimm/of: Symlink platform and region devices

2018-03-23 Thread Dan Williams
On Fri, Mar 23, 2018 at 1:12 AM, Oliver O'Halloran  wrote:
> Add a way direct link between the region and the platform device that
> creates the region.
>

This linking would not be needed if of_nd_regions each lived on their own bus.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-23 Thread Logan Gunthorpe


On 23/03/18 03:50 PM, Bjorn Helgaas wrote:
> Popping way up the stack, my original point was that I'm trying to
> remove restrictions on what devices can participate in peer-to-peer
> DMA.  I think it's fairly clear that in conventional PCI, any devices
> in the same PCI hierarchy, i.e., below the same host-to-PCI bridge,
> should be able to DMA to each other.

Yup, we are working on this.

> The routing behavior of PCIe is supposed to be compatible with
> conventional PCI, and I would argue that this effectively requires
> multi-function PCIe devices to have the internal routing required to
> avoid the route-to-self issue.

That would be very nice but many devices do not support the internal
route. We've had to work around this in the past and as I mentioned
earlier that NVMe devices have a flag indicating support. However, if a
device wants to be involved in P2P it must support it and we can exclude
devices that don't support it by simply not enabling their drivers.

Logan
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


[ndctl PATCH] ndctl: fail NUMA filtering when unsupported

2018-03-23 Thread Ross Zwisler
For systems that don't support NUMA, numactl gives a loud and fatal error:

  # numactl -N 0 ls
  numactl: This system does not support NUMA policy

Follow this model in ndctl for NUMA based filtering:

  # ./ndctl/ndctl list --numa-node=0
Error: This system does not support NUMA

This is done instead of just quietly filtering out all dimms, regions and
namespaces because the NUMA node they were trying to match didn't exist in
the system.

libnuma tests whether NUMA is enabled via the get_mempolicy() syscall,
passing in all NULLs and 0s for arguments to always get the default policy.
See numa_available() in numa(3) and in the numactl source.

ndctl checks sysfs for the existence of the /sys/devices/system/node
directory to avoid a dependency on libnuma.  If we had a dependency on
libnuma we would have to choose whether this was fulfilled or not at
compile time, which would potentially mean that we could be on a
NUMA-enabled kernel but with an ndctl where NUMA support was disabled.
It's better to always have NUMA support in ndctl and only depend on the
kernel config.

I've inspected the code for both get_mempolicy() and the code that creates
the /sys/devices/system/node directory, and they both seem to completely
rely on CONFIG_NUMA being defined.  If CONFIG_NUMA is set, get_mempolicy()
will always be able to return a default policy and /sys/devices/system/node
will always exist.  Otherwise, both checks will always fail.  So, numactl
and ndctl should always agree on whether NUMA is supported on a given
system.

Signed-off-by: Ross Zwisler 
Suggested-by: Dan Williams 
---

v3: Changed back to checking /sys/devices/system/node instead of using
libnuma, and added more info to the changelog.
---
 util/filter.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/util/filter.c b/util/filter.c
index 291d7ed..6ab391a 100644
--- a/util/filter.c
+++ b/util/filter.c
@@ -14,7 +14,10 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -328,6 +331,13 @@ int util_filter_walk(struct ndctl_ctx *ctx, struct 
util_filter_ctx *fctx,
}
 
if (param->numa_node && strcmp(param->numa_node, "all") != 0) {
+   struct stat st;
+
+   if (stat("/sys/devices/system/node", ) != 0) {
+   error("This system does not support NUMA");
+   return -EINVAL;
+   }
+
numa_node = strtol(param->numa_node, , 0);
if (end == param->numa_node || end[0]) {
error("invalid numa_node: '%s'\n", param->numa_node);
-- 
2.14.3

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [ndctl PATCH] ndctl: fail NUMA filtering when unsupported

2018-03-23 Thread Dan Williams
On Fri, Mar 23, 2018 at 4:08 PM, Ross Zwisler
 wrote:
> For systems that don't support NUMA, numactl gives a loud and fatal error:
>
>   # numactl -N 0 ls
>   numactl: This system does not support NUMA policy
>
> Follow this model in ndctl for NUMA based filtering:
>
>   # ./ndctl/ndctl list --numa-node=0
> Error: This system does not support NUMA
>
> This is done instead of just quietly filtering out all dimms, regions and
> namespaces because the NUMA node they were trying to match didn't exist in
> the system.
>
> libnuma tests whether NUMA is enabled via the get_mempolicy() syscall,
> passing in all NULLs and 0s for arguments to always get the default policy.
> See numa_available() in numa(3) and in the numactl source.
>
> ndctl checks sysfs for the existence of the /sys/devices/system/node
> directory to avoid a dependency on libnuma.  If we had a dependency on
> libnuma we would have to choose whether this was fulfilled or not at
> compile time, which would potentially mean that we could be on a
> NUMA-enabled kernel but with an ndctl where NUMA support was disabled.
> It's better to always have NUMA support in ndctl and only depend on the
> kernel config.
>
> I've inspected the code for both get_mempolicy() and the code that creates
> the /sys/devices/system/node directory, and they both seem to completely
> rely on CONFIG_NUMA being defined.  If CONFIG_NUMA is set, get_mempolicy()
> will always be able to return a default policy and /sys/devices/system/node
> will always exist.  Otherwise, both checks will always fail.  So, numactl
> and ndctl should always agree on whether NUMA is supported on a given
> system.
>
> Signed-off-by: Ross Zwisler 
> Suggested-by: Dan Williams 
> ---
>
> v3: Changed back to checking /sys/devices/system/node instead of using
> libnuma, and added more info to the changelog.

Looks good, applied.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-23 Thread Bjorn Helgaas
On Thu, Mar 22, 2018 at 10:57:32PM +, Stephen  Bates wrote:
> >  I've seen the response that peers directly below a Root Port could not
> > DMA to each other through the Root Port because of the "route to self"
> > issue, and I'm not disputing that.  
> 
> Bjorn 
> 
> You asked me for a reference to RTS in the PCIe specification. As
> luck would have it I ended up in an Irish bar with Peter Onufryk
> this week at OCP Summit. We discussed the topic. It is not
> explicitly referred to as "Route to Self" and it's certainly not
> explicit (or obvious) but r6.2.8.1 of the PCIe 4.0 specification
> discusses error conditions for virtual PCI bridges. One of these
> conditions (given in the very first bullet in that section) applies
> to a request that is destined for the same port it came in on. When
> this occurs the request must be terminated as a UR.

Thanks for that reference!

I suspect figure 10-3 in sec 10.1.1 might also be relevant, although
it's buried in the ATS section.  It shows internal routing between
functions of a multifunction device.  That suggests that the functions
should be able to DMA to each other without those transactions ever
appearing on the link.

Popping way up the stack, my original point was that I'm trying to
remove restrictions on what devices can participate in peer-to-peer
DMA.  I think it's fairly clear that in conventional PCI, any devices
in the same PCI hierarchy, i.e., below the same host-to-PCI bridge,
should be able to DMA to each other.

The routing behavior of PCIe is supposed to be compatible with
conventional PCI, and I would argue that this effectively requires
multi-function PCIe devices to have the internal routing required to
avoid the route-to-self issue.

Bjorn
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-23 Thread Bjorn Helgaas
On Fri, Mar 23, 2018 at 03:59:14PM -0600, Logan Gunthorpe wrote:
> On 23/03/18 03:50 PM, Bjorn Helgaas wrote:
> > Popping way up the stack, my original point was that I'm trying to
> > remove restrictions on what devices can participate in
> > peer-to-peer DMA.  I think it's fairly clear that in conventional
> > PCI, any devices in the same PCI hierarchy, i.e., below the same
> > host-to-PCI bridge, should be able to DMA to each other.
> 
> Yup, we are working on this.
> 
> > The routing behavior of PCIe is supposed to be compatible with
> > conventional PCI, and I would argue that this effectively requires
> > multi-function PCIe devices to have the internal routing required
> > to avoid the route-to-self issue.
> 
> That would be very nice but many devices do not support the internal
> route. We've had to work around this in the past and as I mentioned
> earlier that NVMe devices have a flag indicating support. However,
> if a device wants to be involved in P2P it must support it and we
> can exclude devices that don't support it by simply not enabling
> their drivers.

Do you think these devices that don't support internal DMA between
functions are within spec, or should we handle them as exceptions,
e.g., via quirks?

If NVMe defines a flag indicating peer-to-peer support, that would
suggest to me that these devices are within spec.

I looked up the CMBSZ register you mentioned (NVMe 1.3a, sec 3.1.12).
You must be referring to the WDS, RDS, LISTS, CQS, and SQS bits.  If
WDS is set, the controller supports having Write-related data and
metadata in the Controller Memory Buffer.  That would mean the driver
could put certain queues in controller memory instead of in host
memory.  The controller could then read the queue from its own
internal memory rather than issuing a PCIe transaction to read it from
host memory.

That makes sense to me, but I don't see the connection to
peer-to-peer.  There's no multi-function device in this picture, so
it's not about internal DMA between functions.

WDS, etc., tell us about capabilities of the controller.  If WDS is
set, the CPU (or a peer PCIe device) can write things to controller
memory.  If it is clear, neither the CPU nor a peer device can put
things there.  So it doesn't seem to tell us anything about
peer-to-peer specifically.  It looks like information needed by the
NVMe driver, but not by the PCI core.

Bjorn
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm