Re: [PATCH v6 10/42] powerpc/powernv: pnv_ioda_setup_dma() configure one PE only

2015-08-12 Thread Gavin Shan
On Tue, Aug 11, 2015 at 12:39:02PM +1000, Alexey Kardashevskiy wrote:
On 08/11/2015 10:29 AM, Gavin Shan wrote:
On Mon, Aug 10, 2015 at 07:31:11PM +1000, Alexey Kardashevskiy wrote:
On 08/06/2015 02:11 PM, Gavin Shan wrote:
The original implementation of pnv_ioda_setup_dma() iterates the
list of PEs and configures the DMA32 space for them one by one.
The function was designed to be called during PHB fixup time.
When configuring PE's DMA32 space in pcibios_setup_bridge(), in
order to support PCI hotplug, we have to have the function PE
oriented.

This renames pnv_ioda_setup_dma() to pnv_ioda1_setup_dma() and
adds one more argument struct pnv_ioda_pe *pe to it. The caller,
pnv_pci_ioda_setup_DMA(), gets PE from the list and passes to it
or pnv_pci_ioda2_setup_dma_pe(). The patch shouldn't cause behavioral
changes.

Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
  arch/powerpc/platforms/powernv/pci-ioda.c | 75 
 +++
  1 file changed, 36 insertions(+), 39 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 8456f37..cd22002 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2443,52 +2443,29 @@ static void pnv_pci_ioda2_setup_dma_pe(struct 
pnv_phb *phb,
pnv_ioda_setup_bus_dma(pe, pe-pbus);
  }

-static void pnv_ioda_setup_dma(struct pnv_phb *phb)
+static unsigned int pnv_ioda1_setup_dma(struct pnv_phb *phb,
+   struct pnv_ioda_pe *pe,
+   unsigned int base)
  {
struct pci_controller *hose = phb-hose;
-   struct pnv_ioda_pe *pe;
-   unsigned int dma_weight;
+   unsigned int dma_weight, segs;

/* Calculate the PHB's DMA weight */
dma_weight = pnv_ioda_phb_dma_weight(phb);
pr_info(PCI%04x has %ld DMA32 segments, total weight %d\n,
hose-global_number, phb-ioda.dma32_segcount, dma_weight);

-   pnv_pci_ioda_setup_opal_tce_kill(phb);
-
-   /* Walk our PE list and configure their DMA segments, hand them
-* out one base segment plus any residual segments based on
-* weight
-*/
-   list_for_each_entry(pe, phb-ioda.pe_dma_list, dma_link) {
-   if (!pe-dma32_weight)
-   continue;
-
-   /*
-* For IODA2 compliant PHB3, we needn't care about the weight.
-* The all available 32-bits DMA space will be assigned to
-* the specific PE.
-*/
-   if (phb-type == PNV_PHB_IODA1) {
-   unsigned int segs, base = 0;
-
-   if (pe-dma32_weight 
-   dma_weight / phb-ioda.dma32_segcount)
-   segs = 1;
-   else
-   segs = (pe-dma32_weight *
-   phb-ioda.dma32_segcount) / dma_weight;
-
-   pe_info(pe, DMA32 weight %d, assigned %d segments\n,
-   pe-dma32_weight, segs);
-   pnv_pci_ioda_setup_dma_pe(phb, pe, base, segs);
+   if (pe-dma32_weight 
+   dma_weight / phb-ioda.dma32_segcount)

Can be one line now.


Indeed.

+   segs = 1;
+   else
+   segs = (pe-dma32_weight *
+   phb-ioda.dma32_segcount) / dma_weight;
+   pe_info(pe, DMA weight %d, assigned %d segments\n,
+   pe-dma32_weight, segs);
+   pnv_pci_ioda_setup_dma_pe(phb, pe, base, segs);


Why not to merge pnv_ioda1_setup_dma() to pnv_pci_ioda_setup_dma_pe()?


There're two reasons:
- They're separate logically. One is calculating number of DMA32 segments 
required.
   Another one is allocate TCE32 tables and configure devices with them.
- In PCI hotplug path, I need pnv_ioda1_setup_dma() which has pe as 
parameter.


And hotplug path does not care about dma weight why?


PHB3 doesn't care about DMA weight, but P7IOC needs.



-   base += segs;
-   } else {
-   pe_info(pe, Assign DMA32 space\n);
-   pnv_pci_ioda2_setup_dma_pe(phb, pe);
-   }
-   }
+   return segs;
  }

  #ifdef CONFIG_PCI_MSI
@@ -2955,12 +2932,32 @@ static void pnv_pci_ioda_setup_DMA(void)
  {
struct pci_controller *hose, *tmp;
struct pnv_phb *phb;
+   struct pnv_ioda_pe *pe;
+   unsigned int base;

list_for_each_entry_safe(hose, tmp, hose_list, list_node) {
-   pnv_ioda_setup_dma(hose-private_data);
+   phb = hose-private_data;
+   pnv_pci_ioda_setup_opal_tce_kill(phb);
+
+   base = 0;
+   list_for_each_entry(pe, phb-ioda.pe_dma_list, dma_link) {
+   if (!pe-dma32_weight)
+   continue;
+
+   switch (phb-type) {
+   case PNV_PHB_IODA1:
+   base += pnv_ioda1_setup_dma(phb, pe, base);


This @base handling seems never be tested between 8..11 as [PATCH v6 11/42]
powerpc/powernv: Trace DMA32 segments consumed by 

[v2 01/11] powerpc: re-add devm_ioremap_prot()

2015-08-12 Thread Roy Pledge
From: Emil Medve emilian.me...@freescale.com

devm_ioremap_prot() was removed in commit dedd24a12,
and was introduced in commit b41e5fffe8.

This reverts commit dedd24a12fe6735898feeb06184ee346907abb5d.

Signed-off-by: Emil Medve emilian.me...@freescale.com
---
 arch/powerpc/include/asm/io.h |3 +++
 arch/powerpc/lib/Makefile |1 +
 arch/powerpc/lib/devres.c |   43 +
 3 files changed, 47 insertions(+)
 create mode 100644 arch/powerpc/lib/devres.c

diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
index a8d2ef3..9eaf301 100644
--- a/arch/powerpc/include/asm/io.h
+++ b/arch/powerpc/include/asm/io.h
@@ -855,6 +855,9 @@ static inline void * bus_to_virt(unsigned long address)
 
 #define clrsetbits_8(addr, clear, set) clrsetbits(8, addr, clear, set)
 
+void __iomem *devm_ioremap_prot(struct device *dev, resource_size_t offset,
+   size_t size, unsigned long flags);
+
 #endif /* __KERNEL__ */
 
 #endif /* _ASM_POWERPC_IO_H */
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index a47e142..7ae60f0 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -13,6 +13,7 @@ obj-y += string.o alloc.o crtsavres.o ppc_ksyms.o 
code-patching.o \
 feature-fixups.o
 
 obj-$(CONFIG_PPC32)+= div64.o copy_32.o
+obj-$(CONFIG_HAS_IOMEM)+= devres.o
 
 obj64-y+= copypage_64.o copyuser_64.o usercopy_64.o mem_64.o 
hweight_64.o \
   copyuser_power7.o string_64.o copypage_power7.o memcpy_power7.o \
diff --git a/arch/powerpc/lib/devres.c b/arch/powerpc/lib/devres.c
new file mode 100644
index 000..8df55fc
--- /dev/null
+++ b/arch/powerpc/lib/devres.c
@@ -0,0 +1,43 @@
+/*
+ * Copyright (C) 2008 Freescale Semiconductor, Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include linux/device.h  /* devres_*(), devm_ioremap_release() */
+#include linux/gfp.h
+#include linux/io.h  /* ioremap_prot() */
+#include linux/export.h  /* EXPORT_SYMBOL() */
+
+/**
+ * devm_ioremap_prot - Managed ioremap_prot()
+ * @dev: Generic device to remap IO address for
+ * @offset: BUS offset to map
+ * @size: Size of map
+ * @flags: Page flags
+ *
+ * Managed ioremap_prot().  Map is automatically unmapped on driver
+ * detach.
+ */
+void __iomem *devm_ioremap_prot(struct device *dev, resource_size_t offset,
+size_t size, unsigned long flags)
+{
+   void __iomem **ptr, *addr;
+
+   ptr = devres_alloc(devm_ioremap_release, sizeof(*ptr), GFP_KERNEL);
+   if (!ptr)
+   return NULL;
+
+   addr = ioremap_prot(offset, size, flags);
+   if (addr) {
+   *ptr = addr;
+   devres_add(dev, ptr);
+   } else
+   devres_free(ptr);
+
+   return addr;
+}
+EXPORT_SYMBOL(devm_ioremap_prot);
-- 
1.7.9.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[v2 10/11] soc/qman: Add HOTPLUG_CPU support to the QMan driver

2015-08-12 Thread Roy Pledge
From: Hai-Ying Wang haiying.w...@freescale.com

Add support for CPU hotplug for the DPAA 1.0 Queue Manager
driver.

Signed-off-by: Hai-Ying Wang haiying.w...@freescale.com
Signed-off-by: Emil Medve emilian.me...@freescale.com
Signed-off-by: Roy Pledge roy.ple...@freescale.com
---
 drivers/soc/fsl/qbman/qman_portal.c |   43 +++
 1 file changed, 43 insertions(+)

diff --git a/drivers/soc/fsl/qbman/qman_portal.c 
b/drivers/soc/fsl/qbman/qman_portal.c
index ad9e3ba..85acba2 100644
--- a/drivers/soc/fsl/qbman/qman_portal.c
+++ b/drivers/soc/fsl/qbman/qman_portal.c
@@ -474,6 +474,46 @@ static void qman_offline_cpu(unsigned int cpu)
}
 }
 
+#ifdef CONFIG_HOTPLUG_CPU
+static void qman_online_cpu(unsigned int cpu)
+{
+   struct qman_portal *p;
+   const struct qm_portal_config *pcfg;
+
+   p = (struct qman_portal *)affine_portals[cpu];
+   if (p) {
+   pcfg = qman_get_qm_portal_config(p);
+   if (pcfg) {
+   irq_set_affinity(pcfg-public_cfg.irq, cpumask_of(cpu));
+   qman_portal_update_sdest(pcfg, cpu);
+   }
+   }
+}
+
+static int qman_hotplug_cpu_callback(struct notifier_block *nfb,
+unsigned long action, void *hcpu)
+{
+   unsigned int cpu = (unsigned long)hcpu;
+
+   switch (action) {
+   case CPU_ONLINE:
+   case CPU_ONLINE_FROZEN:
+   qman_online_cpu(cpu);
+   break;
+   case CPU_DOWN_PREPARE:
+   case CPU_DOWN_PREPARE_FROZEN:
+   qman_offline_cpu(cpu);
+   default:
+   break;
+   }
+   return NOTIFY_OK;
+}
+
+static struct notifier_block qman_hotplug_cpu_notifier = {
+   .notifier_call = qman_hotplug_cpu_callback,
+};
+#endif /* CONFIG_HOTPLUG_CPU */
+
 __init int qman_init(void)
 {
struct cpumask slave_cpus;
@@ -597,6 +637,9 @@ __init int qman_init(void)
cpumask_andnot(offline_cpus, cpu_possible_mask, cpu_online_mask);
for_each_cpu(cpu, offline_cpus)
qman_offline_cpu(cpu);
+#ifdef CONFIG_HOTPLUG_CPU
+   register_hotcpu_notifier(qman_hotplug_cpu_notifier);
+#endif
return 0;
 }
 
-- 
1.7.9.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[v2 09/11] soc/bman: Add HOTPLUG_CPU support to the BMan driver

2015-08-12 Thread Roy Pledge
From: Hai-Ying Wang haiying.w...@freescale.com

Add support for CPU hotplug for the DPAA 1.0 Buffer Manager
driver

Signed-off-by: Hai-Ying Wang haiying.w...@freescale.com
Signed-off-by: Emil Medve emilian.me...@freescale.com
Signed-off-by: Roy Pledge roy.ple...@freescale.com
---
 drivers/soc/fsl/qbman/bman_portal.c |   40 +++
 drivers/soc/fsl/qbman/dpaa_sys.h|3 +++
 2 files changed, 43 insertions(+)

diff --git a/drivers/soc/fsl/qbman/bman_portal.c 
b/drivers/soc/fsl/qbman/bman_portal.c
index 62d8f64..f33d671 100644
--- a/drivers/soc/fsl/qbman/bman_portal.c
+++ b/drivers/soc/fsl/qbman/bman_portal.c
@@ -129,6 +129,42 @@ static void __cold bman_offline_cpu(unsigned int cpu)
}
 }
 
+#ifdef CONFIG_HOTPLUG_CPU
+static void __cold bman_online_cpu(unsigned int cpu)
+{
+   struct bman_portal *p = (struct bman_portal *)affine_bportals[cpu];
+   const struct bm_portal_config *pcfg;
+
+   if (p) {
+   pcfg = bman_get_bm_portal_config(p);
+   if (pcfg)
+   irq_set_affinity(pcfg-public_cfg.irq, cpumask_of(cpu));
+   }
+}
+
+static int __cold bman_hotplug_cpu_callback(struct notifier_block *nfb,
+   unsigned long action, void *hcpu)
+{
+   unsigned int cpu = (unsigned long)hcpu;
+
+   switch (action) {
+   case CPU_ONLINE:
+   case CPU_ONLINE_FROZEN:
+   bman_online_cpu(cpu);
+   break;
+   case CPU_DOWN_PREPARE:
+   case CPU_DOWN_PREPARE_FROZEN:
+   bman_offline_cpu(cpu);
+   }
+
+   return NOTIFY_OK;
+}
+
+static struct notifier_block bman_hotplug_cpu_notifier = {
+   .notifier_call = bman_hotplug_cpu_callback,
+};
+#endif /* CONFIG_HOTPLUG_CPU */
+
 static int __cold bman_portal_probe(struct platform_device *of_dev)
 {
struct device *dev = of_dev-dev;
@@ -342,6 +378,10 @@ static int __init bman_portal_driver_register(struct 
platform_driver *drv)
for_each_cpu(cpu, offline_cpus)
bman_offline_cpu(cpu);
 
+#ifdef CONFIG_HOTPLUG_CPU
+   register_hotcpu_notifier(bman_hotplug_cpu_notifier);
+#endif
+
bman_seed_bpid_range(0, bman_pool_max);
 
return 0;
diff --git a/drivers/soc/fsl/qbman/dpaa_sys.h b/drivers/soc/fsl/qbman/dpaa_sys.h
index 0dd341c..d1da092 100644
--- a/drivers/soc/fsl/qbman/dpaa_sys.h
+++ b/drivers/soc/fsl/qbman/dpaa_sys.h
@@ -43,6 +43,9 @@
 #include linux/vmalloc.h
 #include linux/platform_device.h
 #include linux/ctype.h
+#ifdef CONFIG_HOTPLUG_CPU
+#include linux/cpu.h
+#endif
 
 #include asm/pgtable.h
 
-- 
1.7.9.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: RFC: prepare for struct scatterlist entries without page backing

2015-08-12 Thread Julian Calaby
Hi,

On Wed, Aug 12, 2015 at 10:42 PM, Boaz Harrosh b...@plexistor.com wrote:
 On 08/12/2015 10:05 AM, Christoph Hellwig wrote:
 It turns out most DMA mapping implementation can handle SGLs without
 page structures with some fairly simple mechanical work.  Most of it
 is just about consistently using sg_phys.  For implementations that
 need to flush caches we need a new helper that skips these cache
 flushes if a entry doesn't have a kernel virtual address.

 However the ccio (parisc) and sba_iommu (parisc  ia64) IOMMUs seem
 to be operate mostly on virtual addresses.  It's a fairly odd concept
 that I don't fully grasp, so I'll need some help with those if we want
 to bring this forward.

 Additional this series skips ARM entirely for now.  The reason is
 that most arm implementations of the .map_sg operation just iterate
 over all entries and call -map_page for it, which means we'd need
 to convert those to a -map_pfn similar to Dan's previous approach.


[snip]

 It is a bit of work but is worth while, and accelerating tremendously
 lots of workloads. Not like this abomination which only branches
 things more and more, and making things fatter and slower.

As a random guy reading a big bunch of patches on code I know almost
nothing about, parts of this comment really resonated with me:
overall, we seem to be adding a lot of if statements to code that
appears to be in a hot path.

I.e. ~90% of this patch set seems to be just mechanically dropping
BUG_ON()s and converting open coded stuff to use accessor functions
(which should be macros or get inlined, right?) - and the remaining
bit is not flushing if we don't have a physical page somewhere.

Would it make sense to split this patch set into a few bits: one to
drop all the useless BUG_ON()s, one to convert all the open coded
stuff to accessor functions, then another to do the actual page-less
sg stuff?

Thanks,

-- 
Julian Calaby

Email: julian.cal...@gmail.com
Profile: http://www.google.com/profiles/julian.calaby/
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v6 08/42] powerpc/powernv: Calculate PHB's DMA weight dynamically

2015-08-12 Thread Gavin Shan
On Mon, Aug 10, 2015 at 07:21:12PM +1000, Alexey Kardashevskiy wrote:
On 08/06/2015 02:11 PM, Gavin Shan wrote:
For P7IOC, the whole available DMA32 space, which is below the
MEM32 space, is divided evenly into 256MB segments. The number
of continuous segments assigned to one particular PE depends on
the PE's DMA weight that is calculated based on the type of each
PCI devices contained in the PE, and PHB's DMA weight which is
accumulative DMA weight of PEs contained in the PHB. It means
that the PHB's DMA weight calculation depends on existing PEs,
which works perfectly now, but not hotplug friendly. As the
whole available DMA32 space can be assigned to one PE on PHB3,
so we don't have the issue on PHB3.

The patch calculates PHB's DMA weight based on the PCI devices
contained in the PHB dynamically so that it's hotplug friendly.

Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
  arch/powerpc/platforms/powernv/pci-ioda.c | 88 
 +++
  arch/powerpc/platforms/powernv/pci.h  |  6 ---
  2 files changed, 43 insertions(+), 51 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 713f4b4..7342cfd 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -927,6 +927,9 @@ static void pnv_ioda_link_pe_by_weight(struct pnv_phb 
*phb,

  static unsigned int pnv_ioda_dma_weight(struct pci_dev *dev)
  {
+ struct pci_controller *hose = pci_bus_to_host(dev-bus);
+ struct pnv_phb *phb = hose-private_data;
+
  /* This is quite simplistic. The base weight of a device
   * is 10. 0 means no DMA is to be accounted for it.
   */
@@ -939,14 +942,34 @@ static unsigned int pnv_ioda_dma_weight(struct pci_dev 
*dev)
  if (dev-class == PCI_CLASS_SERIAL_USB_UHCI ||
  dev-class == PCI_CLASS_SERIAL_USB_OHCI ||
  dev-class == PCI_CLASS_SERIAL_USB_EHCI)
- return 3;
+ return 3 * phb-ioda.tce32_count;

  /* Increase the weight of RAID (includes Obsidian) */
  if ((dev-class  8) == PCI_CLASS_STORAGE_RAID)
- return 15;
+ return 15 * phb-ioda.tce32_count;

  /* Default */
- return 10;
+ return 10 * phb-ioda.tce32_count;
+}
+
+static int __pnv_ioda_phb_dma_weight(struct pci_dev *pdev, void *data)
+{
+ unsigned int *dma_weight = data;
+
+ *dma_weight += pnv_ioda_dma_weight(pdev);
+ return 0;
+}
+
+static unsigned int pnv_ioda_phb_dma_weight(struct pnv_phb *phb)
+{
+ unsigned int dma_weight = 0;
+
+ if (!phb-hose-bus)
+ return 0;
+
+ pci_walk_bus(phb-hose-bus,
+  __pnv_ioda_phb_dma_weight, dma_weight);
+ return dma_weight;
  }

  #ifdef CONFIG_PCI_IOV
@@ -1097,14 +1120,6 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, 
bool all)
  /* Put PE to the list */
  list_add_tail(pe-list, phb-ioda.pe_list);

- /* Account for one DMA PE if at least one DMA capable device exist
-  * below the bridge
-  */
- if (pe-dma_weight != 0) {
- phb-ioda.dma_weight += pe-dma_weight;
- phb-ioda.dma_pe_count++;
- }
-
  /* Link the PE */
  pnv_ioda_link_pe_by_weight(phb, pe);
  }
@@ -2431,24 +2446,13 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb 
*phb,
  static void pnv_ioda_setup_dma(struct pnv_phb *phb)
  {
  struct pci_controller *hose = phb-hose;
- unsigned int residual, remaining, segs, tw, base;
  struct pnv_ioda_pe *pe;
+ unsigned int dma_weight;

- /* If we have more PE# than segments available, hand out one
-  * per PE until we run out and let the rest fail. If not,
-  * then we assign at least one segment per PE, plus more based
-  * on the amount of devices under that PE
-  */
- if (phb-ioda.dma_pe_count  phb-ioda.tce32_count)
- residual = 0;
- else
- residual = phb-ioda.tce32_count -
- phb-ioda.dma_pe_count;
-
- pr_info(PCI: Domain %04x has %ld available 32-bit DMA segments\n,
- hose-global_number, phb-ioda.tce32_count);
- pr_info(PCI: %d PE# for a total weight of %d\n,
- phb-ioda.dma_pe_count, phb-ioda.dma_weight);
+ /* Calculate the PHB's DMA weight */
+ dma_weight = pnv_ioda_phb_dma_weight(phb);
+ pr_info(PCI%04x has %ld DMA32 segments, total weight %d\n,
+ hose-global_number, phb-ioda.tce32_count, dma_weight);

  pnv_pci_ioda_setup_opal_tce_kill(phb);

@@ -2456,22 +2460,9 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
   * out one base segment plus any residual segments based on
   * weight
   */
- remaining = phb-ioda.tce32_count;
- tw = phb-ioda.dma_weight;
- base = 0;
  list_for_each_entry(pe, phb-ioda.pe_dma_list, dma_link) {
  if (!pe-dma_weight)
  continue;
- if (!remaining) {
- pe_warn(pe, No DMA32 resources 

[v2 07/11] soc/bman: Add debugfs support for the BMan driver

2015-08-12 Thread Roy Pledge
From: Geoff Thorpe geoff.tho...@freescale.com

Add debugfs support for querying the state of hardware based
Buffer Manager pools used in DPAA 1.0.

Signed-off-by: Geoff Thorpe geoff.tho...@freescale.com
Signed-off-by: Emil Medve emilian.me...@freescale.com
Signed-off-by: Roy Pledge roy.ple...@freescale.com
---
 drivers/soc/fsl/qbman/Kconfig|7 ++
 drivers/soc/fsl/qbman/Makefile   |1 +
 drivers/soc/fsl/qbman/bman-debugfs.c |  117 ++
 drivers/soc/fsl/qbman/bman_api.c |   19 ++
 drivers/soc/fsl/qbman/dpaa_sys.h |7 +-
 5 files changed, 145 insertions(+), 6 deletions(-)
 create mode 100644 drivers/soc/fsl/qbman/bman-debugfs.c

diff --git a/drivers/soc/fsl/qbman/Kconfig b/drivers/soc/fsl/qbman/Kconfig
index 1f2063a..919ef15 100644
--- a/drivers/soc/fsl/qbman/Kconfig
+++ b/drivers/soc/fsl/qbman/Kconfig
@@ -54,6 +54,13 @@ config FSL_BMAN_TEST_THRESH
  drainer thread, and the other threads that they observe exactly
  the depletion state changes that are expected.
 
+config FSL_BMAN_DEBUGFS
+   tristate BMan debugfs support
+   depends on DEBUG_FS
+   default n
+   help
+   BMan debugfs support
+
 config FSL_QMAN
bool QMan device management
default n
diff --git a/drivers/soc/fsl/qbman/Makefile b/drivers/soc/fsl/qbman/Makefile
index 82f5482..2b53fbc 100644
--- a/drivers/soc/fsl/qbman/Makefile
+++ b/drivers/soc/fsl/qbman/Makefile
@@ -9,6 +9,7 @@ obj-$(CONFIG_FSL_BMAN_TEST) += bman-test.o
 bman-test-y = bman_test.o
 bman-test-$(CONFIG_FSL_BMAN_TEST_API)  += bman_test_api.o
 bman-test-$(CONFIG_FSL_BMAN_TEST_THRESH)   += bman_test_thresh.o
+obj-$(CONFIG_FSL_BMAN_DEBUGFS) += bman-debugfs.o
 
 obj-$(CONFIG_FSL_QMAN) += qman_api.o qman_utils.o 
qman_driver.o
 obj-$(CONFIG_FSL_QMAN_CONFIG)  += qman.o qman_portal.o
diff --git a/drivers/soc/fsl/qbman/bman-debugfs.c 
b/drivers/soc/fsl/qbman/bman-debugfs.c
new file mode 100644
index 000..b384f47
--- /dev/null
+++ b/drivers/soc/fsl/qbman/bman-debugfs.c
@@ -0,0 +1,117 @@
+/* Copyright 2010 - 2015 Freescale Semiconductor, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *  names of its contributors may be used to endorse or promote products
+ *  derived from this software without specific prior written permission.
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License (GPL) as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include bman_priv.h
+
+static struct dentry *dfs_root; /* debugfs root directory */
+
+/* Query Buffer Pool State */
+
+static int query_bp_state_show(struct seq_file *file, void *offset)
+{
+   int ret;
+   struct bm_pool_state state;
+   int i, j;
+   u32 mask;
+
+   memset(state, 0, sizeof(state));
+   ret = bman_query_pools(state);
+   if (ret) {
+   seq_printf(file, Error %d\n, ret);
+   return ret;
+   }
+
+   seq_puts(file, bp_id  free_buffers_avail  bp_depleted\n);
+   for (i = 0; i  2; i++) {
+   mask = 0x8000;
+   for (j = 0; j  32; j++) {
+   seq_printf(file,
+  %-2u   %-3s %-3s\n,
+(i * 32) + j,
+state.as.state.__state[i]  mask ? no : yes,
+state.ds.state.__state[i]  mask ? yes : no);
+mask = 1;
+ 

[v2 11/11] soc/qman: add qman_delete_cgr_safe()

2015-08-12 Thread Roy Pledge
From: Madalin Bucur madalin.bu...@freescale.com

Add qman_delete_cgr_safe() that can be called from any CPU.
This in turn schedules qman_delete_cgr() on the proper CPU.

Signed-off-by: Madalin Bucur madalin.bu...@freescale.com
Signed-off-by: Roy Pledge roy.ple...@freescale.com
---
 drivers/soc/fsl/qbman/qman_api.c |   46 ++
 1 file changed, 46 insertions(+)

diff --git a/drivers/soc/fsl/qbman/qman_api.c b/drivers/soc/fsl/qbman/qman_api.c
index d4f9be0..1dd60f2 100644
--- a/drivers/soc/fsl/qbman/qman_api.c
+++ b/drivers/soc/fsl/qbman/qman_api.c
@@ -2463,6 +2463,8 @@ EXPORT_SYMBOL(qman_modify_cgr);
QM_CHANNEL_SWPORTAL0))
 #define PORTAL_IDX(n) (n-config-public_cfg.channel - QM_CHANNEL_SWPORTAL0)
 
+static u8 qman_cgr_cpus[__CGR_NUM];
+
 int qman_create_cgr(struct qman_cgr *cgr, u32 flags,
struct qm_mcc_initcgr *opts)
 {
@@ -2479,7 +2481,10 @@ int qman_create_cgr(struct qman_cgr *cgr, u32 flags,
if (cgr-cgrid = __CGR_NUM)
return -EINVAL;
 
+   preempt_disable();
p = get_affine_portal();
+   qman_cgr_cpus[cgr-cgrid] = smp_processor_id();
+   preempt_enable();
 
memset(local_opts, 0, sizeof(struct qm_mcc_initcgr));
cgr-chan = p-config-public_cfg.channel;
@@ -2621,6 +2626,47 @@ put_portal:
 }
 EXPORT_SYMBOL(qman_delete_cgr);
 
+struct cgr_comp {
+   struct qman_cgr *cgr;
+   struct completion completion;
+};
+
+static int qman_delete_cgr_thread(void *p)
+{
+   struct cgr_comp *cgr_comp = (struct cgr_comp *)p;
+   int res;
+
+   res = qman_delete_cgr((struct qman_cgr *)cgr_comp-cgr);
+   complete(cgr_comp-completion);
+
+   return res;
+}
+
+void qman_delete_cgr_safe(struct qman_cgr *cgr)
+{
+   struct task_struct *thread;
+   struct cgr_comp cgr_comp;
+
+   preempt_disable();
+   if (qman_cgr_cpus[cgr-cgrid] != smp_processor_id()) {
+   init_completion(cgr_comp.completion);
+   cgr_comp.cgr = cgr;
+   thread = kthread_create(qman_delete_cgr_thread, cgr_comp,
+   cgr_del);
+
+   if (likely(!IS_ERR(thread))) {
+   kthread_bind(thread, qman_cgr_cpus[cgr-cgrid]);
+   wake_up_process(thread);
+   wait_for_completion(cgr_comp.completion);
+   preempt_enable();
+   return;
+   }
+   }
+   qman_delete_cgr(cgr);
+   preempt_enable();
+}
+EXPORT_SYMBOL(qman_delete_cgr_safe);
+
 int qman_set_wpm(int wpm_enable)
 {
return qm_set_wpm(wpm_enable);
-- 
1.7.9.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v6 07/42] powerpc/powernv: Improve IO and M32 mapping

2015-08-12 Thread Gavin Shan
On Tue, Aug 11, 2015 at 12:32:13PM +1000, Alexey Kardashevskiy wrote:
On 08/11/2015 10:12 AM, Gavin Shan wrote:
On Mon, Aug 10, 2015 at 05:40:08PM +1000, Alexey Kardashevskiy wrote:
On 08/06/2015 02:11 PM, Gavin Shan wrote:
There're 3 windows (IO, M32 and M64) for PHB, root port and upstream

These are actually IO, non-prefetchable and prefetchable windows which happen
to be IO, 32bit and 64bit windows but this has nothing to do with the M32/M64
BAR registers in P7IOC/PHB3, do I understand this correctly?


In pci-ioda.c, we have below definiations that are defined when
developing the code, not from any specification:

IO  - resources with IO property
M32 - 32-bits or non-prefetchable resources
M64 - 64-bits and prefetchable resources


This what I am saying - it is incorrect and confusing. M32/M64 are PHB3
register names and associated windows (with M in the beginning) but not
device resources.


I don't see how it's incorrect and confusing. M32/M64 are not PHB3
register names. Also, device resource is either IO, 32-bits prefetchable,
memory, 32-bits non-prefetchable memory, 64-bits non-prefetchable memory,
64-bits prefetchable memory. They match with IO, M32, M64.


port of the PCIE switch behind root port. In order to support PCI
hotplug, we extend the start/end address of those 3 windows of root
port or upstream port to the start/end address of the 3 PHB's windows.
The current implementation, assigning IO or M32 segment based on the
bridge's windows, isn't reliable.

The patch fixes above issue by calculating PE's consumed IO or M32
segments from its contained devices, no PCI bridge windows involved
if the PE doesn't contain all the subordinate PCI buses.

Please, rephrase it. How can PCI bridges be involved in PE consumption?


Ok. Will add something like below:

if the PE, corresponding to the PCI bus, doesn't contain all the subordinate
PCI buses.


No, my question was about PCI bridge windows involved - what do you do to
the windows if PE does not own all child buses?


All of it is about the original implementation: When the PE doesn't include
all child buses, the resource consumed by the PE is: resources assigned to
current PCI bus and then exclude the resources assigned to the child buses.
Note that PCI bridge windows are actually PCI bus's resource.


Otherwise,
the PCI bridge windows still contribute to PE's consumed IO or M32
segments.

PCI bridge windows themselves consume PEs? Is that correct?


PCI bridge windows consume IO, M32, M64 segments, not PEs.

Ah, right.



Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
  arch/powerpc/platforms/powernv/pci-ioda.c | 136 
 +-
  1 file changed, 79 insertions(+), 57 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 488a53e..713f4b4 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2844,75 +2844,97 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
pci_dev *pdev)
  }
  #endif /* CONFIG_PCI_IOV */

-/*
- * This function is supposed to be called on basis of PE from top
- * to bottom style. So the the I/O or MMIO segment assigned to
- * parent PE could be overrided by its child PEs if necessary.
- */
-static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
- struct pnv_ioda_pe *pe)
+static int pnv_ioda_setup_one_res(struct pci_controller *hose,
+ struct pnv_ioda_pe *pe,
+ struct resource *res)
  {
struct pnv_phb *phb = hose-private_data;
struct pci_bus_region region;
-   struct resource *res;
-   int i, index;
-   unsigned int segsize;
+   unsigned int index, segsize;
unsigned long *segmap, *pe_segmap;
uint16_t win;
int64_t rc;

-   /*
-* NOTE: We only care PCI bus based PE for now. For PCI
-* device based PE, for example SRIOV sensitive VF should
-* be figured out later.
-*/
-   BUG_ON(!(pe-flags  (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL)));
+   /* Check if we need map the resource */
+   if (!res-parent || !res-flags || res-start  res-end)

res-start = res-end ?


No, res-start == res-end is valid.


+   return 0;

-   pci_bus_for_each_resource(pe-pbus, res, i) {
-   if (!res || !res-flags ||
-   res-start  res-end)
-   continue;
+   if (res-flags  IORESOURCE_IO) {
+   region.start = res-start - phb-ioda.io_pci_base;
+   region.end   = res-end - phb-ioda.io_pci_base;
+   segsize  = phb-ioda.io_segsize;
+   segmap   = phb-ioda.io_segmap;
+   pe_segmap= pe-io_segmap;
+   win  = OPAL_IO_WINDOW_TYPE;
+   } else if ((res-flags  IORESOURCE_MEM) 
+  !pnv_pci_is_mem_pref_64(res-flags)) {
+   region.start = res-start -
+  hose-mem_offset[0] -
+  phb-ioda.m32_pci_base;
+   region.end   = 

[v2 05/11] soc/bman: Add self-tester for BMan driver

2015-08-12 Thread Roy Pledge
From: Geoff Thorpe geoff.tho...@freescale.com

Add a self test for the DPAA 1.0 Buffer Manager driver. This
test ensures that the driver can properly acquire and release
buffers using the BMan portal infrastructure.

Signed-off-by: Geoff Thorpe geoff.tho...@freescale.com
Signed-off-by: Emil Medve emilian.me...@freescale.com
Signed-off-by: Roy Pledge roy.ple...@freescale.com
---
 drivers/soc/fsl/qbman/Kconfig|   26 
 drivers/soc/fsl/qbman/Makefile   |4 +
 drivers/soc/fsl/qbman/bman_test.c|   56 +
 drivers/soc/fsl/qbman/bman_test.h|   34 +
 drivers/soc/fsl/qbman/bman_test_api.c|  184 +++
 drivers/soc/fsl/qbman/bman_test_thresh.c |  198 ++
 drivers/soc/fsl/qbman/dpaa_sys.h |1 +
 7 files changed, 503 insertions(+)
 create mode 100644 drivers/soc/fsl/qbman/bman_test.c
 create mode 100644 drivers/soc/fsl/qbman/bman_test.h
 create mode 100644 drivers/soc/fsl/qbman/bman_test_api.c
 create mode 100644 drivers/soc/fsl/qbman/bman_test_thresh.c

diff --git a/drivers/soc/fsl/qbman/Kconfig b/drivers/soc/fsl/qbman/Kconfig
index 1ff52a8..1f2063a 100644
--- a/drivers/soc/fsl/qbman/Kconfig
+++ b/drivers/soc/fsl/qbman/Kconfig
@@ -28,6 +28,32 @@ config FSL_BMAN_PORTAL
help
FSL BMan portal driver
 
+config FSL_BMAN_TEST
+   tristate BMan self-tests
+   default n
+   help
+   Compile self-test code
+
+config FSL_BMAN_TEST_API
+   bool High-level API self-test
+   depends on FSL_BMAN_TEST
+   default y
+   help
+   This requires the presence of cpu-affine portals, and performs
+   high-level API testing with them (whichever portal(s) are affine
+   to the cpu(s) the test executes on).
+
+config FSL_BMAN_TEST_THRESH
+   bool Thresholds self-test
+   depends on FSL_BMAN_TEST
+   default y
+   help
+ Multi-threaded (SMP) test of BMan pool depletion. A pool is seeded
+ before multiple threads (one per cpu) create pool objects to track
+ depletion state changes. The pool is then drained to empty by a
+ drainer thread, and the other threads that they observe exactly
+ the depletion state changes that are expected.
+
 config FSL_QMAN
bool QMan device management
default n
diff --git a/drivers/soc/fsl/qbman/Makefile b/drivers/soc/fsl/qbman/Makefile
index 0d96598..04509c3 100644
--- a/drivers/soc/fsl/qbman/Makefile
+++ b/drivers/soc/fsl/qbman/Makefile
@@ -5,6 +5,10 @@ obj-$(CONFIG_FSL_BMAN) += bman.o
 obj-$(CONFIG_FSL_BMAN_PORTAL)  += bman-portal.o
 bman-portal-y   = bman_portal.o bman_api.o 
\
   bman_utils.o
+obj-$(CONFIG_FSL_BMAN_TEST)+= bman-test.o
+bman-test-y = bman_test.o
+bman-test-$(CONFIG_FSL_BMAN_TEST_API)  += bman_test_api.o
+bman-test-$(CONFIG_FSL_BMAN_TEST_THRESH)   += bman_test_thresh.o
 
 obj-$(CONFIG_FSL_QMAN) += qman_api.o qman_utils.o 
qman_driver.o
 obj-$(CONFIG_FSL_QMAN_CONFIG)  += qman.o qman_portal.o
diff --git a/drivers/soc/fsl/qbman/bman_test.c 
b/drivers/soc/fsl/qbman/bman_test.c
new file mode 100644
index 000..9298093
--- /dev/null
+++ b/drivers/soc/fsl/qbman/bman_test.c
@@ -0,0 +1,56 @@
+/* Copyright 2008 - 2015 Freescale Semiconductor, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *  names of its contributors may be used to endorse or promote products
+ *  derived from this software without specific prior written permission.
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License (GPL) as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, 

[v2 02/11] soc/fsl: Introduce DPAA BMan device management driver

2015-08-12 Thread Roy Pledge
From: Geoff Thorpe geoff.tho...@freescale.com

This driver enables the Freescale DPAA 1.0 Buffer Manager block. BMan
is a hardware buffer pool manager that allows accelerators
connected to the SoC datapath to acquire and release buffers during
data processing.

Signed-off-by: Geoff Thorpe geoff.tho...@freescale.com
Signed-off-by: Emil Medve emilian.me...@freescale.com
Signed-off-by: Roy Pledge roy.ple...@freescale.com
---
 drivers/soc/Kconfig   |1 +
 drivers/soc/Makefile  |1 +
 drivers/soc/fsl/Kconfig   |5 +
 drivers/soc/fsl/Makefile  |3 +
 drivers/soc/fsl/qbman/Kconfig |   25 ++
 drivers/soc/fsl/qbman/Makefile|1 +
 drivers/soc/fsl/qbman/bman.c  |  553 +
 drivers/soc/fsl/qbman/bman_priv.h |   53 
 drivers/soc/fsl/qbman/dpaa_sys.h  |   55 
 9 files changed, 697 insertions(+)
 create mode 100644 drivers/soc/fsl/Kconfig
 create mode 100644 drivers/soc/fsl/Makefile
 create mode 100644 drivers/soc/fsl/qbman/Kconfig
 create mode 100644 drivers/soc/fsl/qbman/Makefile
 create mode 100644 drivers/soc/fsl/qbman/bman.c
 create mode 100644 drivers/soc/fsl/qbman/bman_priv.h
 create mode 100644 drivers/soc/fsl/qbman/dpaa_sys.h

diff --git a/drivers/soc/Kconfig b/drivers/soc/Kconfig
index 96ddecb..4e3c8f4 100644
--- a/drivers/soc/Kconfig
+++ b/drivers/soc/Kconfig
@@ -1,6 +1,7 @@
 menu SOC (System On Chip) specific Drivers
 
 source drivers/soc/mediatek/Kconfig
+source drivers/soc/fsl/Kconfig
 source drivers/soc/qcom/Kconfig
 source drivers/soc/sunxi/Kconfig
 source drivers/soc/ti/Kconfig
diff --git a/drivers/soc/Makefile b/drivers/soc/Makefile
index 7dc7c0d..7adcd97 100644
--- a/drivers/soc/Makefile
+++ b/drivers/soc/Makefile
@@ -3,6 +3,7 @@
 #
 
 obj-$(CONFIG_ARCH_MEDIATEK)+= mediatek/
+obj-$(CONFIG_FSL_SOC)  += fsl/
 obj-$(CONFIG_ARCH_QCOM)+= qcom/
 obj-$(CONFIG_ARCH_SUNXI)   += sunxi/
 obj-$(CONFIG_ARCH_TEGRA)   += tegra/
diff --git a/drivers/soc/fsl/Kconfig b/drivers/soc/fsl/Kconfig
new file mode 100644
index 000..daa9c0d
--- /dev/null
+++ b/drivers/soc/fsl/Kconfig
@@ -0,0 +1,5 @@
+menu Freescale SOC (System On Chip) specific Drivers
+
+source drivers/soc/fsl/qbman/Kconfig
+
+endmenu
diff --git a/drivers/soc/fsl/Makefile b/drivers/soc/fsl/Makefile
new file mode 100644
index 000..19e74bb
--- /dev/null
+++ b/drivers/soc/fsl/Makefile
@@ -0,0 +1,3 @@
+# Common
+obj-$(CONFIG_FSL_DPA)  += qbman/
+
diff --git a/drivers/soc/fsl/qbman/Kconfig b/drivers/soc/fsl/qbman/Kconfig
new file mode 100644
index 000..be4ae01
--- /dev/null
+++ b/drivers/soc/fsl/qbman/Kconfig
@@ -0,0 +1,25 @@
+menuconfig FSL_DPA
+   bool Freescale DPAA support
+   depends on FSL_SOC || COMPILE_TEST
+   default n
+   help
+   FSL Data-Path Acceleration Architecture drivers
+
+   These are not the actual Ethernet driver(s)
+
+if FSL_DPA
+
+config FSL_DPA_CHECKING
+   bool additional driver checking
+   default n
+   help
+   Compiles in additional checks to sanity-check the drivers and
+   any use of it by other code. Not recommended for performance
+
+config FSL_BMAN
+   tristate BMan device management
+   default n
+   help
+   FSL DPAA BMan driver
+
+endif # FSL_DPA
diff --git a/drivers/soc/fsl/qbman/Makefile b/drivers/soc/fsl/qbman/Makefile
new file mode 100644
index 000..02014d9
--- /dev/null
+++ b/drivers/soc/fsl/qbman/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_FSL_BMAN) += bman.o
diff --git a/drivers/soc/fsl/qbman/bman.c b/drivers/soc/fsl/qbman/bman.c
new file mode 100644
index 000..9a500ce
--- /dev/null
+++ b/drivers/soc/fsl/qbman/bman.c
@@ -0,0 +1,553 @@
+/* Copyright (c) 2009 - 2015 Freescale Semiconductor, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *  names of its contributors may be used to endorse or promote products
+ *  derived from this software without specific prior written permission.
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License (GPL) as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ 

Re: [PATCH v2 05/10] cxl: Refactor adaptor init/teardown

2015-08-12 Thread Daniel Axtens
 The function above doesn't even use the 'rc' value.

Darn, you're right.

I'll fix that in a new version.

-- 
Regards,
Daniel

-- 
Regards,
Daniel


signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: RFC: prepare for struct scatterlist entries without page backing

2015-08-12 Thread Grant Grundler
On Wed, Aug 12, 2015 at 10:00 AM, James Bottomley
james.bottom...@hansenpartnership.com wrote:
 On Wed, 2015-08-12 at 09:05 +0200, Christoph Hellwig wrote:
...
 However the ccio (parisc) and sba_iommu (parisc  ia64) IOMMUs seem
 to be operate mostly on virtual addresses.  It's a fairly odd concept
 that I don't fully grasp, so I'll need some help with those if we want
 to bring this forward.

James explained the primary function of IOMMUs on parisc (DMA-Cache
coherency) much better than I ever could.

Three more observations:
1) the IOMMU can be bypassed by 64-bit DMA devices on IA64.

2) IOMMU enables 32-bit DMA devices to reach  32-bit physical memory
and thus avoiding bounce buffers. parisc and older IA-64 have some
32-bit PCI devices - e.g. IDE boot HDD.

3) IOMMU acts as a proxy for IO devices by fetching cachelines of data
for PA-RISC systems whose memory controllers ONLY serve cacheline
sized transactions. ie. 32-bit DMA results in the IOMMU fetching the
cacheline and updating just the 32-bits in a DMA cache coherent
fashion.

Bonus thought:
4) IOMMU can improve DMA performance in some cases using hints
provided by the OS (e.g. prefetching DMA data or using READ_CURRENT
bus transactions instead of normal memory fetches.)

cheers,
grant
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v6 11/42] powerpc/powernv: Trace DMA32 segments consumed by PE

2015-08-12 Thread Gavin Shan
On Mon, Aug 10, 2015 at 07:43:48PM +1000, Alexey Kardashevskiy wrote:
On 08/06/2015 02:11 PM, Gavin Shan wrote:
On P7IOC, the whole DMA32 space is divided evenly to 256MB segments.
Each PE can consume one or multiple DMA32 segments. Current code
doesn't trace the available DMA32 segments and those consumed by
one particular PE. It's conflicting with PCI hotplug.

The patch introduces one bitmap to PHB to trace the available
DMA32 segments for allocation, more fields to struct pnv_ioda_pe
to trace the consumed DMA32 segments by the PE, which is going to
be released when the PE is destroyed at PCI unplugging time.

Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
  arch/powerpc/platforms/powernv/pci-ioda.c | 40 
 +++
  arch/powerpc/platforms/powernv/pci.h  |  4 +++-
  2 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index cd22002..57ba8fd 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1946,6 +1946,7 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb 
*phb,

  /* Grab a 32-bit TCE table */
  pe-dma32_seg = base;
+ pe-dma32_segcount = segs;
  pe_info(pe,  Setting up 32-bit TCE table at %08x..%08x\n,
  (base  28), ((base + segs)  28) - 1);

@@ -2006,8 +2007,13 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb 
*phb,
  return;
   fail:
  /* XXX Failure: Try to fallback to 64-bit only ? */
- if (pe-dma32_seg = 0)
+ if (pe-dma32_seg = 0) {
+ bitmap_clear(phb-ioda.dma32_segmap,
+  pe-dma32_seg, pe-dma32_segcount);
  pe-dma32_seg = -1;
+ pe-dma32_segcount = 0;
+ }
+
  if (tce_mem)
  __free_pages(tce_mem, get_order(TCE32_TABLE_SIZE * segs));
  if (tbl) {
@@ -2443,12 +2449,11 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb 
*phb,
  pnv_ioda_setup_bus_dma(pe, pe-pbus);
  }

-static unsigned int pnv_ioda1_setup_dma(struct pnv_phb *phb,
- struct pnv_ioda_pe *pe,
- unsigned int base)
+static void pnv_ioda1_setup_dma(struct pnv_phb *phb,
+ struct pnv_ioda_pe *pe)
  {
  struct pci_controller *hose = phb-hose;
- unsigned int dma_weight, segs;
+ unsigned int dma_weight, base, segs;

  /* Calculate the PHB's DMA weight */
  dma_weight = pnv_ioda_phb_dma_weight(phb);
@@ -2461,11 +2466,28 @@ static unsigned int pnv_ioda1_setup_dma(struct 
pnv_phb *phb,
  else
  segs = (pe-dma32_weight *
  phb-ioda.dma32_segcount) / dma_weight;
+
+ /*
+  * Allocate DMA32 segments. We might not have enough
+  * resources available. However we expect at least one
+  * to be available.
+  */
+ do {
+ base = bitmap_find_next_zero_area(phb-ioda.dma32_segmap,
+   phb-ioda.dma32_segcount,
+   0, segs, 0);
+ if (base  phb-ioda.dma32_segcount) {
+ bitmap_set(phb-ioda.dma32_segmap, base, segs);
+ break;
+ }
+ } while (--segs);


If segs==0 before entering the loop, the loop will execute 0xfffe times.
Make it for(;segs;--segs){ }.


segs is always equal to or bigger than 1 when entering the loop.

+
+ if (WARN_ON(!segs))
+ return;
+
  pe_info(pe, DMA weight %d, assigned %d segments\n,
  pe-dma32_weight, segs);
  pnv_pci_ioda_setup_dma_pe(phb, pe, base, segs);
-
- return segs;
  }

  #ifdef CONFIG_PCI_MSI
@@ -2933,20 +2955,18 @@ static void pnv_pci_ioda_setup_DMA(void)
  struct pci_controller *hose, *tmp;
  struct pnv_phb *phb;
  struct pnv_ioda_pe *pe;
- unsigned int base;

  list_for_each_entry_safe(hose, tmp, hose_list, list_node) {
  phb = hose-private_data;
  pnv_pci_ioda_setup_opal_tce_kill(phb);

- base = 0;
  list_for_each_entry(pe, phb-ioda.pe_dma_list, dma_link) {
  if (!pe-dma32_weight)
  continue;

  switch (phb-type) {
  case PNV_PHB_IODA1:
- base += pnv_ioda1_setup_dma(phb, pe, base);
+ pnv_ioda1_setup_dma(phb, pe);
  break;
  case PNV_PHB_IODA2:
  pnv_pci_ioda2_setup_dma_pe(phb, pe);
diff --git a/arch/powerpc/platforms/powernv/pci.h 
b/arch/powerpc/platforms/powernv/pci.h
index 574fe43..1dc9578 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -65,6 +65,7 @@ struct pnv_ioda_pe {

  /* Base iommu table, ie, 4K TCEs, 32-bit DMA */
  int 

Re: [PATCH v6 12/42] powerpc/powernv: Increase PE# capacity

2015-08-12 Thread Gavin Shan
On Tue, Aug 11, 2015 at 12:47:25PM +1000, Alexey Kardashevskiy wrote:
On 08/11/2015 10:38 AM, Gavin Shan wrote:
On Mon, Aug 10, 2015 at 07:53:02PM +1000, Alexey Kardashevskiy wrote:
On 08/06/2015 02:11 PM, Gavin Shan wrote:
Each PHB maintains an array helping to translate RID (Request
ID) to PE# with the assumption that PE# takes 8 bits, indicating
that we can't have more than 256 PEs. However, pci_dn-pe_number
already had 4-bytes for the PE#.

The patch extends the PE# capacity so that each of them will be
4-bytes long. Then we can use IODA_INVALID_PE to check one entry
in phb-pe_rmap[] is valid or not.

Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
  arch/powerpc/platforms/powernv/pci-ioda.c | 8 ++--
  arch/powerpc/platforms/powernv/pci.h  | 7 +++
  2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 57ba8fd..3094c61 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -786,7 +786,7 @@ static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, 
struct pnv_ioda_pe *pe)

/* Clear the reverse map */
for (rid = pe-rid; rid  rid_end; rid++)
-   phb-ioda.pe_rmap[rid] = 0;
+   phb-ioda.pe_rmap[rid] = IODA_INVALID_PE;

/* Release from all parents PELT-V */
while (parent) {
@@ -3134,7 +3134,7 @@ static void __init pnv_pci_init_ioda_phb(struct 
device_node *np,
unsigned long size, pemap_off;
const __be64 *prop64;
const __be32 *prop32;
-   int len;
+   int len, i;
u64 phb_id;
void *aux;
long rc;
@@ -3201,6 +3201,10 @@ static void __init pnv_pci_init_ioda_phb(struct 
device_node *np,
if (prop32)
phb-ioda.reserved_pe = be32_to_cpup(prop32);

+   /* Invalidate RID to PE# mapping */
+   for (i = 0; i  ARRAY_SIZE(phb-ioda.pe_rmap); ++i)
+   phb-ioda.pe_rmap[i] = IODA_INVALID_PE;
+
/* Parse 64-bit MMIO range */
pnv_ioda_parse_m64_window(phb);

diff --git a/arch/powerpc/platforms/powernv/pci.h 
b/arch/powerpc/platforms/powernv/pci.h
index 1dc9578..6f8568e 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -175,11 +175,10 @@ struct pnv_phb {
struct list_headpe_list;
struct mutexpe_list_mutex;

-   /* Reverse map of PEs, will have to extend if
-* we are to support more than 256 PEs, indexed
-* bus { bus, devfn }
+   /* Reverse map of PEs, indexed by
+* { bus, devfn }
 */
-   unsigned char   pe_rmap[0x1];
+   int pe_rmap[0x1];


256k seems to be waste when only tiny fraction of it will ever be used. Using
include/linux/hashtable.h makes sense here, and if you use a hashtable, you
won't have to initialize anything with IODA_INVALID_PE.


I'm not sure if I follow your idea completely. With hash table to trace
RID mapping here, won't more memory needed if all PCI buse numbers (0
to 255) are all valid? It means hash table doesn't have advantage in
memory consumption.

You need 3 bytes - one for a bus and two for devfn - which makes it a perfect
32bit has key and you only store existing devices in a hash so you do not
waste memory.


You don't answer my concern yet: more memory will be needed if all PCI bus
numbers (0 to 255) are all valid. Also, 2 bytes are enough: one byte is for
bus number, another byte for devfn. Why we need 3 bytes here?

How many bits of the 16-bits (2-bytes) used as the hash key? I believe it
shouldn't all of them because lot of memory will be consumed for the hash
bucket heads. Since most of cases, we have bus level PE. So it sounds
reasonable to use the devfn as hash key, which is one-byte long. In this
case, 2KB (256 * 8) is used for the hash bucket head without any node
populated in the table yet.

Every node would be represented by below data struct, each of which consumes
24-bytes. If the PHB has 5 PCI buses, which is commonly seen, the total consumed
memory will be:

2KB for hash bucket head
30KB for hash nodes: (24 * 256 * 5)

struct pnv_ioda_rid {
   int bdfn;
   int pe_number;
   struct hlist_node node;
};

Don't forget it need more complex to maintain the conflicting list in one
bucket. So I don't see the benefit to use hashtable here.


On the other hand, searching in hash table buckets
have to iterate list of conflicting items (keys), which is slow comparing
to what we have.

How often do you expect this code to execute? Is not it setup-type and
hotplug only? Unless it is thousands times per second, it is not an issue
here.


I was intending to say: hashtable has more complex than array. The data
struct can be as simple as array. I don't see why we bother to have
hashtable here. However, you're correct, the code is just executed at
system 

[v2 00/11] Freescale DPAA QBMan Drivers

2015-08-12 Thread Roy Pledge

The Freescale Data Path Acceleration Architecture (DPAA) is a set of hardware 
components on specific QorIQ multicore processors. This architecture provides 
the infrastructure to support simplified sharing of networking interfaces and 
accelerators by multiple CPU cores and the accelerators.

The Queue Manager (QMan) is a hardware queue management block that allows 
software and accelerators on the datapath to enqueue and dequeue frames in 
order to communicate.

The Buffer Manager (BMan) is a hardware buffer pool management block that 
allows software and accelerators on the datapath to acquire and release buffers 
in order to build frames.

This patch set introduces the QBMan driver code that configures initializes the 
QBMan hardware and provides APIs for software to use the frame queues and 
buffer pools the blocks provide. These drivers provide the base fuctionality 
for software to communicate with the other DPAA accelerators on Freescale QorIQ 
processors.

Changes from v1:
- Cleanup Kconfig options
- Changed base QMan and BMan drivers to only be buit in.
  Will add loadable support in future patch
- Replace panic() call with WARN_ON()
- Elimanated some unused APIs
- Replaced PowerPC specific IO accessors with platform independent 
versions



Emil Medve (1):
  powerpc: re-add devm_ioremap_prot()

Geoff Thorpe (7):
  soc/fsl: Introduce DPAA BMan device management driver
  soc/fsl: Introduce the DPAA BMan portal driver
  soc/fsl: Introduce drivers for the DPAA QMan
  soc/bman: Add self-tester for BMan driver
  soc/qman: Add self-tester for QMan driver
  soc/bman: Add debugfs support for the BMan driver
  soc/qman: Add debugfs support for the QMan driver

Hai-Ying Wang (2):
  soc/bman: Add HOTPLUG_CPU support to the BMan driver
  soc/qman: Add HOTPLUG_CPU support to the QMan driver

Madalin Bucur (1):
  soc/qman: add qman_delete_cgr_safe()

 arch/powerpc/include/asm/io.h |3 +
 arch/powerpc/lib/Makefile |1 +
 arch/powerpc/lib/devres.c |   43 +
 arch/powerpc/platforms/85xx/corenet_generic.c |   16 +
 arch/powerpc/platforms/85xx/p1023_rdb.c   |   14 +
 drivers/soc/Kconfig   |1 +
 drivers/soc/Makefile  |1 +
 drivers/soc/fsl/Kconfig   |5 +
 drivers/soc/fsl/Makefile  |3 +
 drivers/soc/fsl/qbman/Kconfig |  120 +
 drivers/soc/fsl/qbman/Makefile|   20 +
 drivers/soc/fsl/qbman/bman-debugfs.c  |  117 +
 drivers/soc/fsl/qbman/bman.c  |  553 +
 drivers/soc/fsl/qbman/bman.h  |  542 +
 drivers/soc/fsl/qbman/bman_api.c  | 1072 +
 drivers/soc/fsl/qbman/bman_portal.c   |  391 
 drivers/soc/fsl/qbman/bman_priv.h |  134 ++
 drivers/soc/fsl/qbman/bman_test.c |   56 +
 drivers/soc/fsl/qbman/bman_test.h |   34 +
 drivers/soc/fsl/qbman/bman_test_api.c |  184 ++
 drivers/soc/fsl/qbman/bman_test_thresh.c  |  198 ++
 drivers/soc/fsl/qbman/bman_utils.c|   72 +
 drivers/soc/fsl/qbman/dpaa_resource.c |  359 +++
 drivers/soc/fsl/qbman/dpaa_sys.h  |  271 +++
 drivers/soc/fsl/qbman/qman-debugfs.c  | 1313 +++
 drivers/soc/fsl/qbman/qman.c  | 1026 +
 drivers/soc/fsl/qbman/qman.h  | 1128 ++
 drivers/soc/fsl/qbman/qman_api.c  | 2921 +
 drivers/soc/fsl/qbman/qman_driver.c   |   83 +
 drivers/soc/fsl/qbman/qman_portal.c   |  672 ++
 drivers/soc/fsl/qbman/qman_priv.h |  287 +++
 drivers/soc/fsl/qbman/qman_test.c |   57 +
 drivers/soc/fsl/qbman/qman_test.h |   44 +
 drivers/soc/fsl/qbman/qman_test_api.c |  216 ++
 drivers/soc/fsl/qbman/qman_test_stash.c   |  502 +
 drivers/soc/fsl/qbman/qman_utils.c|  305 +++
 include/soc/fsl/bman.h|  518 +
 include/soc/fsl/qman.h| 1977 +
 38 files changed, 15259 insertions(+)
 create mode 100644 arch/powerpc/lib/devres.c
 create mode 100644 drivers/soc/fsl/Kconfig
 create mode 100644 drivers/soc/fsl/Makefile
 create mode 100644 drivers/soc/fsl/qbman/Kconfig
 create mode 100644 drivers/soc/fsl/qbman/Makefile
 create mode 100644 drivers/soc/fsl/qbman/bman-debugfs.c
 create mode 100644 drivers/soc/fsl/qbman/bman.c
 create mode 100644 drivers/soc/fsl/qbman/bman.h
 create mode 100644 drivers/soc/fsl/qbman/bman_api.c
 create mode 100644 drivers/soc/fsl/qbman/bman_portal.c
 create mode 100644 drivers/soc/fsl/qbman/bman_priv.h
 create mode 100644 drivers/soc/fsl/qbman/bman_test.c
 create mode 100644 drivers/soc/fsl/qbman/bman_test.h
 create mode 100644 drivers/soc/fsl/qbman/bman_test_api.c
 create mode 100644 drivers/soc/fsl/qbman/bman_test_thresh.c
 create 

[v2 08/11] soc/qman: Add debugfs support for the QMan driver

2015-08-12 Thread Roy Pledge
From: Geoff Thorpe geoff.tho...@freescale.com

Add debugfs sypport for querying the state of hardware based
queues managed by the DPAA 1.0 Queue Manager.

Signed-off-by: Geoff Thorpe geoff.tho...@freescale.com
Signed-off-by: Emil Medve emilian.me...@freescale.com
Signed-off-by: Madalin Bucur madalin.bu...@freescale.com
Signed-off-by: Roy Pledge roy.ple...@freescale.com
---
 drivers/soc/fsl/qbman/Makefile   |1 +
 drivers/soc/fsl/qbman/dpaa_sys.h |2 +
 drivers/soc/fsl/qbman/qman-debugfs.c | 1313 ++
 drivers/soc/fsl/qbman/qman_api.c |   60 +-
 drivers/soc/fsl/qbman/qman_priv.h|8 +
 5 files changed, 1382 insertions(+), 2 deletions(-)
 create mode 100644 drivers/soc/fsl/qbman/qman-debugfs.c

diff --git a/drivers/soc/fsl/qbman/Makefile b/drivers/soc/fsl/qbman/Makefile
index 2b53fbc..cce1f70 100644
--- a/drivers/soc/fsl/qbman/Makefile
+++ b/drivers/soc/fsl/qbman/Makefile
@@ -17,3 +17,4 @@ obj-$(CONFIG_FSL_QMAN_TEST)   += qman-test.o
 qman-test-y = qman_test.o
 qman-test-$(CONFIG_FSL_QMAN_TEST_API)  += qman_test_api.o
 qman-test-$(CONFIG_FSL_QMAN_TEST_STASH)+= qman_test_stash.o
+obj-$(CONFIG_FSL_QMAN_DEBUGFS) += qman-debugfs.o
diff --git a/drivers/soc/fsl/qbman/dpaa_sys.h b/drivers/soc/fsl/qbman/dpaa_sys.h
index 3cf446a..0dd341c 100644
--- a/drivers/soc/fsl/qbman/dpaa_sys.h
+++ b/drivers/soc/fsl/qbman/dpaa_sys.h
@@ -38,7 +38,9 @@
 #include linux/of_irq.h
 #include linux/of_reserved_mem.h
 #include linux/kthread.h
+#include linux/uaccess.h
 #include linux/debugfs.h
+#include linux/vmalloc.h
 #include linux/platform_device.h
 #include linux/ctype.h
 
diff --git a/drivers/soc/fsl/qbman/qman-debugfs.c 
b/drivers/soc/fsl/qbman/qman-debugfs.c
new file mode 100644
index 000..57585e8
--- /dev/null
+++ b/drivers/soc/fsl/qbman/qman-debugfs.c
@@ -0,0 +1,1313 @@
+/* Copyright 2010 - 2015 Freescale Semiconductor, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *  names of its contributors may be used to endorse or promote products
+ *  derived from this software without specific prior written permission.
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License (GPL) as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include qman_priv.h
+
+#define MAX_FQID (0x00ff)
+#define QM_FQD_BLOCK_SIZE 64
+#define QM_FQD_AR(0xC10)
+
+static u32 fqid_max;
+static u64 qman_ccsr_start;
+static u64 qman_ccsr_size;
+
+static const char * const state_txt[] = {
+   Out of Service,
+   Retired,
+   Tentatively Scheduled,
+   Truly Scheduled,
+   Parked,
+   Active, Active Held or Held Suspended,
+   Unknown State 6,
+   Unknown State 7,
+   NULL,
+};
+
+static const u8 fqd_states[] = {
+   QM_MCR_NP_STATE_OOS, QM_MCR_NP_STATE_RETIRED, QM_MCR_NP_STATE_TEN_SCHED,
+   QM_MCR_NP_STATE_TRU_SCHED, QM_MCR_NP_STATE_PARKED,
+   QM_MCR_NP_STATE_ACTIVE};
+
+struct mask_to_text {
+   u16 mask;
+   const char *txt;
+};
+
+struct mask_filter_s {
+   u16 mask;
+   u8 filter;
+};
+
+static const struct mask_filter_s mask_filter[] = {
+   {QM_FQCTRL_PREFERINCACHE, 0},
+   {QM_FQCTRL_PREFERINCACHE, 1},
+   {QM_FQCTRL_HOLDACTIVE, 0},
+   {QM_FQCTRL_HOLDACTIVE, 1},
+   {QM_FQCTRL_AVOIDBLOCK, 0},
+   {QM_FQCTRL_AVOIDBLOCK, 1},
+   {QM_FQCTRL_FORCESFDR, 0},
+   {QM_FQCTRL_FORCESFDR, 1},
+   {QM_FQCTRL_CPCSTASH, 0},
+   {QM_FQCTRL_CPCSTASH, 1},
+   

Re: [PATCH 31/31] dma-mapping-common: skip kmemleak checks for page-less SG entries

2015-08-12 Thread Catalin Marinas
Christoph,

On 12 August 2015 at 08:05, Christoph Hellwig h...@lst.de wrote:
 Signed-off-by: Christoph Hellwig h...@lst.de
 ---
  include/asm-generic/dma-mapping-common.h | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

 diff --git a/include/asm-generic/dma-mapping-common.h 
 b/include/asm-generic/dma-mapping-common.h
 index 940d5ec..afc3eaf 100644
 --- a/include/asm-generic/dma-mapping-common.h
 +++ b/include/asm-generic/dma-mapping-common.h
 @@ -51,8 +51,10 @@ static inline int dma_map_sg_attrs(struct device *dev, 
 struct scatterlist *sg,
 int i, ents;
 struct scatterlist *s;

 -   for_each_sg(sg, s, nents, i)
 -   kmemcheck_mark_initialized(sg_virt(s), s-length);
 +   for_each_sg(sg, s, nents, i) {
 +   if (sg_has_page(s))
 +   kmemcheck_mark_initialized(sg_virt(s), s-length);
 +   }

Just a nitpick for the subject, it should say kmemcheck rather than
kmemleak (different features ;)).

-- 
Catalin
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v6 05/42] powerpc/powernv: Track IO/M32/M64 segments from PE

2015-08-12 Thread Gavin Shan
On Wed, Aug 12, 2015 at 10:57:33PM +1000, Alexey Kardashevskiy wrote:
On 08/12/2015 09:20 PM, Gavin Shan wrote:
On Wed, Aug 12, 2015 at 09:05:09PM +1000, Alexey Kardashevskiy wrote:
On 08/12/2015 08:45 PM, Gavin Shan wrote:
On Tue, Aug 11, 2015 at 12:23:42PM +1000, Alexey Kardashevskiy wrote:
On 08/11/2015 10:03 AM, Gavin Shan wrote:
On Mon, Aug 10, 2015 at 05:16:40PM +1000, Alexey Kardashevskiy wrote:
On 08/06/2015 02:11 PM, Gavin Shan wrote:
The patch is adding 6 bitmaps, three to PE and three to PHB, to track

The patch is also removing 2 arrays (io_segmap and m32_segmap), what is 
that
all about? Also, there was no m64_segmap, now there is, needs an 
explanation
may be.


Originally, the bitmaps (io_segmap and m32_segmap) are allocated 
dynamically.
Now, they have fixed sizes - 512 bits.

The subject powerpc/powernv: Track IO/M32/M64 segments from PE indicates
why m64_segmap is added.


But before this patch, you somehow managed to keep it working without a map
for M64, by the same time you needed map for IO and M32. It seems you are
making things consistent in this patch but it also feels like you do not 
have
to do so as M64 did not need a map before and I cannot see why it needs one
now.


The M64 map is used by [PATCH v6 23/42] powerpc/powernv: Release PEs 
dynamically
where the M64 segments consumed by one particular PE will be released.


Then add it where it is really started being used. It is really hard to
review a patch which is actually spread between patches. Do not count that
reviewers will just trust you.


Ok. I'll try.



the consumed by one particular PE, which can be released once the PE
is destroyed during PCI unplugging time. Also, we're using fixed
quantity of bits to trace the used IO, M32 and M64 segments by PEs
in one particular PHB.


Out of curiosity - have you considered having just 3 arrays, in PHB, 
storing
PE numbers, and ditching PE's arrays? Does PE itself need to know what 
PEs it
is using? Not sure about this master/slave PEs though.


I don't follow your suggestion. Can you rephrase and explain it a bit 
more?


Please explains in what situations you need same map in both PHB and PE and
how you are going to use them. For example, pe::m64_segmap and
phb::m64_segmap.

I believe you need to know what segment is used by what PE and that's it 
and
having 2 bitmaps is overcomplicated hard to follow. Is there anything else
what I am missing?


The situation is same to all (IO, M32 and M64) segment maps. Taking 
m64_segmap
as an example, it will be used when creating or destroying the PE who 
consumes
M64 segments. phb::m64_segmap is recording the M64 segment usage in PHB's 
domain.
It's used to check same M64 segment won't be used for towice. 
pe::m64_segmap tracks
the M64 segments consumed by the PE.


You could have a single map in PHB, key would be a segment number and value
would be PE number. No need to have a map in PE. At all. No need to
initialize bitmaps, etc.


So it would be arrays for various segmant maps if I understood your suggestion
as below. Please confirm:

#define PNV_IODA_MAX_SEG_NUM  512

  int struct pnv_phb::io_segmap[PNV_IODA_MAX_SEG_NUM];
  m32_segmap[PNV_IODA_MAX_SEG_NUM];
  m64_segmap[PNV_IODA_MAX_SEG_NUM];
- Initially, they are initialize to IODA_INVALID_PE;
- When one segment is assigned to one PE, the corresponding entry
   of the array is set to PE number.
- When one segment is relased, the corresponding entry of the array
   is set to IODA_INVALID_PE;


No, not arrays, I meant DEFINE_HASHTABLE(), hash_add(), etc from
include/linux/hashtable.h.

http://kernelnewbies.org/FAQ/Hashtables is a good place to start :)


Are you sure it needs hashtable to represent the simple data struct?
I really don't understand the benefits, could you provide more details
about the benefits?

With hashtable, every bucket will include multiple items with conflicting
hash key, each of which would be represented by data struct as below. The
data struct uses 24 bytes memory and not efficient enough from this aspect.
When one more segment consued, instance of struct pnv_ioda_segment is
allocated and put into the conflicting list of the target bucket. At later
point, the instance is removed from the list and released when the segment
is detached from the PE. It's more complex than it should be.

struct pnv_ioda_segment {
int   pe_number;
int   seg_number;
struct hlist_node node;
};

Thanks,
Gavin

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[v2 06/11] soc/qman: Add self-tester for QMan driver

2015-08-12 Thread Roy Pledge
From: Geoff Thorpe geoff.tho...@freescale.com

Add a self test for the DPAA 1.0 Queue Manager driver. The tests
ensure that the driver can properly enqueue and dequeue from frame
queues using the QMan portal infrastructure.

Signed-off-by: Geoff Thorpe geoff.tho...@freescale.com
Signed-off-by: Emil Medve emilian.me...@freescale.com
Signed-off-by: Roy Pledge roy.ple...@freescale.com
---
 drivers/soc/fsl/qbman/Makefile  |4 +
 drivers/soc/fsl/qbman/qman_test.c   |   57 
 drivers/soc/fsl/qbman/qman_test.h   |   44 +++
 drivers/soc/fsl/qbman/qman_test_api.c   |  216 +
 drivers/soc/fsl/qbman/qman_test_stash.c |  502 +++
 5 files changed, 823 insertions(+)
 create mode 100644 drivers/soc/fsl/qbman/qman_test.c
 create mode 100644 drivers/soc/fsl/qbman/qman_test.h
 create mode 100644 drivers/soc/fsl/qbman/qman_test_api.c
 create mode 100644 drivers/soc/fsl/qbman/qman_test_stash.c

diff --git a/drivers/soc/fsl/qbman/Makefile b/drivers/soc/fsl/qbman/Makefile
index 04509c3..82f5482 100644
--- a/drivers/soc/fsl/qbman/Makefile
+++ b/drivers/soc/fsl/qbman/Makefile
@@ -12,3 +12,7 @@ bman-test-$(CONFIG_FSL_BMAN_TEST_THRESH)  += 
bman_test_thresh.o
 
 obj-$(CONFIG_FSL_QMAN) += qman_api.o qman_utils.o 
qman_driver.o
 obj-$(CONFIG_FSL_QMAN_CONFIG)  += qman.o qman_portal.o
+obj-$(CONFIG_FSL_QMAN_TEST)+= qman-test.o
+qman-test-y = qman_test.o
+qman-test-$(CONFIG_FSL_QMAN_TEST_API)  += qman_test_api.o
+qman-test-$(CONFIG_FSL_QMAN_TEST_STASH)+= qman_test_stash.o
diff --git a/drivers/soc/fsl/qbman/qman_test.c 
b/drivers/soc/fsl/qbman/qman_test.c
new file mode 100644
index 000..9ec49cb
--- /dev/null
+++ b/drivers/soc/fsl/qbman/qman_test.c
@@ -0,0 +1,57 @@
+/* Copyright 2008 - 2015 Freescale Semiconductor, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *  names of its contributors may be used to endorse or promote products
+ *  derived from this software without specific prior written permission.
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License (GPL) as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include qman_test.h
+
+MODULE_AUTHOR(Geoff Thorpe);
+MODULE_LICENSE(Dual BSD/GPL);
+MODULE_DESCRIPTION(QMan testing);
+
+static int test_init(void)
+{
+   int loop = 1;
+
+   while (loop--) {
+#ifdef CONFIG_FSL_QMAN_TEST_STASH
+   qman_test_stash();
+#endif
+#ifdef CONFIG_FSL_QMAN_TEST_API
+   qman_test_api();
+#endif
+   }
+   return 0;
+}
+
+static void test_exit(void)
+{
+}
+
+module_init(test_init);
+module_exit(test_exit);
diff --git a/drivers/soc/fsl/qbman/qman_test.h 
b/drivers/soc/fsl/qbman/qman_test.h
new file mode 100644
index 000..0b34a67
--- /dev/null
+++ b/drivers/soc/fsl/qbman/qman_test.h
@@ -0,0 +1,44 @@
+/* Copyright 2008 - 2015 Freescale Semiconductor, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor 

Re: [PATCH v6 18/42] powerpc/powernv: Allocate PE# in deasending order

2015-08-12 Thread Gavin Shan
On Tue, Aug 11, 2015 at 12:50:33PM +1000, Alexey Kardashevskiy wrote:
On 08/11/2015 10:43 AM, Gavin Shan wrote:
On Tue, Aug 11, 2015 at 12:39:02AM +1000, Alexey Kardashevskiy wrote:
On 08/06/2015 02:11 PM, Gavin Shan wrote:
The available PE#, represented by a bitmap in the PHB, is allocated
in ascending order.

Available PE# is available exactly because it is not allocated ;)


Yeah, will correct it.

It conflicts with the fact that M64 segments are
assigned in same order. In order to avoid the conflict, the patch
allocates PE# in descending order.

What kind of conflict?


On PHB3, the M64 segment is assigned to one PE whose PE number is
determined. M64 segment are allocated in ascending order. It's why
I would like to allocate PE# in deascending order.


From previous lessons, I thought M64 segment number is PE# number as well :-/
Seems this is not the case, so what does store this seg#-PE# mapping in PHB?


Your understanding is somewhat correct. Let me explain for more here. Taking
PHB3 as an example: it has 16 M64 BARs. The last BAR (15th) is running in
share mode. When one segment from this BAR is assigned to one PE, the PE number
is determined and that's equal to the segment number. However, it's still 
possible
one PE has multiple segments. We have master and slave PEs for the later 
case.

If any one left BARs (0 to 14) is running in single mode and assigned to one 
particular
PE. the PE number can be confiugred.



Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
  arch/powerpc/platforms/powernv/pci-ioda.c | 11 ---
  1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 56b058c..1c950e8 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -161,13 +161,18 @@ static struct pnv_ioda_pe *pnv_ioda_reserve_pe(struct 
pnv_phb *phb, int pe_no)
  static struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb)
  {
unsigned long pe;
+   unsigned long limit = phb-ioda.total_pe_num - 1;

do {
pe = find_next_zero_bit(phb-ioda.pe_alloc,
-   phb-ioda.total_pe_num, 0);
-   if (pe = phb-ioda.total_pe_num)
+   phb-ioda.total_pe_num, limit);
+   if (pe  phb-ioda.total_pe_num 
+   !test_and_set_bit(pe, phb-ioda.pe_alloc))
+   break;
+
+   if (--limit = phb-ioda.total_pe_num)
return NULL;
-   } while(test_and_set_bit(pe, phb-ioda.pe_alloc));
+   } while (1);


Usually, if it is while(1), then it is while(1){} rather than
do{}while(1) :)

Agree, will change it.




return pnv_ioda_init_pe(phb, pe);
  }



Thanks,
Gavin

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [1/8] powerpc/slb: Remove a duplicate extern variable

2015-08-12 Thread Michael Ellerman
On Wed, 2015-29-07 at 07:09:58 UTC, Anshuman Khandual wrote:
 This patch just removes one redundant entry for one extern variable
 'slb_compare_rr_to_size' from the scope. This patch does not change
 any functionality.
 
 Signed-off-by: Anshuman Khandual khand...@linux.vnet.ibm.com

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/752b8adec4a776b4fdf0

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 00/11] CXL EEH Handling

2015-08-12 Thread Daniel Axtens
CXL accelerators are unfortunately not immune from failure. This patch
set enables them to particpate in the Extended Error Handling process.

This series starts with a number of preparatory patches:

 - Patch 1 is cleanup: converting macros to static inlines.

 - Patch 2 makes sure we don't touch the hardware when it has failed.
 
 - Patches 3-5 make the 'unplug' functions idempotent, so that if we
   get part way through recovery and then fail, being completely
   unplugged as part of removal doesn't cause us to oops out.

 - Patches 6 and 7 refactor init and teardown paths for the adapter
   and AFUs, so that they can be configured and deconfigured
   separately from their allocation and release.

 - Patch 8 stops cxl_reset from breaking EEH.

Patches 9 and 10 are parts of EEH.

 - Firstly we have a kernel flag that allows us to confidently assert
   the hardware will not change (be reflashed) when it it reset. We
   need this in order to be able to safely do EEH recovery.

 - We then have the EEH support itself.

Finally, we add a CONFIG_CXL_EEH symbol. This allows drivers to depend
on the API we provide to enable CXL EEH, or to be easily backportable
if EEH is optional.

Changes from v3 are minor:
 - Clarification of responsibility of CXL driver vs driver bound to
   vPHB with regards to preventing inappropriate access of hardware
   during recovery.
 - Clean up unused rc in cxl_alloc_adapter, thanks David Laight.
 - Break setting rc and testing rc into different lines, thanks mpe
   and Cyril.
 - If we fail to init an AFU, don't try to select the best mode.

Changes from v2 are mostly minor cleanups, reflecting some review and
further testing.
 - Use static inlines instead of macros.
 - Propagate PCI link state to devices on the vPHB.
 - Various cleanup, thanks Cyril Bur.
 - Use pci_channel_offline instead of a direct check.
 - Don't ifdef, just provide the symbol so that drivers know that the
   new API is available. Thanks to Cyril for patiently explaining this
   to me about 3 times before I understood.

Changes from v1:
 - More comprehensive link down checks, including vPHB.
 - Rebased to apply cleanly to 4.2-rc4.
 - cxl reset changes.
 - CONFIG_CXL_EEH symbol addition.
 - add better vPHB support to EEH.

Daniel Axtens (11):
  cxl: Convert MMIO read/write macros to inline functions
  cxl: Drop commands if the PCI channel is not in normal state
  cxl: Allocate and release the SPA with the AFU
  cxl: Make IRQ release idempotent
  cxl: Clean up adapter MMIO unmap path.
  cxl: Refactor adaptor init/teardown
  cxl: Refactor AFU init/teardown
  cxl: Don't remove AFUs/vPHBs in cxl_reset
  cxl: Allow the kernel to trust that an image won't change on PERST.
  cxl: EEH support
  cxl: Add CONFIG_CXL_EEH symbol

 Documentation/ABI/testing/sysfs-class-cxl |  10 +
 drivers/misc/cxl/Kconfig  |   6 +
 drivers/misc/cxl/api.c|   7 +
 drivers/misc/cxl/context.c|   6 +-
 drivers/misc/cxl/cxl.h|  84 -
 drivers/misc/cxl/file.c   |  19 +
 drivers/misc/cxl/irq.c|   9 +
 drivers/misc/cxl/native.c | 104 +-
 drivers/misc/cxl/pci.c| 591 +++---
 drivers/misc/cxl/sysfs.c  |  26 ++
 drivers/misc/cxl/vphb.c   |  34 ++
 include/misc/cxl.h|  10 +
 12 files changed, 752 insertions(+), 154 deletions(-)

-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [1/4] cxl: Compile with -Werror

2015-08-12 Thread Michael Ellerman
On Fri, 2015-07-08 at 03:18:17 UTC, Daniel Axtens wrote:
 It's a good idea, and it brings us in line with the rest of arch/powerpc.
 
 Signed-off-by: Daniel Axtens d...@axtens.net
 Acked-by: Michael Neuling mi...@neuling.org

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/d3d73f4b38a8ece19846

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [6/8] powerpc/prom: Simplify the logic while fetching SLB size

2015-08-12 Thread Michael Ellerman
On Wed, 2015-29-07 at 07:10:03 UTC, Anshuman Khandual wrote:
 This patch just simplifies the existing code logic while fetching
 the SLB size property from the device tree. This also changes the
 function name from check_cpu_slb_size to init_mmu_slb_size as
 it just initializes the mmu_slb_size value.
 
 Signed-off-by: Anshuman Khandual khand...@linux.vnet.ibm.com

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/9c61f7a0ad6fdff85b0c

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 04/11] cxl: Make IRQ release idempotent

2015-08-12 Thread Daniel Axtens
Check if an IRQ is mapped before releasing it.

This will simplify future EEH code by allowing unconditional unmapping
of IRQs.

Acked-by: Cyril Bur cyril...@gmail.com
Signed-off-by: Daniel Axtens d...@axtens.net
---
 drivers/misc/cxl/irq.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/misc/cxl/irq.c b/drivers/misc/cxl/irq.c
index 77e5d0e7ebe1..9a1e5732c1af 100644
--- a/drivers/misc/cxl/irq.c
+++ b/drivers/misc/cxl/irq.c
@@ -341,6 +341,9 @@ int cxl_register_psl_err_irq(struct cxl *adapter)
 
 void cxl_release_psl_err_irq(struct cxl *adapter)
 {
+   if (adapter-err_virq != irq_find_mapping(NULL, adapter-err_hwirq))
+   return;
+
cxl_p1_write(adapter, CXL_PSL_ErrIVTE, 0x);
cxl_unmap_irq(adapter-err_virq, adapter);
cxl_release_one_irq(adapter, adapter-err_hwirq);
@@ -374,6 +377,9 @@ int cxl_register_serr_irq(struct cxl_afu *afu)
 
 void cxl_release_serr_irq(struct cxl_afu *afu)
 {
+   if (afu-serr_virq != irq_find_mapping(NULL, afu-serr_hwirq))
+   return;
+
cxl_p1n_write(afu, CXL_PSL_SERR_An, 0x);
cxl_unmap_irq(afu-serr_virq, afu);
cxl_release_one_irq(afu-adapter, afu-serr_hwirq);
@@ -400,6 +406,9 @@ int cxl_register_psl_irq(struct cxl_afu *afu)
 
 void cxl_release_psl_irq(struct cxl_afu *afu)
 {
+   if (afu-psl_virq != irq_find_mapping(NULL, afu-psl_hwirq))
+   return;
+
cxl_unmap_irq(afu-psl_virq, afu);
cxl_release_one_irq(afu-adapter, afu-psl_hwirq);
kfree(afu-psl_irq_name);
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 07/11] cxl: Refactor AFU init/teardown

2015-08-12 Thread Daniel Axtens
As with an adapter, some aspects of initialisation are done only once
in the lifetime of an AFU: for example, allocating memory, or setting
up sysfs/debugfs files.

However, we may want to be able to do some parts of the initialisation
multiple times: for example, in error recovery we want to be able to
tear down and then re-map IO memory and IRQs.

Therefore, refactor AFU init/teardown as follows.

 - Create two new functions: 'cxl_configure_afu', and its pair
   'cxl_deconfigure_afu'. As with the adapter functions,
   these (de)configure resources that do not need to last the entire
   lifetime of the AFU.

 - Allocating and releasing memory remain the task of 'cxl_alloc_afu'
   and 'cxl_release_afu'.

 - Once-only functions that do not involve allocating/releasing memory
   stay in the overarching 'cxl_init_afu'/'cxl_remove_afu' pair.
   However, the task of picking an AFU mode and activating it has been
   broken out.

Signed-off-by: Daniel Axtens d...@axtens.net
---
 drivers/misc/cxl/pci.c | 124 ++---
 1 file changed, 77 insertions(+), 47 deletions(-)

diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index f3c5998f2f37..8e7b0f3ad254 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -752,45 +752,77 @@ ssize_t cxl_afu_read_err_buffer(struct cxl_afu *afu, char 
*buf,
return count;
 }
 
-static int cxl_init_afu(struct cxl *adapter, int slice, struct pci_dev *dev)
+static int cxl_configure_afu(struct cxl_afu *afu, struct cxl *adapter, struct 
pci_dev *dev)
 {
-   struct cxl_afu *afu;
-   bool free = true;
int rc;
 
-   if (!(afu = cxl_alloc_afu(adapter, slice)))
-   return -ENOMEM;
-
-   if ((rc = dev_set_name(afu-dev, afu%i.%i, adapter-adapter_num, 
slice)))
-   goto err1;
+   rc = cxl_map_slice_regs(afu, adapter, dev);
+   if (rc)
+   return rc;
 
-   if ((rc = cxl_map_slice_regs(afu, adapter, dev)))
+   rc = sanitise_afu_regs(afu);
+   if (rc)
goto err1;
 
-   if ((rc = sanitise_afu_regs(afu)))
-   goto err2;
-
/* We need to reset the AFU before we can read the AFU descriptor */
-   if ((rc = __cxl_afu_reset(afu)))
-   goto err2;
+   rc = __cxl_afu_reset(afu);
+   if (rc)
+   goto err1;
 
if (cxl_verbose)
dump_afu_descriptor(afu);
 
-   if ((rc = cxl_read_afu_descriptor(afu)))
-   goto err2;
+   rc = cxl_read_afu_descriptor(afu);
+   if (rc)
+   goto err1;
 
-   if ((rc = cxl_afu_descriptor_looks_ok(afu)))
-   goto err2;
+   rc = cxl_afu_descriptor_looks_ok(afu);
+   if (rc)
+   goto err1;
 
-   if ((rc = init_implementation_afu_regs(afu)))
-   goto err2;
+   rc = init_implementation_afu_regs(afu);
+   if (rc)
+   goto err1;
+
+   rc = cxl_register_serr_irq(afu);
+   if (rc)
+   goto err1;
 
-   if ((rc = cxl_register_serr_irq(afu)))
+   rc = cxl_register_psl_irq(afu);
+   if (rc)
goto err2;
 
-   if ((rc = cxl_register_psl_irq(afu)))
-   goto err3;
+   return 0;
+
+err2:
+   cxl_release_serr_irq(afu);
+err1:
+   cxl_unmap_slice_regs(afu);
+   return rc;
+}
+
+static void cxl_deconfigure_afu(struct cxl_afu *afu)
+{
+   cxl_release_psl_irq(afu);
+   cxl_release_serr_irq(afu);
+   cxl_unmap_slice_regs(afu);
+}
+
+static int cxl_init_afu(struct cxl *adapter, int slice, struct pci_dev *dev)
+{
+   struct cxl_afu *afu;
+   int rc;
+
+   if (!(afu = cxl_alloc_afu(adapter, slice)))
+   return -ENOMEM;
+
+   rc = dev_set_name(afu-dev, afu%i.%i, adapter-adapter_num, slice);
+   if (rc)
+   goto err_free;
+
+   rc = cxl_configure_afu(afu, adapter, dev);
+   if (rc)
+   goto err_free;
 
/* Don't care if this fails */
cxl_debugfs_afu_add(afu);
@@ -799,38 +831,32 @@ static int cxl_init_afu(struct cxl *adapter, int slice, 
struct pci_dev *dev)
 * After we call this function we must not free the afu directly, even
 * if it returns an error!
 */
-   if ((rc = cxl_register_afu(afu)))
+   rc = cxl_register_afu(afu);
+   if (rc)
goto err_put1;
 
-   if ((rc = cxl_sysfs_afu_add(afu)))
+   rc = cxl_sysfs_afu_add(afu);
+   if (rc)
goto err_put1;
 
-
-   if ((rc = cxl_afu_select_best_mode(afu)))
-   goto err_put2;
-
adapter-afu[afu-slice] = afu;
 
-   if ((rc = cxl_pci_vphb_add(afu)))
+   rc = cxl_pci_vphb_add(afu);
+   if (rc)
dev_info(afu-dev, Can't register vPHB\n);
 
return 0;
 
-err_put2:
-   cxl_sysfs_afu_remove(afu);
 err_put1:
-   device_unregister(afu-dev);
-   free = false;
+   cxl_deconfigure_afu(afu);

[PATCH v4 05/11] cxl: Clean up adapter MMIO unmap path.

2015-08-12 Thread Daniel Axtens
 - MMIO pointer unmapping is guarded by a null pointer check.
   However, iounmap doesn't null the pointer, just invalidate it.
   Therefore, explicitly null the pointer after unmapping.

 - afu_desc_mmio also needs to be unmapped.

 - PCI regions are allocated in cxl_map_adapter_regs.
   Therefore they should be released in unmap, not elsewhere.

Acked-by: Cyril Bur cyril...@gmail.com
Signed-off-by: Daniel Axtens d...@axtens.net
---
 drivers/misc/cxl/pci.c | 24 ++--
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 62a762d94de3..484d35a5aead 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -539,10 +539,18 @@ err:
 
 static void cxl_unmap_slice_regs(struct cxl_afu *afu)
 {
-   if (afu-p2n_mmio)
+   if (afu-p2n_mmio) {
iounmap(afu-p2n_mmio);
-   if (afu-p1n_mmio)
+   afu-p2n_mmio = NULL;
+   }
+   if (afu-p1n_mmio) {
iounmap(afu-p1n_mmio);
+   afu-p1n_mmio = NULL;
+   }
+   if (afu-afu_desc_mmio) {
+   iounmap(afu-afu_desc_mmio);
+   afu-afu_desc_mmio = NULL;
+   }
 }
 
 static void cxl_release_afu(struct device *dev)
@@ -919,10 +927,16 @@ err1:
 
 static void cxl_unmap_adapter_regs(struct cxl *adapter)
 {
-   if (adapter-p1_mmio)
+   if (adapter-p1_mmio) {
iounmap(adapter-p1_mmio);
-   if (adapter-p2_mmio)
+   adapter-p1_mmio = NULL;
+   pci_release_region(to_pci_dev(adapter-dev.parent), 2);
+   }
+   if (adapter-p2_mmio) {
iounmap(adapter-p2_mmio);
+   adapter-p2_mmio = NULL;
+   pci_release_region(to_pci_dev(adapter-dev.parent), 0);
+   }
 }
 
 static int cxl_read_vsec(struct cxl *adapter, struct pci_dev *dev)
@@ -1131,8 +1145,6 @@ static void cxl_remove_adapter(struct cxl *adapter)
 
device_unregister(adapter-dev);
 
-   pci_release_region(pdev, 0);
-   pci_release_region(pdev, 2);
pci_disable_device(pdev);
 }
 
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 11/11] cxl: Add CONFIG_CXL_EEH symbol

2015-08-12 Thread Daniel Axtens
CONFIG_CXL_EEH is for CXL's EEH related code.

Other drivers can depend on or #ifdef on this symbol to configure
PERST behaviour, allowing CXL to participate in the EEH process.

Reviewed-by: Cyril Bur cyril...@gmail.com
Signed-off-by: Daniel Axtens d...@axtens.net
---
 drivers/misc/cxl/Kconfig | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/misc/cxl/Kconfig b/drivers/misc/cxl/Kconfig
index b6db9ebd52c2..c151fc1fe14c 100644
--- a/drivers/misc/cxl/Kconfig
+++ b/drivers/misc/cxl/Kconfig
@@ -11,11 +11,17 @@ config CXL_KERNEL_API
bool
default n
 
+config CXL_EEH
+   bool
+   default n
+   select EEH
+
 config CXL
tristate Support for IBM Coherent Accelerators (CXL)
depends on PPC_POWERNV  PCI_MSI
select CXL_BASE
select CXL_KERNEL_API
+   select CXL_EEH
default m
help
  Select this option to enable driver support for IBM Coherent
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v6 23/42] powerpc/powernv: Release PEs dynamically

2015-08-12 Thread Gavin Shan
On Tue, Aug 11, 2015 at 11:03:40PM +1000, Alexey Kardashevskiy wrote:
On 08/06/2015 02:11 PM, Gavin Shan wrote:
This adds the refcount to PE, which represents number of PCI
devices contained in the PE. When last device leaves from the
PE, the PE together with its consumed resources (IO, DMA, PELTM,
PELTV) are released, to support PCI hotplug.

Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
  arch/powerpc/platforms/powernv/pci-ioda.c | 233 
 +++---
  arch/powerpc/platforms/powernv/pci.h  |   3 +
  2 files changed, 217 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index d2697a3..13d8a5b 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -132,6 +132,53 @@ static inline bool pnv_pci_is_mem_pref_64(unsigned long 
flags)
  (IORESOURCE_MEM_64 | IORESOURCE_PREFETCH));
  }

+static void pnv_pci_ioda_release_pe_dma(struct pnv_ioda_pe *pe)

Is this ioda1 helper or common helper for both ioda1 and ioda2?


It's for IODA1 only.

+{
+ struct pnv_phb *phb = pe-phb;
+ struct iommu_table *tbl;
+ int seg;
+ int64_t rc;
+
+ /* No DMA32 segments allocated */
+ if (pe-dma32_seg == PNV_INVALID_SEGMENT ||
+ pe-dma32_segcount = 0) {


dma32_segcount is unsigned long, cannot be less than 0.


It's int dma32_segcount in pci.h:

+ pe-dma32_seg = PNV_INVALID_SEGMENT;
+ pe-dma32_segcount = 0;
+ return;
+ }
+
+ /* Unlink IOMMU table from group */
+ tbl = pe-table_group.tables[0];
+ pnv_pci_unlink_table_and_group(tbl, pe-table_group);
+ if (pe-table_group.group) {
+ iommu_group_put(pe-table_group.group);
+ BUG_ON(pe-table_group.group);
+ }
+
+ /* Release IOMMU table */
+ free_pages(tbl-it_base,
+ get_order(TCE32_TABLE_SIZE * pe-dma32_segcount));
+ iommu_free_table(tbl,
+ of_node_full_name(pci_bus_to_OF_node(pe-pbus)));

There is pnv_pci_ioda2_table_free_pages(), use it.


The function (pnv_pci_ioda_release_pe_dma()) is for IODA1 only.

+
+ /* Disable TVE */
+ for (seg = pe-dma32_seg;
+  seg  pe-dma32_seg + pe-dma32_segcount;
+  seg++) {
+ rc = opal_pci_map_pe_dma_window(phb-opal_id,
+ pe-pe_number, seg, 0, 0ul, 0ul, 0ul);
+ if (rc)
+ pe_warn(pe, Error %ld unmapping DMA32 seg#%d\n,
+ rc, seg);
+ }

May be implement iommu_table_group_ops::unset_window for IODA1 too?


Good point, but it's something out of scope. I'm putting it into my TODO
list and cook up the patch when having chance.

+
+ /* Free the DMA32 segments */
+ bitmap_clear(phb-ioda.dma32_segmap,
+ pe-dma32_seg, pe-dma32_segcount);
+ pe-dma32_seg = PNV_INVALID_SEGMENT;
+ pe-dma32_segcount = 0;
+}
+
  static inline void pnv_pci_ioda2_tce_invalidate_entire(struct pnv_ioda_pe 
 *pe)
  {
  /* 01xb - invalidate TCEs that match the specified PE# */
@@ -199,13 +246,15 @@ static void pnv_pci_ioda2_set_bypass(struct pnv_ioda_pe 
*pe, bool enable)
  pe-tce_bypass_enabled = enable;
  }

-#ifdef CONFIG_PCI_IOV
-static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev,
-  struct pnv_ioda_pe *pe)
+static void pnv_pci_ioda2_release_pe_dma(struct pnv_ioda_pe *pe)
  {
  struct iommu_table*tbl;
+ struct device_node*dn;
  int64_t   rc;

+ if (pe-dma32_seg == PNV_INVALID_SEGMENT)
+ return;
+
  tbl = pe-table_group.tables[0];
  rc = pnv_pci_ioda2_unset_window(pe-table_group, 0);
  if (rc)
@@ -216,10 +265,91 @@ static void pnv_pci_ioda2_release_dma_pe(struct pci_dev 
*dev,
  iommu_group_put(pe-table_group.group);
  BUG_ON(pe-table_group.group);
  }
+
+ if (pe-flags  (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL))
+ dn = pci_bus_to_OF_node(pe-pbus);
+ else if (pe-flags  PNV_IODA_PE_DEV)
+ dn = pci_device_to_OF_node(pe-pdev);
+#ifdef CONFIG_PCI_IOV
+ else if (pe-flags  PNV_IODA_PE_VF)
+ dn = pci_device_to_OF_node(pe-parent_dev);
+#endif
+ else
+ dn = NULL;
+
  pnv_pci_ioda2_table_free_pages(tbl);
- iommu_free_table(tbl, of_node_full_name(dev-dev.of_node));
+ iommu_free_table(tbl, of_node_full_name(dn));
+ pe-dma32_seg = PNV_INVALID_SEGMENT;
+}



I'd drop the chunk about calculating @dn above, nobody really cares what
iommu_free_table() prints. If you really need to print something, print PE#.


It makes sense. I'll drop the chunk of garbage and replace it with the
PE number.

+
+static void pnv_ioda_release_pe_dma(struct pnv_ioda_pe *pe)
+{
+ struct pnv_phb *phb = pe-phb;
+
+ switch (phb-type) {
+ case PNV_PHB_IODA1:
+ pnv_pci_ioda_release_pe_dma(pe);
+ break;
+  

Re: [2/4] cxl: sparse: Make declarations static

2015-08-12 Thread Michael Ellerman
On Fri, 2015-07-08 at 03:18:18 UTC, Daniel Axtens wrote:
 A few declarations were identified by sparse as needing to be static:
 
 /scratch/dja/linux-capi/drivers/misc/cxl/irq.c:408:6: warning: symbol 
 'afu_irq_name_free' was not declared. Should it be static?
 /scratch/dja/linux-capi/drivers/misc/cxl/irq.c:467:6: warning: symbol 
 'afu_register_hwirqs' was not declared. Should it be static?
 /scratch/dja/linux-capi/drivers/misc/cxl/file.c:254:6: warning: symbol 
 'afu_compat_ioctl' was not declared. Should it be static?
 /scratch/dja/linux-capi/drivers/misc/cxl/file.c:399:30: warning: symbol 
 'afu_master_fops' was not declared. Should it be static?
 
 Make them static.
 
 Signed-off-by: Daniel Axtens d...@axtens.net
 Acked-by: Michael Neuling mi...@neuling.org

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/3d6b040e73387570521d

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [5/8] powerpc/slb: Add documentation to runtime patching of SLB encoding

2015-08-12 Thread Michael Ellerman
On Wed, 2015-29-07 at 07:10:02 UTC, Anshuman Khandual wrote:
 This patch adds some documentation to 'patch_slb_encoding' function
 explaining about how it clears the existing immediate value in the
 given instruction and inserts a new one there.
 
 Signed-off-by: Anshuman Khandual khand...@linux.vnet.ibm.com

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/79d0be7407955a268bce

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 10/11] cxl: EEH support

2015-08-12 Thread Daniel Axtens
EEH (Enhanced Error Handling) allows a driver to recover from the
temporary failure of an attached PCI card. Enable basic CXL support
for EEH.

Signed-off-by: Daniel Axtens d...@axtens.net
---
 drivers/misc/cxl/cxl.h  |   1 +
 drivers/misc/cxl/pci.c  | 253 
 drivers/misc/cxl/vphb.c |   8 ++
 3 files changed, 262 insertions(+)

diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index cda02412b01e..6f5386653dae 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -726,6 +726,7 @@ int cxl_psl_purge(struct cxl_afu *afu);
 
 void cxl_stop_trace(struct cxl *cxl);
 int cxl_pci_vphb_add(struct cxl_afu *afu);
+void cxl_pci_vphb_reconfigure(struct cxl_afu *afu);
 void cxl_pci_vphb_remove(struct cxl_afu *afu);
 
 extern struct pci_driver cxl_pci_driver;
diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 965524a6ae7c..5c2dc82da92f 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -24,6 +24,7 @@
 #include asm/io.h
 
 #include cxl.h
+#include misc/cxl.h
 
 
 #define CXL_PCI_VSEC_ID0x1280
@@ -1275,10 +1276,262 @@ static void cxl_remove(struct pci_dev *dev)
cxl_remove_adapter(adapter);
 }
 
+static pci_ers_result_t cxl_vphb_error_detected(struct cxl_afu *afu,
+   pci_channel_state_t state)
+{
+   struct pci_dev *afu_dev;
+   pci_ers_result_t result = PCI_ERS_RESULT_NEED_RESET;
+   pci_ers_result_t afu_result = PCI_ERS_RESULT_NEED_RESET;
+
+   /* There should only be one entry, but go through the list
+* anyway
+*/
+   list_for_each_entry(afu_dev, afu-phb-bus-devices, bus_list) {
+   if (!afu_dev-driver)
+   continue;
+
+   afu_dev-error_state = state;
+
+   if (afu_dev-driver-err_handler)
+   afu_result = 
afu_dev-driver-err_handler-error_detected(afu_dev,
+   
  state);
+   /* Disconnect trumps all, NONE trumps NEED_RESET */
+   if (afu_result == PCI_ERS_RESULT_DISCONNECT)
+   result = PCI_ERS_RESULT_DISCONNECT;
+   else if ((afu_result == PCI_ERS_RESULT_NONE) 
+(result == PCI_ERS_RESULT_NEED_RESET))
+   result = PCI_ERS_RESULT_NONE;
+   }
+   return result;
+}
+
+static pci_ers_result_t cxl_pci_error_detected(struct pci_dev *pdev,
+  pci_channel_state_t state)
+{
+   struct cxl *adapter = pci_get_drvdata(pdev);
+   struct cxl_afu *afu;
+   pci_ers_result_t result = PCI_ERS_RESULT_NEED_RESET;
+   int i;
+
+   /* At this point, we could still have an interrupt pending.
+* Let's try to get them out of the way before they do
+* anything we don't like.
+*/
+   schedule();
+
+   /* If we're permanently dead, give up. */
+   if (state == pci_channel_io_perm_failure) {
+   /* Tell the AFU drivers; but we don't care what they
+* say, we're going away.
+*/
+   for (i = 0; i  adapter-slices; i++) {
+   afu = adapter-afu[i];
+   cxl_vphb_error_detected(afu, state);
+   }
+   return PCI_ERS_RESULT_DISCONNECT;
+   }
+
+   /* Are we reflashing?
+*
+* If we reflash, we could come back as something entirely
+* different, including a non-CAPI card. As such, by default
+* we don't participate in the process. We'll be unbound and
+* the slot re-probed. (TODO: check EEH doesn't blindly rebind
+* us!)
+*
+* However, this isn't the entire story: for reliablity
+* reasons, we usually want to reflash the FPGA on PERST in
+* order to get back to a more reliable known-good state.
+*
+* This causes us a bit of a problem: if we reflash we can't
+* trust that we'll come back the same - we could have a new
+* image and been PERSTed in order to load that
+* image. However, most of the time we actually *will* come
+* back the same - for example a regular EEH event.
+*
+* Therefore, we allow the user to assert that the image is
+* indeed the same and that we should continue on into EEH
+* anyway.
+*/
+   if (adapter-perst_loads_image  !adapter-perst_same_image) {
+   /* TODO take the PHB out of CXL mode */
+   dev_info(pdev-dev, reflashing, so opting out of EEH!\n);
+   return PCI_ERS_RESULT_NONE;
+   }
+
+   /*
+* At this point, we want to try to recover.  We'll always
+* need a complete slot reset: we don't trust any other reset.
+*
+* Now, we go through each AFU:
+*  - We send the driver, if bound, an error_detected 

[PATCH v4 06/11] cxl: Refactor adaptor init/teardown

2015-08-12 Thread Daniel Axtens
Some aspects of initialisation are done only once in the lifetime of
an adapter: for example, allocating memory for the adapter,
allocating the adapter number, or setting up sysfs/debugfs files.

However, we may want to be able to do some parts of the
initialisation multiple times: for example, in error recovery we
want to be able to tear down and then re-map IO memory and IRQs.

Therefore, refactor CXL init/teardown as follows.

 - Keep the overarching functions 'cxl_init_adapter' and its pair,
   'cxl_remove_adapter'.

 - Move all 'once only' allocation/freeing steps to the existing
   'cxl_alloc_adapter' function, and its pair 'cxl_release_adapter'
   (This involves moving allocation of the adapter number out of
   cxl_init_adapter.)

 - Create two new functions: 'cxl_configure_adapter', and its pair
   'cxl_deconfigure_adapter'. These two functions 'wire up' the
   hardware --- they (de)configure resources that do not need to
   last the entire lifetime of the adapter

Signed-off-by: Daniel Axtens d...@axtens.net
---
 drivers/misc/cxl/pci.c | 176 +++--
 1 file changed, 111 insertions(+), 65 deletions(-)

diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 484d35a5aead..f3c5998f2f37 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -965,7 +965,6 @@ static int cxl_read_vsec(struct cxl *adapter, struct 
pci_dev *dev)
CXL_READ_VSEC_BASE_IMAGE(dev, vsec, adapter-base_image);
CXL_READ_VSEC_IMAGE_STATE(dev, vsec, image_state);
adapter-user_image_loaded = !!(image_state  
CXL_VSEC_USER_IMAGE_LOADED);
-   adapter-perst_loads_image = true;
adapter-perst_select_user = !!(image_state  
CXL_VSEC_USER_IMAGE_LOADED);
 
CXL_READ_VSEC_NAFUS(dev, vsec, adapter-slices);
@@ -1025,22 +1024,33 @@ static void cxl_release_adapter(struct device *dev)
 
pr_devel(cxl_release_adapter\n);
 
+   cxl_remove_adapter_nr(adapter);
+
kfree(adapter);
 }
 
-static struct cxl *cxl_alloc_adapter(struct pci_dev *dev)
+static struct cxl *cxl_alloc_adapter(void)
 {
struct cxl *adapter;
 
if (!(adapter = kzalloc(sizeof(struct cxl), GFP_KERNEL)))
return NULL;
 
-   adapter-dev.parent = dev-dev;
-   adapter-dev.release = cxl_release_adapter;
-   pci_set_drvdata(dev, adapter);
spin_lock_init(adapter-afu_list_lock);
 
+   if (cxl_alloc_adapter_nr(adapter))
+   goto err1;
+
+   if (dev_set_name(adapter-dev, card%i, adapter-adapter_num))
+   goto err2;
+
return adapter;
+
+err2:
+   cxl_remove_adapter_nr(adapter);
+err1:
+   kfree(adapter);
+   return NULL;
 }
 
 static int sanitise_adapter_regs(struct cxl *adapter)
@@ -1049,57 +1059,107 @@ static int sanitise_adapter_regs(struct cxl *adapter)
return cxl_tlb_slb_invalidate(adapter);
 }
 
-static struct cxl *cxl_init_adapter(struct pci_dev *dev)
+/* This should contain *only* operations that can safely be done in
+ * both creation and recovery.
+ */
+static int cxl_configure_adapter(struct cxl *adapter, struct pci_dev *dev)
 {
-   struct cxl *adapter;
-   bool free = true;
int rc;
 
+   adapter-dev.parent = dev-dev;
+   adapter-dev.release = cxl_release_adapter;
+   pci_set_drvdata(dev, adapter);
 
-   if (!(adapter = cxl_alloc_adapter(dev)))
-   return ERR_PTR(-ENOMEM);
-
-   if ((rc = cxl_read_vsec(adapter, dev)))
-   goto err1;
-
-   if ((rc = cxl_vsec_looks_ok(adapter, dev)))
-   goto err1;
+   rc = pci_enable_device(dev);
+   if (rc) {
+   dev_err(dev-dev, pci_enable_device failed: %i\n, rc);
+   return rc;
+   }
 
-   if ((rc = setup_cxl_bars(dev)))
-   goto err1;
+   rc = cxl_read_vsec(adapter, dev);
+   if (rc)
+   return rc;
 
-   if ((rc = switch_card_to_cxl(dev)))
-   goto err1;
+   rc = cxl_vsec_looks_ok(adapter, dev);
+   if (rc)
+   return rc;
 
-   if ((rc = cxl_alloc_adapter_nr(adapter)))
-   goto err1;
+   rc = setup_cxl_bars(dev);
+   if (rc)
+   return rc;
 
-   if ((rc = dev_set_name(adapter-dev, card%i, adapter-adapter_num)))
-   goto err2;
+   rc = switch_card_to_cxl(dev);
+   if (rc)
+   return rc;
 
-   if ((rc = cxl_update_image_control(adapter)))
-   goto err2;
+   rc = cxl_update_image_control(adapter);
+   if (rc)
+   return rc;
 
-   if ((rc = cxl_map_adapter_regs(adapter, dev)))
-   goto err2;
+   rc = cxl_map_adapter_regs(adapter, dev);
+   if (rc)
+   return rc;
 
-   if ((rc = sanitise_adapter_regs(adapter)))
-   goto err2;
+   rc = sanitise_adapter_regs(adapter);
+   if (rc)
+   goto err;
 
-   if ((rc = init_implementation_adapter_regs(adapter, dev)))
- 

Re: [7/8] powerpc/xmon: Drop the 'valid' variable completely in 'dump_segments'

2015-08-12 Thread Michael Ellerman
On Wed, 2015-29-07 at 07:10:04 UTC, Anshuman Khandual wrote:
 Value of the 'valid' variable is zero when 'esid' is zero and it does
 not matter when 'esid' is non-zero. The variable 'valid' can be dropped
 from the function 'dump_segments' by checking for validity of 'esid'
 inside the nested code block. This patch does that change.
 
 Signed-off-by: Anshuman Khandual khand...@linux.vnet.ibm.com

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/8218a3031c204b20582b

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 02/11] cxl: Drop commands if the PCI channel is not in normal state

2015-08-12 Thread Daniel Axtens
If the PCI channel has gone down, don't attempt to poke the hardware.

We need to guard every time cxl_whatever_(read|write) is called. This
is because a call to those functions will dereference an offset into an
mmio register, and the mmio mappings get invalidated in the EEH
teardown.

Check in the read/write functions in the header.
We give them the same semantics as usual PCI operations:
 - a write to a channel that is down is ignored.
 - a read from a channel that is down returns all fs.

Also, we try to access the MMIO space of a vPHB device as part of the
PCI disable path. Because that's a read that bypasses most of our usual
checks, we handle it explicitly.

As far as user visible warnings go:
 - Check link state in file ops, return -EIO if down.
 - Be reasonably quiet if there's an error in a teardown path,
   or when we already know the hardware is going down.
 - Throw a big WARN if someone tries to start a CXL operation
   while the card is down. This gives a useful stacktrace for
   debugging whatever is doing that.

Signed-off-by: Daniel Axtens d...@axtens.net
---
 drivers/misc/cxl/context.c |  6 +++-
 drivers/misc/cxl/cxl.h | 44 ++--
 drivers/misc/cxl/file.c| 19 +
 drivers/misc/cxl/native.c  | 71 --
 drivers/misc/cxl/vphb.c| 26 +
 5 files changed, 154 insertions(+), 12 deletions(-)

diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c
index 1287148629c0..615842115848 100644
--- a/drivers/misc/cxl/context.c
+++ b/drivers/misc/cxl/context.c
@@ -193,7 +193,11 @@ int __detach_context(struct cxl_context *ctx)
if (status != STARTED)
return -EBUSY;
 
-   WARN_ON(cxl_detach_process(ctx));
+   /* Only warn if we detached while the link was OK.
+* If detach fails when hw is down, we don't care.
+*/
+   WARN_ON(cxl_detach_process(ctx) 
+   cxl_adapter_link_ok(ctx-afu-adapter));
flush_work(ctx-fault_work); /* Only needed for dedicated process */
put_pid(ctx-pid);
cxl_ctx_put();
diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index 6a93bfbcd826..9b9e89fd02cc 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -531,6 +531,14 @@ struct cxl_process_element {
__be32 software_state;
 } __packed;
 
+static inline bool cxl_adapter_link_ok(struct cxl *cxl)
+{
+   struct pci_dev *pdev;
+
+   pdev = to_pci_dev(cxl-dev.parent);
+   return !pci_channel_offline(pdev);
+}
+
 static inline void __iomem *_cxl_p1_addr(struct cxl *cxl, cxl_p1_reg_t reg)
 {
WARN_ON(!cpu_has_feature(CPU_FTR_HVMODE));
@@ -539,12 +547,16 @@ static inline void __iomem *_cxl_p1_addr(struct cxl *cxl, 
cxl_p1_reg_t reg)
 
 static inline void cxl_p1_write(struct cxl *cxl, cxl_p1_reg_t reg, u64 val)
 {
-   out_be64(_cxl_p1_addr(cxl, reg), val);
+   if (likely(cxl_adapter_link_ok(cxl)))
+   out_be64(_cxl_p1_addr(cxl, reg), val);
 }
 
 static inline u64 cxl_p1_read(struct cxl *cxl, cxl_p1_reg_t reg)
 {
-   return in_be64(_cxl_p1_addr(cxl, reg));
+   if (likely(cxl_adapter_link_ok(cxl)))
+   return in_be64(_cxl_p1_addr(cxl, reg));
+   else
+   return ~0ULL;
 }
 
 static inline void __iomem *_cxl_p1n_addr(struct cxl_afu *afu, cxl_p1n_reg_t 
reg)
@@ -555,12 +567,16 @@ static inline void __iomem *_cxl_p1n_addr(struct cxl_afu 
*afu, cxl_p1n_reg_t reg
 
 static inline void cxl_p1n_write(struct cxl_afu *afu, cxl_p1n_reg_t reg, u64 
val)
 {
-   out_be64(_cxl_p1n_addr(afu, reg), val);
+   if (likely(cxl_adapter_link_ok(afu-adapter)))
+   out_be64(_cxl_p1n_addr(afu, reg), val);
 }
 
 static inline u64 cxl_p1n_read(struct cxl_afu *afu, cxl_p1n_reg_t reg)
 {
-   return in_be64(_cxl_p1n_addr(afu, reg));
+   if (likely(cxl_adapter_link_ok(afu-adapter)))
+   return in_be64(_cxl_p1n_addr(afu, reg));
+   else
+   return ~0ULL;
 }
 
 static inline void __iomem *_cxl_p2n_addr(struct cxl_afu *afu, cxl_p2n_reg_t 
reg)
@@ -570,22 +586,34 @@ static inline void __iomem *_cxl_p2n_addr(struct cxl_afu 
*afu, cxl_p2n_reg_t reg
 
 static inline void cxl_p2n_write(struct cxl_afu *afu, cxl_p2n_reg_t reg, u64 
val)
 {
-   out_be64(_cxl_p2n_addr(afu, reg), val);
+   if (likely(cxl_adapter_link_ok(afu-adapter)))
+   out_be64(_cxl_p2n_addr(afu, reg), val);
 }
 
 static inline u64 cxl_p2n_read(struct cxl_afu *afu, cxl_p2n_reg_t reg)
 {
-   return in_be64(_cxl_p2n_addr(afu, reg));
+   if (likely(cxl_adapter_link_ok(afu-adapter)))
+   return in_be64(_cxl_p2n_addr(afu, reg));
+   else
+   return ~0ULL;
 }
 
 static inline u64 cxl_afu_cr_read64(struct cxl_afu *afu, int cr, u64 off)
 {
-   return in_le64((afu)-afu_desc_mmio + (afu)-crs_offset + ((cr) * 
(afu)-crs_len) + (off));
+   if (likely(cxl_adapter_link_ok(afu-adapter)))
+   return 

[PATCH v4 03/11] cxl: Allocate and release the SPA with the AFU

2015-08-12 Thread Daniel Axtens
Previously the SPA was allocated and freed upon entering and leaving
AFU-directed mode. This causes some issues for error recovery - contexts
hold a pointer inside the SPA, and they may persist after the AFU has
been detached.

We would ideally like to allocate the SPA when the AFU is allocated, and
release it until the AFU is released. However, we don't know how big the
SPA needs to be until we read the AFU descriptor.

Therefore, restructure the code:

 - Allocate the SPA only once, on the first attach.

 - Release the SPA only when the entire AFU is being released (not
   detached). Guard the release with a NULL check, so we don't free
   if it was never allocated (e.g. dedicated mode)

Acked-by: Cyril Bur cyril...@gmail.com
Signed-off-by: Daniel Axtens d...@axtens.net
---
 drivers/misc/cxl/cxl.h|  3 +++
 drivers/misc/cxl/native.c | 33 ++---
 drivers/misc/cxl/pci.c|  2 ++
 3 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index 9b9e89fd02cc..d540542f9931 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -632,6 +632,9 @@ void unregister_cxl_calls(struct cxl_calls *calls);
 int cxl_alloc_adapter_nr(struct cxl *adapter);
 void cxl_remove_adapter_nr(struct cxl *adapter);
 
+int cxl_alloc_spa(struct cxl_afu *afu);
+void cxl_release_spa(struct cxl_afu *afu);
+
 int cxl_file_init(void);
 void cxl_file_exit(void);
 int cxl_register_adapter(struct cxl *adapter);
diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c
index cd1dda5fcd3a..0af3a0d1c697 100644
--- a/drivers/misc/cxl/native.c
+++ b/drivers/misc/cxl/native.c
@@ -182,10 +182,8 @@ static int spa_max_procs(int spa_size)
return ((spa_size / 8) - 96) / 17;
 }
 
-static int alloc_spa(struct cxl_afu *afu)
+int cxl_alloc_spa(struct cxl_afu *afu)
 {
-   u64 spap;
-
/* Work out how many pages to allocate */
afu-spa_order = 0;
do {
@@ -204,6 +202,13 @@ static int alloc_spa(struct cxl_afu *afu)
pr_devel(spa pages: %i afu-spa_max_procs: %i   afu-num_procs: %i\n,
 1afu-spa_order, afu-spa_max_procs, afu-num_procs);
 
+   return 0;
+}
+
+static void attach_spa(struct cxl_afu *afu)
+{
+   u64 spap;
+
afu-sw_command_status = (__be64 *)((char *)afu-spa +
((afu-spa_max_procs + 3) * 128));
 
@@ -212,14 +217,19 @@ static int alloc_spa(struct cxl_afu *afu)
spap |= CXL_PSL_SPAP_V;
pr_devel(cxl: SPA allocated at 0x%p. Max processes: %i, 
sw_command_status: 0x%p CXL_PSL_SPAP_An=0x%016llx\n, afu-spa, 
afu-spa_max_procs, afu-sw_command_status, spap);
cxl_p1n_write(afu, CXL_PSL_SPAP_An, spap);
-
-   return 0;
 }
 
-static void release_spa(struct cxl_afu *afu)
+static inline void detach_spa(struct cxl_afu *afu)
 {
cxl_p1n_write(afu, CXL_PSL_SPAP_An, 0);
-   free_pages((unsigned long) afu-spa, afu-spa_order);
+}
+
+void cxl_release_spa(struct cxl_afu *afu)
+{
+   if (afu-spa) {
+   free_pages((unsigned long) afu-spa, afu-spa_order);
+   afu-spa = NULL;
+   }
 }
 
 int cxl_tlb_slb_invalidate(struct cxl *adapter)
@@ -446,8 +456,11 @@ static int activate_afu_directed(struct cxl_afu *afu)
 
dev_info(afu-dev, Activating AFU directed mode\n);
 
-   if (alloc_spa(afu))
-   return -ENOMEM;
+   if (afu-spa == NULL) {
+   if (cxl_alloc_spa(afu))
+   return -ENOMEM;
+   }
+   attach_spa(afu);
 
cxl_p1n_write(afu, CXL_PSL_SCNTL_An, CXL_PSL_SCNTL_An_PM_AFU);
cxl_p1n_write(afu, CXL_PSL_AMOR_An, 0xULL);
@@ -558,8 +571,6 @@ static int deactivate_afu_directed(struct cxl_afu *afu)
cxl_afu_disable(afu);
cxl_psl_purge(afu);
 
-   release_spa(afu);
-
return 0;
 }
 
diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 32ad09705949..62a762d94de3 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -551,6 +551,8 @@ static void cxl_release_afu(struct device *dev)
 
pr_devel(cxl_release_afu\n);
 
+   cxl_release_spa(afu);
+
kfree(afu);
 }
 
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [2/8] powerpc/slb: Rename all the 'slot' occurrences to 'entry'

2015-08-12 Thread Michael Ellerman
On Wed, 2015-29-07 at 07:09:59 UTC, Anshuman Khandual wrote:
 These are essentially SLB individual slots with entries what we are
 dealing with in these functions. Usage of both 'entry' and 'slot'
 synonyms makes it real confusing sometimes. This patch makes it
 uniform across the file by replacing all those 'slot's with 'entry's.
 
 Signed-off-by: Anshuman Khandual khand...@linux.vnet.ibm.com

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/2be682af48e8236558da

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [4/4] cxl: sparse: Silence iomem warning in debugfs file creation

2015-08-12 Thread Michael Ellerman
On Fri, 2015-07-08 at 03:18:20 UTC, Daniel Axtens wrote:
 An IO address, tagged with __iomem, is passed to debugfs_create_file
 as private data. This requires that it be cast to void *. The cast
 creates a sparse warning:
 /scratch/dja/linux-capi/drivers/misc/cxl/debugfs.c:51:57: warning: cast 
 removes address space of expression
 
 The address space marker is added back in the file operations
 (fops_io_u64).
 
 Silence the warning with __force.
 
 Signed-off-by: Daniel Axtens d...@axtens.net
 Acked-by: Michael Neuling mi...@neuling.org

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/83c3fee7e78f5a937b73

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/prom: Use DRCONF flags while processing detected LMBs

2015-08-12 Thread Michael Ellerman
On Thu, 2015-06-08 at 13:05:07 UTC, Anshuman Khandual wrote:
 This patch just replaces hard coded values with existing
 DRCONF flags while procesing detected LMBs from the device
 tree. This does not change any functionality.
 
 Signed-off-by: Anshuman Khandual khand...@linux.vnet.ibm.com

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/9afac933433ca71e0f78

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 01/11] cxl: Convert MMIO read/write macros to inline functions

2015-08-12 Thread Daniel Axtens
We're about to make these more complex, so make them functions
first.

Signed-off-by: Daniel Axtens d...@axtens.net
---
 drivers/misc/cxl/cxl.h | 51 ++
 1 file changed, 35 insertions(+), 16 deletions(-)

diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index 4fd66cabde1e..6a93bfbcd826 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -537,10 +537,15 @@ static inline void __iomem *_cxl_p1_addr(struct cxl *cxl, 
cxl_p1_reg_t reg)
return cxl-p1_mmio + cxl_reg_off(reg);
 }
 
-#define cxl_p1_write(cxl, reg, val) \
-   out_be64(_cxl_p1_addr(cxl, reg), val)
-#define cxl_p1_read(cxl, reg) \
-   in_be64(_cxl_p1_addr(cxl, reg))
+static inline void cxl_p1_write(struct cxl *cxl, cxl_p1_reg_t reg, u64 val)
+{
+   out_be64(_cxl_p1_addr(cxl, reg), val);
+}
+
+static inline u64 cxl_p1_read(struct cxl *cxl, cxl_p1_reg_t reg)
+{
+   return in_be64(_cxl_p1_addr(cxl, reg));
+}
 
 static inline void __iomem *_cxl_p1n_addr(struct cxl_afu *afu, cxl_p1n_reg_t 
reg)
 {
@@ -548,26 +553,40 @@ static inline void __iomem *_cxl_p1n_addr(struct cxl_afu 
*afu, cxl_p1n_reg_t reg
return afu-p1n_mmio + cxl_reg_off(reg);
 }
 
-#define cxl_p1n_write(afu, reg, val) \
-   out_be64(_cxl_p1n_addr(afu, reg), val)
-#define cxl_p1n_read(afu, reg) \
-   in_be64(_cxl_p1n_addr(afu, reg))
+static inline void cxl_p1n_write(struct cxl_afu *afu, cxl_p1n_reg_t reg, u64 
val)
+{
+   out_be64(_cxl_p1n_addr(afu, reg), val);
+}
+
+static inline u64 cxl_p1n_read(struct cxl_afu *afu, cxl_p1n_reg_t reg)
+{
+   return in_be64(_cxl_p1n_addr(afu, reg));
+}
 
 static inline void __iomem *_cxl_p2n_addr(struct cxl_afu *afu, cxl_p2n_reg_t 
reg)
 {
return afu-p2n_mmio + cxl_reg_off(reg);
 }
 
-#define cxl_p2n_write(afu, reg, val) \
-   out_be64(_cxl_p2n_addr(afu, reg), val)
-#define cxl_p2n_read(afu, reg) \
-   in_be64(_cxl_p2n_addr(afu, reg))
+static inline void cxl_p2n_write(struct cxl_afu *afu, cxl_p2n_reg_t reg, u64 
val)
+{
+   out_be64(_cxl_p2n_addr(afu, reg), val);
+}
 
+static inline u64 cxl_p2n_read(struct cxl_afu *afu, cxl_p2n_reg_t reg)
+{
+   return in_be64(_cxl_p2n_addr(afu, reg));
+}
 
-#define cxl_afu_cr_read64(afu, cr, off) \
-   in_le64((afu)-afu_desc_mmio + (afu)-crs_offset + ((cr) * 
(afu)-crs_len) + (off))
-#define cxl_afu_cr_read32(afu, cr, off) \
-   in_le32((afu)-afu_desc_mmio + (afu)-crs_offset + ((cr) * 
(afu)-crs_len) + (off))
+static inline u64 cxl_afu_cr_read64(struct cxl_afu *afu, int cr, u64 off)
+{
+   return in_le64((afu)-afu_desc_mmio + (afu)-crs_offset + ((cr) * 
(afu)-crs_len) + (off));
+}
+
+static inline u32 cxl_afu_cr_read32(struct cxl_afu *afu, int cr, u64 off)
+{
+   return in_le32((afu)-afu_desc_mmio + (afu)-crs_offset + ((cr) * 
(afu)-crs_len) + (off));
+}
 u16 cxl_afu_cr_read16(struct cxl_afu *afu, int cr, u64 off);
 u8 cxl_afu_cr_read8(struct cxl_afu *afu, int cr, u64 off);
 
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 09/11] cxl: Allow the kernel to trust that an image won't change on PERST.

2015-08-12 Thread Daniel Axtens
Provide a kernel API and a sysfs entry which allow a user to specify
that when a card is PERSTed, it's image will stay the same, allowing
it to participate in EEH.

cxl_reset is used to reflash the card. In that case, we cannot safely
assert that the image will not change. Therefore, disallow cxl_reset
if the flag is set.

Signed-off-by: Daniel Axtens d...@axtens.net
---
 Documentation/ABI/testing/sysfs-class-cxl | 10 ++
 drivers/misc/cxl/api.c|  7 +++
 drivers/misc/cxl/cxl.h|  1 +
 drivers/misc/cxl/pci.c|  7 +++
 drivers/misc/cxl/sysfs.c  | 26 ++
 include/misc/cxl.h| 10 ++
 6 files changed, 61 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-class-cxl 
b/Documentation/ABI/testing/sysfs-class-cxl
index acfe9df83139..b07e86d4597f 100644
--- a/Documentation/ABI/testing/sysfs-class-cxl
+++ b/Documentation/ABI/testing/sysfs-class-cxl
@@ -223,3 +223,13 @@ Description:write only
 Writing 1 will issue a PERST to card which may cause the card
 to reload the FPGA depending on load_image_on_perst.
 Users: https://github.com/ibm-capi/libcxl
+
+What:  /sys/class/cxl/card/perst_reloads_same_image
+Date:  July 2015
+Contact:   linuxppc-dev@lists.ozlabs.org
+Description:   read/write
+   Trust that when an image is reloaded via PERST, it will not
+   have changed.
+   0 = don't trust, the image may be different (default)
+   1 = trust that the image will not change.
+Users: https://github.com/ibm-capi/libcxl
diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
index 729e0851167d..6a768a9ad22f 100644
--- a/drivers/misc/cxl/api.c
+++ b/drivers/misc/cxl/api.c
@@ -327,3 +327,10 @@ int cxl_afu_reset(struct cxl_context *ctx)
return cxl_afu_check_and_enable(afu);
 }
 EXPORT_SYMBOL_GPL(cxl_afu_reset);
+
+void cxl_perst_reloads_same_image(struct cxl_afu *afu,
+ bool perst_reloads_same_image)
+{
+   afu-adapter-perst_same_image = perst_reloads_same_image;
+}
+EXPORT_SYMBOL_GPL(cxl_perst_reloads_same_image);
diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index d540542f9931..cda02412b01e 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -493,6 +493,7 @@ struct cxl {
bool user_image_loaded;
bool perst_loads_image;
bool perst_select_user;
+   bool perst_same_image;
 };
 
 int cxl_alloc_one_irq(struct cxl *adapter);
diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index e7976deed1f8..965524a6ae7c 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -887,6 +887,12 @@ int cxl_reset(struct cxl *adapter)
int i;
u32 val;
 
+   if (adapter-perst_same_image) {
+   dev_warn(dev-dev,
+cxl: refusing to reset/reflash when 
perst_reloads_same_image is set.\n);
+   return -EINVAL;
+   }
+
dev_info(dev-dev, CXL reset\n);
 
/* pcie_warm_reset requests a fundamental pci reset which includes a
@@ -1171,6 +1177,7 @@ static struct cxl *cxl_init_adapter(struct pci_dev *dev)
 * configure/reconfigure
 */
adapter-perst_loads_image = true;
+   adapter-perst_same_image = false;
 
rc = cxl_configure_adapter(adapter, dev);
if (rc) {
diff --git a/drivers/misc/cxl/sysfs.c b/drivers/misc/cxl/sysfs.c
index 31f38bc71a3d..6619cf1f6e1f 100644
--- a/drivers/misc/cxl/sysfs.c
+++ b/drivers/misc/cxl/sysfs.c
@@ -112,12 +112,38 @@ static ssize_t load_image_on_perst_store(struct device 
*device,
return count;
 }
 
+static ssize_t perst_reloads_same_image_show(struct device *device,
+struct device_attribute *attr,
+char *buf)
+{
+   struct cxl *adapter = to_cxl_adapter(device);
+
+   return scnprintf(buf, PAGE_SIZE, %i\n, adapter-perst_same_image);
+}
+
+static ssize_t perst_reloads_same_image_store(struct device *device,
+struct device_attribute *attr,
+const char *buf, size_t count)
+{
+   struct cxl *adapter = to_cxl_adapter(device);
+   int rc;
+   int val;
+
+   rc = sscanf(buf, %i, val);
+   if ((rc != 1) || !(val == 1 || val == 0))
+   return -EINVAL;
+
+   adapter-perst_same_image = (val == 1 ? true : false);
+   return count;
+}
+
 static struct device_attribute adapter_attrs[] = {
__ATTR_RO(caia_version),
__ATTR_RO(psl_revision),
__ATTR_RO(base_image),
__ATTR_RO(image_loaded),
__ATTR_RW(load_image_on_perst),
+   __ATTR_RW(perst_reloads_same_image),
__ATTR(reset, S_IWUSR, NULL, reset_adapter_store),
 };
 
diff --git a/include/misc/cxl.h b/include/misc/cxl.h
index 7a6c1d6cc173..f2ffe5bd720d 100644

[PATCH v4 08/11] cxl: Don't remove AFUs/vPHBs in cxl_reset

2015-08-12 Thread Daniel Axtens
If the driver doesn't participate in EEH, the AFUs will be removed
by cxl_remove, which will be invoked by EEH.

If the driver does particpate in EEH, the vPHB needs to stick around
so that the it can particpate.

In both cases, we shouldn't remove the AFU/vPHB.

Reviewed-by: Cyril Bur cyril...@gmail.com
Signed-off-by: Daniel Axtens d...@axtens.net
---
 drivers/misc/cxl/pci.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 8e7b0f3ad254..e7976deed1f8 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -889,11 +889,6 @@ int cxl_reset(struct cxl *adapter)
 
dev_info(dev-dev, CXL reset\n);
 
-   for (i = 0; i  adapter-slices; i++) {
-   cxl_pci_vphb_remove(adapter-afu[i]);
-   cxl_remove_afu(adapter-afu[i]);
-   }
-
/* pcie_warm_reset requests a fundamental pci reset which includes a
 * PERST assert/deassert.  PERST triggers a loading of the image
 * if user or factory is selected in sysfs */
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/prom: Use DRCONF flags while processing detected LMBs

2015-08-12 Thread Anshuman Khandual
On 08/11/2015 03:18 AM, Michael Ellerman wrote:
 On Fri, 2015-08-07 at 07:49 +0530, Madhavan Srinivasan wrote:
  
  On Thursday 06 August 2015 06:35 PM, Anshuman Khandual wrote:
   This patch just replaces hard coded values with existing
  
Please drop This patch just and start with Replace hard ...
  
https://www.kernel.org/doc/Documentation/SubmittingPatches 
 Yeah I rewrote it as:
 
 Replace hard coded values with existing DRCONF flags while procesing
 detected LMBs from the device tree. Does not change any functionality.

Thanks Michael.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 05/31] x86/pci-calgary: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
For the iommu offset we just need and offset into the page.  Calculate
that using the physical address instead of using the virtual address
so that we don't require a virtual mapping.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/x86/kernel/pci-calgary_64.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/pci-calgary_64.c b/arch/x86/kernel/pci-calgary_64.c
index 0497f71..8f1581d 100644
--- a/arch/x86/kernel/pci-calgary_64.c
+++ b/arch/x86/kernel/pci-calgary_64.c
@@ -368,16 +368,14 @@ static int calgary_map_sg(struct device *dev, struct 
scatterlist *sg,
 {
struct iommu_table *tbl = find_iommu_table(dev);
struct scatterlist *s;
-   unsigned long vaddr;
+   unsigned long paddr;
unsigned int npages;
unsigned long entry;
int i;
 
for_each_sg(sg, s, nelems, i) {
-   BUG_ON(!sg_page(s));
-
-   vaddr = (unsigned long) sg_virt(s);
-   npages = iommu_num_pages(vaddr, s-length, PAGE_SIZE);
+   paddr = sg_phys(s);
+   npages = iommu_num_pages(paddr, s-length, PAGE_SIZE);
 
entry = iommu_range_alloc(dev, tbl, npages);
if (entry == DMA_ERROR_CODE) {
@@ -389,7 +387,7 @@ static int calgary_map_sg(struct device *dev, struct 
scatterlist *sg,
s-dma_address = (entry  PAGE_SHIFT) | s-offset;
 
/* insert into HW table */
-   tce_build(tbl, entry, npages, vaddr  PAGE_MASK, dir);
+   tce_build(tbl, entry, npages, paddr  PAGE_MASK, dir);
 
s-dma_length = s-length;
}
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RFC: prepare for struct scatterlist entries without page backing

2015-08-12 Thread Christoph Hellwig
Dan Williams started to look into addressing I/O to and from
Persistent Memory in his series from June:

http://thread.gmane.org/gmane.linux.kernel.cross-arch/27944

I've started looking into DMA mapping of these SGLs specifically instead
of the map_pfn method in there.  In addition to supporting NVDIMM backed
I/O I also suspect this would be highly useful for media drivers that
go through nasty hoops to be able to DMA from/to their ioremapped regions,
with vb2_dc_get_userptr in drivers/media/v4l2-core/videobuf2-dma-contig.c
being a prime example for the unsafe hacks currently used.

It turns out most DMA mapping implementation can handle SGLs without
page structures with some fairly simple mechanical work.  Most of it
is just about consistently using sg_phys.  For implementations that
need to flush caches we need a new helper that skips these cache
flushes if a entry doesn't have a kernel virtual address.

However the ccio (parisc) and sba_iommu (parisc  ia64) IOMMUs seem
to be operate mostly on virtual addresses.  It's a fairly odd concept
that I don't fully grasp, so I'll need some help with those if we want
to bring this forward.

Additional this series skips ARM entirely for now.  The reason is
that most arm implementations of the .map_sg operation just iterate
over all entries and call -map_page for it, which means we'd need
to convert those to a -map_pfn similar to Dan's previous approach.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 11/31] sparc/iommu: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Use sg_phys() instead of __pa(sg_virt(sg)) so that we don't
require a kernel virtual address.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/sparc/kernel/iommu.c| 2 +-
 arch/sparc/kernel/iommu_common.h | 4 +---
 arch/sparc/kernel/pci_sun4v.c| 2 +-
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
index 5320689..2ad89d2 100644
--- a/arch/sparc/kernel/iommu.c
+++ b/arch/sparc/kernel/iommu.c
@@ -486,7 +486,7 @@ static int dma_4u_map_sg(struct device *dev, struct 
scatterlist *sglist,
continue;
}
/* Allocate iommu entries for that segment */
-   paddr = (unsigned long) SG_ENT_PHYS_ADDRESS(s);
+   paddr = sg_phys(s);
npages = iommu_num_pages(paddr, slen, IO_PAGE_SIZE);
entry = iommu_tbl_range_alloc(dev, iommu-tbl, npages,
  handle, (unsigned long)(-1), 0);
diff --git a/arch/sparc/kernel/iommu_common.h b/arch/sparc/kernel/iommu_common.h
index b40cec2..8e2c211 100644
--- a/arch/sparc/kernel/iommu_common.h
+++ b/arch/sparc/kernel/iommu_common.h
@@ -33,15 +33,13 @@
  */
 #define IOMMU_PAGE_SHIFT   13
 
-#define SG_ENT_PHYS_ADDRESS(SG)(__pa(sg_virt((SG
-
 static inline int is_span_boundary(unsigned long entry,
   unsigned long shift,
   unsigned long boundary_size,
   struct scatterlist *outs,
   struct scatterlist *sg)
 {
-   unsigned long paddr = SG_ENT_PHYS_ADDRESS(outs);
+   unsigned long paddr = sg_phys(outs);
int nr = iommu_num_pages(paddr, outs-dma_length + sg-length,
 IO_PAGE_SIZE);
 
diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index d2fe57d..a7a6e41 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -370,7 +370,7 @@ static int dma_4v_map_sg(struct device *dev, struct 
scatterlist *sglist,
continue;
}
/* Allocate iommu entries for that segment */
-   paddr = (unsigned long) SG_ENT_PHYS_ADDRESS(s);
+   paddr = sg_phys(s);
npages = iommu_num_pages(paddr, slen, IO_PAGE_SIZE);
entry = iommu_tbl_range_alloc(dev, iommu-tbl, npages,
  handle, (unsigned long)(-1), 0);
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 12/31] mn10300: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Just remove a BUG_ON, the code handles them just fine as-is.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/mn10300/include/asm/dma-mapping.h | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/mn10300/include/asm/dma-mapping.h 
b/arch/mn10300/include/asm/dma-mapping.h
index a18abfc..b1b1050 100644
--- a/arch/mn10300/include/asm/dma-mapping.h
+++ b/arch/mn10300/include/asm/dma-mapping.h
@@ -57,11 +57,8 @@ int dma_map_sg(struct device *dev, struct scatterlist 
*sglist, int nents,
BUG_ON(!valid_dma_direction(direction));
WARN_ON(nents == 0 || sglist[0].length == 0);
 
-   for_each_sg(sglist, sg, nents, i) {
-   BUG_ON(!sg_page(sg));
-
+   for_each_sg(sglist, sg, nents, i)
sg-dma_address = sg_phys(sg);
-   }
 
mn10300_dcache_flush_inv();
return nents;
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 17/31] ia64/sba_iommu: remove sba_sg_address

2015-08-12 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/ia64/hp/common/sba_iommu.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
index 344387a..9e5aa8e 100644
--- a/arch/ia64/hp/common/sba_iommu.c
+++ b/arch/ia64/hp/common/sba_iommu.c
@@ -248,8 +248,6 @@ static int reserve_sba_gart = 1;
 static SBA_INLINE void sba_mark_invalid(struct ioc *, dma_addr_t, size_t);
 static SBA_INLINE void sba_free_range(struct ioc *, dma_addr_t, size_t);
 
-#define sba_sg_address(sg) sg_virt((sg))
-
 #ifdef FULL_VALID_PDIR
 static u64 prefetch_spill_page;
 #endif
@@ -397,7 +395,7 @@ sba_dump_sg( struct ioc *ioc, struct scatterlist *startsg, 
int nents)
while (nents--  0) {
printk(KERN_DEBUG  %d : DMA %08lx/%05x CPU %p\n, nents,
   startsg-dma_address, startsg-dma_length,
-  sba_sg_address(startsg));
+  sg_virt(startsg));
startsg = sg_next(startsg);
}
 }
@@ -409,7 +407,7 @@ sba_check_sg( struct ioc *ioc, struct scatterlist *startsg, 
int nents)
int the_nents = nents;
 
while (the_nents--  0) {
-   if (sba_sg_address(the_sg) == 0x0UL)
+   if (sg_virt(the_sg) == 0x0UL)
sba_dump_sg(NULL, startsg, nents);
the_sg = sg_next(the_sg);
}
@@ -1243,11 +1241,11 @@ sba_fill_pdir(
if (dump_run_sg)
printk( %2d : %08lx/%05x %p\n,
nents, startsg-dma_address, cnt,
-   sba_sg_address(startsg));
+   sg_virt(startsg));
 #else
DBG_RUN_SG( %d : %08lx/%05x %p\n,
nents, startsg-dma_address, cnt,
-   sba_sg_address(startsg));
+   sg_virt(startsg));
 #endif
/*
** Look for the start of a new DMA stream
@@ -1267,7 +1265,7 @@ sba_fill_pdir(
** Look for a VCONTIG chunk
*/
if (cnt) {
-   unsigned long vaddr = (unsigned long) 
sba_sg_address(startsg);
+   unsigned long vaddr = (unsigned long) sg_virt(startsg);
ASSERT(pdirp);
 
/* Since multiple Vcontig blocks could make up
@@ -1335,7 +1333,7 @@ sba_coalesce_chunks(struct ioc *ioc, struct device *dev,
int idx;
 
while (nents  0) {
-   unsigned long vaddr = (unsigned long) sba_sg_address(startsg);
+   unsigned long vaddr = (unsigned long) sg_virt(startsg);
 
/*
** Prepare for first/next DMA stream
@@ -1380,7 +1378,7 @@ sba_coalesce_chunks(struct ioc *ioc, struct device *dev,
**
** append the next transaction?
*/
-   vaddr = (unsigned long) sba_sg_address(startsg);
+   vaddr = (unsigned long) sg_virt(startsg);
if  (vcontig_end == vaddr)
{
vcontig_len += startsg-length;
@@ -1479,7 +1477,7 @@ static int sba_map_sg_attrs(struct device *dev, struct 
scatterlist *sglist,
if (likely((ioc-dma_mask  ~to_pci_dev(dev)-dma_mask) == 0)) {
for_each_sg(sglist, sg, nents, filled) {
sg-dma_length = sg-length;
-   sg-dma_address = virt_to_phys(sba_sg_address(sg));
+   sg-dma_address = virt_to_phys(sg_virt(sg));
}
return filled;
}
@@ -1487,7 +1485,7 @@ static int sba_map_sg_attrs(struct device *dev, struct 
scatterlist *sglist,
/* Fast path single entry scatterlists. */
if (nents == 1) {
sglist-dma_length = sglist-length;
-   sglist-dma_address = sba_map_single_attrs(dev, 
sba_sg_address(sglist), sglist-length, dir, attrs);
+   sglist-dma_address = sba_map_single_attrs(dev, 
sg_virt(sglist), sglist-length, dir, attrs);
return 1;
}
 
@@ -1563,7 +1561,7 @@ static void sba_unmap_sg_attrs(struct device *dev, struct 
scatterlist *sglist,
 #endif
 
DBG_RUN_SG(%s() START %d entries,  %p,%x\n,
-  __func__, nents, sba_sg_address(sglist), sglist-length);
+  __func__, nents, sg_virt(sglist), sglist-length);
 
 #ifdef ASSERT_PDIR_SANITY
ioc = GET_IOC(dev);
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 24/31] xtensa: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Make all cache invalidation conditional on sg_has_page().

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/xtensa/include/asm/dma-mapping.h | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/arch/xtensa/include/asm/dma-mapping.h 
b/arch/xtensa/include/asm/dma-mapping.h
index 1f5f6dc..262a1d1 100644
--- a/arch/xtensa/include/asm/dma-mapping.h
+++ b/arch/xtensa/include/asm/dma-mapping.h
@@ -61,10 +61,9 @@ dma_map_sg(struct device *dev, struct scatterlist *sglist, 
int nents,
BUG_ON(direction == DMA_NONE);
 
for_each_sg(sglist, sg, nents, i) {
-   BUG_ON(!sg_page(sg));
-
sg-dma_address = sg_phys(sg);
-   consistent_sync(sg_virt(sg), sg-length, direction);
+   if (sg_has_page(sg))
+   consistent_sync(sg_virt(sg), sg-length, direction);
}
 
return nents;
@@ -131,8 +130,10 @@ dma_sync_sg_for_cpu(struct device *dev, struct scatterlist 
*sglist, int nelems,
int i;
struct scatterlist *sg;
 
-   for_each_sg(sglist, sg, nelems, i)
-   consistent_sync(sg_virt(sg), sg-length, dir);
+   for_each_sg(sglist, sg, nelems, i) {
+   if (sg_has_page(sg))
+   consistent_sync(sg_virt(sg), sg-length, dir);
+   }
 }
 
 static inline void
@@ -142,8 +143,10 @@ dma_sync_sg_for_device(struct device *dev, struct 
scatterlist *sglist,
int i;
struct scatterlist *sg;
 
-   for_each_sg(sglist, sg, nelems, i)
-   consistent_sync(sg_virt(sg), sg-length, dir);
+   for_each_sg(sglist, sg, nelems, i) {
+   if (sg_has_page(sg))
+   consistent_sync(sg_virt(sg), sg-length, dir);
+   }
 }
 static inline int
 dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 23/31] sh: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Make all cache invalidation conditional on sg_has_page().

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/sh/kernel/dma-nommu.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/sh/kernel/dma-nommu.c b/arch/sh/kernel/dma-nommu.c
index 5b0bfcd..3b64dc7 100644
--- a/arch/sh/kernel/dma-nommu.c
+++ b/arch/sh/kernel/dma-nommu.c
@@ -33,9 +33,8 @@ static int nommu_map_sg(struct device *dev, struct 
scatterlist *sg,
WARN_ON(nents == 0 || sg[0].length == 0);
 
for_each_sg(sg, s, nents, i) {
-   BUG_ON(!sg_page(s));
-
-   dma_cache_sync(dev, sg_virt(s), s-length, dir);
+   if (sg_has_page(s))
+   dma_cache_sync(dev, sg_virt(s), s-length, dir);
 
s-dma_address = sg_phys(s);
s-dma_length = s-length;
@@ -57,8 +56,10 @@ static void nommu_sync_sg(struct device *dev, struct 
scatterlist *sg,
struct scatterlist *s;
int i;
 
-   for_each_sg(sg, s, nelems, i)
-   dma_cache_sync(dev, sg_virt(s), s-length, dir);
+   for_each_sg(sg, s, nelems, i) {
+   if (sg_has_page(s))
+   dma_cache_sync(dev, sg_virt(s), s-length, dir);
+   }
 }
 #endif
 
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 8/8] powerpc/xmon: Add some more elements to the existing PACA dump list

2015-08-12 Thread Anshuman Khandual
On 08/12/2015 11:35 AM, Michael Ellerman wrote:
 On Wed, 2015-07-29 at 12:40 +0530, Anshuman Khandual wrote:
 This patch adds a set of new elements to the existing PACA dump list
 inside an xmon session which can be listed below improving the overall
 xmon debug support.

 (1) hmi_event_available
 (2) dscr_default
 (3) vmalloc_sllp
 (4) slb_cache_ptr
 (5) sprg_vdso
 (6) tm_scratch
 (7) core_idle_state_ptr
 (8) thread_idle_state
 (9) thread_mask
 (10) slb_shadow
 (11) pgd
 (12) kernel_pgd
 (13) tcd_ptr
 (14) mc_kstack
 (15) crit_kstack
 (16) dbg_kstack
 (17) user_time
 (18) system_time
 (19) user_time_scaled
 (20) starttime
 (21) starttime_user
 (22) startspurr
 (23) utime_sspurr
 (24) stolen_time
 
 Adding these makes the paca display much longer than 24 lines. I know in
 general we don't worry too much about folks on 80x24 green screens, but it's
 nice if xmon works OK on those. Or on virtual consoles that don't scroll for
 whatever reason.
 
 So I'm going to hold off on this one until we have a way to display some of 
 the
 paca. I have an idea for that and will send a patch if it works.
 

Sure, if you believe that is the best thing to do at the moment.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 09/11] cxl: Allow the kernel to trust that an image won't change on PERST.

2015-08-12 Thread Cyril Bur
On Wed, 12 Aug 2015 10:48:18 +1000
Daniel Axtens d...@axtens.net wrote:

 Provide a kernel API and a sysfs entry which allow a user to specify
 that when a card is PERSTed, it's image will stay the same, allowing
 it to participate in EEH.
 
 cxl_reset is used to reflash the card. In that case, we cannot safely
 assert that the image will not change. Therefore, disallow cxl_reset
 if the flag is set.
 

Looks much better without all the #ifdefs!!

Reviewed-by: Cyril Bur cyril...@gmail.com

 Signed-off-by: Daniel Axtens d...@axtens.net
 ---
  Documentation/ABI/testing/sysfs-class-cxl | 10 ++
  drivers/misc/cxl/api.c|  7 +++
  drivers/misc/cxl/cxl.h|  1 +
  drivers/misc/cxl/pci.c|  7 +++
  drivers/misc/cxl/sysfs.c  | 26 ++
  include/misc/cxl.h| 10 ++
  6 files changed, 61 insertions(+)
 
 diff --git a/Documentation/ABI/testing/sysfs-class-cxl 
 b/Documentation/ABI/testing/sysfs-class-cxl
 index acfe9df83139..b07e86d4597f 100644
 --- a/Documentation/ABI/testing/sysfs-class-cxl
 +++ b/Documentation/ABI/testing/sysfs-class-cxl
 @@ -223,3 +223,13 @@ Description:write only
  Writing 1 will issue a PERST to card which may cause the card
  to reload the FPGA depending on load_image_on_perst.
  Users:   https://github.com/ibm-capi/libcxl
 +
 +What:/sys/class/cxl/card/perst_reloads_same_image
 +Date:July 2015
 +Contact: linuxppc-dev@lists.ozlabs.org
 +Description: read/write
 + Trust that when an image is reloaded via PERST, it will not
 + have changed.
 + 0 = don't trust, the image may be different (default)
 + 1 = trust that the image will not change.
 +Users:   https://github.com/ibm-capi/libcxl
 diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
 index 729e0851167d..6a768a9ad22f 100644
 --- a/drivers/misc/cxl/api.c
 +++ b/drivers/misc/cxl/api.c
 @@ -327,3 +327,10 @@ int cxl_afu_reset(struct cxl_context *ctx)
   return cxl_afu_check_and_enable(afu);
  }
  EXPORT_SYMBOL_GPL(cxl_afu_reset);
 +
 +void cxl_perst_reloads_same_image(struct cxl_afu *afu,
 +   bool perst_reloads_same_image)
 +{
 + afu-adapter-perst_same_image = perst_reloads_same_image;
 +}
 +EXPORT_SYMBOL_GPL(cxl_perst_reloads_same_image);
 diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
 index d540542f9931..cda02412b01e 100644
 --- a/drivers/misc/cxl/cxl.h
 +++ b/drivers/misc/cxl/cxl.h
 @@ -493,6 +493,7 @@ struct cxl {
   bool user_image_loaded;
   bool perst_loads_image;
   bool perst_select_user;
 + bool perst_same_image;
  };
  
  int cxl_alloc_one_irq(struct cxl *adapter);
 diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
 index 023a2086830b..b4a68a896a33 100644
 --- a/drivers/misc/cxl/pci.c
 +++ b/drivers/misc/cxl/pci.c
 @@ -874,6 +874,12 @@ int cxl_reset(struct cxl *adapter)
   int i;
   u32 val;
  
 + if (adapter-perst_same_image) {
 + dev_warn(dev-dev,
 +  cxl: refusing to reset/reflash when 
 perst_reloads_same_image is set.\n);
 + return -EINVAL;
 + }
 +
   dev_info(dev-dev, CXL reset\n);
  
   /* pcie_warm_reset requests a fundamental pci reset which includes a
 @@ -1148,6 +1154,7 @@ static struct cxl *cxl_init_adapter(struct pci_dev *dev)
* configure/reconfigure
*/
   adapter-perst_loads_image = true;
 + adapter-perst_same_image = false;
  
   rc = cxl_configure_adapter(adapter, dev);
   if (rc) {
 diff --git a/drivers/misc/cxl/sysfs.c b/drivers/misc/cxl/sysfs.c
 index 31f38bc71a3d..6619cf1f6e1f 100644
 --- a/drivers/misc/cxl/sysfs.c
 +++ b/drivers/misc/cxl/sysfs.c
 @@ -112,12 +112,38 @@ static ssize_t load_image_on_perst_store(struct device 
 *device,
   return count;
  }
  
 +static ssize_t perst_reloads_same_image_show(struct device *device,
 +  struct device_attribute *attr,
 +  char *buf)
 +{
 + struct cxl *adapter = to_cxl_adapter(device);
 +
 + return scnprintf(buf, PAGE_SIZE, %i\n, adapter-perst_same_image);
 +}
 +
 +static ssize_t perst_reloads_same_image_store(struct device *device,
 +  struct device_attribute *attr,
 +  const char *buf, size_t count)
 +{
 + struct cxl *adapter = to_cxl_adapter(device);
 + int rc;
 + int val;
 +
 + rc = sscanf(buf, %i, val);
 + if ((rc != 1) || !(val == 1 || val == 0))
 + return -EINVAL;
 +
 + adapter-perst_same_image = (val == 1 ? true : false);
 + return count;
 +}
 +
  static struct device_attribute adapter_attrs[] = {
   __ATTR_RO(caia_version),
   __ATTR_RO(psl_revision),
   __ATTR_RO(base_image),
   __ATTR_RO(image_loaded),
   

Re: [4/8] powerpc/slb: Add some helper functions to improve modularization

2015-08-12 Thread Anshuman Khandual
On 08/12/2015 09:41 AM, Michael Ellerman wrote:
 On Wed, 2015-29-07 at 07:10:01 UTC, Anshuman Khandual wrote:
  This patch adds the following six helper functions to help improve
  modularization and readability of the code.
  
  (1) slb_invalidate_all:Invalidates the entire SLB
  (2) slb_invalidate:Invalidates SLB entries present in PACA
  (3) mmu_linear_vsid_flags: VSID flags for kernel linear mapping
  (4) mmu_virtual_vsid_flags:VSID flags for kernel virtual mapping
  (5) mmu_vmemmap_vsid_flags:VSID flags for kernel vmem mapping
  (6) mmu_io_vsid_flags: VSID flags for kernel I/O mapping
 That's too many changes for one patch, it's certainly not a single logical 
 change.
 
 I'm happy with all the flag ones being done in a single patch, but please do
 the other two in separate patches.

Sure, will split this into three separate patches, also update the
in-code documentation as suggested on the [5/8] patch and then will
send out a new series.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 01/31] scatterlist: add sg_pfn and sg_has_page helpers

2015-08-12 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig h...@lst.de
---
 include/linux/scatterlist.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 9b1ef0c..b1056bf 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -230,6 +230,16 @@ static inline dma_addr_t sg_phys(struct scatterlist *sg)
return page_to_phys(sg_page(sg)) + sg-offset;
 }
 
+static inline unsigned long sg_pfn(struct scatterlist *sg)
+{
+   return page_to_pfn(sg_page(sg));
+}
+
+static inline bool sg_has_page(struct scatterlist *sg)
+{
+   return true;
+}
+
 /**
  * sg_virt - Return virtual address of an sg entry
  * @sg:  SG entry
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 14/31] sparc32/io-unit: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
For the iommu offset we just need and offset into the page.  Calculate
that using the physical address instead of using the virtual address
so that we don't require a virtual mapping.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/sparc/mm/io-unit.c | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/sparc/mm/io-unit.c b/arch/sparc/mm/io-unit.c
index f311bf2..82f97ae 100644
--- a/arch/sparc/mm/io-unit.c
+++ b/arch/sparc/mm/io-unit.c
@@ -91,13 +91,14 @@ static int __init iounit_init(void)
 subsys_initcall(iounit_init);
 
 /* One has to hold iounit-lock to call this */
-static unsigned long iounit_get_area(struct iounit_struct *iounit, unsigned 
long vaddr, int size)
+static dma_addr_t iounit_get_area(struct iounit_struct *iounit,
+   unsigned long paddr, int size)
 {
int i, j, k, npages;
-   unsigned long rotor, scan, limit;
+   unsigned long rotor, scan, limit, dma_addr;
iopte_t iopte;
 
-npages = ((vaddr  ~PAGE_MASK) + size + (PAGE_SIZE-1))  PAGE_SHIFT;
+npages = ((paddr  ~PAGE_MASK) + size + (PAGE_SIZE-1))  PAGE_SHIFT;
 
/* A tiny bit of magic ingredience :) */
switch (npages) {
@@ -106,7 +107,7 @@ static unsigned long iounit_get_area(struct iounit_struct 
*iounit, unsigned long
default: i = 0x0213; break;
}

-   IOD((iounit_get_area(%08lx,%d[%d])=, vaddr, size, npages));
+   IOD((iounit_get_area(%08lx,%d[%d])=, paddr, size, npages));

 next:  j = (i  15);
rotor = iounit-rotor[j - 1];
@@ -121,7 +122,7 @@ nexti:  scan = find_next_zero_bit(iounit-bmap, limit, 
scan);
}
i = 4;
if (!(i  15))
-   panic(iounit_get_area: Couldn't find free iopte slots 
for (%08lx,%d)\n, vaddr, size);
+   panic(iounit_get_area: Couldn't find free iopte slots 
for (%08lx,%d)\n, paddr, size);
goto next;
}
for (k = 1, scan++; k  npages; k++)
@@ -129,14 +130,14 @@ nexti:scan = find_next_zero_bit(iounit-bmap, limit, 
scan);
goto nexti;
iounit-rotor[j - 1] = (scan  limit) ? scan : iounit-limit[j - 1];
scan -= npages;
-   iopte = MKIOPTE(__pa(vaddr  PAGE_MASK));
-   vaddr = IOUNIT_DMA_BASE + (scan  PAGE_SHIFT) + (vaddr  ~PAGE_MASK);
+   iopte = MKIOPTE(paddr  PAGE_MASK);
+   dma_addr = IOUNIT_DMA_BASE + (scan  PAGE_SHIFT) + (paddr  
~PAGE_MASK);
for (k = 0; k  npages; k++, iopte = __iopte(iopte_val(iopte) + 0x100), 
scan++) {
set_bit(scan, iounit-bmap);
sbus_writel(iopte, iounit-page_table[scan]);
}
-   IOD((%08lx\n, vaddr));
-   return vaddr;
+   IOD((%08lx\n, dma_addr));
+   return dma_addr;
 }
 
 static __u32 iounit_get_scsi_one(struct device *dev, char *vaddr, unsigned 
long len)
@@ -145,7 +146,7 @@ static __u32 iounit_get_scsi_one(struct device *dev, char 
*vaddr, unsigned long
unsigned long ret, flags;

spin_lock_irqsave(iounit-lock, flags);
-   ret = iounit_get_area(iounit, (unsigned long)vaddr, len);
+   ret = iounit_get_area(iounit, virt_to_phys(vaddr), len);
spin_unlock_irqrestore(iounit-lock, flags);
return ret;
 }
@@ -159,7 +160,7 @@ static void iounit_get_scsi_sgl(struct device *dev, struct 
scatterlist *sg, int
spin_lock_irqsave(iounit-lock, flags);
while (sz != 0) {
--sz;
-   sg-dma_address = iounit_get_area(iounit, (unsigned long) 
sg_virt(sg), sg-length);
+   sg-dma_address = iounit_get_area(iounit, sg_phys(sg), 
sg-length);
sg-dma_length = sg-length;
sg = sg_next(sg);
}
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 21/31] blackfin: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Switch from sg_virt to sg_phys as blackfin like all nommu architectures
has a 1:1 virtual to physical mapping.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/blackfin/kernel/dma-mapping.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/blackfin/kernel/dma-mapping.c 
b/arch/blackfin/kernel/dma-mapping.c
index df437e5..e2c4d1a 100644
--- a/arch/blackfin/kernel/dma-mapping.c
+++ b/arch/blackfin/kernel/dma-mapping.c
@@ -120,7 +120,7 @@ dma_map_sg(struct device *dev, struct scatterlist *sg_list, 
int nents,
int i;
 
for_each_sg(sg_list, sg, nents, i) {
-   sg-dma_address = (dma_addr_t) sg_virt(sg);
+   sg-dma_address = sg_phys(sg);
__dma_sync(sg_dma_address(sg), sg_dma_len(sg), direction);
}
 
@@ -135,7 +135,7 @@ void dma_sync_sg_for_device(struct device *dev, struct 
scatterlist *sg_list,
int i;
 
for_each_sg(sg_list, sg, nelems, i) {
-   sg-dma_address = (dma_addr_t) sg_virt(sg);
+   sg-dma_address = sg_phys(sg);
__dma_sync(sg_dma_address(sg), sg_dma_len(sg), direction);
}
 }
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 27/31] mips: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Make all cache invalidation conditional on sg_has_page() and use
sg_phys to get the physical address directly.  To do this consolidate
the two platform callouts using pages and virtual addresses into a
single one using a physical address.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/mips/bmips/dma.c  |  9 ++--
 arch/mips/include/asm/mach-ath25/dma-coherence.h   | 10 ++---
 arch/mips/include/asm/mach-bmips/dma-coherence.h   |  4 ++--
 .../include/asm/mach-cavium-octeon/dma-coherence.h | 11 ++
 arch/mips/include/asm/mach-generic/dma-coherence.h | 12 +++
 arch/mips/include/asm/mach-ip27/dma-coherence.h| 16 +++---
 arch/mips/include/asm/mach-ip32/dma-coherence.h| 19 +++-
 arch/mips/include/asm/mach-jazz/dma-coherence.h| 11 +++---
 .../include/asm/mach-loongson64/dma-coherence.h| 16 +++---
 arch/mips/mm/dma-default.c | 25 --
 10 files changed, 37 insertions(+), 96 deletions(-)

diff --git a/arch/mips/bmips/dma.c b/arch/mips/bmips/dma.c
index 04790f4..13fc891 100644
--- a/arch/mips/bmips/dma.c
+++ b/arch/mips/bmips/dma.c
@@ -52,14 +52,9 @@ static dma_addr_t bmips_phys_to_dma(struct device *dev, 
phys_addr_t pa)
return pa;
 }
 
-dma_addr_t plat_map_dma_mem(struct device *dev, void *addr, size_t size)
+dma_addr_t plat_map_dma_mem(struct device *dev, phys_addr_t phys, size_t size)
 {
-   return bmips_phys_to_dma(dev, virt_to_phys(addr));
-}
-
-dma_addr_t plat_map_dma_mem_page(struct device *dev, struct page *page)
-{
-   return bmips_phys_to_dma(dev, page_to_phys(page));
+   return bmips_phys_to_dma(dev, phys);
 }
 
 unsigned long plat_dma_addr_to_phys(struct device *dev, dma_addr_t dma_addr)
diff --git a/arch/mips/include/asm/mach-ath25/dma-coherence.h 
b/arch/mips/include/asm/mach-ath25/dma-coherence.h
index d5defdd..4330de6 100644
--- a/arch/mips/include/asm/mach-ath25/dma-coherence.h
+++ b/arch/mips/include/asm/mach-ath25/dma-coherence.h
@@ -31,15 +31,9 @@ static inline dma_addr_t ath25_dev_offset(struct device *dev)
 }
 
 static inline dma_addr_t
-plat_map_dma_mem(struct device *dev, void *addr, size_t size)
+plat_map_dma_mem(struct device *dev, phys_addr_t phys, size_t size)
 {
-   return virt_to_phys(addr) + ath25_dev_offset(dev);
-}
-
-static inline dma_addr_t
-plat_map_dma_mem_page(struct device *dev, struct page *page)
-{
-   return page_to_phys(page) + ath25_dev_offset(dev);
+   return phys + ath25_dev_offset(dev);
 }
 
 static inline unsigned long
diff --git a/arch/mips/include/asm/mach-bmips/dma-coherence.h 
b/arch/mips/include/asm/mach-bmips/dma-coherence.h
index d29781f..1b9a7f4 100644
--- a/arch/mips/include/asm/mach-bmips/dma-coherence.h
+++ b/arch/mips/include/asm/mach-bmips/dma-coherence.h
@@ -21,8 +21,8 @@
 
 struct device;
 
-extern dma_addr_t plat_map_dma_mem(struct device *dev, void *addr, size_t 
size);
-extern dma_addr_t plat_map_dma_mem_page(struct device *dev, struct page *page);
+extern dma_addr_t plat_map_dma_mem(struct device *dev, phys_addr_t phys,
+   size_t size);
 extern unsigned long plat_dma_addr_to_phys(struct device *dev,
dma_addr_t dma_addr);
 
diff --git a/arch/mips/include/asm/mach-cavium-octeon/dma-coherence.h 
b/arch/mips/include/asm/mach-cavium-octeon/dma-coherence.h
index 460042e..d0988c7 100644
--- a/arch/mips/include/asm/mach-cavium-octeon/dma-coherence.h
+++ b/arch/mips/include/asm/mach-cavium-octeon/dma-coherence.h
@@ -19,15 +19,8 @@ struct device;
 
 extern void octeon_pci_dma_init(void);
 
-static inline dma_addr_t plat_map_dma_mem(struct device *dev, void *addr,
-   size_t size)
-{
-   BUG();
-   return 0;
-}
-
-static inline dma_addr_t plat_map_dma_mem_page(struct device *dev,
-   struct page *page)
+static inline dma_addr_t plat_map_dma_mem(struct device *dev, phys_addr_t phys,
+   size_t size)
 {
BUG();
return 0;
diff --git a/arch/mips/include/asm/mach-generic/dma-coherence.h 
b/arch/mips/include/asm/mach-generic/dma-coherence.h
index 0f8a354..2dfb133 100644
--- a/arch/mips/include/asm/mach-generic/dma-coherence.h
+++ b/arch/mips/include/asm/mach-generic/dma-coherence.h
@@ -11,16 +11,10 @@
 
 struct device;
 
-static inline dma_addr_t plat_map_dma_mem(struct device *dev, void *addr,
-   size_t size)
+static inline dma_addr_t plat_map_dma_mem(struct device *dev, phys_addr_t phys,
+   size_t size)
 {
-   return virt_to_phys(addr);
-}
-
-static inline dma_addr_t plat_map_dma_mem_page(struct device *dev,
-   struct page *page)
-{
-   return page_to_phys(page);
+   return phys;
 }
 
 static inline unsigned long plat_dma_addr_to_phys(struct device *dev,
diff --git a/arch/mips/include/asm/mach-ip27/dma-coherence.h 
b/arch/mips/include/asm/mach-ip27/dma-coherence.h
index 1daa644..2578b9d 100644
--- a/arch/mips/include/asm/mach-ip27/dma-coherence.h
+++ 

[PATCH 28/31] powerpc: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Make all cache invalidation conditional on sg_has_page().

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/powerpc/kernel/dma.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index 35e4dcc..cece40b 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -135,7 +135,10 @@ static int dma_direct_map_sg(struct device *dev, struct 
scatterlist *sgl,
for_each_sg(sgl, sg, nents, i) {
sg-dma_address = sg_phys(sg) + get_dma_offset(dev);
sg-dma_length = sg-length;
-   __dma_sync_page(sg_page(sg), sg-offset, sg-length, direction);
+   if (sg_has_page(sg)) {
+   __dma_sync_page(sg_page(sg), sg-offset, sg-length,
+   direction);
+   }
}
 
return nents;
@@ -200,7 +203,10 @@ static inline void dma_direct_sync_sg(struct device *dev,
int i;
 
for_each_sg(sgl, sg, nents, i)
-   __dma_sync_page(sg_page(sg), sg-offset, sg-length, direction);
+   if (sg_has_page(sg)) {
+   __dma_sync_page(sg_page(sg), sg-offset, sg-length,
+   direction);
+   }
 }
 
 static inline void dma_direct_sync_single(struct device *dev,
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V3] powerpc/irq: Enable some more exceptions in /proc/interrupts interface

2015-08-12 Thread Anshuman Khandual
On 08/09/2015 07:57 AM, Benjamin Herrenschmidt wrote:
 On Tue, 2015-08-04 at 19:57 +1000, Michael Ellerman wrote:
  On Mon, 2015-13-07 at 08:16:06 UTC, Anshuman Khandual wrote:
   This patch enables facility unavailable exceptions for generic facility,
   FPU, ALTIVEC and VSX in /proc/interrupts listing by incrementing their
   newly added IRQ statistical counters as and when these exceptions 
   happen.
   This also adds couple of helper functions which will be called from 
   within
   the interrupt handler context to update their statistics. Similarly this
   patch also enables alignment and program check exceptions as well.
  
  ...
  
   diff --git a/arch/powerpc/kernel/exceptions-64s.S 
   b/arch/powerpc/kernel/exceptions-64s.S
   index 0a0399c2..a86180c 100644
   --- a/arch/powerpc/kernel/exceptions-64s.S
   +++ b/arch/powerpc/kernel/exceptions-64s.S
   @@ -1158,6 +1158,7 @@ BEGIN_FTR_SECTION
END_FTR_SECTION_IFSET(CPU_FTR_TM)
#endif
   bl  load_up_fpu
   +   bl  fpu_unav_exceptions_count
  
  Is it safe to call C code here?
 Even if it was (at some stage it wasn't, I'd have to look very closely
 to see what's the situation now), we certainly don't want to add
 overhead to load_up_fpu.

As I had already mentioned in the V2 thread of this patch, the
FPU performance with this patch being applied is still very much
comparable to the kernel without this patch. Though I have not
verified whether this still holds true with the new changes being
proposed in exceptions-64s.S (earlier reply in this thread) to
make the C function call safer.

Average of 1000 iterations (context_switch2 --fp 0 0)

Withthe patch : 322599.57  (Average of 1000 results)
Without the patch : 320464.924 (Average of 1000 results)

With standard deviation of the results.

6029.1407073288 (with patch ) 5941.7684079774 (without patch)

Wondering if the result above still does not convince us
that FPU performance might not be getting hit because of
this patch, let me know if we need to do more experiments.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 02/11] cxl: Drop commands if the PCI channel is not in normal state

2015-08-12 Thread Cyril Bur
On Wed, 12 Aug 2015 10:48:11 +1000
Daniel Axtens d...@axtens.net wrote:

 If the PCI channel has gone down, don't attempt to poke the hardware.
 
 We need to guard every time cxl_whatever_(read|write) is called. This
 is because a call to those functions will dereference an offset into an
 mmio register, and the mmio mappings get invalidated in the EEH
 teardown.
 
 Check in the read/write functions in the header.
 We give them the same semantics as usual PCI operations:
  - a write to a channel that is down is ignored.
  - a read from a channel that is down returns all fs.
 
 Also, we try to access the MMIO space of a vPHB device as part of the
 PCI disable path. Because that's a read that bypasses most of our usual
 checks, we handle it explicitly.
 
 As far as user visible warnings go:
  - Check link state in file ops, return -EIO if down.
  - Be reasonably quiet if there's an error in a teardown path,
or when we already know the hardware is going down.
  - Throw a big WARN if someone tries to start a CXL operation
while the card is down. This gives a useful stacktrace for
debugging whatever is doing that.
 

My previous comments appear to have been added, making functions from those
macros was a good move. I can't speak too much for the exact function of the
patch but the code looks good.

Reviewed-by: Cyril Bur cyril...@gmail.com

 Signed-off-by: Daniel Axtens d...@axtens.net
 ---
  drivers/misc/cxl/context.c |  6 +++-
  drivers/misc/cxl/cxl.h | 44 ++--
  drivers/misc/cxl/file.c| 19 +
  drivers/misc/cxl/native.c  | 71 
 --
  drivers/misc/cxl/vphb.c| 26 +
  5 files changed, 154 insertions(+), 12 deletions(-)
 
 diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c
 index 1287148629c0..615842115848 100644
 --- a/drivers/misc/cxl/context.c
 +++ b/drivers/misc/cxl/context.c
 @@ -193,7 +193,11 @@ int __detach_context(struct cxl_context *ctx)
   if (status != STARTED)
   return -EBUSY;
  
 - WARN_ON(cxl_detach_process(ctx));
 + /* Only warn if we detached while the link was OK.
 +  * If detach fails when hw is down, we don't care.
 +  */
 + WARN_ON(cxl_detach_process(ctx) 
 + cxl_adapter_link_ok(ctx-afu-adapter));
   flush_work(ctx-fault_work); /* Only needed for dedicated process */
   put_pid(ctx-pid);
   cxl_ctx_put();
 diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
 index 6a93bfbcd826..9b9e89fd02cc 100644
 --- a/drivers/misc/cxl/cxl.h
 +++ b/drivers/misc/cxl/cxl.h
 @@ -531,6 +531,14 @@ struct cxl_process_element {
   __be32 software_state;
  } __packed;
  
 +static inline bool cxl_adapter_link_ok(struct cxl *cxl)
 +{
 + struct pci_dev *pdev;
 +
 + pdev = to_pci_dev(cxl-dev.parent);
 + return !pci_channel_offline(pdev);
 +}
 +
  static inline void __iomem *_cxl_p1_addr(struct cxl *cxl, cxl_p1_reg_t reg)
  {
   WARN_ON(!cpu_has_feature(CPU_FTR_HVMODE));
 @@ -539,12 +547,16 @@ static inline void __iomem *_cxl_p1_addr(struct cxl 
 *cxl, cxl_p1_reg_t reg)
  
  static inline void cxl_p1_write(struct cxl *cxl, cxl_p1_reg_t reg, u64 val)
  {
 - out_be64(_cxl_p1_addr(cxl, reg), val);
 + if (likely(cxl_adapter_link_ok(cxl)))
 + out_be64(_cxl_p1_addr(cxl, reg), val);
  }
  
  static inline u64 cxl_p1_read(struct cxl *cxl, cxl_p1_reg_t reg)
  {
 - return in_be64(_cxl_p1_addr(cxl, reg));
 + if (likely(cxl_adapter_link_ok(cxl)))
 + return in_be64(_cxl_p1_addr(cxl, reg));
 + else
 + return ~0ULL;
  }
  
  static inline void __iomem *_cxl_p1n_addr(struct cxl_afu *afu, cxl_p1n_reg_t 
 reg)
 @@ -555,12 +567,16 @@ static inline void __iomem *_cxl_p1n_addr(struct 
 cxl_afu *afu, cxl_p1n_reg_t reg
  
  static inline void cxl_p1n_write(struct cxl_afu *afu, cxl_p1n_reg_t reg, u64 
 val)
  {
 - out_be64(_cxl_p1n_addr(afu, reg), val);
 + if (likely(cxl_adapter_link_ok(afu-adapter)))
 + out_be64(_cxl_p1n_addr(afu, reg), val);
  }
  
  static inline u64 cxl_p1n_read(struct cxl_afu *afu, cxl_p1n_reg_t reg)
  {
 - return in_be64(_cxl_p1n_addr(afu, reg));
 + if (likely(cxl_adapter_link_ok(afu-adapter)))
 + return in_be64(_cxl_p1n_addr(afu, reg));
 + else
 + return ~0ULL;
  }
  
  static inline void __iomem *_cxl_p2n_addr(struct cxl_afu *afu, cxl_p2n_reg_t 
 reg)
 @@ -570,22 +586,34 @@ static inline void __iomem *_cxl_p2n_addr(struct 
 cxl_afu *afu, cxl_p2n_reg_t reg
  
  static inline void cxl_p2n_write(struct cxl_afu *afu, cxl_p2n_reg_t reg, u64 
 val)
  {
 - out_be64(_cxl_p2n_addr(afu, reg), val);
 + if (likely(cxl_adapter_link_ok(afu-adapter)))
 + out_be64(_cxl_p2n_addr(afu, reg), val);
  }
  
  static inline u64 cxl_p2n_read(struct cxl_afu *afu, cxl_p2n_reg_t reg)
  {
 - return in_be64(_cxl_p2n_addr(afu, reg));
 + if (likely(cxl_adapter_link_ok(afu-adapter)))
 + 

[PATCH 07/31] alpha/pci_iommu: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Use sg_phys() instead of virt_to_phys(sg_virt(sg)) so that we don't
require a kernel virtual address, and switch a few debug printfs to
print physical instead of virtual addresses.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/alpha/kernel/pci_iommu.c | 36 +++-
 1 file changed, 15 insertions(+), 21 deletions(-)

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index eddee77..5d46b49 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -248,20 +248,17 @@ static int pci_dac_dma_supported(struct pci_dev *dev, u64 
mask)
until either pci_unmap_single or pci_dma_sync_single is performed.  */
 
 static dma_addr_t
-pci_map_single_1(struct pci_dev *pdev, void *cpu_addr, size_t size,
+pci_map_single_1(struct pci_dev *pdev, unsigned long paddr, size_t size,
 int dac_allowed)
 {
struct pci_controller *hose = pdev ? pdev-sysdata : pci_isa_hose;
dma_addr_t max_dma = pdev ? pdev-dma_mask : ISA_DMA_MASK;
struct pci_iommu_arena *arena;
long npages, dma_ofs, i;
-   unsigned long paddr;
dma_addr_t ret;
unsigned int align = 0;
struct device *dev = pdev ? pdev-dev : NULL;
 
-   paddr = __pa(cpu_addr);
-
 #if !DEBUG_NODIRECT
/* First check to see if we can use the direct map window.  */
if (paddr + size + __direct_map_base - 1 = max_dma
@@ -269,7 +266,7 @@ pci_map_single_1(struct pci_dev *pdev, void *cpu_addr, 
size_t size,
ret = paddr + __direct_map_base;
 
DBGA2(pci_map_single: [%p,%zx] - direct %llx from %pf\n,
- cpu_addr, size, ret, __builtin_return_address(0));
+ paddr, size, ret, __builtin_return_address(0));
 
return ret;
}
@@ -280,7 +277,7 @@ pci_map_single_1(struct pci_dev *pdev, void *cpu_addr, 
size_t size,
ret = paddr + alpha_mv.pci_dac_offset;
 
DBGA2(pci_map_single: [%p,%zx] - DAC %llx from %pf\n,
- cpu_addr, size, ret, __builtin_return_address(0));
+ paddr, size, ret, __builtin_return_address(0));
 
return ret;
}
@@ -309,15 +306,15 @@ pci_map_single_1(struct pci_dev *pdev, void *cpu_addr, 
size_t size,
return 0;
}
 
+   offset = paddr  ~PAGE_MASK;
paddr = PAGE_MASK;
for (i = 0; i  npages; ++i, paddr += PAGE_SIZE)
arena-ptes[i + dma_ofs] = mk_iommu_pte(paddr);
 
-   ret = arena-dma_base + dma_ofs * PAGE_SIZE;
-   ret += (unsigned long)cpu_addr  ~PAGE_MASK;
+   ret = arena-dma_base + dma_ofs * PAGE_SIZE + offset;
 
DBGA2(pci_map_single: [%p,%zx] np %ld - sg %llx from %pf\n,
- cpu_addr, size, npages, ret, __builtin_return_address(0));
+ paddr, size, npages, ret, __builtin_return_address(0));
 
return ret;
 }
@@ -357,7 +354,7 @@ static dma_addr_t alpha_pci_map_page(struct device *dev, 
struct page *page,
BUG_ON(dir == PCI_DMA_NONE);
 
dac_allowed = pdev ? pci_dac_dma_supported(pdev, pdev-dma_mask) : 0; 
-   return pci_map_single_1(pdev, (char *)page_address(page) + offset, 
+   return pci_map_single_1(pdev, page_to_phys(page) + offset,
size, dac_allowed);
 }
 
@@ -453,7 +450,7 @@ try_again:
}
memset(cpu_addr, 0, size);
 
-   *dma_addrp = pci_map_single_1(pdev, cpu_addr, size, 0);
+   *dma_addrp = pci_map_single_1(pdev, __pa(cpu_addr), size, 0);
if (*dma_addrp == 0) {
free_pages((unsigned long)cpu_addr, order);
if (alpha_mv.mv_pci_tbi || (gfp  GFP_DMA))
@@ -497,9 +494,6 @@ static void alpha_pci_free_coherent(struct device *dev, 
size_t size,
Write dma_length of each leader with the combined lengths of
the mergable followers.  */
 
-#define SG_ENT_VIRT_ADDRESS(SG) (sg_virt((SG)))
-#define SG_ENT_PHYS_ADDRESS(SG) __pa(SG_ENT_VIRT_ADDRESS(SG))
-
 static void
 sg_classify(struct device *dev, struct scatterlist *sg, struct scatterlist 
*end,
int virt_ok)
@@ -512,13 +506,13 @@ sg_classify(struct device *dev, struct scatterlist *sg, 
struct scatterlist *end,
leader = sg;
leader_flag = 0;
leader_length = leader-length;
-   next_paddr = SG_ENT_PHYS_ADDRESS(leader) + leader_length;
+   next_paddr = sg_phys(leader) + leader_length;
 
/* we will not marge sg without device. */
max_seg_size = dev ? dma_get_max_seg_size(dev) : 0;
for (++sg; sg  end; ++sg) {
unsigned long addr, len;
-   addr = SG_ENT_PHYS_ADDRESS(sg);
+   addr = sg_phys(sg);
len = sg-length;
 
if (leader_length + len  max_seg_size)
@@ -555,7 +549,7 @@ sg_fill(struct device *dev, struct scatterlist *leader, 
struct scatterlist *end,
struct scatterlist *out, struct pci_iommu_arena *arena,

[PATCH 08/31] c6x: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Use sg_phys() instead of virt_to_phys(sg_virt(sg)) so that we don't
require a kernel virtual address.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/c6x/kernel/dma.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/c6x/kernel/dma.c b/arch/c6x/kernel/dma.c
index ab7b12d..79cae03 100644
--- a/arch/c6x/kernel/dma.c
+++ b/arch/c6x/kernel/dma.c
@@ -68,8 +68,7 @@ int dma_map_sg(struct device *dev, struct scatterlist *sglist,
int i;
 
for_each_sg(sglist, sg, nents, i)
-   sg-dma_address = dma_map_single(dev, sg_virt(sg), sg-length,
-dir);
+   sg-dma_address = sg_phys(sg);
 
debug_dma_map_sg(dev, sglist, nents, nents, dir);
 
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 15/31] sparc32/iommu: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Pass a PFN to iommu_get_one instad of calculating it locall from a
page structure so that we don't need pages for every address we can
DMA to or from.

Also further restrict the cache flushing as we now have a non-highmem
way of not kernel virtual mapped physical addresses.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/sparc/mm/iommu.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/sparc/mm/iommu.c b/arch/sparc/mm/iommu.c
index 491511d..3ed53d7 100644
--- a/arch/sparc/mm/iommu.c
+++ b/arch/sparc/mm/iommu.c
@@ -174,7 +174,7 @@ static void iommu_flush_iotlb(iopte_t *iopte, unsigned int 
niopte)
}
 }
 
-static u32 iommu_get_one(struct device *dev, struct page *page, int npages)
+static u32 iommu_get_one(struct device *dev, unsigned long pfn, int npages)
 {
struct iommu_struct *iommu = dev-archdata.iommu;
int ioptex;
@@ -183,7 +183,7 @@ static u32 iommu_get_one(struct device *dev, struct page 
*page, int npages)
int i;
 
/* page color = pfn of page */
-   ioptex = bit_map_string_get(iommu-usemap, npages, page_to_pfn(page));
+   ioptex = bit_map_string_get(iommu-usemap, npages, pfn);
if (ioptex  0)
panic(iommu out);
busa0 = iommu-start + (ioptex  PAGE_SHIFT);
@@ -192,11 +192,11 @@ static u32 iommu_get_one(struct device *dev, struct page 
*page, int npages)
busa = busa0;
iopte = iopte0;
for (i = 0; i  npages; i++) {
-   iopte_val(*iopte) = MKIOPTE(page_to_pfn(page), IOPERM);
+   iopte_val(*iopte) = MKIOPTE(pfn, IOPERM);
iommu_invalidate_page(iommu-regs, busa);
busa += PAGE_SIZE;
iopte++;
-   page++;
+   pfn++;
}
 
iommu_flush_iotlb(iopte0, npages);
@@ -214,7 +214,7 @@ static u32 iommu_get_scsi_one(struct device *dev, char 
*vaddr, unsigned int len)
off = (unsigned long)vaddr  ~PAGE_MASK;
npages = (off + len + PAGE_SIZE-1)  PAGE_SHIFT;
page = virt_to_page((unsigned long)vaddr  PAGE_MASK);
-   busa = iommu_get_one(dev, page, npages);
+   busa = iommu_get_one(dev, page_to_pfn(page), npages);
return busa + off;
 }
 
@@ -243,7 +243,7 @@ static void iommu_get_scsi_sgl_gflush(struct device *dev, 
struct scatterlist *sg
while (sz != 0) {
--sz;
n = (sg-length + sg-offset + PAGE_SIZE-1)  PAGE_SHIFT;
-   sg-dma_address = iommu_get_one(dev, sg_page(sg), n) + 
sg-offset;
+   sg-dma_address = iommu_get_one(dev, sg_pfn(sg), n) + 
sg-offset;
sg-dma_length = sg-length;
sg = sg_next(sg);
}
@@ -264,7 +264,8 @@ static void iommu_get_scsi_sgl_pflush(struct device *dev, 
struct scatterlist *sg
 * XXX Is this a good assumption?
 * XXX What if someone else unmaps it here and races us?
 */
-   if ((page = (unsigned long) page_address(sg_page(sg))) != 0) {
+   if (sg_has_page(sg) 
+   (page = (unsigned long) page_address(sg_page(sg))) != 0) {
for (i = 0; i  n; i++) {
if (page != oldpage) {  /* Already flushed? */
flush_page_for_dma(page);
@@ -274,7 +275,7 @@ static void iommu_get_scsi_sgl_pflush(struct device *dev, 
struct scatterlist *sg
}
}
 
-   sg-dma_address = iommu_get_one(dev, sg_page(sg), n) + 
sg-offset;
+   sg-dma_address = iommu_get_one(dev, sg_pfn(sg), n) + 
sg-offset;
sg-dma_length = sg-length;
sg = sg_next(sg);
}
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 16/31] s390: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Use sg_phys() instead of page_to_phys(sg_page(sg)) so that we don't
require a page structure for all DMA memory.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/s390/pci/pci_dma.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index 6fd8d58..aae5a47 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -272,14 +272,13 @@ int dma_set_mask(struct device *dev, u64 mask)
 }
 EXPORT_SYMBOL_GPL(dma_set_mask);
 
-static dma_addr_t s390_dma_map_pages(struct device *dev, struct page *page,
-unsigned long offset, size_t size,
+static dma_addr_t s390_dma_map_phys(struct device *dev, unsigned long pa,
+size_t size,
 enum dma_data_direction direction,
 struct dma_attrs *attrs)
 {
struct zpci_dev *zdev = get_zdev(to_pci_dev(dev));
unsigned long nr_pages, iommu_page_index;
-   unsigned long pa = page_to_phys(page) + offset;
int flags = ZPCI_PTE_VALID;
dma_addr_t dma_addr;
 
@@ -301,7 +300,7 @@ static dma_addr_t s390_dma_map_pages(struct device *dev, 
struct page *page,
 
if (!dma_update_trans(zdev, pa, dma_addr, size, flags)) {
atomic64_add(nr_pages, zdev-mapped_pages);
-   return dma_addr + (offset  ~PAGE_MASK);
+   return dma_addr + (pa  ~PAGE_MASK);
}
 
 out_free:
@@ -312,6 +311,16 @@ out_err:
return DMA_ERROR_CODE;
 }
 
+static dma_addr_t s390_dma_map_pages(struct device *dev, struct page *page,
+unsigned long offset, size_t size,
+enum dma_data_direction direction,
+struct dma_attrs *attrs)
+{
+   unsigned long pa = page_to_phys(page) + offset;
+
+   return s390_dma_map_phys(dev, pa, size, direction, attrs);
+}
+
 static void s390_dma_unmap_pages(struct device *dev, dma_addr_t dma_addr,
 size_t size, enum dma_data_direction direction,
 struct dma_attrs *attrs)
@@ -384,8 +393,7 @@ static int s390_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
int i;
 
for_each_sg(sg, s, nr_elements, i) {
-   struct page *page = sg_page(s);
-   s-dma_address = s390_dma_map_pages(dev, page, s-offset,
+   s-dma_address = s390_dma_map_phys(dev, sg_phys(s),
s-length, dir, NULL);
if (!dma_mapping_error(dev, s-dma_address)) {
s-dma_length = s-length;
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 22/31] metag: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Make all cache invalidation conditional on sg_has_page().

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/metag/include/asm/dma-mapping.h | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/metag/include/asm/dma-mapping.h 
b/arch/metag/include/asm/dma-mapping.h
index eb5cdec..2ae9057 100644
--- a/arch/metag/include/asm/dma-mapping.h
+++ b/arch/metag/include/asm/dma-mapping.h
@@ -55,10 +55,9 @@ dma_map_sg(struct device *dev, struct scatterlist *sglist, 
int nents,
WARN_ON(nents == 0 || sglist[0].length == 0);
 
for_each_sg(sglist, sg, nents, i) {
-   BUG_ON(!sg_page(sg));
-
sg-dma_address = sg_phys(sg);
-   dma_sync_for_device(sg_virt(sg), sg-length, direction);
+   if (sg_has_page(sg))
+   dma_sync_for_device(sg_virt(sg), sg-length, direction);
}
 
return nents;
@@ -94,10 +93,9 @@ dma_unmap_sg(struct device *dev, struct scatterlist *sglist, 
int nhwentries,
WARN_ON(nhwentries == 0 || sglist[0].length == 0);
 
for_each_sg(sglist, sg, nhwentries, i) {
-   BUG_ON(!sg_page(sg));
-
sg-dma_address = sg_phys(sg);
-   dma_sync_for_cpu(sg_virt(sg), sg-length, direction);
+   if (sg_has_page(sg))
+   dma_sync_for_cpu(sg_virt(sg), sg-length, direction);
}
 }
 
@@ -140,8 +138,10 @@ dma_sync_sg_for_cpu(struct device *dev, struct scatterlist 
*sglist, int nelems,
int i;
struct scatterlist *sg;
 
-   for_each_sg(sglist, sg, nelems, i)
-   dma_sync_for_cpu(sg_virt(sg), sg-length, direction);
+   for_each_sg(sglist, sg, nelems, i) {
+   if (sg_has_page(sg))
+   dma_sync_for_cpu(sg_virt(sg), sg-length, direction);
+   }
 }
 
 static inline void
@@ -151,8 +151,10 @@ dma_sync_sg_for_device(struct device *dev, struct 
scatterlist *sglist,
int i;
struct scatterlist *sg;
 
-   for_each_sg(sglist, sg, nelems, i)
-   dma_sync_for_device(sg_virt(sg), sg-length, direction);
+   for_each_sg(sglist, sg, nelems, i) {
+   if (sg_has_page(sg))
+   dma_sync_for_device(sg_virt(sg), sg-length, direction);
+   }
 }
 
 static inline int
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 09/10] Define PERF_PMU_TXN_READ interface

2015-08-12 Thread Peter Zijlstra
On Tue, Aug 11, 2015 at 09:14:00PM -0700, Sukadev Bhattiprolu wrote:
 | +static void __perf_read_group_add(struct perf_event *leader, u64 
 read_format, u64 *values)
 |  {
 | +   struct perf_event *sub;
 | +   int n = 1; /* skip @nr */
 
 This n = 1 is to skip over the values[0] = 1 + nr_siblings in the
 caller.
 
 Anyway, in __perf_read_group_add() we always start with n = 1, however
 ...
 | 
 | +   perf_event_read(leader, true);
 | +
 | +   /*
 | +* Since we co-schedule groups, {enabled,running} times of siblings
 | +* will be identical to those of the leader, so we only publish one
 | +* set.
 | +*/
 | +   if (read_format  PERF_FORMAT_TOTAL_TIME_ENABLED) {
 | +   values[n++] += leader-total_time_enabled +
 | +   atomic64_read(leader-child_total_time_enabled);

Note how this is an in-place addition,

 | +   }
 | 
 | +   if (read_format  PERF_FORMAT_TOTAL_TIME_RUNNING) {
 | +   values[n++] += leader-total_time_running +
 | +   atomic64_read(leader-child_total_time_running);

and here,

 | +   }
 | 
 | +   /*
 | +* Write {count,id} tuples for every sibling.
 | +*/
 | +   values[n++] += perf_event_count(leader);

and here,


 | if (read_format  PERF_FORMAT_ID)
 | values[n++] = primary_event_id(leader);

and this will always assign the same value.

 | +   list_for_each_entry(sub, leader-sibling_list, group_entry) {
 | +   values[n++] += perf_event_count(sub);
 | +   if (read_format  PERF_FORMAT_ID)
 | +   values[n++] = primary_event_id(sub);

Same for these, therefore,

 | +   }
 | +}
 | 
 | +static int perf_read_group(struct perf_event *event,
 | +  u64 read_format, char __user *buf)
 | +{
 | +   struct perf_event *leader = event-group_leader, *child;
 | +   struct perf_event_context *ctx = leader-ctx;
 | +   int ret = leader-read_size;
 | +   u64 *values;
 | 
 | +   lockdep_assert_held(ctx-mutex);
 | 
 | +   values = kzalloc(event-read_size);
 | +   if (!values)
 | +   return -ENOMEM;
 | 
 | +   values[0] = 1 + leader-nr_siblings;
 | 
 | +   /*
 | +* By locking the child_mutex of the leader we effectively
 | +* lock the child list of all siblings.. XXX explain how.
 | +*/
 | +   mutex_lock(leader-child_mutex);
 | 
 | +   __perf_read_group_add(leader, read_format, values);
 
 ... we don't copy_to_user() here,
 
 | +   list_for_each_entry(child, leader-child_list, child_list)
 | +   __perf_read_group_add(child, read_format, values);
 
 so won't we overwrite the values[], if we always start at n = 1
 in __perf_read_group_add()?

yes and no, we have to re-iterate the same values for each child as they
all have the same group, but we add the time and count fields, we do not
overwrite. The _add() suffix was supposed to be a hint ;-)

 | +   mutex_unlock(leader-child_mutex);
 | +
 | +   if (copy_to_user(buf, values, event-read_size))
 | +   ret = -EFAULT;
 | +
 | +   kfree(values);
 | 
 | return ret;
 |  }

Where previously we would iterate the group and for each member
iterate/sum all the child values together before copying the value out,
we now, because we need to read groups together, need to first iterate
the child list and sum whole groups.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/xmon: Allow limiting the size of the paca display

2015-08-12 Thread Michael Ellerman
The paca display is already more than 24 lines, which can be problematic
if you have an old school 80x24 terminal, or more likely you are on a
virtual terminal which does not scroll for whatever reason.

We'd like to expand the paca display even more, so add a way to limit
the number of lines that are displayed.

This adds a third form of 'dp' which is 'dp # #', where the first number
is the cpu, and the second is the number of lines to display.

Example output:

  5:mon dp 3 6
  paca for cpu 0x3 @ cfdc0d80:
   possible = yes
   present  = yes
   online   = yes
   lock_token   = 0x8000(0xa)
   paca_index   = 0x3   (0x8)

Signed-off-by: Michael Ellerman m...@ellerman.id.au
---
 arch/powerpc/xmon/xmon.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index e599259d84fc..6f44e9c07f34 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -205,6 +205,7 @@ Commands:\n\
 #ifdef CONFIG_PPC64
   \
   dp[#]dump paca for current cpu, or cpu #\n\
+  dp##  dump paca for cpu #, only # lines\n\
   dpa  dump paca for all possible cpus\n
 #endif
   \
@@ -2070,9 +2071,10 @@ static void xmon_rawdump (unsigned long adrs, long ndump)
 }
 
 #ifdef CONFIG_PPC64
-static void dump_one_paca(int cpu)
+static void dump_one_paca(int cpu, int num_lines)
 {
struct paca_struct *p;
+   int i;
 
if (setjmp(bus_error_jmp) != 0) {
printf(*** Error dumping paca for cpu 0x%x!\n, cpu);
@@ -2090,9 +2092,12 @@ static void dump_one_paca(int cpu)
printf( %-*s = %s\n, 16, present, cpu_present(cpu) ? yes : no);
printf( %-*s = %s\n, 16, online, cpu_online(cpu) ? yes : no);
 
+   i = 4; /* We always print the first four lines */
+
 #define DUMP(paca, name, format) \
-   printf( %-*s = %#-*format\t(0x%lx)\n, 16, #name, 18, paca-name, \
-   offsetof(struct paca_struct, name));
+   if (!num_lines || i++  num_lines)  \
+   printf( %-*s = %#-*format\t(0x%lx)\n, 16, #name, 18, \
+  paca-name, offsetof(struct paca_struct, name));
 
DUMP(p, lock_token, x);
DUMP(p, paca_index, x);
@@ -2135,7 +2140,7 @@ static void dump_all_pacas(void)
}
 
for_each_possible_cpu(cpu)
-   dump_one_paca(cpu);
+   dump_one_paca(cpu, 0);
 }
 
 static void dump_pacas(void)
@@ -2151,10 +2156,12 @@ static void dump_pacas(void)
 
termch = c; /* Put c back, it wasn't 'a' */
 
-   if (scanhex(num))
-   dump_one_paca(num);
-   else
-   dump_one_paca(xmon_owner);
+   if (scanhex(num)) {
+   unsigned long lines = 0;
+   scanhex(lines);
+   dump_one_paca(num, lines);
+   } else
+   dump_one_paca(xmon_owner, 0);
 }
 #endif
 
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 03/31] dma-debug: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Use sg_pfn to get a the PFN and skip checks that require a kernel
virtual address.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 lib/dma-debug.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index dace71f..a215a80 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -1368,7 +1368,7 @@ void debug_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
 
entry-type   = dma_debug_sg;
entry-dev= dev;
-   entry-pfn= page_to_pfn(sg_page(s));
+   entry-pfn= sg_pfn(s);
entry-offset = s-offset,
entry-size   = sg_dma_len(s);
entry-dev_addr   = sg_dma_address(s);
@@ -1376,7 +1376,7 @@ void debug_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
entry-sg_call_ents   = nents;
entry-sg_mapped_ents = mapped_ents;
 
-   if (!PageHighMem(sg_page(s))) {
+   if (sg_has_page(s)  !PageHighMem(sg_page(s))) {
check_for_stack(dev, sg_virt(s));
check_for_illegal_area(dev, sg_virt(s), sg_dma_len(s));
}
@@ -1419,7 +1419,7 @@ void debug_dma_unmap_sg(struct device *dev, struct 
scatterlist *sglist,
struct dma_debug_entry ref = {
.type   = dma_debug_sg,
.dev= dev,
-   .pfn= page_to_pfn(sg_page(s)),
+   .pfn= sg_pfn(s),
.offset = s-offset,
.dev_addr   = sg_dma_address(s),
.size   = sg_dma_len(s),
@@ -1580,7 +1580,7 @@ void debug_dma_sync_sg_for_cpu(struct device *dev, struct 
scatterlist *sg,
struct dma_debug_entry ref = {
.type   = dma_debug_sg,
.dev= dev,
-   .pfn= page_to_pfn(sg_page(s)),
+   .pfn= sg_pfn(s),
.offset = s-offset,
.dev_addr   = sg_dma_address(s),
.size   = sg_dma_len(s),
@@ -1613,7 +1613,7 @@ void debug_dma_sync_sg_for_device(struct device *dev, 
struct scatterlist *sg,
struct dma_debug_entry ref = {
.type   = dma_debug_sg,
.dev= dev,
-   .pfn= page_to_pfn(sg_page(s)),
+   .pfn= sg_pfn(s),
.offset = s-offset,
.dev_addr   = sg_dma_address(s),
.size   = sg_dma_len(s),
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 13/31] sparc/ldc: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Use

sg_phys(sg)  PAGE_MASK

instead of

page_to_pfn(sg_page(sg))  PAGE_SHIFT

to get at the page-aligned physical address ofa SG entry, so that
we don't require a page backing for SG entries.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/sparc/kernel/ldc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/sparc/kernel/ldc.c b/arch/sparc/kernel/ldc.c
index 1ae5eb1..0a29974 100644
--- a/arch/sparc/kernel/ldc.c
+++ b/arch/sparc/kernel/ldc.c
@@ -2051,7 +2051,7 @@ static void fill_cookies(struct cookie_state *sp, 
unsigned long pa,
 
 static int sg_count_one(struct scatterlist *sg)
 {
-   unsigned long base = page_to_pfn(sg_page(sg))  PAGE_SHIFT;
+   unsigned long base = sg_phys(sg)  PAGE_MASK;
long len = sg-length;
 
if ((sg-offset | len)  (8UL - 1))
@@ -2114,7 +2114,7 @@ int ldc_map_sg(struct ldc_channel *lp,
state.nc = 0;
 
for_each_sg(sg, s, num_sg, i) {
-   fill_cookies(state, page_to_pfn(sg_page(s))  PAGE_SHIFT,
+   fill_cookies(state, sg_phys(s)  PAGE_MASK,
 s-offset, s-length);
}
 
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 19/31] arc: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Make all cache invalidation conditional on sg_has_page() and use
sg_phys to get the physical address directly.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/arc/include/asm/dma-mapping.h | 26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/arch/arc/include/asm/dma-mapping.h 
b/arch/arc/include/asm/dma-mapping.h
index 2d28ba9..42eb526 100644
--- a/arch/arc/include/asm/dma-mapping.h
+++ b/arch/arc/include/asm/dma-mapping.h
@@ -108,9 +108,13 @@ dma_map_sg(struct device *dev, struct scatterlist *sg,
struct scatterlist *s;
int i;
 
-   for_each_sg(sg, s, nents, i)
-   s-dma_address = dma_map_page(dev, sg_page(s), s-offset,
-  s-length, dir);
+   for_each_sg(sg, s, nents, i) {
+   if (sg_has_page(s)) {
+   _dma_cache_sync((unsigned long)sg_virt(s), s-length,
+   dir);
+   }
+   s-dma_address = sg_phys(s);
+   }
 
return nents;
 }
@@ -163,8 +167,12 @@ dma_sync_sg_for_cpu(struct device *dev, struct scatterlist 
*sglist, int nelems,
int i;
struct scatterlist *sg;
 
-   for_each_sg(sglist, sg, nelems, i)
-   _dma_cache_sync((unsigned int)sg_virt(sg), sg-length, dir);
+   for_each_sg(sglist, sg, nelems, i) {
+   if (sg_has_page(sg)) {
+   _dma_cache_sync((unsigned int)sg_virt(sg), sg-length,
+   dir);
+   }
+   }
 }
 
 static inline void
@@ -174,8 +182,12 @@ dma_sync_sg_for_device(struct device *dev, struct 
scatterlist *sglist,
int i;
struct scatterlist *sg;
 
-   for_each_sg(sglist, sg, nelems, i)
-   _dma_cache_sync((unsigned int)sg_virt(sg), sg-length, dir);
+   for_each_sg(sglist, sg, nelems, i) {
+   if (sg_has_page(sg)) {
+   _dma_cache_sync((unsigned int)sg_virt(sg), sg-length,
+   dir);
+   }
+   }
 }
 
 static inline int dma_supported(struct device *dev, u64 dma_mask)
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 20/31] avr32: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Make all cache invalidation conditional on sg_has_page() and use
sg_phys to get the physical address directly, bypassing the noop
page_to_bus.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/avr32/include/asm/dma-mapping.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/avr32/include/asm/dma-mapping.h 
b/arch/avr32/include/asm/dma-mapping.h
index ae7ac92..a662ce2 100644
--- a/arch/avr32/include/asm/dma-mapping.h
+++ b/arch/avr32/include/asm/dma-mapping.h
@@ -216,11 +216,9 @@ dma_map_sg(struct device *dev, struct scatterlist *sglist, 
int nents,
struct scatterlist *sg;
 
for_each_sg(sglist, sg, nents, i) {
-   char *virt;
-
-   sg-dma_address = page_to_bus(sg_page(sg)) + sg-offset;
-   virt = sg_virt(sg);
-   dma_cache_sync(dev, virt, sg-length, direction);
+   sg-dma_address = sg_phys(sg);
+   if (sg_has_page(sg))
+   dma_cache_sync(dev, sg_virt(sg), sg-length, direction);
}
 
return nents;
@@ -328,8 +326,10 @@ dma_sync_sg_for_device(struct device *dev, struct 
scatterlist *sglist,
int i;
struct scatterlist *sg;
 
-   for_each_sg(sglist, sg, nents, i)
-   dma_cache_sync(dev, sg_virt(sg), sg-length, direction);
+   for_each_sg(sglist, sg, nents, i) {
+   if (sg_has_page(sg))
+   dma_cache_sync(dev, sg_virt(sg), sg-length, direction);
+   }
 }
 
 /* Now for the API extensions over the pci_ one */
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/xmon: Allow limiting the size of the paca display

2015-08-12 Thread Anshuman Khandual
On 08/12/2015 12:27 PM, Michael Ellerman wrote:
 The paca display is already more than 24 lines, which can be problematic
 if you have an old school 80x24 terminal, or more likely you are on a
 virtual terminal which does not scroll for whatever reason.
 
 We'd like to expand the paca display even more, so add a way to limit
 the number of lines that are displayed.
 
 This adds a third form of 'dp' which is 'dp # #', where the first number
 is the cpu, and the second is the number of lines to display.
 
 Example output:
 
   5:mon dp 3 6
   paca for cpu 0x3 @ cfdc0d80:
possible = yes
present  = yes
online   = yes
lock_token   = 0x8000  (0xa)
paca_index   = 0x3 (0x8)
 
 Signed-off-by: Michael Ellerman m...@ellerman.id.au
 ---
  arch/powerpc/xmon/xmon.c | 23 +++
  1 file changed, 15 insertions(+), 8 deletions(-)
 
 diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
 index e599259d84fc..6f44e9c07f34 100644
 --- a/arch/powerpc/xmon/xmon.c
 +++ b/arch/powerpc/xmon/xmon.c
 @@ -205,6 +205,7 @@ Commands:\n\
  #ifdef CONFIG_PPC64
\
dp[#]  dump paca for current cpu, or cpu #\n\
 +  dp##  dump paca for cpu #, only # lines\n\
dpadump paca for all possible cpus\n
  #endif
\
 @@ -2070,9 +2071,10 @@ static void xmon_rawdump (unsigned long adrs, long 
 ndump)
  }
  
  #ifdef CONFIG_PPC64
 -static void dump_one_paca(int cpu)
 +static void dump_one_paca(int cpu, int num_lines)
  {
   struct paca_struct *p;
 + int i;
  
   if (setjmp(bus_error_jmp) != 0) {
   printf(*** Error dumping paca for cpu 0x%x!\n, cpu);
 @@ -2090,9 +2092,12 @@ static void dump_one_paca(int cpu)
   printf( %-*s = %s\n, 16, present, cpu_present(cpu) ? yes : no);
   printf( %-*s = %s\n, 16, online, cpu_online(cpu) ? yes : no);
  
 + i = 4; /* We always print the first four lines */
 +
  #define DUMP(paca, name, format) \
 - printf( %-*s = %#-*format\t(0x%lx)\n, 16, #name, 18, paca-name, \
 - offsetof(struct paca_struct, name));
 + if (!num_lines || i++  num_lines)

All look good except the fact that we are using 0 to signify that
there is no limit to the number of lines. Is not it bit confusing ?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 8/8] powerpc/xmon: Add some more elements to the existing PACA dump list

2015-08-12 Thread Michael Ellerman
On Wed, 2015-07-29 at 12:40 +0530, Anshuman Khandual wrote:
 This patch adds a set of new elements to the existing PACA dump list
 inside an xmon session which can be listed below improving the overall
 xmon debug support.
 
 (1) hmi_event_available
 (2) dscr_default
 (3) vmalloc_sllp
 (4) slb_cache_ptr
 (5) sprg_vdso
 (6) tm_scratch
 (7) core_idle_state_ptr
 (8) thread_idle_state
 (9) thread_mask
 (10) slb_shadow
 (11) pgd
 (12) kernel_pgd
 (13) tcd_ptr
 (14) mc_kstack
 (15) crit_kstack
 (16) dbg_kstack
 (17) user_time
 (18) system_time
 (19) user_time_scaled
 (20) starttime
 (21) starttime_user
 (22) startspurr
 (23) utime_sspurr
 (24) stolen_time

Adding these makes the paca display much longer than 24 lines. I know in
general we don't worry too much about folks on 80x24 green screens, but it's
nice if xmon works OK on those. Or on virtual consoles that don't scroll for
whatever reason.

So I'm going to hold off on this one until we have a way to display some of the
paca. I have an idea for that and will send a patch if it works.

cheers


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 01/11] cxl: Convert MMIO read/write macros to inline functions

2015-08-12 Thread Cyril Bur
On Wed, 12 Aug 2015 10:48:10 +1000
Daniel Axtens d...@axtens.net wrote:

 We're about to make these more complex, so make them functions
 first.
 

Reviewed-by: Cyril Bur cyril...@gmail.com

 Signed-off-by: Daniel Axtens d...@axtens.net
 ---
  drivers/misc/cxl/cxl.h | 51 
 ++
  1 file changed, 35 insertions(+), 16 deletions(-)
 
 diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
 index 4fd66cabde1e..6a93bfbcd826 100644
 --- a/drivers/misc/cxl/cxl.h
 +++ b/drivers/misc/cxl/cxl.h
 @@ -537,10 +537,15 @@ static inline void __iomem *_cxl_p1_addr(struct cxl 
 *cxl, cxl_p1_reg_t reg)
   return cxl-p1_mmio + cxl_reg_off(reg);
  }
  
 -#define cxl_p1_write(cxl, reg, val) \
 - out_be64(_cxl_p1_addr(cxl, reg), val)
 -#define cxl_p1_read(cxl, reg) \
 - in_be64(_cxl_p1_addr(cxl, reg))
 +static inline void cxl_p1_write(struct cxl *cxl, cxl_p1_reg_t reg, u64 val)
 +{
 + out_be64(_cxl_p1_addr(cxl, reg), val);
 +}
 +
 +static inline u64 cxl_p1_read(struct cxl *cxl, cxl_p1_reg_t reg)
 +{
 + return in_be64(_cxl_p1_addr(cxl, reg));
 +}
  
  static inline void __iomem *_cxl_p1n_addr(struct cxl_afu *afu, cxl_p1n_reg_t 
 reg)
  {
 @@ -548,26 +553,40 @@ static inline void __iomem *_cxl_p1n_addr(struct 
 cxl_afu *afu, cxl_p1n_reg_t reg
   return afu-p1n_mmio + cxl_reg_off(reg);
  }
  
 -#define cxl_p1n_write(afu, reg, val) \
 - out_be64(_cxl_p1n_addr(afu, reg), val)
 -#define cxl_p1n_read(afu, reg) \
 - in_be64(_cxl_p1n_addr(afu, reg))
 +static inline void cxl_p1n_write(struct cxl_afu *afu, cxl_p1n_reg_t reg, u64 
 val)
 +{
 + out_be64(_cxl_p1n_addr(afu, reg), val);
 +}
 +
 +static inline u64 cxl_p1n_read(struct cxl_afu *afu, cxl_p1n_reg_t reg)
 +{
 + return in_be64(_cxl_p1n_addr(afu, reg));
 +}
  
  static inline void __iomem *_cxl_p2n_addr(struct cxl_afu *afu, cxl_p2n_reg_t 
 reg)
  {
   return afu-p2n_mmio + cxl_reg_off(reg);
  }
  
 -#define cxl_p2n_write(afu, reg, val) \
 - out_be64(_cxl_p2n_addr(afu, reg), val)
 -#define cxl_p2n_read(afu, reg) \
 - in_be64(_cxl_p2n_addr(afu, reg))
 +static inline void cxl_p2n_write(struct cxl_afu *afu, cxl_p2n_reg_t reg, u64 
 val)
 +{
 + out_be64(_cxl_p2n_addr(afu, reg), val);
 +}
  
 +static inline u64 cxl_p2n_read(struct cxl_afu *afu, cxl_p2n_reg_t reg)
 +{
 + return in_be64(_cxl_p2n_addr(afu, reg));
 +}
  
 -#define cxl_afu_cr_read64(afu, cr, off) \
 - in_le64((afu)-afu_desc_mmio + (afu)-crs_offset + ((cr) * 
 (afu)-crs_len) + (off))
 -#define cxl_afu_cr_read32(afu, cr, off) \
 - in_le32((afu)-afu_desc_mmio + (afu)-crs_offset + ((cr) * 
 (afu)-crs_len) + (off))
 +static inline u64 cxl_afu_cr_read64(struct cxl_afu *afu, int cr, u64 off)
 +{
 + return in_le64((afu)-afu_desc_mmio + (afu)-crs_offset + ((cr) * 
 (afu)-crs_len) + (off));
 +}
 +
 +static inline u32 cxl_afu_cr_read32(struct cxl_afu *afu, int cr, u64 off)
 +{
 + return in_le32((afu)-afu_desc_mmio + (afu)-crs_offset + ((cr) * 
 (afu)-crs_len) + (off));
 +}
  u16 cxl_afu_cr_read16(struct cxl_afu *afu, int cr, u64 off);
  u8 cxl_afu_cr_read8(struct cxl_afu *afu, int cr, u64 off);
  

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 10/11] cxl: EEH support

2015-08-12 Thread Cyril Bur
On Wed, 12 Aug 2015 10:48:19 +1000
Daniel Axtens d...@axtens.net wrote:

 EEH (Enhanced Error Handling) allows a driver to recover from the
 temporary failure of an attached PCI card. Enable basic CXL support
 for EEH.
 

Looks like the only change since was the removal of the #ifdef, if that is
correct.

Reviewed-by: Cyril Bur cyril...@gmail.com

 Signed-off-by: Daniel Axtens d...@axtens.net
 ---
  drivers/misc/cxl/cxl.h  |   1 +
  drivers/misc/cxl/pci.c  | 252 
 
  drivers/misc/cxl/vphb.c |   8 ++
  3 files changed, 261 insertions(+)
 
 diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
 index cda02412b01e..6f5386653dae 100644
 --- a/drivers/misc/cxl/cxl.h
 +++ b/drivers/misc/cxl/cxl.h
 @@ -726,6 +726,7 @@ int cxl_psl_purge(struct cxl_afu *afu);
  
  void cxl_stop_trace(struct cxl *cxl);
  int cxl_pci_vphb_add(struct cxl_afu *afu);
 +void cxl_pci_vphb_reconfigure(struct cxl_afu *afu);
  void cxl_pci_vphb_remove(struct cxl_afu *afu);
  
  extern struct pci_driver cxl_pci_driver;
 diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
 index b4a68a896a33..1eb26a357ce0 100644
 --- a/drivers/misc/cxl/pci.c
 +++ b/drivers/misc/cxl/pci.c
 @@ -24,6 +24,7 @@
  #include asm/io.h
  
  #include cxl.h
 +#include misc/cxl.h
  
  
  #define CXL_PCI_VSEC_ID  0x1280
 @@ -1246,10 +1247,261 @@ static void cxl_remove(struct pci_dev *dev)
   cxl_remove_adapter(adapter);
  }
  
 +static pci_ers_result_t cxl_vphb_error_detected(struct cxl_afu *afu,
 + pci_channel_state_t state)
 +{
 + struct pci_dev *afu_dev;
 + pci_ers_result_t result = PCI_ERS_RESULT_NEED_RESET;
 + pci_ers_result_t afu_result = PCI_ERS_RESULT_NEED_RESET;
 +
 + /* There should only be one entry, but go through the list
 +  * anyway
 +  */
 + list_for_each_entry(afu_dev, afu-phb-bus-devices, bus_list) {
 + if (!afu_dev-driver)
 + continue;
 +
 + afu_dev-error_state = state;
 +
 + if (afu_dev-driver-err_handler)
 + afu_result = 
 afu_dev-driver-err_handler-error_detected(afu_dev,
 + 
   state);
 + /* Disconnect trumps all, NONE trumps NEED_RESET */
 + if (afu_result == PCI_ERS_RESULT_DISCONNECT)
 + result = PCI_ERS_RESULT_DISCONNECT;
 + else if ((afu_result == PCI_ERS_RESULT_NONE) 
 +  (result == PCI_ERS_RESULT_NEED_RESET))
 + result = PCI_ERS_RESULT_NONE;
 + }
 + return result;
 +}
 +
 +static pci_ers_result_t cxl_pci_error_detected(struct pci_dev *pdev,
 +pci_channel_state_t state)
 +{
 + struct cxl *adapter = pci_get_drvdata(pdev);
 + struct cxl_afu *afu;
 + pci_ers_result_t result = PCI_ERS_RESULT_NEED_RESET;
 + int i;
 +
 + /* At this point, we could still have an interrupt pending.
 +  * Let's try to get them out of the way before they do
 +  * anything we don't like.
 +  */
 + schedule();
 +
 + /* If we're permanently dead, give up. */
 + if (state == pci_channel_io_perm_failure) {
 + /* Tell the AFU drivers; but we don't care what they
 +  * say, we're going away.
 +  */
 + for (i = 0; i  adapter-slices; i++) {
 + afu = adapter-afu[i];
 + cxl_vphb_error_detected(afu, state);
 + }
 + return PCI_ERS_RESULT_DISCONNECT;
 + }
 +
 + /* Are we reflashing?
 +  *
 +  * If we reflash, we could come back as something entirely
 +  * different, including a non-CAPI card. As such, by default
 +  * we don't participate in the process. We'll be unbound and
 +  * the slot re-probed. (TODO: check EEH doesn't blindly rebind
 +  * us!)
 +  *
 +  * However, this isn't the entire story: for reliablity
 +  * reasons, we usually want to reflash the FPGA on PERST in
 +  * order to get back to a more reliable known-good state.
 +  *
 +  * This causes us a bit of a problem: if we reflash we can't
 +  * trust that we'll come back the same - we could have a new
 +  * image and been PERSTed in order to load that
 +  * image. However, most of the time we actually *will* come
 +  * back the same - for example a regular EEH event.
 +  *
 +  * Therefore, we allow the user to assert that the image is
 +  * indeed the same and that we should continue on into EEH
 +  * anyway.
 +  */
 + if (adapter-perst_loads_image  !adapter-perst_same_image) {
 + /* TODO take the PHB out of CXL mode */
 + dev_info(pdev-dev, reflashing, so opting out of EEH!\n);
 + return PCI_ERS_RESULT_NONE;
 + }
 +
 + /*
 +  * At this point, we want to try to recover.  We'll always
 + 

[PATCH 04/31] x86/pci-nommu: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Just remove a BUG_ON, the code handles them just fine as-is.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/x86/kernel/pci-nommu.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/kernel/pci-nommu.c b/arch/x86/kernel/pci-nommu.c
index da15918..a218059 100644
--- a/arch/x86/kernel/pci-nommu.c
+++ b/arch/x86/kernel/pci-nommu.c
@@ -63,7 +63,6 @@ static int nommu_map_sg(struct device *hwdev, struct 
scatterlist *sg,
WARN_ON(nents == 0 || sg[0].length == 0);
 
for_each_sg(sg, s, nents, i) {
-   BUG_ON(!sg_page(s));
s-dma_address = sg_phys(s);
if (!check_addr(map_sg, hwdev, s-dma_address, s-length))
return 0;
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 09/31] ia64/pci_dma: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Use sg_phys() instead of virt_to_phys(sg_virt(sg)) so that we don't
require a kernel virtual address.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/ia64/sn/pci/pci_dma.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/ia64/sn/pci/pci_dma.c b/arch/ia64/sn/pci/pci_dma.c
index d0853e8..8f713c8 100644
--- a/arch/ia64/sn/pci/pci_dma.c
+++ b/arch/ia64/sn/pci/pci_dma.c
@@ -18,9 +18,6 @@
 #include asm/sn/pcidev.h
 #include asm/sn/sn_sal.h
 
-#define SG_ENT_VIRT_ADDRESS(sg)(sg_virt((sg)))
-#define SG_ENT_PHYS_ADDRESS(SG)virt_to_phys(SG_ENT_VIRT_ADDRESS(SG))
-
 /**
  * sn_dma_supported - test a DMA mask
  * @dev: device to test
@@ -291,7 +288,7 @@ static int sn_dma_map_sg(struct device *dev, struct 
scatterlist *sgl,
 */
for_each_sg(sgl, sg, nhwentries, i) {
dma_addr_t dma_addr;
-   phys_addr = SG_ENT_PHYS_ADDRESS(sg);
+   phys_addr = sg_phys(sg);
if (dmabarr)
dma_addr = provider-dma_map_consistent(pdev,
phys_addr,
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 26/31] openrisc: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Make all cache invalidation conditional on sg_has_page() and use
sg_phys to get the physical address directly.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/openrisc/kernel/dma.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c
index 0b77ddb..94ed052 100644
--- a/arch/openrisc/kernel/dma.c
+++ b/arch/openrisc/kernel/dma.c
@@ -184,8 +184,13 @@ or1k_map_sg(struct device *dev, struct scatterlist *sg,
int i;
 
for_each_sg(sg, s, nents, i) {
-   s-dma_address = or1k_map_page(dev, sg_page(s), s-offset,
-  s-length, dir, NULL);
+   if (sg_has_page(s)) {
+   s-dma_address = or1k_map_page(dev, sg_page(s),
+   s-offset, s-length, dir,
+   NULL);
+   } else {
+   s-dma_address = sg_phys(s);
+   }
}
 
return nents;
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 31/31] dma-mapping-common: skip kmemleak checks for page-less SG entries

2015-08-12 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig h...@lst.de
---
 include/asm-generic/dma-mapping-common.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/dma-mapping-common.h 
b/include/asm-generic/dma-mapping-common.h
index 940d5ec..afc3eaf 100644
--- a/include/asm-generic/dma-mapping-common.h
+++ b/include/asm-generic/dma-mapping-common.h
@@ -51,8 +51,10 @@ static inline int dma_map_sg_attrs(struct device *dev, 
struct scatterlist *sg,
int i, ents;
struct scatterlist *s;
 
-   for_each_sg(sg, s, nents, i)
-   kmemcheck_mark_initialized(sg_virt(s), s-length);
+   for_each_sg(sg, s, nents, i) {
+   if (sg_has_page(s))
+   kmemcheck_mark_initialized(sg_virt(s), s-length);
+   }
BUG_ON(!valid_dma_direction(dir));
ents = ops-map_sg(dev, sg, nents, dir, attrs);
BUG_ON(ents  0);
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 06/11] cxl: Refactor adaptor init/teardown

2015-08-12 Thread Cyril Bur
On Wed, 12 Aug 2015 10:48:15 +1000
Daniel Axtens d...@axtens.net wrote:

 Some aspects of initialisation are done only once in the lifetime of
 an adapter: for example, allocating memory for the adapter,
 allocating the adapter number, or setting up sysfs/debugfs files.
 
 However, we may want to be able to do some parts of the
 initialisation multiple times: for example, in error recovery we
 want to be able to tear down and then re-map IO memory and IRQs.
 
 Therefore, refactor CXL init/teardown as follows.
 
  - Keep the overarching functions 'cxl_init_adapter' and its pair,
'cxl_remove_adapter'.
 
  - Move all 'once only' allocation/freeing steps to the existing
'cxl_alloc_adapter' function, and its pair 'cxl_release_adapter'
(This involves moving allocation of the adapter number out of
cxl_init_adapter.)
 
  - Create two new functions: 'cxl_configure_adapter', and its pair
'cxl_deconfigure_adapter'. These two functions 'wire up' the
hardware --- they (de)configure resources that do not need to
last the entire lifetime of the adapter
 

Reviewed-by: Cyril Bur cyril...@gmail.com

 Signed-off-by: Daniel Axtens d...@axtens.net
 ---
  drivers/misc/cxl/pci.c | 140 
 ++---
  1 file changed, 87 insertions(+), 53 deletions(-)
 
 diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
 index 484d35a5aead..f6cb089ff981 100644
 --- a/drivers/misc/cxl/pci.c
 +++ b/drivers/misc/cxl/pci.c
 @@ -965,7 +965,6 @@ static int cxl_read_vsec(struct cxl *adapter, struct 
 pci_dev *dev)
   CXL_READ_VSEC_BASE_IMAGE(dev, vsec, adapter-base_image);
   CXL_READ_VSEC_IMAGE_STATE(dev, vsec, image_state);
   adapter-user_image_loaded = !!(image_state  
 CXL_VSEC_USER_IMAGE_LOADED);
 - adapter-perst_loads_image = true;
   adapter-perst_select_user = !!(image_state  
 CXL_VSEC_USER_IMAGE_LOADED);
  
   CXL_READ_VSEC_NAFUS(dev, vsec, adapter-slices);
 @@ -1025,22 +1024,34 @@ static void cxl_release_adapter(struct device *dev)
  
   pr_devel(cxl_release_adapter\n);
  
 + cxl_remove_adapter_nr(adapter);
 +
   kfree(adapter);
  }
  
 -static struct cxl *cxl_alloc_adapter(struct pci_dev *dev)
 +static struct cxl *cxl_alloc_adapter(void)
  {
   struct cxl *adapter;
 + int rc;
  
   if (!(adapter = kzalloc(sizeof(struct cxl), GFP_KERNEL)))
   return NULL;
  
 - adapter-dev.parent = dev-dev;
 - adapter-dev.release = cxl_release_adapter;
 - pci_set_drvdata(dev, adapter);
   spin_lock_init(adapter-afu_list_lock);
  
 + if ((rc = cxl_alloc_adapter_nr(adapter)))
 + goto err1;
 +
 + if ((rc = dev_set_name(adapter-dev, card%i, adapter-adapter_num)))
 + goto err2;
 +
   return adapter;
 +
 +err2:
 + cxl_remove_adapter_nr(adapter);
 +err1:
 + kfree(adapter);
 + return NULL;
  }
  
  static int sanitise_adapter_regs(struct cxl *adapter)
 @@ -1049,57 +1060,96 @@ static int sanitise_adapter_regs(struct cxl *adapter)
   return cxl_tlb_slb_invalidate(adapter);
  }
  
 -static struct cxl *cxl_init_adapter(struct pci_dev *dev)
 +/* This should contain *only* operations that can safely be done in
 + * both creation and recovery.
 + */
 +static int cxl_configure_adapter(struct cxl *adapter, struct pci_dev *dev)
  {
 - struct cxl *adapter;
 - bool free = true;
   int rc;
  
 + adapter-dev.parent = dev-dev;
 + adapter-dev.release = cxl_release_adapter;
 + pci_set_drvdata(dev, adapter);
  
 - if (!(adapter = cxl_alloc_adapter(dev)))
 - return ERR_PTR(-ENOMEM);
 + rc = pci_enable_device(dev);
 + if (rc) {
 + dev_err(dev-dev, pci_enable_device failed: %i\n, rc);
 + return rc;
 + }
  
   if ((rc = cxl_read_vsec(adapter, dev)))
 - goto err1;
 + return rc;
  
   if ((rc = cxl_vsec_looks_ok(adapter, dev)))
 - goto err1;
 + return rc;
  
   if ((rc = setup_cxl_bars(dev)))
 - goto err1;
 + return rc;
  
   if ((rc = switch_card_to_cxl(dev)))
 - goto err1;
 -
 - if ((rc = cxl_alloc_adapter_nr(adapter)))
 - goto err1;
 -
 - if ((rc = dev_set_name(adapter-dev, card%i, adapter-adapter_num)))
 - goto err2;
 + return rc;
  
   if ((rc = cxl_update_image_control(adapter)))
 - goto err2;
 + return rc;
  
   if ((rc = cxl_map_adapter_regs(adapter, dev)))
 - goto err2;
 + return rc;
  
   if ((rc = sanitise_adapter_regs(adapter)))
 - goto err2;
 + goto err;
  
   if ((rc = init_implementation_adapter_regs(adapter, dev)))
 - goto err3;
 + goto err;
  
   if ((rc = pnv_phb_to_cxl_mode(dev, OPAL_PHB_CAPI_MODE_CAPI)))
 - goto err3;
 + goto err;
  
   /* If recovery happened, the last step is to turn on snooping.
* In 

[PATCH 02/31] scatterlist: use sg_phys()

2015-08-12 Thread Christoph Hellwig
From: Dan Williams dan.j.willi...@intel.com

Coccinelle cleanup to replace open coded sg to physical address
translations.  This is in preparation for introducing scatterlists that
reference __pfn_t.

// sg_phys.cocci: convert usage page_to_phys(sg_page(sg)) to sg_phys(sg)
// usage: make coccicheck COCCI=sg_phys.cocci MODE=patch

virtual patch

@@
struct scatterlist *sg;
@@

- page_to_phys(sg_page(sg)) + sg-offset
+ sg_phys(sg)

@@
struct scatterlist *sg;
@@

- page_to_phys(sg_page(sg))
+ sg_phys(sg)  PAGE_MASK

Signed-off-by: Dan Williams dan.j.willi...@intel.com
---
 arch/arm/mm/dma-mapping.c| 2 +-
 arch/microblaze/kernel/dma.c | 3 +--
 drivers/iommu/intel-iommu.c  | 4 ++--
 drivers/iommu/iommu.c| 2 +-
 drivers/staging/android/ion/ion_chunk_heap.c | 4 ++--
 5 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index cba12f3..3d3d6aa 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1520,7 +1520,7 @@ static int __map_sg_chunk(struct device *dev, struct 
scatterlist *sg,
return -ENOMEM;
 
for (count = 0, s = sg; count  (size  PAGE_SHIFT); s = sg_next(s)) {
-   phys_addr_t phys = page_to_phys(sg_page(s));
+   phys_addr_t phys = sg_phys(s)  PAGE_MASK;
unsigned int len = PAGE_ALIGN(s-offset + s-length);
 
if (!is_coherent 
diff --git a/arch/microblaze/kernel/dma.c b/arch/microblaze/kernel/dma.c
index bf4dec2..c89da63 100644
--- a/arch/microblaze/kernel/dma.c
+++ b/arch/microblaze/kernel/dma.c
@@ -61,8 +61,7 @@ static int dma_direct_map_sg(struct device *dev, struct 
scatterlist *sgl,
/* FIXME this part of code is untested */
for_each_sg(sgl, sg, nents, i) {
sg-dma_address = sg_phys(sg);
-   __dma_sync(page_to_phys(sg_page(sg)) + sg-offset,
-   sg-length, direction);
+   __dma_sync(sg_phys(sg), sg-length, direction);
}
 
return nents;
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0649b94..3541d65 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2097,7 +2097,7 @@ static int __domain_mapping(struct dmar_domain *domain, 
unsigned long iov_pfn,
sg_res = aligned_nrpages(sg-offset, sg-length);
sg-dma_address = ((dma_addr_t)iov_pfn  
VTD_PAGE_SHIFT) + sg-offset;
sg-dma_length = sg-length;
-   pteval = page_to_phys(sg_page(sg)) | prot;
+   pteval = (sg_phys(sg)  PAGE_MASK) | prot;
phys_pfn = pteval  VTD_PAGE_SHIFT;
}
 
@@ -3623,7 +3623,7 @@ static int intel_nontranslate_map_sg(struct device *hddev,
 
for_each_sg(sglist, sg, nelems, i) {
BUG_ON(!sg_page(sg));
-   sg-dma_address = page_to_phys(sg_page(sg)) + sg-offset;
+   sg-dma_address = sg_phys(sg);
sg-dma_length = sg-length;
}
return nelems;
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f286090..049df49 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1408,7 +1408,7 @@ size_t default_iommu_map_sg(struct iommu_domain *domain, 
unsigned long iova,
min_pagesz = 1  __ffs(domain-ops-pgsize_bitmap);
 
for_each_sg(sg, s, nents, i) {
-   phys_addr_t phys = page_to_phys(sg_page(s)) + s-offset;
+   phys_addr_t phys = sg_phys(s);
 
/*
 * We are mapping on IOMMU page boundaries, so offset within
diff --git a/drivers/staging/android/ion/ion_chunk_heap.c 
b/drivers/staging/android/ion/ion_chunk_heap.c
index 5474615..f7b6ef9 100644
--- a/drivers/staging/android/ion/ion_chunk_heap.c
+++ b/drivers/staging/android/ion/ion_chunk_heap.c
@@ -81,7 +81,7 @@ static int ion_chunk_heap_allocate(struct ion_heap *heap,
 err:
sg = table-sgl;
for (i -= 1; i = 0; i--) {
-   gen_pool_free(chunk_heap-pool, page_to_phys(sg_page(sg)),
+   gen_pool_free(chunk_heap-pool, sg_phys(sg)  PAGE_MASK,
  sg-length);
sg = sg_next(sg);
}
@@ -109,7 +109,7 @@ static void ion_chunk_heap_free(struct ion_buffer *buffer)
DMA_BIDIRECTIONAL);
 
for_each_sg(table-sgl, sg, table-nents, i) {
-   gen_pool_free(chunk_heap-pool, page_to_phys(sg_page(sg)),
+   gen_pool_free(chunk_heap-pool, sg_phys(sg)  PAGE_MASK,
  sg-length);
}
chunk_heap-allocated -= allocated_size;
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 06/31] alpha/pci-noop: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Use sg_phys() instead of virt_to_phys(sg_virt(sg)) so that we don't
require a kernel virtual address.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/alpha/kernel/pci-noop.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/alpha/kernel/pci-noop.c b/arch/alpha/kernel/pci-noop.c
index df24b76..7319151 100644
--- a/arch/alpha/kernel/pci-noop.c
+++ b/arch/alpha/kernel/pci-noop.c
@@ -145,11 +145,7 @@ static int alpha_noop_map_sg(struct device *dev, struct 
scatterlist *sgl, int ne
struct scatterlist *sg;
 
for_each_sg(sgl, sg, nents, i) {
-   void *va;
-
-   BUG_ON(!sg_page(sg));
-   va = sg_virt(sg);
-   sg_dma_address(sg) = (dma_addr_t)virt_to_phys(va);
+   sg_dma_address(sg) = (dma_addr_t)sg_phys(sg);
sg_dma_len(sg) = sg-length;
}
 
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 10/31] powerpc/iommu: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
For the iommu offset we just need and offset into the page.  Calculate
that using the physical address instead of using the virtual address
so that we don't require a virtual mapping.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/powerpc/kernel/iommu.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index a8e3490..0f52e40 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -457,7 +457,7 @@ int ppc_iommu_map_sg(struct device *dev, struct iommu_table 
*tbl,
 
max_seg_size = dma_get_max_seg_size(dev);
for_each_sg(sglist, s, nelems, i) {
-   unsigned long vaddr, npages, entry, slen;
+   unsigned long paddr, npages, entry, slen;
 
slen = s-length;
/* Sanity check */
@@ -466,22 +466,22 @@ int ppc_iommu_map_sg(struct device *dev, struct 
iommu_table *tbl,
continue;
}
/* Allocate iommu entries for that segment */
-   vaddr = (unsigned long) sg_virt(s);
-   npages = iommu_num_pages(vaddr, slen, IOMMU_PAGE_SIZE(tbl));
+   paddr = sg_phys(s);
+   npages = iommu_num_pages(paddr, slen, IOMMU_PAGE_SIZE(tbl));
align = 0;
if (tbl-it_page_shift  PAGE_SHIFT  slen = PAGE_SIZE 
-   (vaddr  ~PAGE_MASK) == 0)
+   (paddr  ~PAGE_MASK) == 0)
align = PAGE_SHIFT - tbl-it_page_shift;
entry = iommu_range_alloc(dev, tbl, npages, handle,
  mask  tbl-it_page_shift, align);
 
-   DBG(  - vaddr: %lx, size: %lx\n, vaddr, slen);
+   DBG(  - paddr: %lx, size: %lx\n, paddr, slen);
 
/* Handle failure */
if (unlikely(entry == DMA_ERROR_CODE)) {
if (printk_ratelimit())
dev_info(dev, iommu_alloc failed, tbl %p 
-vaddr %lx npages %lu\n, tbl, vaddr,
+paddr %lx npages %lu\n, tbl, paddr,
 npages);
goto failure;
}
@@ -496,7 +496,7 @@ int ppc_iommu_map_sg(struct device *dev, struct iommu_table 
*tbl,
 
/* Insert into HW table */
build_fail = tbl-it_ops-set(tbl, entry, npages,
- vaddr  IOMMU_PAGE_MASK(tbl),
+ paddr  IOMMU_PAGE_MASK(tbl),
  direction, attrs);
if(unlikely(build_fail))
goto failure;
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 18/31] nios2: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Make all cache invalidation conditional on sg_has_page() and use
sg_phys to get the physical address directly.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/nios2/mm/dma-mapping.c | 29 +++--
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/arch/nios2/mm/dma-mapping.c b/arch/nios2/mm/dma-mapping.c
index ac5da75..1a0a68d 100644
--- a/arch/nios2/mm/dma-mapping.c
+++ b/arch/nios2/mm/dma-mapping.c
@@ -64,13 +64,11 @@ int dma_map_sg(struct device *dev, struct scatterlist *sg, 
int nents,
BUG_ON(!valid_dma_direction(direction));
 
for_each_sg(sg, sg, nents, i) {
-   void *addr;
-
-   addr = sg_virt(sg);
-   if (addr) {
-   __dma_sync_for_device(addr, sg-length, direction);
-   sg-dma_address = sg_phys(sg);
+   if (sg_has_page(sg)) {
+   __dma_sync_for_device(sg_virt(sg), sg-length,
+   direction);
}
+   sg-dma_address = sg_phys(sg);
}
 
return nents;
@@ -113,9 +111,8 @@ void dma_unmap_sg(struct device *dev, struct scatterlist 
*sg, int nhwentries,
return;
 
for_each_sg(sg, sg, nhwentries, i) {
-   addr = sg_virt(sg);
-   if (addr)
-   __dma_sync_for_cpu(addr, sg-length, direction);
+   if (sg_has_page(sg))
+   __dma_sync_for_cpu(sg_virt(sg), sg-length, direction);
}
 }
 EXPORT_SYMBOL(dma_unmap_sg);
@@ -166,8 +163,10 @@ void dma_sync_sg_for_cpu(struct device *dev, struct 
scatterlist *sg, int nelems,
BUG_ON(!valid_dma_direction(direction));
 
/* Make sure that gcc doesn't leave the empty loop body.  */
-   for_each_sg(sg, sg, nelems, i)
-   __dma_sync_for_cpu(sg_virt(sg), sg-length, direction);
+   for_each_sg(sg, sg, nelems, i) {
+   if (sg_has_page(sg))
+   __dma_sync_for_cpu(sg_virt(sg), sg-length, direction);
+   }
 }
 EXPORT_SYMBOL(dma_sync_sg_for_cpu);
 
@@ -179,8 +178,10 @@ void dma_sync_sg_for_device(struct device *dev, struct 
scatterlist *sg,
BUG_ON(!valid_dma_direction(direction));
 
/* Make sure that gcc doesn't leave the empty loop body.  */
-   for_each_sg(sg, sg, nelems, i)
-   __dma_sync_for_device(sg_virt(sg), sg-length, direction);
-
+   for_each_sg(sg, sg, nelems, i) {
+   if (sg_has_page(sg))
+   __dma_sync_for_device(sg_virt(sg), sg-length,
+   direction);
+   }
 }
 EXPORT_SYMBOL(dma_sync_sg_for_device);
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 25/31] frv: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Only call kmap_atomic_primary when the SG entry is mapped into
kernel virtual space.

XXX: the code already looks odd due to the lack of pairing between
kmap_atomic_primary and kunmap_atomic_primary.  Does it work either
before or after this patch?

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/frv/mb93090-mb00/pci-dma.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/frv/mb93090-mb00/pci-dma.c b/arch/frv/mb93090-mb00/pci-dma.c
index 4d1f01d..77b3a1c 100644
--- a/arch/frv/mb93090-mb00/pci-dma.c
+++ b/arch/frv/mb93090-mb00/pci-dma.c
@@ -63,6 +63,9 @@ int dma_map_sg(struct device *dev, struct scatterlist 
*sglist, int nents,
dampr2 = __get_DAMPR(2);
 
for_each_sg(sglist, sg, nents, i) {
+   if (!sg_has_page(sg))
+   continue;
+
vaddr = kmap_atomic_primary(sg_page(sg));
 
frv_dcache_writeback((unsigned long) vaddr,
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 30/31] intel-iommu: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Just remove a BUG_ON, the code handles them just fine as-is.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 drivers/iommu/intel-iommu.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 3541d65..ae10573 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3622,7 +3622,6 @@ static int intel_nontranslate_map_sg(struct device *hddev,
struct scatterlist *sg;
 
for_each_sg(sglist, sg, nelems, i) {
-   BUG_ON(!sg_page(sg));
sg-dma_address = sg_phys(sg);
sg-dma_length = sg-length;
}
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 29/31] parisc: handle page-less SG entries

2015-08-12 Thread Christoph Hellwig
Make all cache invalidation conditional on sg_has_page() and use
sg_phys to get the physical address directly.

Signed-off-by: Christoph Hellwig h...@lst.de
---
 arch/parisc/kernel/pci-dma.c | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c
index b9402c9..6cad0e0 100644
--- a/arch/parisc/kernel/pci-dma.c
+++ b/arch/parisc/kernel/pci-dma.c
@@ -483,11 +483,13 @@ static int pa11_dma_map_sg(struct device *dev, struct 
scatterlist *sglist, int n
BUG_ON(direction == DMA_NONE);
 
for_each_sg(sglist, sg, nents, i) {
-   unsigned long vaddr = (unsigned long)sg_virt(sg);
-
-   sg_dma_address(sg) = (dma_addr_t) virt_to_phys(vaddr);
+   sg_dma_address(sg) = sg_phys(sg);
sg_dma_len(sg) = sg-length;
-   flush_kernel_dcache_range(vaddr, sg-length);
+
+   if (sg_has_page(sg)) {
+   flush_kernel_dcache_range((unsigned long)sg_virt(sg),
+ sg-length);
+   }
}
return nents;
 }
@@ -504,9 +506,10 @@ static void pa11_dma_unmap_sg(struct device *dev, struct 
scatterlist *sglist, in
 
/* once we do combining we'll need to use 
phys_to_virt(sg_dma_address(sglist)) */
 
-   for_each_sg(sglist, sg, nents, i)
-   flush_kernel_vmap_range(sg_virt(sg), sg-length);
-   return;
+   for_each_sg(sglist, sg, nents, i) {
+   if (sg_has_page(sg))
+   flush_kernel_vmap_range(sg_virt(sg), sg-length);
+   }
 }
 
 static void pa11_dma_sync_single_for_cpu(struct device *dev, dma_addr_t 
dma_handle, unsigned long offset, size_t size, enum dma_data_direction 
direction)
@@ -530,8 +533,10 @@ static void pa11_dma_sync_sg_for_cpu(struct device *dev, 
struct scatterlist *sgl
 
/* once we do combining we'll need to use 
phys_to_virt(sg_dma_address(sglist)) */
 
-   for_each_sg(sglist, sg, nents, i)
-   flush_kernel_vmap_range(sg_virt(sg), sg-length);
+   for_each_sg(sglist, sg, nents, i) {
+   if (sg_has_page(sg))
+   flush_kernel_vmap_range(sg_virt(sg), sg-length);
+   }
 }
 
 static void pa11_dma_sync_sg_for_device(struct device *dev, struct scatterlist 
*sglist, int nents, enum dma_data_direction direction)
@@ -541,8 +546,10 @@ static void pa11_dma_sync_sg_for_device(struct device 
*dev, struct scatterlist *
 
/* once we do combining we'll need to use 
phys_to_virt(sg_dma_address(sglist)) */
 
-   for_each_sg(sglist, sg, nents, i)
-   flush_kernel_vmap_range(sg_virt(sg), sg-length);
+   for_each_sg(sglist, sg, nents, i) {
+   if (sg_has_page(sg))
+   flush_kernel_vmap_range(sg_virt(sg), sg-length);
+   }
 }
 
 struct hppa_dma_ops pcxl_dma_ops = {
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 19/31] arc: handle page-less SG entries

2015-08-12 Thread Vineet Gupta
On Wednesday 12 August 2015 12:39 PM, Christoph Hellwig wrote:
 Make all cache invalidation conditional on sg_has_page() and use
 sg_phys to get the physical address directly.

 Signed-off-by: Christoph Hellwig h...@lst.de

With a minor nit below.

Acked-by: Vineet Gupta vgu...@synopsys.com

 ---
  arch/arc/include/asm/dma-mapping.h | 26 +++---
  1 file changed, 19 insertions(+), 7 deletions(-)

 diff --git a/arch/arc/include/asm/dma-mapping.h 
 b/arch/arc/include/asm/dma-mapping.h
 index 2d28ba9..42eb526 100644
 --- a/arch/arc/include/asm/dma-mapping.h
 +++ b/arch/arc/include/asm/dma-mapping.h
 @@ -108,9 +108,13 @@ dma_map_sg(struct device *dev, struct scatterlist *sg,
   struct scatterlist *s;
   int i;
  
 - for_each_sg(sg, s, nents, i)
 - s-dma_address = dma_map_page(dev, sg_page(s), s-offset,
 -s-length, dir);
 + for_each_sg(sg, s, nents, i) {
 + if (sg_has_page(s)) {
 + _dma_cache_sync((unsigned long)sg_virt(s), s-length,
 + dir);
 + }
 + s-dma_address = sg_phys(s);
 + }
  
   return nents;
  }
 @@ -163,8 +167,12 @@ dma_sync_sg_for_cpu(struct device *dev, struct 
 scatterlist *sglist, int nelems,
   int i;
   struct scatterlist *sg;
  
 - for_each_sg(sglist, sg, nelems, i)
 - _dma_cache_sync((unsigned int)sg_virt(sg), sg-length, dir);
 + for_each_sg(sglist, sg, nelems, i) {
 + if (sg_has_page(sg)) {
 + _dma_cache_sync((unsigned int)sg_virt(sg), sg-length,
 + dir);
 + }
 + }
  }
  
  static inline void
 @@ -174,8 +182,12 @@ dma_sync_sg_for_device(struct device *dev, struct 
 scatterlist *sglist,
   int i;
   struct scatterlist *sg;
  
 - for_each_sg(sglist, sg, nelems, i)
 - _dma_cache_sync((unsigned int)sg_virt(sg), sg-length, dir);
 + for_each_sg(sglist, sg, nelems, i) {
 + if (sg_has_page(sg)) {
 + _dma_cache_sync((unsigned int)sg_virt(sg), sg-length,
 + dir);

For consistency, could u please fix the left alignment of @dir above - another 
tab
perhaps ?

 + }
 + }
  }
  
  static inline int dma_supported(struct device *dev, u64 dma_mask)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v6 03/42] powerpc/powernv: Enable M64 on P7IOC

2015-08-12 Thread Gavin Shan
On Tue, Aug 11, 2015 at 12:06:26PM +1000, Alexey Kardashevskiy wrote:
On 08/11/2015 09:45 AM, Gavin Shan wrote:
On Mon, Aug 10, 2015 at 04:30:09PM +1000, Alexey Kardashevskiy wrote:
On 08/06/2015 02:11 PM, Gavin Shan wrote:
The patch enables M64 window on P7IOC, which has been enabled on
PHB3. Different from PHB3 where 16 M64 BARs are supported and each
of them can be owned by one particular PE# exclusively or divided
evenly to 256 segments, each P7IOC PHB has 16 M64 BARs and each
of them are divided into 8 segments.

Is this a limitation of POWER7 chip or it is from IODA1?


 From IODA1.

So each P7IOC PHB can support
128 M64 segments only. Also, P7IOC has M64DT, which helps mapping
one particular M64 segment# to arbitrary PE#. PHB3 doesn't have
M64DT, indicating that one M64 segment can only be pinned to the
fixed PE#. In order to have similar logic to support M64 for PHB3
and P7IOC, we just provide 128 M64 (16 BARs) segments and fixed
mapping between PE# and M64 segment# on P7IOC. In turn, we just
need different phb-init_m64() hooks for P7IOC and PHB3 to support
M64.

Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
  arch/powerpc/platforms/powernv/pci-ioda.c | 116 
 ++
  1 file changed, 104 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 38b5405..e4ac703 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -172,6 +172,69 @@ static void pnv_ioda_free_pe(struct pnv_phb *phb, int 
pe)
clear_bit(pe, phb-ioda.pe_alloc);
  }

+static int pnv_ioda1_init_m64(struct pnv_phb *phb)
+{
+   struct resource *r;
+   int seg;
+
+   /* There are as many M64 segments as the maximum number
+* of PEs, which is 128.
+*/
+   for (seg = 0; seg  phb-ioda.total_pe; seg += 8) {


This 8 is used a lot across the patch, please make it a macro
(PNV_PHB_P7IOC_SEGNUM or PNV_PHB_IODA1_SEGNUM or whatever you think it is)
with a short comment why it is 8. Or a pnv_phb member.


I would like to use 8. When having a macro, you have to check
the definition of the macro to get the real value of that.

Give it a good name then.


However,
it makes sense to add more comments explaining why it's 8 here.

You cannot comment it everywhere and everywhere is exact place when you'll
have to comment it as I believe sometime it is segments-per-M64 and sometime
it is number of bits in a byte (or not? anyway, this is will always distract
unless you use macro for segments-per-M64).


Ok. I will use PNV_PHB_IODA1_SEGNUM then.



+   unsigned long base;
+   int64_t rc;
+
+   base = phb-ioda.m64_base + seg * phb-ioda.m64_segsize;
+   rc = opal_pci_set_phb_mem_window(phb-opal_id,
+OPAL_M64_WINDOW_TYPE,
+seg / 8,
+base,
+0, /* unused */
+8 * phb-ioda.m64_segsize);
+   if (rc != OPAL_SUCCESS) {
+   pr_warn(  Error %lld setting M64 PHB#%d-BAR#%d\n,
+   rc, phb-hose-global_number, seg / 8);
+   goto fail;
+   }
+
+   rc = opal_pci_phb_mmio_enable(phb-opal_id,
+ OPAL_M64_WINDOW_TYPE,
+ seg / 8,
+ OPAL_ENABLE_M64_SPLIT);
+   if (rc != OPAL_SUCCESS) {
+   pr_warn(  Error %lld enabling M64 PHB#%d-BAR#%d\n,
+   rc, phb-hose-global_number, seg / 8);
+   goto fail;
+   }
+   }
+
+   /* Strip off the segment used by the reserved PE, which

What is this reserved PE on P7IOC? Strip off means exclude here?


127 that was exported from skiboot. Strip off means exclude.

I like exclude lot better.


Ok. Will use it.



+* is expected to be 0 or last supported PE#. The PHB's
+* first memory window traces the 32-bits MMIO range

s/traces/filters/ ? Or I did not understand this comment...


It seems you didn't understand it: there are two memory windows
in every PHB. The first one is tracing M32 resource and the
second one is tracing M64 resource.


Tracing means logging, pretty much. Is this what you mean here?


No, it means recording, not logging. So it would be appropriate
to replace it with track?



+* while the second one traces the 64-bits prefetchable
+* MMIO range that the PHB supports.

32/64 ranges comment seems irrelevant here.


Maybe it's not so relevant, but still.

Not relevant - remove it. Put this text to the commit log.


Ok.

We're stripping off the
M64 segment from the 2nd resource (as above), not first one.


2nd window (not _resource_), you mean?


I mean struct pci_controller::mem_resources[1].





+*/
+   r = 

Re: [PATCH v6 05/42] powerpc/powernv: Track IO/M32/M64 segments from PE

2015-08-12 Thread Gavin Shan
On Wed, Aug 12, 2015 at 09:05:09PM +1000, Alexey Kardashevskiy wrote:
On 08/12/2015 08:45 PM, Gavin Shan wrote:
On Tue, Aug 11, 2015 at 12:23:42PM +1000, Alexey Kardashevskiy wrote:
On 08/11/2015 10:03 AM, Gavin Shan wrote:
On Mon, Aug 10, 2015 at 05:16:40PM +1000, Alexey Kardashevskiy wrote:
On 08/06/2015 02:11 PM, Gavin Shan wrote:
The patch is adding 6 bitmaps, three to PE and three to PHB, to track

The patch is also removing 2 arrays (io_segmap and m32_segmap), what is 
that
all about? Also, there was no m64_segmap, now there is, needs an 
explanation
may be.


Originally, the bitmaps (io_segmap and m32_segmap) are allocated 
dynamically.
Now, they have fixed sizes - 512 bits.

The subject powerpc/powernv: Track IO/M32/M64 segments from PE indicates
why m64_segmap is added.


But before this patch, you somehow managed to keep it working without a map
for M64, by the same time you needed map for IO and M32. It seems you are
making things consistent in this patch but it also feels like you do not have
to do so as M64 did not need a map before and I cannot see why it needs one
now.


The M64 map is used by [PATCH v6 23/42] powerpc/powernv: Release PEs 
dynamically
where the M64 segments consumed by one particular PE will be released.


Then add it where it is really started being used. It is really hard to
review a patch which is actually spread between patches. Do not count that
reviewers will just trust you.


Ok. I'll try. 



the consumed by one particular PE, which can be released once the PE
is destroyed during PCI unplugging time. Also, we're using fixed
quantity of bits to trace the used IO, M32 and M64 segments by PEs
in one particular PHB.


Out of curiosity - have you considered having just 3 arrays, in PHB, 
storing
PE numbers, and ditching PE's arrays? Does PE itself need to know what PEs 
it
is using? Not sure about this master/slave PEs though.


I don't follow your suggestion. Can you rephrase and explain it a bit more?


Please explains in what situations you need same map in both PHB and PE and
how you are going to use them. For example, pe::m64_segmap and
phb::m64_segmap.

I believe you need to know what segment is used by what PE and that's it and
having 2 bitmaps is overcomplicated hard to follow. Is there anything else
what I am missing?


The situation is same to all (IO, M32 and M64) segment maps. Taking m64_segmap
as an example, it will be used when creating or destroying the PE who consumes
M64 segments. phb::m64_segmap is recording the M64 segment usage in PHB's 
domain.
It's used to check same M64 segment won't be used for towice. pe::m64_segmap 
tracks
the M64 segments consumed by the PE.


You could have a single map in PHB, key would be a segment number and value
would be PE number. No need to have a map in PE. At all. No need to
initialize bitmaps, etc.


So it would be arrays for various segmant maps if I understood your suggestion
as below. Please confirm:

#define PNV_IODA_MAX_SEG_NUM512

int struct pnv_phb::io_segmap[PNV_IODA_MAX_SEG_NUM];
m32_segmap[PNV_IODA_MAX_SEG_NUM];
m64_segmap[PNV_IODA_MAX_SEG_NUM];
- Initially, they are initialize to IODA_INVALID_PE;
- When one segment is assigned to one PE, the corresponding entry
  of the array is set to PE number.
- When one segment is relased, the corresponding entry of the array
  is set to IODA_INVALID_PE;
 

It would be easier to read patches if this one was right before
[PATCH v6 23/42] powerpc/powernv: Release PEs dynamically


I'll try to reoder the patch, but not expect too much...



Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
  arch/powerpc/platforms/powernv/pci-ioda.c | 29 
 +++--
  arch/powerpc/platforms/powernv/pci.h  | 18 ++
  2 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index e4ac703..78b49a1 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -388,6 +388,12 @@ static int pnv_ioda_pick_m64_pe(struct pci_bus *bus, 
bool all)
  list_add_tail(pe-list, master_pe-slaves);
  }

+ /* M64 segments consumed by slave PEs are tracked
+  * by master PE
+  */
+ set_bit(pe-pe_number, master_pe-m64_segmap);
+ set_bit(pe-pe_number, phb-ioda.m64_segmap);
+
  /* P7IOC supports M64DT, which helps mapping M64 segment
   * to one particular PE#. However, PHB3 has fixed 
 mapping
   * between M64 segment and PE#. In order to have same 
 logic
@@ -2871,9 +2877,11 @@ static void pnv_ioda_setup_pe_seg(struct 
pci_controller *hose,

  while (index  phb-ioda.total_pe 
 region.start = region.end) {
- phb-ioda.io_segmap[index] = pe-pe_number;
+   

RE: [PATCH v2 05/10] cxl: Refactor adaptor init/teardown

2015-08-12 Thread David Laight
From: Cyril Bur
 Sent: 11 August 2015 07:01
...
 You have a dilema with the use of ugly if (rc = foo()). I don't like it but 
 the
 file is littered with it.
 
 Looks like the majority of uses in this file the conditional block is only
 one line then it makes sense (or at least in terms of numbers of lines... fair
 enough), however, if you have a conditional block spanning multiple lines, I
 don't like.
...
  kfree(adapter);
   }
 
  -static struct cxl *cxl_alloc_adapter(struct pci_dev *dev)
  +static struct cxl *cxl_alloc_adapter(void)
   {
  struct cxl *adapter;
  +   int rc;
 
  if (!(adapter = kzalloc(sizeof(struct cxl), GFP_KERNEL)))
  return NULL;
 
  -   adapter-dev.parent = dev-dev;
  -   adapter-dev.release = cxl_release_adapter;
  -   pci_set_drvdata(dev, adapter);
  spin_lock_init(adapter-afu_list_lock);
 
  +   if ((rc = cxl_alloc_adapter_nr(adapter)))
 
 Humf
 
  +   goto err1;
  +
  +   if ((rc = dev_set_name(adapter-dev, card%i, adapter-adapter_num)))
 
 Humf
  +   goto err2;
  +
  return adapter;
  +
  +err2:
  +   cxl_remove_adapter_nr(adapter);
  +err1:
  +   kfree(adapter);
  +   return NULL;
   }
...

The function above doesn't even use the 'rc' value.

David
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 20/31] avr32: handle page-less SG entries

2015-08-12 Thread Hans-Christian Egtvedt
Around Wed 12 Aug 2015 09:05:39 +0200 or thereabout, Christoph Hellwig wrote:
 Make all cache invalidation conditional on sg_has_page() and use
 sg_phys to get the physical address directly, bypassing the noop
 page_to_bus.
 
 Signed-off-by: Christoph Hellwig h...@lst.de

Acked-by: Hans-Christian Egtvedt egtv...@samfundet.no

 ---
  arch/avr32/include/asm/dma-mapping.h | 14 +++---
  1 file changed, 7 insertions(+), 7 deletions(-)
 
 diff --git a/arch/avr32/include/asm/dma-mapping.h 
 b/arch/avr32/include/asm/dma-mapping.h
 index ae7ac92..a662ce2 100644
 --- a/arch/avr32/include/asm/dma-mapping.h
 +++ b/arch/avr32/include/asm/dma-mapping.h
 @@ -216,11 +216,9 @@ dma_map_sg(struct device *dev, struct scatterlist 
 *sglist, int nents,
   struct scatterlist *sg;
  
   for_each_sg(sglist, sg, nents, i) {
 - char *virt;
 -
 - sg-dma_address = page_to_bus(sg_page(sg)) + sg-offset;
 - virt = sg_virt(sg);
 - dma_cache_sync(dev, virt, sg-length, direction);
 + sg-dma_address = sg_phys(sg);
 + if (sg_has_page(sg))
 + dma_cache_sync(dev, sg_virt(sg), sg-length, direction);
   }
  
   return nents;
 @@ -328,8 +326,10 @@ dma_sync_sg_for_device(struct device *dev, struct 
 scatterlist *sglist,
   int i;
   struct scatterlist *sg;
  
 - for_each_sg(sglist, sg, nents, i)
 - dma_cache_sync(dev, sg_virt(sg), sg-length, direction);
 + for_each_sg(sglist, sg, nents, i) {
 + if (sg_has_page(sg))
 + dma_cache_sync(dev, sg_virt(sg), sg-length, direction);
 + }
  }
  
  /* Now for the API extensions over the pci_ one */
-- 
mvh
Hans-Christian Egtvedt
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v6 05/42] powerpc/powernv: Track IO/M32/M64 segments from PE

2015-08-12 Thread Gavin Shan
On Tue, Aug 11, 2015 at 12:23:42PM +1000, Alexey Kardashevskiy wrote:
On 08/11/2015 10:03 AM, Gavin Shan wrote:
On Mon, Aug 10, 2015 at 05:16:40PM +1000, Alexey Kardashevskiy wrote:
On 08/06/2015 02:11 PM, Gavin Shan wrote:
The patch is adding 6 bitmaps, three to PE and three to PHB, to track

The patch is also removing 2 arrays (io_segmap and m32_segmap), what is that
all about? Also, there was no m64_segmap, now there is, needs an explanation
may be.


Originally, the bitmaps (io_segmap and m32_segmap) are allocated dynamically.
Now, they have fixed sizes - 512 bits.

The subject powerpc/powernv: Track IO/M32/M64 segments from PE indicates
why m64_segmap is added.


But before this patch, you somehow managed to keep it working without a map
for M64, by the same time you needed map for IO and M32. It seems you are
making things consistent in this patch but it also feels like you do not have
to do so as M64 did not need a map before and I cannot see why it needs one
now.


The M64 map is used by [PATCH v6 23/42] powerpc/powernv: Release PEs dynamically
where the M64 segments consumed by one particular PE will be released.


the consumed by one particular PE, which can be released once the PE
is destroyed during PCI unplugging time. Also, we're using fixed
quantity of bits to trace the used IO, M32 and M64 segments by PEs
in one particular PHB.


Out of curiosity - have you considered having just 3 arrays, in PHB, storing
PE numbers, and ditching PE's arrays? Does PE itself need to know what PEs it
is using? Not sure about this master/slave PEs though.


I don't follow your suggestion. Can you rephrase and explain it a bit more?


Please explains in what situations you need same map in both PHB and PE and
how you are going to use them. For example, pe::m64_segmap and
phb::m64_segmap.

I believe you need to know what segment is used by what PE and that's it and
having 2 bitmaps is overcomplicated hard to follow. Is there anything else
what I am missing?


The situation is same to all (IO, M32 and M64) segment maps. Taking m64_segmap
as an example, it will be used when creating or destroying the PE who consumes
M64 segments. phb::m64_segmap is recording the M64 segment usage in PHB's 
domain.
It's used to check same M64 segment won't be used for towice. pe::m64_segmap 
tracks
the M64 segments consumed by the PE.

It would be easier to read patches if this one was right before
[PATCH v6 23/42] powerpc/powernv: Release PEs dynamically


I'll try to reoder the patch, but not expect too much...



Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
  arch/powerpc/platforms/powernv/pci-ioda.c | 29 
 +++--
  arch/powerpc/platforms/powernv/pci.h  | 18 ++
  2 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index e4ac703..78b49a1 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -388,6 +388,12 @@ static int pnv_ioda_pick_m64_pe(struct pci_bus *bus, 
bool all)
list_add_tail(pe-list, master_pe-slaves);
}

+   /* M64 segments consumed by slave PEs are tracked
+* by master PE
+*/
+   set_bit(pe-pe_number, master_pe-m64_segmap);
+   set_bit(pe-pe_number, phb-ioda.m64_segmap);
+
/* P7IOC supports M64DT, which helps mapping M64 segment
 * to one particular PE#. However, PHB3 has fixed mapping
 * between M64 segment and PE#. In order to have same logic
@@ -2871,9 +2877,11 @@ static void pnv_ioda_setup_pe_seg(struct 
pci_controller *hose,

while (index  phb-ioda.total_pe 
   region.start = region.end) {
-   phb-ioda.io_segmap[index] = pe-pe_number;
+   set_bit(index, pe-io_segmap);
+   set_bit(index, phb-ioda.io_segmap);
rc = opal_pci_map_pe_mmio_window(phb-opal_id,
-   pe-pe_number, OPAL_IO_WINDOW_TYPE, 0, 
index);
+   pe-pe_number, OPAL_IO_WINDOW_TYPE,
+   0, index);

Unrelated change.


True, will drop. However, checkpatch.pl will complain wtih:
exceeding 80 characters.

It will not as you are not changing these lines, it only complains on changes.




if (rc != OPAL_SUCCESS) {
pr_err(%s: OPAL error %d when mapping 
 IO 
   segment #%d to PE#%d\n,
@@ -2896,9 +2904,11 @@ static void pnv_ioda_setup_pe_seg(struct 
pci_controller *hose,

while (index  phb-ioda.total_pe 
   region.start = region.end) {
-   phb-ioda.m32_segmap[index] = pe-pe_number;
+   set_bit(index, pe-m32_segmap);
+ 

Re: [PATCH v6 05/42] powerpc/powernv: Track IO/M32/M64 segments from PE

2015-08-12 Thread Alexey Kardashevskiy

On 08/12/2015 08:45 PM, Gavin Shan wrote:

On Tue, Aug 11, 2015 at 12:23:42PM +1000, Alexey Kardashevskiy wrote:

On 08/11/2015 10:03 AM, Gavin Shan wrote:

On Mon, Aug 10, 2015 at 05:16:40PM +1000, Alexey Kardashevskiy wrote:

On 08/06/2015 02:11 PM, Gavin Shan wrote:

The patch is adding 6 bitmaps, three to PE and three to PHB, to track


The patch is also removing 2 arrays (io_segmap and m32_segmap), what is that
all about? Also, there was no m64_segmap, now there is, needs an explanation
may be.



Originally, the bitmaps (io_segmap and m32_segmap) are allocated dynamically.
Now, they have fixed sizes - 512 bits.

The subject powerpc/powernv: Track IO/M32/M64 segments from PE indicates
why m64_segmap is added.



But before this patch, you somehow managed to keep it working without a map
for M64, by the same time you needed map for IO and M32. It seems you are
making things consistent in this patch but it also feels like you do not have
to do so as M64 did not need a map before and I cannot see why it needs one
now.



The M64 map is used by [PATCH v6 23/42] powerpc/powernv: Release PEs dynamically
where the M64 segments consumed by one particular PE will be released.



Then add it where it is really started being used. It is really hard to 
review a patch which is actually spread between patches. Do not count that 
reviewers will just trust you.






the consumed by one particular PE, which can be released once the PE
is destroyed during PCI unplugging time. Also, we're using fixed
quantity of bits to trace the used IO, M32 and M64 segments by PEs
in one particular PHB.



Out of curiosity - have you considered having just 3 arrays, in PHB, storing
PE numbers, and ditching PE's arrays? Does PE itself need to know what PEs it
is using? Not sure about this master/slave PEs though.



I don't follow your suggestion. Can you rephrase and explain it a bit more?



Please explains in what situations you need same map in both PHB and PE and
how you are going to use them. For example, pe::m64_segmap and
phb::m64_segmap.

I believe you need to know what segment is used by what PE and that's it and
having 2 bitmaps is overcomplicated hard to follow. Is there anything else
what I am missing?



The situation is same to all (IO, M32 and M64) segment maps. Taking m64_segmap
as an example, it will be used when creating or destroying the PE who consumes
M64 segments. phb::m64_segmap is recording the M64 segment usage in PHB's 
domain.
It's used to check same M64 segment won't be used for towice. pe::m64_segmap 
tracks
the M64 segments consumed by the PE.



You could have a single map in PHB, key would be a segment number and value 
would be PE number. No need to have a map in PE. At all. No need to 
initialize bitmaps, etc.





It would be easier to read patches if this one was right before
[PATCH v6 23/42] powerpc/powernv: Release PEs dynamically



I'll try to reoder the patch, but not expect too much...





Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
  arch/powerpc/platforms/powernv/pci-ioda.c | 29 +++--
  arch/powerpc/platforms/powernv/pci.h  | 18 ++
  2 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index e4ac703..78b49a1 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -388,6 +388,12 @@ static int pnv_ioda_pick_m64_pe(struct pci_bus *bus, bool 
all)
list_add_tail(pe-list, master_pe-slaves);
}

+   /* M64 segments consumed by slave PEs are tracked
+* by master PE
+*/
+   set_bit(pe-pe_number, master_pe-m64_segmap);
+   set_bit(pe-pe_number, phb-ioda.m64_segmap);
+
/* P7IOC supports M64DT, which helps mapping M64 segment
 * to one particular PE#. However, PHB3 has fixed mapping
 * between M64 segment and PE#. In order to have same logic
@@ -2871,9 +2877,11 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller 
*hose,

while (index  phb-ioda.total_pe 
   region.start = region.end) {
-   phb-ioda.io_segmap[index] = pe-pe_number;
+   set_bit(index, pe-io_segmap);
+   set_bit(index, phb-ioda.io_segmap);
rc = opal_pci_map_pe_mmio_window(phb-opal_id,
-   pe-pe_number, OPAL_IO_WINDOW_TYPE, 0, 
index);
+   pe-pe_number, OPAL_IO_WINDOW_TYPE,
+   0, index);


Unrelated change.



True, will drop. However, checkpatch.pl will complain wtih:
exceeding 80 characters.


It will not as you are not changing these lines, it only complains on changes.






   

Re: [PATCH v7 2/6] mm: mlock: Add new mlock system call

2015-08-12 Thread Michal Hocko
On Sun 09-08-15 01:22:52, Eric B Munson wrote:
 With the refactored mlock code, introduce a new system call for mlock.
 The new call will allow the user to specify what lock states are being
 added.  mlock2 is trivial at the moment, but a follow on patch will add
 a new mlock state making it useful.

Looks good to me

Acked-by: Michal Hocko mho...@suse.com

 Signed-off-by: Eric B Munson emun...@akamai.com
 Acked-by: Vlastimil Babka vba...@suse.cz
 Cc: Michal Hocko mho...@suse.cz
 Cc: Vlastimil Babka vba...@suse.cz
 Cc: Heiko Carstens heiko.carst...@de.ibm.com
 Cc: Geert Uytterhoeven ge...@linux-m68k.org
 Cc: Catalin Marinas catalin.mari...@arm.com
 Cc: Stephen Rothwell s...@canb.auug.org.au
 Cc: Guenter Roeck li...@roeck-us.net
 Cc: Andrea Arcangeli aarca...@redhat.com
 Cc: linux-al...@vger.kernel.org
 Cc: linux-ker...@vger.kernel.org
 Cc: linux-arm-ker...@lists.infradead.org
 Cc: adi-buildroot-de...@lists.sourceforge.net
 Cc: linux-cris-ker...@axis.com
 Cc: linux-i...@vger.kernel.org
 Cc: linux-m...@lists.linux-m68k.org
 Cc: linux-am33-l...@redhat.com
 Cc: linux-par...@vger.kernel.org
 Cc: linuxppc-dev@lists.ozlabs.org
 Cc: linux-s...@vger.kernel.org
 Cc: linux...@vger.kernel.org
 Cc: sparcli...@vger.kernel.org
 Cc: linux-xte...@linux-xtensa.org
 Cc: linux-...@vger.kernel.org
 Cc: linux-a...@vger.kernel.org
 Cc: linux...@kvack.org
 ---
  arch/x86/entry/syscalls/syscall_32.tbl | 1 +
  arch/x86/entry/syscalls/syscall_64.tbl | 1 +
  include/linux/syscalls.h   | 2 ++
  include/uapi/asm-generic/unistd.h  | 4 +++-
  kernel/sys_ni.c| 1 +
  mm/mlock.c | 8 
  6 files changed, 16 insertions(+), 1 deletion(-)
 
 diff --git a/arch/x86/entry/syscalls/syscall_32.tbl 
 b/arch/x86/entry/syscalls/syscall_32.tbl
 index ef8187f..8e06da6 100644
 --- a/arch/x86/entry/syscalls/syscall_32.tbl
 +++ b/arch/x86/entry/syscalls/syscall_32.tbl
 @@ -365,3 +365,4 @@
  356  i386memfd_createsys_memfd_create
  357  i386bpf sys_bpf
  358  i386execveatsys_execveat
 stub32_execveat
 +360  i386mlock2  sys_mlock2
 diff --git a/arch/x86/entry/syscalls/syscall_64.tbl 
 b/arch/x86/entry/syscalls/syscall_64.tbl
 index 9ef32d5..67601e7 100644
 --- a/arch/x86/entry/syscalls/syscall_64.tbl
 +++ b/arch/x86/entry/syscalls/syscall_64.tbl
 @@ -329,6 +329,7 @@
  320  common  kexec_file_load sys_kexec_file_load
  321  common  bpf sys_bpf
  322  64  execveatstub_execveat
 +324  common  mlock2  sys_mlock2
  
  #
  # x32-specific system call numbers start at 512 to avoid cache impact
 diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
 index b45c45b..56a3d59 100644
 --- a/include/linux/syscalls.h
 +++ b/include/linux/syscalls.h
 @@ -884,4 +884,6 @@ asmlinkage long sys_execveat(int dfd, const char __user 
 *filename,
   const char __user *const __user *argv,
   const char __user *const __user *envp, int flags);
  
 +asmlinkage long sys_mlock2(unsigned long start, size_t len, int flags);
 +
  #endif
 diff --git a/include/uapi/asm-generic/unistd.h 
 b/include/uapi/asm-generic/unistd.h
 index e016bd9..14a6013 100644
 --- a/include/uapi/asm-generic/unistd.h
 +++ b/include/uapi/asm-generic/unistd.h
 @@ -709,9 +709,11 @@ __SYSCALL(__NR_memfd_create, sys_memfd_create)
  __SYSCALL(__NR_bpf, sys_bpf)
  #define __NR_execveat 281
  __SC_COMP(__NR_execveat, sys_execveat, compat_sys_execveat)
 +#define __NR_mlock2 282
 +__SYSCALL(__NR_mlock2, sys_mlock2)
  
  #undef __NR_syscalls
 -#define __NR_syscalls 282
 +#define __NR_syscalls 283
  
  /*
   * All syscalls below here should go away really,
 diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
 index 7995ef5..4818b71 100644
 --- a/kernel/sys_ni.c
 +++ b/kernel/sys_ni.c
 @@ -193,6 +193,7 @@ cond_syscall(sys_mlock);
  cond_syscall(sys_munlock);
  cond_syscall(sys_mlockall);
  cond_syscall(sys_munlockall);
 +cond_syscall(sys_mlock2);
  cond_syscall(sys_mincore);
  cond_syscall(sys_madvise);
  cond_syscall(sys_mremap);
 diff --git a/mm/mlock.c b/mm/mlock.c
 index 5692ee5..3094f27 100644
 --- a/mm/mlock.c
 +++ b/mm/mlock.c
 @@ -643,6 +643,14 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
   return do_mlock(start, len, VM_LOCKED);
  }
  
 +SYSCALL_DEFINE3(mlock2, unsigned long, start, size_t, len, int, flags)
 +{
 + if (flags)
 + return -EINVAL;
 +
 + return do_mlock(start, len, VM_LOCKED);
 +}
 +
  SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len)
  {
   int ret;
 -- 
 1.9.1

-- 
Michal Hocko
SUSE Labs
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/xmon: Allow limiting the size of the paca display

2015-08-12 Thread Michael Ellerman
On Wed, 2015-08-12 at 13:24 +0530, Anshuman Khandual wrote:
 On 08/12/2015 12:27 PM, Michael Ellerman wrote:
  @@ -2090,9 +2092,12 @@ static void dump_one_paca(int cpu)
  printf( %-*s = %s\n, 16, present, cpu_present(cpu) ? yes : no);
  printf( %-*s = %s\n, 16, online, cpu_online(cpu) ? yes : no);
   
  +   i = 4; /* We always print the first four lines */
  +
   #define DUMP(paca, name, format) \
  -   printf( %-*s = %#-*format\t(0x%lx)\n, 16, #name, 18, paca-name, \
  -   offsetof(struct paca_struct, name));
  +   if (!num_lines || i++  num_lines)
 
 All look good except the fact that we are using 0 to signify that
 there is no limit to the number of lines. Is not it bit confusing ?

Maybe a bit, but it's the simplest option, and it's not uncommon for zero to
mean no limit. chage(1) springs to mind as an example.

And in terms of xmon it's hardly the most confusing part :)

cheers


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 30/31] intel-iommu: handle page-less SG entries

2015-08-12 Thread David Woodhouse
On Wed, 2015-08-12 at 09:05 +0200, Christoph Hellwig wrote:
 Just remove a BUG_ON, the code handles them just fine as-is.
 
 Signed-off-by: Christoph Hellwig h...@lst.de

Acked-by: David Woodhouse david.woodho...@intel.com

-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation


smime.p7s
Description: S/MIME cryptographic signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 31/31] dma-mapping-common: skip kmemleak checks for page-less SG entries

2015-08-12 Thread Linus Torvalds
On Wed, Aug 12, 2015 at 12:05 AM, Christoph Hellwig h...@lst.de wrote:
 +   for_each_sg(sg, s, nents, i) {
 +   if (sg_has_page(s))
 +   kmemcheck_mark_initialized(sg_virt(s), s-length);
 +   }

[ Again, I'm responding to one random patch - this pattern was in
other patches too.  ]

A question: do we actually expect to mix page-less and pageful SG
entries in the same SG list?

How does that happen?

(I'm not saying it can't, I'm just wondering where people expect this
to happen).

IOW, maybe it would be valid to have a rule saying a SG list is
either all pageful or pageless, never mixed, and then have the if
statement outside the loop rather than inside.

  Linus
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 29/31] parisc: handle page-less SG entries

2015-08-12 Thread Linus Torvalds
On Wed, Aug 12, 2015 at 12:05 AM, Christoph Hellwig h...@lst.de wrote:
 Make all cache invalidation conditional on sg_has_page() and use
 sg_phys to get the physical address directly.

So this worries me a bit (I'm just reacting to one random patch in the series).

The reason?

I think this wants a big honking comment somewhere saying non-sg_page
accesses are not necessarily cache coherent).

Now, I don't think that's _wrong_, but it's an important distinction:
if you look up pages in the page tables directly, there's a very
subtle difference between then saving just the pfn and saving the
struct page of the result.

On sane architectures, this whole cache flushing thing doesn't matter.
Which just means that it's going to be even more subtle on the odd
broken ones..

I'm assuming that anybody who wants to use the page-less
scatter-gather lists always does so on memory that isn't actually
virtually mapped at all, or only does so on sane architectures that
are cache coherent at a physical level, but I'd like that assumption
*documented* somewhere.

(And maybe it is, and I just didn't get to that patch yet)

   Linus
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

  1   2   >