Re: pci and pcie device-tree binding - range No cells

2012-12-12 Thread Michal Simek

On 12/10/2012 10:41 PM, Grant Likely wrote:

On Mon, 10 Dec 2012 09:21:51 -0600, Rob Herring robherri...@gmail.com wrote:

On 12/10/2012 09:05 AM, Michal Simek wrote:

On 12/10/2012 03:26 PM, Rob Herring wrote:

On 12/10/2012 06:20 AM, Michal Simek wrote:

Hi Grant and others,

I have a question regarding number of cells in ranges property
for pci and pcie nodes.

Linux pci/pcie powerpc DTSes contain 7 cells (xpedite5370.dts,
sequoia.dts, etc)
but also 6 cells format too (mpc832x_mds.dts)

Here is shown 6 cells ranges format and describe
http://devicetree.org/Device_Tree_Usage#PCI_Host_Bridge

And also in documentation in the linux
Documentation/devicetree/bindings/pci/83xx-512x-pci.txt

Both format uses:
#size-cells = 2;
#address-cells = 3;

What is valid format?


Both. 7 cells are valid when the host (parent) bus is 64-bit and 6 cells
are valid when the host bus is 32-bit. The ranges property is child
address parent address size. The parent address #address-cells is
taken from the parent node.


Ok. Got it.

Here is what we use on zynq and microblaze - both 32bit which should be
fine.

 ps7_axi_interconnect_0: axi@0 {
 #address-cells = 1;
 #size-cells = 1;
 axi_pcie_0: axi-pcie@5000 {
 #address-cells = 3;
 #size-cells = 2;
 compatible = xlnx,axi-pcie-1.05.a;
 ranges =  0x0200 0 0x6000 0x6000 0 0x1000 ;
 ...
 }
 }

What I am wondering is pci_process_bridge_OF_ranges() at
arch/powerpc/kernel/pci-common.c
where there are used some hardcoded values which should be probably
loaded from device-tree.

For example:
683 int np = pna + 5;
...
702 pci_addr = of_read_number(ranges + 1, 2);
703 cpu_addr = of_translate_address(dev, ranges + 3);
704 size = of_read_number(ranges + pna + 3, 2);


These would always be correct whether you have 6 or 7 cells. pna is the
parent bus address cells size. The pci address is fixed at 3 cells.




Unfortunately we have copied it to microblaze.


I look at the PCI DT code in powerpc and see a whole bunch of code that
seems like it should be common. The different per arch pci structs
complicates that. No one has really gotten to looking at PCI DT on ARM
yet except you and Thierry for Tegra. We definitely don't want to create
a 3rd copy. Starting the process of moving it to something like
drivers/pci/pci-of.c would be great.


A lot of it should be common. The microblaze code is a copy of the
powerpc version. I'll strongly nack any attempt to add a third!  :-)


Yes it. There are some things which we had fixed because that powerpc
port is big endian only and we support PCIe on little endian too.
But changes are really cosmetic.



drivers/pci/pci-of.c would be good. I'd also accept drivers/of/pci.c
which might actually be a good idea in the short term so that it gets
appropriate supervision while being generalized before being moved into
the pci directory.


Ben: Are you willing to move that ppc code to this location?
It is probably not good idea that I should do it when I even don't have
hardware available for testing (Asking someone else).

Thanks,
Michal

--
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/
Microblaze U-BOOT custodian
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: pci and pcie device-tree binding - range No cells

2012-12-12 Thread Grant Likely
On Wed, Dec 12, 2012 at 10:37 AM, Michal Simek mon...@monstr.eu wrote:
 On 12/10/2012 10:41 PM, Grant Likely wrote:
 drivers/pci/pci-of.c would be good. I'd also accept drivers/of/pci.c
 which might actually be a good idea in the short term so that it gets
 appropriate supervision while being generalized before being moved into
 the pci directory.

 Ben: Are you willing to move that ppc code to this location?
 It is probably not good idea that I should do it when I even don't have
 hardware available for testing (Asking someone else).

You're a clever guy, you are more than capable of crafting the patch,
even if you can't test on hardware. :-)

I refactored most of the OF support code without having access to most
of the affected hardware. Once I got the changes out there for review
I also asked for spot testing before getting it into linux-next for
even more testing.

g.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] vfio powerpc: enabled on powernv platform

2012-12-12 Thread Alexey Kardashevskiy
This patch initializes IOMMU groups based on the IOMMU
configuration discovered during the PCI scan on POWERNV
(POWER non virtualized) platform. The IOMMU groups are
to be used later by VFIO driver (PCI pass through).

It also implements an API for mapping/unmapping pages for
guest PCI drivers and providing DMA window properties.
This API is going to be used later by QEMU-VFIO to handle
h_put_tce hypercalls from the KVM guest.

Although this driver has been tested only on the POWERNV
platform, it should work on any platform which supports
TCE tables.

To enable VFIO on POWER, enable SPAPR_TCE_IOMMU config
option and configure VFIO as required.

Cc: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 arch/powerpc/include/asm/iommu.h |   10 ++
 arch/powerpc/kernel/iommu.c  |  329 ++
 arch/powerpc/platforms/powernv/pci.c |  134 ++
 drivers/iommu/Kconfig|8 +
 4 files changed, 481 insertions(+)

diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index cbfe678..3c861ae 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -76,6 +76,9 @@ struct iommu_table {
struct iommu_pool large_pool;
struct iommu_pool pools[IOMMU_NR_POOLS];
unsigned long *it_map;   /* A simple allocation bitmap for now */
+#ifdef CONFIG_IOMMU_API
+   struct iommu_group *it_group;
+#endif
 };
 
 struct scatterlist;
@@ -147,5 +150,12 @@ static inline void iommu_restore(void)
 }
 #endif
 
+extern void iommu_reset_table(struct iommu_table *tbl, bool restore);
+extern long iommu_clear_tces(struct iommu_table *tbl, unsigned long ioba,
+   unsigned long size);
+extern long iommu_put_tces(struct iommu_table *tbl, unsigned long ioba,
+   uint64_t tce, enum dma_data_direction direction,
+   unsigned long size);
+
 #endif /* __KERNEL__ */
 #endif /* _ASM_IOMMU_H */
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index ff5a6ce..f3bb2e7 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -36,6 +36,7 @@
 #include linux/hash.h
 #include linux/fault-inject.h
 #include linux/pci.h
+#include linux/uaccess.h
 #include asm/io.h
 #include asm/prom.h
 #include asm/iommu.h
@@ -44,6 +45,7 @@
 #include asm/kdump.h
 #include asm/fadump.h
 #include asm/vio.h
+#include asm/tce.h
 
 #define DBG(...)
 
@@ -856,3 +858,330 @@ void iommu_free_coherent(struct iommu_table *tbl, size_t 
size,
free_pages((unsigned long)vaddr, get_order(size));
}
 }
+
+#ifdef CONFIG_IOMMU_API
+/*
+ * SPAPR TCE API
+ */
+
+struct vwork {
+   struct mm_struct*mm;
+   longnpage;
+   struct work_struct  work;
+};
+
+/* delayed decrement/increment for locked_vm */
+static void lock_acct_bg(struct work_struct *work)
+{
+   struct vwork *vwork = container_of(work, struct vwork, work);
+   struct mm_struct *mm;
+
+   mm = vwork-mm;
+   down_write(mm-mmap_sem);
+   mm-locked_vm += vwork-npage;
+   up_write(mm-mmap_sem);
+   mmput(mm);
+   kfree(vwork);
+}
+
+static void lock_acct(long npage)
+{
+   struct vwork *vwork;
+   struct mm_struct *mm;
+
+   if (!current-mm)
+   return; /* process exited */
+
+   if (down_write_trylock(current-mm-mmap_sem)) {
+   current-mm-locked_vm += npage;
+   up_write(current-mm-mmap_sem);
+   return;
+   }
+
+   /*
+* Couldn't get mmap_sem lock, so must setup to update
+* mm-locked_vm later. If locked_vm were atomic, we
+* wouldn't need this silliness
+*/
+   vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL);
+   if (!vwork)
+   return;
+   mm = get_task_mm(current);
+   if (!mm) {
+   kfree(vwork);
+   return;
+   }
+   INIT_WORK(vwork-work, lock_acct_bg);
+   vwork-mm = mm;
+   vwork-npage = npage;
+   schedule_work(vwork-work);
+}
+
+/*
+ * iommu_reset_table is called when it started/stopped being used.
+ *
+ * restore==true says to bring the iommu_table into the state as it was
+ * before being used by VFIO.
+ */
+void iommu_reset_table(struct iommu_table *tbl, bool restore)
+{
+   /* Page#0 is marked as used in iommu_init_table, so we clear it... */
+   if (!restore  (tbl-it_offset == 0))
+   clear_bit(0, tbl-it_map);
+
+   iommu_clear_tces(tbl, tbl-it_offset, tbl-it_size);
+
+   /* ... or restore  */
+   if (restore  (tbl-it_offset == 0))
+   set_bit(0, tbl-it_map);
+}
+EXPORT_SYMBOL_GPL(iommu_reset_table);
+
+/*
+ * Returns the number of used IOMMU pages (4K) within
+ * the same system page (4K or 64K).
+ *
+ * syspage_weight_zero is optimized for expected case == 0
+ * syspage_weight_one is optimized for expected case  1
+ * Other case are not used in 

[PATCH] vfio powerpc: implemented IOMMU driver for VFIO

2012-12-12 Thread Alexey Kardashevskiy
VFIO implements platform independent stuff such as
a PCI driver, BAR access (via read/write on a file descriptor
or direct mapping when possible) and IRQ signaling.

The platform dependent part includes IOMMU initialization
and handling. This patch implements an IOMMU driver for VFIO
which does mapping/unmapping pages for the guest IO and
provides information about DMA window (required by a POWERPC
guest).

The counterpart in QEMU is required to support this functionality.

Cc: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 drivers/vfio/Kconfig|6 +
 drivers/vfio/Makefile   |1 +
 drivers/vfio/vfio_iommu_spapr_tce.c |  249 +++
 include/linux/vfio.h|   31 +
 4 files changed, 287 insertions(+)
 create mode 100644 drivers/vfio/vfio_iommu_spapr_tce.c

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index 7cd5dec..b464687 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -3,10 +3,16 @@ config VFIO_IOMMU_TYPE1
depends on VFIO
default n
 
+config VFIO_IOMMU_SPAPR_TCE
+   tristate
+   depends on VFIO  SPAPR_TCE_IOMMU
+   default n
+
 menuconfig VFIO
tristate VFIO Non-Privileged userspace driver framework
depends on IOMMU_API
select VFIO_IOMMU_TYPE1 if X86
+   select VFIO_IOMMU_SPAPR_TCE if PPC_POWERNV
help
  VFIO provides a framework for secure userspace device drivers.
  See Documentation/vfio.txt for more details.
diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index 2398d4a..72bfabc 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -1,3 +1,4 @@
 obj-$(CONFIG_VFIO) += vfio.o
 obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
+obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
 obj-$(CONFIG_VFIO_PCI) += pci/
diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c 
b/drivers/vfio/vfio_iommu_spapr_tce.c
new file mode 100644
index 000..714bf57
--- /dev/null
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -0,0 +1,249 @@
+/*
+ * VFIO: IOMMU DMA mapping support for TCE on POWER
+ *
+ * Copyright (C) 2012 IBM Corp.  All rights reserved.
+ * Author: Alexey Kardashevskiy a...@ozlabs.ru
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Derived from original vfio_iommu_type1.c:
+ * Copyright (C) 2012 Red Hat, Inc.  All rights reserved.
+ * Author: Alex Williamson alex.william...@redhat.com
+ */
+
+#include linux/module.h
+#include linux/pci.h
+#include linux/slab.h
+#include linux/uaccess.h
+#include linux/err.h
+#include linux/vfio.h
+#include asm/iommu.h
+
+#define DRIVER_VERSION  0.1
+#define DRIVER_AUTHOR   a...@ozlabs.ru
+#define DRIVER_DESC VFIO IOMMU SPAPR TCE
+
+static void tce_iommu_detach_group(void *iommu_data,
+   struct iommu_group *iommu_group);
+
+/*
+ * VFIO IOMMU fd for SPAPR_TCE IOMMU implementation
+ *
+ * This code handles mapping and unmapping of user data buffers
+ * into DMA'ble space using the IOMMU
+ */
+
+/*
+ * The container descriptor supports only a single group per container.
+ * Required by the API as the container is not supplied with the IOMMU group
+ * at the moment of initialization.
+ */
+struct tce_container {
+   struct mutex lock;
+   struct iommu_table *tbl;
+};
+
+static void *tce_iommu_open(unsigned long arg)
+{
+   struct tce_container *container;
+
+   if (arg != VFIO_SPAPR_TCE_IOMMU) {
+   pr_err(tce_vfio: Wrong IOMMU type\n);
+   return ERR_PTR(-EINVAL);
+   }
+
+   container = kzalloc(sizeof(*container), GFP_KERNEL);
+   if (!container)
+   return ERR_PTR(-ENOMEM);
+
+   mutex_init(container-lock);
+
+   return container;
+}
+
+static void tce_iommu_release(void *iommu_data)
+{
+   struct tce_container *container = iommu_data;
+
+   WARN_ON(container-tbl  !container-tbl-it_group);
+   if (container-tbl  container-tbl-it_group)
+   tce_iommu_detach_group(iommu_data, container-tbl-it_group);
+
+   mutex_destroy(container-lock);
+
+   kfree(container);
+}
+
+static long tce_iommu_ioctl(void *iommu_data,
+unsigned int cmd, unsigned long arg)
+{
+   struct tce_container *container = iommu_data;
+   unsigned long minsz;
+   long ret;
+
+   switch (cmd) {
+   case VFIO_CHECK_EXTENSION:
+   return (arg == VFIO_SPAPR_TCE_IOMMU) ? 1 : 0;
+
+   case VFIO_IOMMU_SPAPR_TCE_GET_INFO: {
+   struct vfio_iommu_spapr_tce_info info;
+   struct iommu_table *tbl = container-tbl;
+
+   if (WARN_ON(!tbl))
+   return -ENXIO;
+
+   minsz = offsetofend(struct vfio_iommu_spapr_tce_info,
+   

Re: [PATCH] vfio powerpc: enabled on powernv platform

2012-12-12 Thread Alexey Kardashevskiy

Hi Alex,

I posted other pair of patches. While debugging and testing my stuff I 
implemented some rough hack to support IOMMU mappings without passing those 
hypercalls to the QEMU, this is why I moved pieces of code around - want to 
support both QEMU-VFIO and kernel optimized H_PUT_TCE handler.




On 12/12/12 23:34, Alexey Kardashevskiy wrote:

This patch initializes IOMMU groups based on the IOMMU
configuration discovered during the PCI scan on POWERNV
(POWER non virtualized) platform. The IOMMU groups are
to be used later by VFIO driver (PCI pass through).

It also implements an API for mapping/unmapping pages for
guest PCI drivers and providing DMA window properties.
This API is going to be used later by QEMU-VFIO to handle
h_put_tce hypercalls from the KVM guest.

Although this driver has been tested only on the POWERNV
platform, it should work on any platform which supports
TCE tables.

To enable VFIO on POWER, enable SPAPR_TCE_IOMMU config
option and configure VFIO as required.

Cc: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---



--
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] vfio powerpc: enabled on powernv platform

2012-12-12 Thread Alex Williamson
On Wed, 2012-12-12 at 17:14 +1100, Alexey Kardashevskiy wrote:
 On 08/12/12 04:38, Alex Williamson wrote:
  +static int __init tce_iommu_init(void)
  +{
  +  struct pci_dev *pdev = NULL;
  +  struct iommu_table *tbl;
  +  struct iommu_group *grp;
  +
  +  /* Allocate and initialize IOMMU groups */
  +  for_each_pci_dev(pdev) {
  +  tbl = get_iommu_table_base(pdev-dev);
  +  if (!tbl)
  +  continue;
  +
  +  /* Skip already initialized */
  +  if (tbl-it_group)
  +  continue;
  +
  +  grp = iommu_group_alloc();
  +  if (IS_ERR(grp)) {
  +  pr_info(tce_vfio: cannot create new IOMMU group, 
  ret=%ld\n,
  +  PTR_ERR(grp));
  +  return PTR_ERR(grp);
  +  }
  +  tbl-it_group = grp;
  +  iommu_group_set_iommudata(grp, tbl, group_release);
 
  BTW, groups have a name property that shows up in sysfs that can be set
  with iommu_group_set_name().  IIRC, this was a feature David requested
  for PEs.  It'd be nice if it was used for PEs...  Thanks,
 
 
 
 But what would I put there?... IOMMU ID is more than enough at the moment 
 and struct iommu_table does not have anything what would have made sense to 
 show in the sysfs...

I believe David mentioned that PEs had user visible names.  Perhaps they
match an enclosure location or something.  Group numbers are rather
arbitrary and really have no guarantee of persistence.  Thanks,

Alex


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] vfio powerpc: implemented IOMMU driver for VFIO

2012-12-12 Thread Alex Williamson
On Wed, 2012-12-12 at 17:59 +1100, Alexey Kardashevskiy wrote:
 On 08/12/12 04:01, Alex Williamson wrote:
  +  case VFIO_IOMMU_MAP_DMA: {
  +  vfio_iommu_spapr_tce_dma_map param;
  +  struct iommu_table *tbl = container-tbl;
  +  enum dma_data_direction direction;
  +  unsigned long locked, lock_limit;
  +
  +  if (WARN_ON(!tbl))
  +  return -ENXIO;
  +
  +  minsz = offsetofend(vfio_iommu_spapr_tce_dma_map, size);
  +
  +  if (copy_from_user(param, (void __user *)arg, minsz))
  +  return -EFAULT;
  +
  +  if (param.argsz  minsz)
  +  return -EINVAL;
  +
  +  if ((param.flags  VFIO_DMA_MAP_FLAG_READ) 
  +  (param.flags  VFIO_DMA_MAP_FLAG_WRITE))
  +  direction = DMA_BIDIRECTIONAL;
  +  else if (param.flags  VFIO_DMA_MAP_FLAG_READ)
  +  direction = DMA_TO_DEVICE;
  +  else if (param.flags  VFIO_DMA_MAP_FLAG_WRITE)
  +  direction = DMA_FROM_DEVICE;
  +  else
  +  return -EINVAL;
 
  flags needs to be sanitized too.  Return EINVAL if any unknown bit is
  set or else sloppy users may make it very difficult to make use of those
  flag bits later.
 
 
 It already returns -EINVAL on any bit set except READ/WRITE, no?

No.  I could pass flags ~0 through there to get a read/write mapping and
cause you problems if you later want to define another bit.  Thanks,

Alex

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: pci and pcie device-tree binding - range No cells

2012-12-12 Thread Thomas Petazzoni
Dear Rob Herring,

On Mon, 10 Dec 2012 17:24:44 -0600, Rob Herring wrote:

  Marvell SoCs have up to 20 configurable address windows, which allow
  you, at run time, to say I would like the range from physical
  address 0x to 0x to correspond to the PCIe device
  in port 1, lane 2, or to the NAND, or to this or that device.
  Therefore, in the PCIe driver I proposed for the Armada 370/XP SoCs
  [1], there is no need to encode all those ranges statically in the
  DT.
 
 That's not a unique feature. I'm not sure if any powerpc systems do
 that though.

Yes, probably not an unique feature.

  The only ranges property I'm using is to allow the DT sub-nodes
  describing each PCIe port/lane to access the CPU registers that
  allow to see if the PCIe link is up or down, access the PCI
  configuration space and so on. So all ranges in my ranges
  property correspond to normal CPU registers, like the one you would
  put in the reg property for any device. The fact that those
  devices are PCIe is really orthogonal here.
 
 That doesn't really sound right.

Very likely, but I still don't get what is the right way.

 I don't think deviating from the normal binding is the right approach.
 Perhaps the host driver should fill in the ranges property with the
 addresses it uses. Then any child devices will get the right address
 translation.

I don't really understand what you mean here. If you look at the host
driver code (arch/arm/mach-mvebu/pcie.c), for each PCIe interface
is simply does:

 * Create an address decoding window for the memory BAR
 * Create an address decoding window for the I/O BAR
 * Associate the memory BAR window address and the I/O bar window
   address with the PCIe interface

And that's it. See
https://github.com/MISL-EBU-System-SW/mainline-public/blob/marvell-pcie-v1/arch/arm/mach-mvebu/pcie.c#L107.

So this driver is both deciding of the physical addresses for each
PCIe interface, and associating them with the PCIe interfaces. How is
it useful to feed some addresses back into the Device Tree?

 Also, while the h/w may support practically any config, there are
 practical constraints of what Linux will use like there's no reason to
 support more than 64K i/o space. PCI memory addresses generally start
 at 0x10. You probably don't need more than 1 memory window per
 root complex (although prefetchable memory may also be needed).

I allocate one 64K I/O window and one memory window per PCIe interface
whose link is up (i.e a PCIe device is connected).

 You could let the DT settings drive the address window configuration.

No, because I don't want to have absolute addresses for the windows: I
have 10 PCIe interfaces, but often, only a few of them are used. So I
don't want in the Device Tree to over-allocate hundreds of MB of
physical address space if it's not useful.

PCIe is dynamic, address window configuration is dynamic. And we should
hardcode all this configuration statically in the DT? Doesn't seem like
the right solution.

Best regards,

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: pci and pcie device-tree binding - range No cells

2012-12-12 Thread Michal Simek

On 12/12/2012 11:49 AM, Grant Likely wrote:

On Wed, Dec 12, 2012 at 10:37 AM, Michal Simek mon...@monstr.eu wrote:

On 12/10/2012 10:41 PM, Grant Likely wrote:

drivers/pci/pci-of.c would be good. I'd also accept drivers/of/pci.c
which might actually be a good idea in the short term so that it gets
appropriate supervision while being generalized before being moved into
the pci directory.


Ben: Are you willing to move that ppc code to this location?
It is probably not good idea that I should do it when I even don't have
hardware available for testing (Asking someone else).


You're a clever guy, you are more than capable of crafting the patch,
even if you can't test on hardware. :-)

I refactored most of the OF support code without having access to most
of the affected hardware. Once I got the changes out there for review
I also asked for spot testing before getting it into linux-next for
even more testing.


Fair enough. :-)

Good time to start to look for how to work with board farm.

Thanks,
Michal


--
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/
Microblaze U-BOOT custodian
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: pci and pcie device-tree binding - range No cells

2012-12-12 Thread Grant Likely
On Wed, Dec 12, 2012 at 4:16 PM, Thomas Petazzoni
thomas.petazz...@free-electrons.com wrote:
 Dear Rob Herring,

 On Mon, 10 Dec 2012 17:24:44 -0600, Rob Herring wrote:

  Marvell SoCs have up to 20 configurable address windows, which allow
  you, at run time, to say I would like the range from physical
  address 0x to 0x to correspond to the PCIe device
  in port 1, lane 2, or to the NAND, or to this or that device.
  Therefore, in the PCIe driver I proposed for the Armada 370/XP SoCs
  [1], there is no need to encode all those ranges statically in the
  DT.

 That's not a unique feature. I'm not sure if any powerpc systems do
 that though.

 Yes, probably not an unique feature.

  The only ranges property I'm using is to allow the DT sub-nodes
  describing each PCIe port/lane to access the CPU registers that
  allow to see if the PCIe link is up or down, access the PCI
  configuration space and so on. So all ranges in my ranges
  property correspond to normal CPU registers, like the one you would
  put in the reg property for any device. The fact that those
  devices are PCIe is really orthogonal here.

 That doesn't really sound right.

 Very likely, but I still don't get what is the right way.

Hi Thomas,

I just went and looked at your binding. Here's the snippet I found interesting:

pcie-controller {
+ compatible = marvell,armada-370-xp-pcie;
+ status = disabled;
+ #address-cells = 1;
+ #size-cells = 1;
+ ranges = 0   0xd004 0x2000 /* port0x1_port0 */
+  0x2000  0xd0042000 0x2000 /* port2x1_port0 */
+  0x4000  0xd0044000 0x2000 /* port0x1_port1 */
+  0x8000  0xd0048000 0x2000 /* port0x1_port2 */
+  0xC000  0xd004C000 0x2000 /* port0x1_port3 */
+  0x1 0xd008 0x2000 /* port1x1_port0 */
+  0x12000 0xd0082000 0x2000 /* port3x1_port0 */
+  0x14000 0xd0084000 0x2000 /* port1x1_port1 */
+  0x18000 0xd0088000 0x2000 /* port1x1_port2 */
+  0x1C000 0xd008C000 0x2000 /* port1x1_port3 */;
+
+ pcie0.0 at 0xd004 {
+ reg = 0x0 0x2000;
+ interrupts = 58;
+ clocks = gateclk 5;
+ marvell,pcie-port = 0;
+ marvell,pcie-lane = 0;
+ status = disabled;
+ };
+
+ pcie0.1 at 0xd0044000 {
+ reg = 0x4000 0x2000;
+ interrupts = 59;
+ clocks = gateclk 5;
+ marvell,pcie-port = 0;
+ marvell,pcie-lane = 1;
+ status = disabled;
+ };
[... rest trimmed for berevity]

You're right, if you're doing dynamic allocation of windows, then you
really don't need to have a ranges property. However, the way the PCI
node is set up definitely looks incorrect.

PCI already has a very specific binding for pci host controller nodes.
First, #address-cells=3; #size-cells=2; and device_type=pcie
must be there. You don't want to break this. You can find details on
the pci and pci-express binding here:
http://www.openfirmware.org/1275/bindings/pci/pci2_1.pdf
http://www.openfirmware.org/1275/bindings/pci/pci-express.txt

For the child nodes, PCI is a discoverable bus, so normally I wouldn't
expect to see child nodes at all when using a dtb. The only time nodes
should be populated is when a device has non-discoverable
charactersitics. In your example above you do have some additional
data, but I don't know enough about pci-express to say how best to
encode them or whether they are needed at all. Ben might have some
thoughts on this.

When the PCI child nodes are present, it is important to stick with
the established PCI addressing scheme which uses 3 cells for
addressing. The first entry in the reg property must represent the
configuration space so that DT nodes can be matched up with discovered
devices. There is no requirement to include mappings for the memory
and io regions if the host controller can assign them dynamically.

I don't think you should need a ranges property at all for what you're
doing. Access to config space is generally managed by the PCI host
controller drivers and subsystem, and PCI device drivers don't
typically use of_ calls directly.

g.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: pci and pcie device-tree binding - range No cells

2012-12-12 Thread Rob Herring
On 12/12/2012 10:16 AM, Thomas Petazzoni wrote:
 Dear Rob Herring,
 
 On Mon, 10 Dec 2012 17:24:44 -0600, Rob Herring wrote:
 
 Marvell SoCs have up to 20 configurable address windows, which allow
 you, at run time, to say I would like the range from physical
 address 0x to 0x to correspond to the PCIe device
 in port 1, lane 2, or to the NAND, or to this or that device.
 Therefore, in the PCIe driver I proposed for the Armada 370/XP SoCs
 [1], there is no need to encode all those ranges statically in the
 DT.

 That's not a unique feature. I'm not sure if any powerpc systems do
 that though.
 
 Yes, probably not an unique feature.
 
 The only ranges property I'm using is to allow the DT sub-nodes
 describing each PCIe port/lane to access the CPU registers that
 allow to see if the PCIe link is up or down, access the PCI
 configuration space and so on. So all ranges in my ranges
 property correspond to normal CPU registers, like the one you would
 put in the reg property for any device. The fact that those
 devices are PCIe is really orthogonal here.

 That doesn't really sound right.
 
 Very likely, but I still don't get what is the right way.
 
 I don't think deviating from the normal binding is the right approach.
 Perhaps the host driver should fill in the ranges property with the
 addresses it uses. Then any child devices will get the right address
 translation.
 
 I don't really understand what you mean here. If you look at the host
 driver code (arch/arm/mach-mvebu/pcie.c), for each PCIe interface
 is simply does:
 
  * Create an address decoding window for the memory BAR
  * Create an address decoding window for the I/O BAR
  * Associate the memory BAR window address and the I/O bar window
address with the PCIe interface
 
 And that's it. See
 https://github.com/MISL-EBU-System-SW/mainline-public/blob/marvell-pcie-v1/arch/arm/mach-mvebu/pcie.c#L107.
 
 So this driver is both deciding of the physical addresses for each
 PCIe interface, and associating them with the PCIe interfaces. How is
 it useful to feed some addresses back into the Device Tree?

I'm not completely sure for PCI, but the ranges is necessary to
translate addresses of child nodes.

If you don't need ranges then you could omit it. If you need ranges,
then you should follow the PCI binding whether it is put in the DTS or
you dynamically fill it in. This could be filled in by the bootloader as
well if you have PCI devices you need to boot from.

 Also, while the h/w may support practically any config, there are
 practical constraints of what Linux will use like there's no reason to
 support more than 64K i/o space. PCI memory addresses generally start
 at 0x10. You probably don't need more than 1 memory window per
 root complex (although prefetchable memory may also be needed).
 
 I allocate one 64K I/O window and one memory window per PCIe interface
 whose link is up (i.e a PCIe device is connected).
 
 You could let the DT settings drive the address window configuration.
 
 No, because I don't want to have absolute addresses for the windows: I
 have 10 PCIe interfaces, but often, only a few of them are used. So I
 don't want in the Device Tree to over-allocate hundreds of MB of
 physical address space if it's not useful.

How many you have is probably board dependent and not probe-able, right?
So you would at least know the subset of root complexes that you are
using. I know you want to find the size of all the cards up front and
size windows based on that, but I don't think that is going to be possible.

 
 PCIe is dynamic, address window configuration is dynamic. And we should
 hardcode all this configuration statically in the DT? Doesn't seem like
 the right solution.

I'm just throwing out ideas. There are many cases of flexibility in h/w
designs which are never used. H/w is often designed in a vacuum without
s/w input. Not saying that is the case here, but you do have to consider
that.

Rob

 
 Best regards,
 
 Thomas
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[TRIVIAL PATCH 00/26] treewide: Add and use vsprintf extension %pSR

2012-12-12 Thread Joe Perches
Remove the somewhat awkward uses of print_symbol and convert all the
existing uses to a new vsprintf pointer type of %pSR.

print_symbol can be interleaved when it is used in a sequence like:

printk(something: ...);
print_symbol(%s, addr);
printk(\n);

Instead use:

printk(something: %pSR\n, (void *)addr);

Add a new %p[SsFf]R vsprintf extension that can perform the same
symbol function/address/offset formatting as print_symbol to
reduce the number and styles of message logging functions.

print_symbol used __builtin_extract_return_addr for those architectures
like S/390 and SPARC that have offset or masked addressing.
%p[FfSs]R uses the same gcc __builtin

Joe Perches (26):
  vsprintf: Add extension %pSR - print_symbol replacement
  alpha: Convert print_symbol to %pSR
  arm: Convert print_symbol to %pSR
  arm64: Convert print_symbol to %pSR
  avr32: Convert print_symbol to %pSR
  c6x: Convert print_symbol to %pSR
  ia64: Convert print_symbol to %pSR
  m32r: Convert print_symbol to %pSR
  mn10300: Convert print_symbol to %pSR
  openrisc: Convert print_symbol to %pSR
  powerpc: Convert print_symbol to %pSR
  s390: Convert print_symbol to %pSR
  sh: Convert print_symbol to %pSR
  um: Convert print_symbol to %pSR
  unicore32: Convert print_symbol to %pSR
  x86: Convert print_symbol to %pSR
  xtensa: Convert print_symbol to %pSR
  drivers: base: Convert print_symbol to %pSR
  gfs2: Convert print_symbol to %pSR
  sysfs: Convert print_symbol to %pSR
  irq: Convert print_symbol to %pSR
  smp_processor_id: Convert print_symbol to %pSR
  mm: Convert print_symbol to %pSR
  xtensa: Convert print_symbol to %pSR
  x86: head_64.S: Use vsprintf extension %pSR not print_symbol
  kallsyms: Remove print_symbol

 Documentation/filesystems/sysfs.txt |4 +-
 Documentation/printk-formats.txt|2 +
 Documentation/zh_CN/filesystems/sysfs.txt   |4 +-
 arch/alpha/kernel/traps.c   |8 ++
 arch/arm/kernel/process.c   |4 +-
 arch/arm64/kernel/process.c |4 +-
 arch/avr32/kernel/process.c |   25 ++-
 arch/c6x/kernel/traps.c |3 +-
 arch/ia64/kernel/process.c  |   13 ---
 arch/m32r/kernel/traps.c|6 +---
 arch/mn10300/kernel/traps.c |8 +++---
 arch/openrisc/kernel/traps.c|7 +
 arch/powerpc/platforms/cell/spu_callbacks.c |   12 --
 arch/s390/kernel/traps.c|   28 +++---
 arch/sh/kernel/process_32.c |4 +-
 arch/um/kernel/sysrq.c  |6 +---
 arch/unicore32/kernel/process.c |5 ++-
 arch/x86/kernel/cpu/mcheck/mce.c|   13 ++-
 arch/x86/kernel/dumpstack.c |5 +--
 arch/x86/kernel/head_64.S   |4 +-
 arch/x86/kernel/process_32.c|2 +-
 arch/x86/mm/mmio-mod.c  |4 +-
 arch/x86/um/sysrq_32.c  |9 ++-
 arch/xtensa/kernel/traps.c  |6 +---
 drivers/base/core.c |4 +-
 fs/gfs2/glock.c |4 +-
 fs/gfs2/trans.c |3 +-
 fs/sysfs/file.c |4 +-
 include/linux/kallsyms.h|   18 -
 kernel/irq/debug.h  |   15 ++---
 kernel/kallsyms.c   |   11 --
 lib/smp_processor_id.c  |2 +-
 lib/vsprintf.c  |   18 
 mm/memory.c |8 +++---
 mm/slab.c   |8 ++
 35 files changed, 117 insertions(+), 164 deletions(-)

-- 
1.7.8.112.g3fd21

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[TRIVIAL PATCH 11/26] powerpc: Convert print_symbol to %pSR

2012-12-12 Thread Joe Perches
Use the new vsprintf extension to avoid any possible
message interleaving.

Convert the #ifdef DEBUG block to a single pr_debug.

Signed-off-by: Joe Perches j...@perches.com
---
 arch/powerpc/platforms/cell/spu_callbacks.c |   12 +---
 1 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spu_callbacks.c 
b/arch/powerpc/platforms/cell/spu_callbacks.c
index 75d6133..c5fe6d2 100644
--- a/arch/powerpc/platforms/cell/spu_callbacks.c
+++ b/arch/powerpc/platforms/cell/spu_callbacks.c
@@ -60,13 +60,11 @@ long spu_sys_callback(struct spu_syscall_block *s)
 
syscall = spu_syscall_table[s-nr_ret];
 
-#ifdef DEBUG
-   print_symbol(KERN_DEBUG SPU-syscall %s:, (unsigned long)syscall);
-   printk(syscall%ld(%lx, %lx, %lx, %lx, %lx, %lx)\n,
-   s-nr_ret,
-   s-parm[0], s-parm[1], s-parm[2],
-   s-parm[3], s-parm[4], s-parm[5]);
-#endif
+   pr_debug(SPU-syscall %pSR:syscall%ld(%lx, %lx, %lx, %lx, %lx, %lx)\n,
+syscall,
+s-nr_ret,
+s-parm[0], s-parm[1], s-parm[2],
+s-parm[3], s-parm[4], s-parm[5]);
 
return syscall(s-parm[0], s-parm[1], s-parm[2],
   s-parm[3], s-parm[4], s-parm[5]);
-- 
1.7.8.112.g3fd21

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: pci and pcie device-tree binding - range No cells

2012-12-12 Thread Andrew Murray
On Wed, Dec 12, 2012 at 10:49 AM, Grant Likely wrote:
 On Wed, Dec 12, 2012 at 10:37 AM, Michal Simek 
 mon...@monstr.eumailto:mon...@monstr.eu wrote:
  On 12/10/2012 10:41 PM, Grant Likely wrote:
  drivers/pci/pci-of.c would be good. I'd also accept drivers/of/pci.c
  which might actually be a good idea in the short term so that it gets
  appropriate supervision while being generalized before being moved into
  the pci directory.
 
  Ben: Are you willing to move that ppc code to this location?
  It is probably not good idea that I should do it when I even don't have
  hardware available for testing (Asking someone else).
 
 You're a clever guy, you are more than capable of crafting the patch,
 even if you can't test on hardware. :-)
 
 I refactored most of the OF support code without having access to most
 of the affected hardware. Once I got the changes out there for review
 I also asked for spot testing before getting it into linux-next for
 even more testing.

I've been working on a relatively architecture agnostic PCI host bridge driver
and also wanted to avoid duplicating more generic DT parsing code for PCI
bindings.

I've ended up with a patch which provides an iterator for returning resources
based on the the typical 'ranges' binding. This has ended up living in
drivers/of/address.c. I originally started out in drivers/of/pci.c and
drivers/pci/pci-of.c but found there were good (and static) implementations in
drivers/of/address.c which can be reused (e.g. of_bus_pci_get_flags,
bus-count_cells).

I'm not just ready to post it - but can do before early next week if you can
wait.

Andrew Murray

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: pci and pcie device-tree binding - range No cells

2012-12-12 Thread Thierry Reding
On Wed, Dec 12, 2012 at 12:19:12PM +, Andrew Murray wrote:
 On Wed, Dec 12, 2012 at 10:49 AM, Grant Likely wrote:
  On Wed, Dec 12, 2012 at 10:37 AM, Michal Simek 
  mon...@monstr.eumailto:mon...@monstr.eu wrote:
   On 12/10/2012 10:41 PM, Grant Likely wrote:
   drivers/pci/pci-of.c would be good. I'd also accept drivers/of/pci.c
   which might actually be a good idea in the short term so that it gets
   appropriate supervision while being generalized before being moved into
   the pci directory.
  
   Ben: Are you willing to move that ppc code to this location?
   It is probably not good idea that I should do it when I even don't have
   hardware available for testing (Asking someone else).
  
  You're a clever guy, you are more than capable of crafting the patch,
  even if you can't test on hardware. :-)
  
  I refactored most of the OF support code without having access to most
  of the affected hardware. Once I got the changes out there for review
  I also asked for spot testing before getting it into linux-next for
  even more testing.
 
 I've been working on a relatively architecture agnostic PCI host bridge driver
 and also wanted to avoid duplicating more generic DT parsing code for PCI
 bindings.
 
 I've ended up with a patch which provides an iterator for returning resources
 based on the the typical 'ranges' binding. This has ended up living in
 drivers/of/address.c. I originally started out in drivers/of/pci.c and
 drivers/pci/pci-of.c but found there were good (and static) implementations in
 drivers/of/address.c which can be reused (e.g. of_bus_pci_get_flags,
 bus-count_cells).
 
 I'm not just ready to post it - but can do before early next week if you can
 wait.

I already posted a similar patch[0] as part of a larger series to bring
DT support to Tegra PCIe back in July. I suppose what you have must be
something pretty close to that. Most of the stuff that had me occupied
since then should be done soon and I was planning on resurrecting the
series one of these days.

Thierry

[0]: https://patchwork.kernel.org/patch/1244451/


pgp8ylNOrYiqS.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] pci: Provide support for parsing PCI DT ranges property

2012-12-12 Thread Andrew Murray
DT bindings for PCI host bridges often use the ranges property to describe
memory and IO ranges - this binding tends to be the same across architectures
yet several parsing implementations exist, e.g. arch/mips/pci/pci.c,
arch/powerpc/kernel/pci-common.c, arch/sparc/kernel/pci.c and
arch/microblaze/pci/pci-common.c (clone of PPC). Some of these duplicate
functionality provided by drivers/of/address.c.

This patch provides a common iterator-based parser for the ranges property, it
is hoped this will reduce DT representation differences between architectures
and that architectures will migrate in part to this new parser.

It is also hoped (and the motativation for the patch) that this patch will
reduce duplication of code when writing host bridge drivers that are supported
by multiple architectures.

This patch provides struct resources from a device tree node, e.g.:

u32 *last = NULL;
struct resource res;
while ((last = of_pci_process_ranges(np, res, last))) {
//do something with res
}

Platforms with quirks can then do what they like with the resource or migrate
common quirk handling to the parser. In an ideal world drivers can just request
the obtained resources and pass them on (e.g. pci_add_resource_offset).

Signed-off-by: Andrew Murray andrew.mur...@arm.com
Signed-off-by: Liviu Dudau liviu.du...@arm.com
---
 drivers/of/address.c   |   53 +++-
 include/linux/of_address.h |7 +
 2 files changed, 59 insertions(+), 1 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 7e262a6..03bfe61 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -219,6 +219,57 @@ int of_pci_address_to_resource(struct device_node *dev, 
int bar,
return __of_address_to_resource(dev, addrp, size, flags, NULL, r);
 }
 EXPORT_SYMBOL_GPL(of_pci_address_to_resource);
+
+const __be32 *of_pci_process_ranges(struct device_node *node,
+   struct resource *res, const __be32 *from)
+{
+   const __be32 *start, *end;
+   int na, ns, np, pna;
+   int rlen;
+   struct of_bus *bus;
+   WARN_ON(!res);
+
+   bus = of_match_bus(node);
+   bus-count_cells(node, na, ns);
+   if (!OF_CHECK_COUNTS(na, ns)) {
+   pr_err(Bad cell count for %s\n, node-full_name);
+   return NULL;
+   }
+
+   pna = of_n_addr_cells(node);
+   np = pna + na + ns;
+
+   start = of_get_property(node, ranges, rlen);
+   if (start == NULL)
+   return NULL;
+
+   end = start + rlen;
+
+   if (!from)
+   from = start;
+
+   while (from + np = end) {
+   u64 cpu_addr, size;
+
+   cpu_addr = of_translate_address(node, from + na);
+   size = of_read_number(from + na + pna, ns);
+   res-flags = bus-get_flags(from);
+   from += np;
+
+   if (cpu_addr == OF_BAD_ADDR || size == 0)
+   continue;
+
+   res-name = node-full_name;
+   res-start = cpu_addr;
+   res-end = res-start + size - 1;
+   res-parent = res-child = res-sibling = NULL;
+   return from;
+   }
+
+   return NULL;
+}
+EXPORT_SYMBOL_GPL(of_pci_process_ranges);
+
 #endif /* CONFIG_PCI */
 
 /*
@@ -421,7 +472,7 @@ u64 __of_translate_address(struct device_node *dev, const 
__be32 *in_addr,
goto bail;
bus = of_match_bus(parent);
 
-   /* Cound address cells  copy address locally */
+   /* Count address cells  copy address locally */
bus-count_cells(dev, na, ns);
if (!OF_CHECK_COUNTS(na, ns)) {
printk(KERN_ERR prom_parse: Bad cell count for %s\n,
diff --git a/include/linux/of_address.h b/include/linux/of_address.h
index 01b925a..4582b20 100644
--- a/include/linux/of_address.h
+++ b/include/linux/of_address.h
@@ -26,6 +26,8 @@ static inline unsigned long pci_address_to_pio(phys_addr_t 
addr) { return -1; }
 #define pci_address_to_pio pci_address_to_pio
 #endif
 
+const __be32 *of_pci_process_ranges(struct device_node *node,
+   struct resource *res, const __be32 *from);
 #else /* CONFIG_OF_ADDRESS */
 static inline int of_address_to_resource(struct device_node *dev, int index,
 struct resource *r)
@@ -48,6 +50,11 @@ static inline const u32 *of_get_address(struct device_node 
*dev, int index,
 {
return NULL;
 }
+const __be32 *of_pci_process_ranges(struct device_node *node,
+   struct resource *res, const __be32 *from)
+{
+   return NULL;
+}
 #endif /* CONFIG_OF_ADDRESS */
 
 
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: pci and pcie device-tree binding - range No cells

2012-12-12 Thread Andrew Murray
On Wed, Dec 12, 2012 at 01:34:24PM +, Thierry Reding wrote:
 On Wed, Dec 12, 2012 at 12:19:12PM +, Andrew Murray wrote:
  I've been working on a relatively architecture agnostic PCI host bridge 
  driver
  and also wanted to avoid duplicating more generic DT parsing code for PCI
  bindings.
  
  I've ended up with a patch which provides an iterator for returning 
  resources
  based on the the typical 'ranges' binding. This has ended up living in
  drivers/of/address.c. I originally started out in drivers/of/pci.c and
  drivers/pci/pci-of.c but found there were good (and static) implementations 
  in
  drivers/of/address.c which can be reused (e.g. of_bus_pci_get_flags,
  bus-count_cells).
  
  I'm not just ready to post it - but can do before early next week if you can
  wait.
 
 I already posted a similar patch[0] as part of a larger series to bring
 DT support to Tegra PCIe back in July. I suppose what you have must be
 something pretty close to that. Most of the stuff that had me occupied
 since then should be done soon and I was planning on resurrecting the
 series one of these days.

Thanks for the reference. I've submitted my patch, it's along the lines of your
existing patch.

I'm happy to take the best bits from both, drop mine, etc.

Andrew Murray

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] Revert crypto: caam - Updated SEC-4.0 device tree binding for ERA information.

2012-12-12 Thread Kumar Gala

On Dec 7, 2012, at 2:57 AM, Vakul Garg wrote:

 This reverts commit a2c0911c09190125f52c9941b9d187f601c2f7be.
 
 Signed-off-by: Vakul Garg va...@freescale.com
 ---
 Instead of adding SEC era information in crypto node's compatible, a new
 property 'fsl,sec-era' is being introduced into crypto node.
 
 .../devicetree/bindings/crypto/fsl-sec4.txt|5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

What tree do you think this has been applied to?

- k
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v3] powerpc: fix wii_memory_fixups() compile error on 3.0.y tree

2012-12-12 Thread Shuah Khan
Fix wii_memory_fixups() the following compile error on 3.0.y tree with
wii_defconfig on 3.0.y tree.

  CC  arch/powerpc/platforms/embedded6xx/wii.o
arch/powerpc/platforms/embedded6xx/wii.c: In function ‘wii_memory_fixups’:
arch/powerpc/platforms/embedded6xx/wii.c:88:2: error: format ‘%llx’ expects 
argument of type ‘long long unsigned int’, but argument 2 has type 
‘phys_addr_t’ [-Werror=format]
arch/powerpc/platforms/embedded6xx/wii.c:88:2: error: format ‘%llx’ expects 
argument of type ‘long long unsigned int’, but argument 3 has type 
‘phys_addr_t’ [-Werror=format]
arch/powerpc/platforms/embedded6xx/wii.c:90:2: error: format ‘%llx’ expects 
argument of type ‘long long unsigned int’, but argument 2 has type 
‘phys_addr_t’ [-Werror=format]
arch/powerpc/platforms/embedded6xx/wii.c:90:2: error: format ‘%llx’ expects 
argument of type ‘long long unsigned int’, but argument 3 has type 
‘phys_addr_t’ [-Werror=format]
cc1: all warnings being treated as errors
make[2]: *** [arch/powerpc/platforms/embedded6xx/wii.o] Error 1
make[1]: *** [arch/powerpc/platforms/embedded6xx] Error 2
make: *** [arch/powerpc/platforms] Error 2

Signed-off-by: Shuah Khan shuah.k...@hp.com
CC: sta...@vger.kernel.org 3.0.y
---
 arch/powerpc/platforms/embedded6xx/wii.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/embedded6xx/wii.c 
b/arch/powerpc/platforms/embedded6xx/wii.c
index 1b5dc1a..daf793b 100644
--- a/arch/powerpc/platforms/embedded6xx/wii.c
+++ b/arch/powerpc/platforms/embedded6xx/wii.c
@@ -85,9 +85,11 @@ void __init wii_memory_fixups(void)
wii_hole_start = p[0].base + p[0].size;
wii_hole_size = p[1].base - wii_hole_start;
 
-   pr_info(MEM1: %08llx %08llx\n, p[0].base, p[0].size);
+   pr_info(MEM1: %08llx %08llx\n,
+   (unsigned long long) p[0].base, (unsigned long long) p[0].size);
pr_info(HOLE: %08lx %08lx\n, wii_hole_start, wii_hole_size);
-   pr_info(MEM2: %08llx %08llx\n, p[1].base, p[1].size);
+   pr_info(MEM2: %08llx %08llx\n,
+   (unsigned long long) p[1].base, (unsigned long long) p[1].size);
 
p[0].size += wii_hole_size + p[1].size;
 
-- 
1.7.9.5



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] vfio powerpc: enabled on powernv platform

2012-12-12 Thread Alex Williamson
On Wed, 2012-12-12 at 23:34 +1100, Alexey Kardashevskiy wrote:
 This patch initializes IOMMU groups based on the IOMMU
 configuration discovered during the PCI scan on POWERNV
 (POWER non virtualized) platform. The IOMMU groups are
 to be used later by VFIO driver (PCI pass through).
 
 It also implements an API for mapping/unmapping pages for
 guest PCI drivers and providing DMA window properties.
 This API is going to be used later by QEMU-VFIO to handle
 h_put_tce hypercalls from the KVM guest.
 
 Although this driver has been tested only on the POWERNV
 platform, it should work on any platform which supports
 TCE tables.
 
 To enable VFIO on POWER, enable SPAPR_TCE_IOMMU config
 option and configure VFIO as required.
 
 Cc: David Gibson da...@gibson.dropbear.id.au
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
  arch/powerpc/include/asm/iommu.h |   10 ++
  arch/powerpc/kernel/iommu.c  |  329 
 ++
  arch/powerpc/platforms/powernv/pci.c |  134 ++
  drivers/iommu/Kconfig|8 +
  4 files changed, 481 insertions(+)
 
 diff --git a/arch/powerpc/include/asm/iommu.h 
 b/arch/powerpc/include/asm/iommu.h
 index cbfe678..3c861ae 100644
 --- a/arch/powerpc/include/asm/iommu.h
 +++ b/arch/powerpc/include/asm/iommu.h
 @@ -76,6 +76,9 @@ struct iommu_table {
   struct iommu_pool large_pool;
   struct iommu_pool pools[IOMMU_NR_POOLS];
   unsigned long *it_map;   /* A simple allocation bitmap for now */
 +#ifdef CONFIG_IOMMU_API
 + struct iommu_group *it_group;
 +#endif
  };
  
  struct scatterlist;
 @@ -147,5 +150,12 @@ static inline void iommu_restore(void)
  }
  #endif
  
 +extern void iommu_reset_table(struct iommu_table *tbl, bool restore);
 +extern long iommu_clear_tces(struct iommu_table *tbl, unsigned long ioba,
 + unsigned long size);
 +extern long iommu_put_tces(struct iommu_table *tbl, unsigned long ioba,
 + uint64_t tce, enum dma_data_direction direction,
 + unsigned long size);
 +
  #endif /* __KERNEL__ */
  #endif /* _ASM_IOMMU_H */
 diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
 index ff5a6ce..f3bb2e7 100644
 --- a/arch/powerpc/kernel/iommu.c
 +++ b/arch/powerpc/kernel/iommu.c
 @@ -36,6 +36,7 @@
  #include linux/hash.h
  #include linux/fault-inject.h
  #include linux/pci.h
 +#include linux/uaccess.h
  #include asm/io.h
  #include asm/prom.h
  #include asm/iommu.h
 @@ -44,6 +45,7 @@
  #include asm/kdump.h
  #include asm/fadump.h
  #include asm/vio.h
 +#include asm/tce.h
  
  #define DBG(...)
  
 @@ -856,3 +858,330 @@ void iommu_free_coherent(struct iommu_table *tbl, 
 size_t size,
   free_pages((unsigned long)vaddr, get_order(size));
   }
  }
 +
 +#ifdef CONFIG_IOMMU_API
 +/*
 + * SPAPR TCE API
 + */
 +
 +struct vwork {
 + struct mm_struct*mm;
 + longnpage;
 + struct work_struct  work;
 +};
 +
 +/* delayed decrement/increment for locked_vm */
 +static void lock_acct_bg(struct work_struct *work)
 +{
 + struct vwork *vwork = container_of(work, struct vwork, work);
 + struct mm_struct *mm;
 +
 + mm = vwork-mm;
 + down_write(mm-mmap_sem);
 + mm-locked_vm += vwork-npage;
 + up_write(mm-mmap_sem);
 + mmput(mm);
 + kfree(vwork);
 +}
 +
 +static void lock_acct(long npage)
 +{
 + struct vwork *vwork;
 + struct mm_struct *mm;
 +
 + if (!current-mm)
 + return; /* process exited */
 +
 + if (down_write_trylock(current-mm-mmap_sem)) {
 + current-mm-locked_vm += npage;
 + up_write(current-mm-mmap_sem);
 + return;
 + }
 +
 + /*
 +  * Couldn't get mmap_sem lock, so must setup to update
 +  * mm-locked_vm later. If locked_vm were atomic, we
 +  * wouldn't need this silliness
 +  */
 + vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL);
 + if (!vwork)
 + return;
 + mm = get_task_mm(current);
 + if (!mm) {
 + kfree(vwork);
 + return;
 + }
 + INIT_WORK(vwork-work, lock_acct_bg);
 + vwork-mm = mm;
 + vwork-npage = npage;
 + schedule_work(vwork-work);
 +}

Locked page accounting in this version is very, very broken.  How do
powerpc folks feel about seemingly generic kernel iommu interfaces
messing with the current task mm?  Besides that, more problems below...

 +
 +/*
 + * iommu_reset_table is called when it started/stopped being used.
 + *
 + * restore==true says to bring the iommu_table into the state as it was
 + * before being used by VFIO.
 + */
 +void iommu_reset_table(struct iommu_table *tbl, bool restore)
 +{
 + /* Page#0 is marked as used in iommu_init_table, so we clear it... */
 + if (!restore  (tbl-it_offset == 0))
 + clear_bit(0, tbl-it_map);
 +
 + iommu_clear_tces(tbl, tbl-it_offset, tbl-it_size);

This does locked page accounting and unpins pages, even on startup when
the pages 

[PATCH] powerpc+of: Rename and fix OF reconfig notifier error inject module

2012-12-12 Thread Benjamin Herrenschmidt
This module used to inject errors in the pSeries specific dynamic
reconfiguration notifiers. Those are gone however, replaced by
generic notifiers for changes to the device-tree. So let's update
the module to deal with these instead and rename it along the way.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---
 lib/Kconfig.debug|   10 ++---
 lib/Makefile |4 +-
 lib/of-reconfig-notifier-error-inject.c  |   51 ++
 lib/pSeries-reconfig-notifier-error-inject.c |   51 --
 4 files changed, 58 insertions(+), 58 deletions(-)
 create mode 100644 lib/of-reconfig-notifier-error-inject.c
 delete mode 100644 lib/pSeries-reconfig-notifier-error-inject.c

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 28e9d6c9..c2d89f3 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1192,14 +1192,14 @@ config MEMORY_NOTIFIER_ERROR_INJECT
 
  If unsure, say N.
 
-config PSERIES_RECONFIG_NOTIFIER_ERROR_INJECT
-   tristate pSeries reconfig notifier error injection module
-   depends on PPC_PSERIES  NOTIFIER_ERROR_INJECTION
+config OF_RECONFIG_NOTIFIER_ERROR_INJECT
+   tristate OF reconfig notifier error injection module
+   depends on OF_DYNAMIC  NOTIFIER_ERROR_INJECTION
help
  This option provides the ability to inject artifical errors to
- pSeries reconfig notifier chain callbacks.  It is controlled
+ OF reconfig notifier chain callbacks.  It is controlled
  through debugfs interface under
- /sys/kernel/debug/notifier-error-inject/pSeries-reconfig/
+ /sys/kernel/debug/notifier-error-inject/OF-reconfig/
 
  If the notifier call chain should be failed with some events
  notified, write the error code to actions/notifier event/error.
diff --git a/lib/Makefile b/lib/Makefile
index 821a162..7c00908 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -94,8 +94,8 @@ obj-$(CONFIG_NOTIFIER_ERROR_INJECTION) += 
notifier-error-inject.o
 obj-$(CONFIG_CPU_NOTIFIER_ERROR_INJECT) += cpu-notifier-error-inject.o
 obj-$(CONFIG_PM_NOTIFIER_ERROR_INJECT) += pm-notifier-error-inject.o
 obj-$(CONFIG_MEMORY_NOTIFIER_ERROR_INJECT) += memory-notifier-error-inject.o
-obj-$(CONFIG_PSERIES_RECONFIG_NOTIFIER_ERROR_INJECT) += \
-   pSeries-reconfig-notifier-error-inject.o
+obj-$(CONFIG_OF_RECONFIG_NOTIFIER_ERROR_INJECT) += \
+   of-reconfig-notifier-error-inject.o
 
 lib-$(CONFIG_GENERIC_BUG) += bug.o
 
diff --git a/lib/of-reconfig-notifier-error-inject.c 
b/lib/of-reconfig-notifier-error-inject.c
new file mode 100644
index 000..8dc7986
--- /dev/null
+++ b/lib/of-reconfig-notifier-error-inject.c
@@ -0,0 +1,51 @@
+#include linux/kernel.h
+#include linux/module.h
+#include linux/of.h
+
+#include notifier-error-inject.h
+
+static int priority;
+module_param(priority, int, 0);
+MODULE_PARM_DESC(priority, specify OF reconfig notifier priority);
+
+static struct notifier_err_inject reconfig_err_inject = {
+   .actions = {
+   { NOTIFIER_ERR_INJECT_ACTION(OF_RECONFIG_ATTACH_NODE) },
+   { NOTIFIER_ERR_INJECT_ACTION(OF_RECONFIG_DETACH_NODE) },
+   { NOTIFIER_ERR_INJECT_ACTION(OF_RECONFIG_ADD_PROPERTY) },
+   { NOTIFIER_ERR_INJECT_ACTION(OF_RECONFIG_REMOVE_PROPERTY) },
+   { NOTIFIER_ERR_INJECT_ACTION(OF_RECONFIG_UPDATE_PROPERTY) },
+   {}
+   }
+};
+
+static struct dentry *dir;
+
+static int err_inject_init(void)
+{
+   int err;
+
+   dir = notifier_err_inject_init(OF-reconfig,
+   notifier_err_inject_dir, reconfig_err_inject, priority);
+   if (IS_ERR(dir))
+   return PTR_ERR(dir);
+
+   err = of_reconfig_notifier_register(reconfig_err_inject.nb);
+   if (err)
+   debugfs_remove_recursive(dir);
+
+   return err;
+}
+
+static void err_inject_exit(void)
+{
+   of_reconfig_notifier_unregister(reconfig_err_inject.nb);
+   debugfs_remove_recursive(dir);
+}
+
+module_init(err_inject_init);
+module_exit(err_inject_exit);
+
+MODULE_DESCRIPTION(OF reconfig notifier error injection module);
+MODULE_LICENSE(GPL);
+MODULE_AUTHOR(Akinobu Mita akinobu.m...@gmail.com);
diff --git a/lib/pSeries-reconfig-notifier-error-inject.c 
b/lib/pSeries-reconfig-notifier-error-inject.c
deleted file mode 100644
index 7f7c98d..000
--- a/lib/pSeries-reconfig-notifier-error-inject.c
+++ /dev/null
@@ -1,51 +0,0 @@
-#include linux/kernel.h
-#include linux/module.h
-
-#include asm/pSeries_reconfig.h
-
-#include notifier-error-inject.h
-
-static int priority;
-module_param(priority, int, 0);
-MODULE_PARM_DESC(priority, specify pSeries reconfig notifier priority);
-
-static struct notifier_err_inject reconfig_err_inject = {
-   .actions = {
-   { NOTIFIER_ERR_INJECT_ACTION(PSERIES_RECONFIG_ADD) },
-   { NOTIFIER_ERR_INJECT_ACTION(PSERIES_RECONFIG_REMOVE) },
-   { 

[PATCH 1/3] powerpc: Run savedefconfig over pseries, ppc64 and ppc64e defconfig

2012-12-12 Thread Anton Blanchard

No changes, just update the configs with savedefconfig.

Signed-off-by: Anton Blanchard an...@samba.org
--- 

Index: b/arch/powerpc/configs/ppc64_defconfig
===
--- a/arch/powerpc/configs/ppc64_defconfig
+++ b/arch/powerpc/configs/ppc64_defconfig
@@ -5,6 +5,9 @@ CONFIG_SMP=y
 CONFIG_EXPERIMENTAL=y
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
+CONFIG_IRQ_DOMAIN_DEBUG=y
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
 CONFIG_TASKSTATS=y
 CONFIG_TASK_DELAY_ACCT=y
 CONFIG_IKCONFIG=y
@@ -21,6 +24,7 @@ CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
 CONFIG_MODVERSIONS=y
 CONFIG_MODULE_SRCVERSION_ALL=y
+CONFIG_PARTITION_ADVANCED=y
 CONFIG_PPC_SPLPAR=y
 CONFIG_SCANLOG=m
 CONFIG_PPC_SMLPAR=y
@@ -42,11 +46,8 @@ CONFIG_CPU_FREQ=y
 CONFIG_CPU_FREQ_GOV_POWERSAVE=y
 CONFIG_CPU_FREQ_GOV_USERSPACE=y
 CONFIG_CPU_FREQ_PMAC64=y
-CONFIG_NO_HZ=y
-CONFIG_HIGH_RES_TIMERS=y
 CONFIG_HZ_100=y
 CONFIG_BINFMT_MISC=m
-CONFIG_HOTPLUG_CPU=y
 CONFIG_KEXEC=y
 CONFIG_IRQ_ALL_CPUS=y
 CONFIG_MEMORY_HOTREMOVE=y
@@ -73,7 +74,6 @@ CONFIG_INET_ESP=m
 CONFIG_INET_IPCOMP=m
 # CONFIG_IPV6 is not set
 CONFIG_NETFILTER=y
-CONFIG_NETFILTER_NETLINK_QUEUE=m
 CONFIG_NF_CONNTRACK=m
 CONFIG_NF_CONNTRACK_EVENTS=y
 CONFIG_NF_CT_PROTO_SCTP=m
@@ -130,19 +130,12 @@ CONFIG_NETFILTER_XT_MATCH_U32=m
 CONFIG_NF_CONNTRACK_IPV4=m
 CONFIG_IP_NF_QUEUE=m
 CONFIG_IP_NF_IPTABLES=m
-CONFIG_IP_NF_MATCH_ADDRTYPE=m
 CONFIG_IP_NF_MATCH_AH=m
 CONFIG_IP_NF_MATCH_ECN=m
 CONFIG_IP_NF_MATCH_TTL=m
 CONFIG_IP_NF_FILTER=m
 CONFIG_IP_NF_TARGET_REJECT=m
-CONFIG_IP_NF_TARGET_LOG=m
 CONFIG_IP_NF_TARGET_ULOG=m
-CONFIG_NF_NAT=m
-CONFIG_IP_NF_TARGET_MASQUERADE=m
-CONFIG_IP_NF_TARGET_NETMAP=m
-CONFIG_IP_NF_TARGET_REDIRECT=m
-CONFIG_NF_NAT_SNMP_BASIC=m
 CONFIG_IP_NF_MANGLE=m
 CONFIG_IP_NF_TARGET_CLUSTERIP=m
 CONFIG_IP_NF_TARGET_ECN=m
@@ -151,6 +144,7 @@ CONFIG_IP_NF_RAW=m
 CONFIG_IP_NF_ARPTABLES=m
 CONFIG_IP_NF_ARPFILTER=m
 CONFIG_IP_NF_ARP_MANGLE=m
+CONFIG_BPF_JIT=y
 CONFIG_UEVENT_HELPER_PATH=/sbin/hotplug
 CONFIG_PROC_DEVICETREE=y
 CONFIG_BLK_DEV_FD=y
@@ -173,7 +167,6 @@ CONFIG_CHR_DEV_SG=y
 CONFIG_SCSI_MULTI_LUN=y
 CONFIG_SCSI_CONSTANTS=y
 CONFIG_SCSI_FC_ATTRS=y
-CONFIG_SCSI_SAS_ATTRS=m
 CONFIG_SCSI_CXGB3_ISCSI=m
 CONFIG_SCSI_CXGB4_ISCSI=m
 CONFIG_SCSI_BNX2_ISCSI=m
@@ -205,13 +198,6 @@ CONFIG_DM_SNAPSHOT=m
 CONFIG_DM_MIRROR=m
 CONFIG_DM_ZERO=m
 CONFIG_DM_MULTIPATH=m
-CONFIG_IEEE1394=y
-CONFIG_IEEE1394_OHCI1394=y
-CONFIG_IEEE1394_SBP2=m
-CONFIG_IEEE1394_ETH1394=m
-CONFIG_IEEE1394_RAWIO=y
-CONFIG_IEEE1394_VIDEO1394=m
-CONFIG_IEEE1394_DV1394=m
 CONFIG_ADB_PMU=y
 CONFIG_PMAC_SMU=y
 CONFIG_THERM_PM72=y
@@ -220,50 +206,43 @@ CONFIG_WINDFARM_PM81=y
 CONFIG_WINDFARM_PM91=y
 CONFIG_WINDFARM_PM112=y
 CONFIG_WINDFARM_PM121=y
-CONFIG_NETDEVICES=y
-CONFIG_DUMMY=m
 CONFIG_BONDING=m
+CONFIG_DUMMY=m
+CONFIG_NETCONSOLE=y
+CONFIG_NETPOLL_TRAP=y
 CONFIG_TUN=m
-CONFIG_MARVELL_PHY=y
-CONFIG_BROADCOM_PHY=m
-CONFIG_NET_ETHERNET=y
-CONFIG_SUNGEM=y
-CONFIG_NET_VENDOR_3COM=y
 CONFIG_VORTEX=y
-CONFIG_IBMVETH=m
-CONFIG_NET_PCI=y
-CONFIG_PCNET32=y
-CONFIG_E100=y
 CONFIG_ACENIC=m
 CONFIG_ACENIC_OMIT_TIGON_I=y
-CONFIG_E1000=y
-CONFIG_E1000E=y
+CONFIG_PCNET32=y
 CONFIG_TIGON3=y
-CONFIG_BNX2=m
-CONFIG_SPIDER_NET=m
-CONFIG_GELIC_NET=m
-CONFIG_GELIC_WIRELESS=y
 CONFIG_CHELSIO_T1=m
-CONFIG_CHELSIO_T3=m
-CONFIG_CHELSIO_T4=m
+CONFIG_BE2NET=m
+CONFIG_S2IO=m
+CONFIG_IBMVETH=m
 CONFIG_EHEA=m
-CONFIG_IXGBE=m
+CONFIG_E100=y
+CONFIG_E1000=y
+CONFIG_E1000E=y
 CONFIG_IXGB=m
-CONFIG_S2IO=m
+CONFIG_IXGBE=m
+CONFIG_MLX4_EN=m
 CONFIG_MYRI10GE=m
-CONFIG_NETXEN_NIC=m
 CONFIG_PASEMI_MAC=y
-CONFIG_MLX4_EN=m
 CONFIG_QLGE=m
-CONFIG_BE2NET=m
+CONFIG_NETXEN_NIC=m
+CONFIG_SUNGEM=y
+CONFIG_GELIC_NET=m
+CONFIG_GELIC_WIRELESS=y
+CONFIG_SPIDER_NET=m
+CONFIG_MARVELL_PHY=y
+CONFIG_BROADCOM_PHY=m
 CONFIG_PPP=m
-CONFIG_PPP_ASYNC=m
-CONFIG_PPP_SYNC_TTY=m
-CONFIG_PPP_DEFLATE=m
 CONFIG_PPP_BSDCOMP=m
+CONFIG_PPP_DEFLATE=m
 CONFIG_PPPOE=m
-CONFIG_NETCONSOLE=y
-CONFIG_NETPOLL_TRAP=y
+CONFIG_PPP_ASYNC=m
+CONFIG_PPP_SYNC_TTY=m
 # CONFIG_INPUT_MOUSEDEV_PSAUX is not set
 CONFIG_INPUT_EVDEV=m
 CONFIG_INPUT_MISC=y
@@ -279,13 +258,10 @@ CONFIG_HVC_RTAS=y
 CONFIG_HVC_BEAT=y
 CONFIG_HVCS=m
 CONFIG_IBM_BSR=m
-CONFIG_HW_RANDOM=m
-CONFIG_HW_RANDOM_PSERIES=m
 CONFIG_RAW_DRIVER=y
 CONFIG_I2C_CHARDEV=y
 CONFIG_I2C_AMD8111=y
 CONFIG_I2C_PASEMI=y
-# CONFIG_HWMON is not set
 CONFIG_VIDEO_OUTPUT_CONTROL=m
 CONFIG_FB=y
 CONFIG_FIRMWARE_EDID=y
@@ -300,7 +276,6 @@ CONFIG_FB_RADEON=y
 CONFIG_FB_IBM_GXT4500=y
 CONFIG_FB_PS3=m
 CONFIG_LCD_CLASS_DEVICE=y
-CONFIG_DISPLAY_SUPPORT=y
 # CONFIG_VGA_CONSOLE is not set
 CONFIG_FRAMEBUFFER_CONSOLE=y
 CONFIG_LOGO=y
@@ -317,18 +292,16 @@ CONFIG_SND_AOA_FABRIC_LAYOUT=m
 CONFIG_SND_AOA_ONYX=m
 CONFIG_SND_AOA_TAS=m
 CONFIG_SND_AOA_TOONIE=m
-CONFIG_USB_HIDDEV=y
 CONFIG_HID_GYRATION=y
 CONFIG_HID_PANTHERLORD=y
 CONFIG_HID_PETALYNX=y
 CONFIG_HID_SAMSUNG=y
 CONFIG_HID_SONY=y
 CONFIG_HID_SUNPLUS=y
+CONFIG_USB_HIDDEV=y
 CONFIG_USB=y
-CONFIG_USB_DEVICEFS=y
 CONFIG_USB_MON=m
 CONFIG_USB_EHCI_HCD=y

[PATCH 2/3] powerpc: Cleanup NLS config options on pseries, ppc64 and ppc64e defconfig

2012-12-12 Thread Anton Blanchard

Set CONFIG_NLS_DEFAULT to utf8. The distros do this (eg ppc64 FC17
and RHEL6) as well as the x86 defconfigs. Userspace these days is
most likely to expect utf8 anyway.

Signed-off-by: Anton Blanchard an...@samba.org
--- 

Index: b/arch/powerpc/configs/ppc64_defconfig
===
--- a/arch/powerpc/configs/ppc64_defconfig
+++ b/arch/powerpc/configs/ppc64_defconfig
@@ -372,43 +372,11 @@ CONFIG_NFSD_V4=y
 CONFIG_CIFS=m
 CONFIG_CIFS_XATTR=y
 CONFIG_CIFS_POSIX=y
+CONFIG_NLS_DEFAULT=utf8
 CONFIG_NLS_CODEPAGE_437=y
-CONFIG_NLS_CODEPAGE_737=m
-CONFIG_NLS_CODEPAGE_775=m
-CONFIG_NLS_CODEPAGE_850=m
-CONFIG_NLS_CODEPAGE_852=m
-CONFIG_NLS_CODEPAGE_855=m
-CONFIG_NLS_CODEPAGE_857=m
-CONFIG_NLS_CODEPAGE_860=m
-CONFIG_NLS_CODEPAGE_861=m
-CONFIG_NLS_CODEPAGE_862=m
-CONFIG_NLS_CODEPAGE_863=m
-CONFIG_NLS_CODEPAGE_864=m
-CONFIG_NLS_CODEPAGE_865=m
-CONFIG_NLS_CODEPAGE_866=m
-CONFIG_NLS_CODEPAGE_869=m
-CONFIG_NLS_CODEPAGE_936=m
-CONFIG_NLS_CODEPAGE_950=m
-CONFIG_NLS_CODEPAGE_932=m
-CONFIG_NLS_CODEPAGE_949=m
-CONFIG_NLS_CODEPAGE_874=m
-CONFIG_NLS_ISO8859_8=m
-CONFIG_NLS_CODEPAGE_1250=m
-CONFIG_NLS_CODEPAGE_1251=m
-CONFIG_NLS_ASCII=m
+CONFIG_NLS_ASCII=y
 CONFIG_NLS_ISO8859_1=y
-CONFIG_NLS_ISO8859_2=m
-CONFIG_NLS_ISO8859_3=m
-CONFIG_NLS_ISO8859_4=m
-CONFIG_NLS_ISO8859_5=m
-CONFIG_NLS_ISO8859_6=m
-CONFIG_NLS_ISO8859_7=m
-CONFIG_NLS_ISO8859_9=m
-CONFIG_NLS_ISO8859_13=m
-CONFIG_NLS_ISO8859_14=m
-CONFIG_NLS_ISO8859_15=m
-CONFIG_NLS_KOI8_R=m
-CONFIG_NLS_KOI8_U=m
+CONFIG_NLS_UTF8=y
 CONFIG_CRC_T10DIF=y
 CONFIG_MAGIC_SYSRQ=y
 CONFIG_DEBUG_KERNEL=y
Index: b/arch/powerpc/configs/ppc64e_defconfig
===
--- a/arch/powerpc/configs/ppc64e_defconfig
+++ b/arch/powerpc/configs/ppc64e_defconfig
@@ -290,43 +290,11 @@ CONFIG_NFSD_V4=y
 CONFIG_CIFS=m
 CONFIG_CIFS_XATTR=y
 CONFIG_CIFS_POSIX=y
+CONFIG_NLS_DEFAULT=utf8
 CONFIG_NLS_CODEPAGE_437=y
-CONFIG_NLS_CODEPAGE_737=m
-CONFIG_NLS_CODEPAGE_775=m
-CONFIG_NLS_CODEPAGE_850=m
-CONFIG_NLS_CODEPAGE_852=m
-CONFIG_NLS_CODEPAGE_855=m
-CONFIG_NLS_CODEPAGE_857=m
-CONFIG_NLS_CODEPAGE_860=m
-CONFIG_NLS_CODEPAGE_861=m
-CONFIG_NLS_CODEPAGE_862=m
-CONFIG_NLS_CODEPAGE_863=m
-CONFIG_NLS_CODEPAGE_864=m
-CONFIG_NLS_CODEPAGE_865=m
-CONFIG_NLS_CODEPAGE_866=m
-CONFIG_NLS_CODEPAGE_869=m
-CONFIG_NLS_CODEPAGE_936=m
-CONFIG_NLS_CODEPAGE_950=m
-CONFIG_NLS_CODEPAGE_932=m
-CONFIG_NLS_CODEPAGE_949=m
-CONFIG_NLS_CODEPAGE_874=m
-CONFIG_NLS_ISO8859_8=m
-CONFIG_NLS_CODEPAGE_1250=m
-CONFIG_NLS_CODEPAGE_1251=m
-CONFIG_NLS_ASCII=m
+CONFIG_NLS_ASCII=y
 CONFIG_NLS_ISO8859_1=y
-CONFIG_NLS_ISO8859_2=m
-CONFIG_NLS_ISO8859_3=m
-CONFIG_NLS_ISO8859_4=m
-CONFIG_NLS_ISO8859_5=m
-CONFIG_NLS_ISO8859_6=m
-CONFIG_NLS_ISO8859_7=m
-CONFIG_NLS_ISO8859_9=m
-CONFIG_NLS_ISO8859_13=m
-CONFIG_NLS_ISO8859_14=m
-CONFIG_NLS_ISO8859_15=m
-CONFIG_NLS_KOI8_R=m
-CONFIG_NLS_KOI8_U=m
+CONFIG_NLS_UTF8=y
 CONFIG_CRC_T10DIF=y
 CONFIG_MAGIC_SYSRQ=y
 CONFIG_DEBUG_KERNEL=y
Index: b/arch/powerpc/configs/pseries_defconfig
===
--- a/arch/powerpc/configs/pseries_defconfig
+++ b/arch/powerpc/configs/pseries_defconfig
@@ -298,9 +298,11 @@ CONFIG_NFSD_V4=y
 CONFIG_CIFS=m
 CONFIG_CIFS_XATTR=y
 CONFIG_CIFS_POSIX=y
+CONFIG_NLS_DEFAULT=utf8
 CONFIG_NLS_CODEPAGE_437=y
 CONFIG_NLS_ASCII=y
 CONFIG_NLS_ISO8859_1=y
+CONFIG_NLS_UTF8=y
 CONFIG_CRC_T10DIF=y
 CONFIG_MAGIC_SYSRQ=y
 CONFIG_DEBUG_KERNEL=y
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 3/3] powerpc: Enable devtmpfs, EFI partition support and tmpfs ACLs on pseries, ppc64 and ppc64e defconfig

2012-12-12 Thread Anton Blanchard

We need devtmpfs enabled to boot on recent versions of Fedora. EFI
partitions will be useful for large block devices. tmpfs ACL support
is used by some distros for managing access to devices.

Signed-off-by: Anton Blanchard an...@samba.org
--- 

Index: b/arch/powerpc/configs/pseries_defconfig
===
--- a/arch/powerpc/configs/pseries_defconfig
+++ b/arch/powerpc/configs/pseries_defconfig
@@ -32,6 +32,8 @@ CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
 CONFIG_MODVERSIONS=y
 CONFIG_MODULE_SRCVERSION_ALL=y
+CONFIG_PARTITION_ADVANCED=y
+CONFIG_EFI_PARTITION=y
 CONFIG_PPC_SPLPAR=y
 CONFIG_SCANLOG=m
 CONFIG_PPC_SMLPAR=y
@@ -118,6 +120,8 @@ CONFIG_IP_NF_FILTER=m
 CONFIG_IP_NF_TARGET_REJECT=m
 CONFIG_IP_NF_TARGET_ULOG=m
 CONFIG_UEVENT_HELPER_PATH=/sbin/hotplug
+CONFIG_DEVTMPFS=y
+CONFIG_DEVTMPFS_MOUNT=y
 CONFIG_PROC_DEVICETREE=y
 CONFIG_PARPORT=m
 CONFIG_PARPORT_PC=m
@@ -283,6 +287,7 @@ CONFIG_MSDOS_FS=y
 CONFIG_VFAT_FS=y
 CONFIG_PROC_KCORE=y
 CONFIG_TMPFS=y
+CONFIG_TMPFS_POSIX_ACL=y
 CONFIG_HUGETLBFS=y
 CONFIG_CRAMFS=m
 CONFIG_SQUASHFS=m
Index: b/arch/powerpc/configs/ppc64_defconfig
===
--- a/arch/powerpc/configs/ppc64_defconfig
+++ b/arch/powerpc/configs/ppc64_defconfig
@@ -25,6 +25,7 @@ CONFIG_MODULE_UNLOAD=y
 CONFIG_MODVERSIONS=y
 CONFIG_MODULE_SRCVERSION_ALL=y
 CONFIG_PARTITION_ADVANCED=y
+CONFIG_EFI_PARTITION=y
 CONFIG_PPC_SPLPAR=y
 CONFIG_SCANLOG=m
 CONFIG_PPC_SMLPAR=y
@@ -146,6 +147,8 @@ CONFIG_IP_NF_ARPFILTER=m
 CONFIG_IP_NF_ARP_MANGLE=m
 CONFIG_BPF_JIT=y
 CONFIG_UEVENT_HELPER_PATH=/sbin/hotplug
+CONFIG_DEVTMPFS=y
+CONFIG_DEVTMPFS_MOUNT=y
 CONFIG_PROC_DEVICETREE=y
 CONFIG_BLK_DEV_FD=y
 CONFIG_BLK_DEV_LOOP=y
@@ -354,6 +357,7 @@ CONFIG_MSDOS_FS=y
 CONFIG_VFAT_FS=y
 CONFIG_PROC_KCORE=y
 CONFIG_TMPFS=y
+CONFIG_TMPFS_POSIX_ACL=y
 CONFIG_HUGETLBFS=y
 CONFIG_HFS_FS=m
 CONFIG_HFSPLUS_FS=m
Index: b/arch/powerpc/configs/ppc64e_defconfig
===
--- a/arch/powerpc/configs/ppc64e_defconfig
+++ b/arch/powerpc/configs/ppc64e_defconfig
@@ -22,6 +22,7 @@ CONFIG_MODVERSIONS=y
 CONFIG_MODULE_SRCVERSION_ALL=y
 CONFIG_PARTITION_ADVANCED=y
 CONFIG_MAC_PARTITION=y
+CONFIG_EFI_PARTITION=y
 CONFIG_P5020_DS=y
 CONFIG_CPU_FREQ=y
 CONFIG_CPU_FREQ_GOV_POWERSAVE=y
@@ -119,6 +120,8 @@ CONFIG_IP_NF_ARPTABLES=m
 CONFIG_IP_NF_ARPFILTER=m
 CONFIG_IP_NF_ARP_MANGLE=m
 CONFIG_UEVENT_HELPER_PATH=/sbin/hotplug
+CONFIG_DEVTMPFS=y
+CONFIG_DEVTMPFS_MOUNT=y
 CONFIG_PROC_DEVICETREE=y
 CONFIG_BLK_DEV_FD=y
 CONFIG_BLK_DEV_LOOP=y
@@ -277,6 +280,7 @@ CONFIG_MSDOS_FS=y
 CONFIG_VFAT_FS=y
 CONFIG_PROC_KCORE=y
 CONFIG_TMPFS=y
+CONFIG_TMPFS_POSIX_ACL=y
 CONFIG_HFS_FS=m
 CONFIG_HFSPLUS_FS=m
 CONFIG_CRAMFS=y

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] powerpc: Avoid load of static chain register when calling nested functions through a pointer on 64bit

2012-12-12 Thread Anton Blanchard

The ppc64 ABI has a static chain register (r11) which is only used
when calling nested functions through a pointer. Considering that
we take a dim view of nested functions in the kernel, we have a lot
of unnecessary overhead here.

gcc 4.7 has an option to disable loading of r11 so lets use it.

If hell freezes over and hipsters manage to litter the kernel
with nested functions, gcc will give us an error message and
won't simply compile bad code:

You cannot take the address of a nested function if you use
the -mno-pointers-to-nested-functions option.

Furthermore our kernel module trampolines don't setup the static
chain register so adding this option and forcing gcc to error out
makes even more sense.

Signed-off-by: Anton Blanchard an...@samba.org
--- 

Index: b/arch/powerpc/Makefile
===
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -85,6 +85,7 @@ endif
 
 CFLAGS-$(CONFIG_PPC64) := -mtraceback=no -mcall-aixdesc
 CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mcmodel=medium,-mminimal-toc)
+CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mno-pointers-to-nested-functions)
 CFLAGS-$(CONFIG_PPC32) := -ffixed-r2 -mmultiple
 
 CFLAGS-$(CONFIG_GENERIC_CPU) += $(call cc-option,-mtune=power7,-mtune=power4)
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v3] powerpc: fix wii_memory_fixups() compile error on 3.0.y tree

2012-12-12 Thread Ben Hutchings
On Wed, 2012-12-12 at 16:27 -0700, Shuah Khan wrote:
 Fix wii_memory_fixups() the following compile error on 3.0.y tree with
 wii_defconfig on 3.0.y tree.
 
   CC  arch/powerpc/platforms/embedded6xx/wii.o
 arch/powerpc/platforms/embedded6xx/wii.c: In function ‘wii_memory_fixups’:
 arch/powerpc/platforms/embedded6xx/wii.c:88:2: error: format ‘%llx’ expects 
 argument of type ‘long long unsigned int’, but argument 2 has type 
 ‘phys_addr_t’ [-Werror=format]
 arch/powerpc/platforms/embedded6xx/wii.c:88:2: error: format ‘%llx’ expects 
 argument of type ‘long long unsigned int’, but argument 3 has type 
 ‘phys_addr_t’ [-Werror=format]
 arch/powerpc/platforms/embedded6xx/wii.c:90:2: error: format ‘%llx’ expects 
 argument of type ‘long long unsigned int’, but argument 2 has type 
 ‘phys_addr_t’ [-Werror=format]
 arch/powerpc/platforms/embedded6xx/wii.c:90:2: error: format ‘%llx’ expects 
 argument of type ‘long long unsigned int’, but argument 3 has type 
 ‘phys_addr_t’ [-Werror=format]
 cc1: all warnings being treated as errors
 make[2]: *** [arch/powerpc/platforms/embedded6xx/wii.o] Error 1
 make[1]: *** [arch/powerpc/platforms/embedded6xx] Error 2
 make: *** [arch/powerpc/platforms] Error 2
 
 Signed-off-by: Shuah Khan shuah.k...@hp.com
 CC: sta...@vger.kernel.org 3.0.y
[...]

This looks fine, and I've queued it up for 3.2 since the errant code
wasn't removed until 3.3.

Ben.

-- 
Ben Hutchings
Theory and practice are closer in theory than in practice.
- John Levine, moderator of comp.compilers


signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] vfio powerpc: enabled on powernv platform

2012-12-12 Thread Alexey Kardashevskiy

On 13/12/12 10:30, Alex Williamson wrote:

On Wed, 2012-12-12 at 23:34 +1100, Alexey Kardashevskiy wrote:

This patch initializes IOMMU groups based on the IOMMU
configuration discovered during the PCI scan on POWERNV
(POWER non virtualized) platform. The IOMMU groups are
to be used later by VFIO driver (PCI pass through).

It also implements an API for mapping/unmapping pages for
guest PCI drivers and providing DMA window properties.
This API is going to be used later by QEMU-VFIO to handle
h_put_tce hypercalls from the KVM guest.

Although this driver has been tested only on the POWERNV
platform, it should work on any platform which supports
TCE tables.

To enable VFIO on POWER, enable SPAPR_TCE_IOMMU config
option and configure VFIO as required.

Cc: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
  arch/powerpc/include/asm/iommu.h |   10 ++
  arch/powerpc/kernel/iommu.c  |  329 ++
  arch/powerpc/platforms/powernv/pci.c |  134 ++
  drivers/iommu/Kconfig|8 +
  4 files changed, 481 insertions(+)

diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index cbfe678..3c861ae 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -76,6 +76,9 @@ struct iommu_table {
struct iommu_pool large_pool;
struct iommu_pool pools[IOMMU_NR_POOLS];
unsigned long *it_map;   /* A simple allocation bitmap for now */
+#ifdef CONFIG_IOMMU_API
+   struct iommu_group *it_group;
+#endif
  };

  struct scatterlist;
@@ -147,5 +150,12 @@ static inline void iommu_restore(void)
  }
  #endif

+extern void iommu_reset_table(struct iommu_table *tbl, bool restore);
+extern long iommu_clear_tces(struct iommu_table *tbl, unsigned long ioba,
+   unsigned long size);
+extern long iommu_put_tces(struct iommu_table *tbl, unsigned long ioba,
+   uint64_t tce, enum dma_data_direction direction,
+   unsigned long size);
+
  #endif /* __KERNEL__ */
  #endif /* _ASM_IOMMU_H */
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index ff5a6ce..f3bb2e7 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -36,6 +36,7 @@
  #include linux/hash.h
  #include linux/fault-inject.h
  #include linux/pci.h
+#include linux/uaccess.h
  #include asm/io.h
  #include asm/prom.h
  #include asm/iommu.h
@@ -44,6 +45,7 @@
  #include asm/kdump.h
  #include asm/fadump.h
  #include asm/vio.h
+#include asm/tce.h

  #define DBG(...)

@@ -856,3 +858,330 @@ void iommu_free_coherent(struct iommu_table *tbl, size_t 
size,
free_pages((unsigned long)vaddr, get_order(size));
}
  }
+
+#ifdef CONFIG_IOMMU_API
+/*
+ * SPAPR TCE API
+ */
+
+struct vwork {
+   struct mm_struct*mm;
+   longnpage;
+   struct work_struct  work;
+};
+
+/* delayed decrement/increment for locked_vm */
+static void lock_acct_bg(struct work_struct *work)
+{
+   struct vwork *vwork = container_of(work, struct vwork, work);
+   struct mm_struct *mm;
+
+   mm = vwork-mm;
+   down_write(mm-mmap_sem);
+   mm-locked_vm += vwork-npage;
+   up_write(mm-mmap_sem);
+   mmput(mm);
+   kfree(vwork);
+}
+
+static void lock_acct(long npage)
+{
+   struct vwork *vwork;
+   struct mm_struct *mm;
+
+   if (!current-mm)
+   return; /* process exited */
+
+   if (down_write_trylock(current-mm-mmap_sem)) {
+   current-mm-locked_vm += npage;
+   up_write(current-mm-mmap_sem);
+   return;
+   }
+
+   /*
+* Couldn't get mmap_sem lock, so must setup to update
+* mm-locked_vm later. If locked_vm were atomic, we
+* wouldn't need this silliness
+*/
+   vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL);
+   if (!vwork)
+   return;
+   mm = get_task_mm(current);
+   if (!mm) {
+   kfree(vwork);
+   return;
+   }
+   INIT_WORK(vwork-work, lock_acct_bg);
+   vwork-mm = mm;
+   vwork-npage = npage;
+   schedule_work(vwork-work);
+}


Locked page accounting in this version is very, very broken.  How do
powerpc folks feel about seemingly generic kernel iommu interfaces
messing with the current task mm?  Besides that, more problems below...


+
+/*
+ * iommu_reset_table is called when it started/stopped being used.
+ *
+ * restore==true says to bring the iommu_table into the state as it was
+ * before being used by VFIO.
+ */
+void iommu_reset_table(struct iommu_table *tbl, bool restore)
+{
+   /* Page#0 is marked as used in iommu_init_table, so we clear it... */
+   if (!restore  (tbl-it_offset == 0))
+   clear_bit(0, tbl-it_map);
+
+   iommu_clear_tces(tbl, tbl-it_offset, tbl-it_size);


This does locked page accounting and unpins pages, even on startup when

Re: [PATCH] vfio powerpc: enabled on powernv platform

2012-12-12 Thread Benjamin Herrenschmidt
On Wed, 2012-12-12 at 07:34 -0700, Alex Williamson wrote:
  But what would I put there?... IOMMU ID is more than enough at the moment 
  and struct iommu_table does not have anything what would have made sense to 
  show in the sysfs...
 
 I believe David mentioned that PEs had user visible names.  Perhaps they
 match an enclosure location or something.  Group numbers are rather
 arbitrary and really have no guarantee of persistence.  Thanks, 

I agree. Make up something, for example domain[PE] or something like
that.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] vfio powerpc: enabled on powernv platform

2012-12-12 Thread Benjamin Herrenschmidt
On Wed, 2012-12-12 at 16:30 -0700, Alex Williamson wrote:

 Locked page accounting in this version is very, very broken.  How do
 powerpc folks feel about seemingly generic kernel iommu interfaces
 messing with the current task mm?  Besides that, more problems below...

Not good at all :-)

I don't understand tho ... H_PUT_TCE calls should be in the qemu context
(or the guest) as current at the point of the call, so everything should
be accounted fine on the *current* task when those calls occur, what's
the point of the work queue Alexey ?

This code looks horribly complicated ... where does it come from ?

  +/*
  + * iommu_reset_table is called when it started/stopped being used.
  + *
  + * restore==true says to bring the iommu_table into the state as it was
  + * before being used by VFIO.
  + */
  +void iommu_reset_table(struct iommu_table *tbl, bool restore)
  +{
  +   /* Page#0 is marked as used in iommu_init_table, so we clear it... */
  +   if (!restore  (tbl-it_offset == 0))
  +   clear_bit(0, tbl-it_map);
  +
  +   iommu_clear_tces(tbl, tbl-it_offset, tbl-it_size);
 
 This does locked page accounting and unpins pages, even on startup when
 the pages aren't necessarily pinned or accounted against the current
 process.

Not sure what you mean Alex, and not sure either what Alexey
implementation actually does but indeed, pages inside an iommu table
that was used by the host don't have their refcount elevated by the fact
that they are there.

So when taking ownership of an iommu for vfio, you probably need to FAIL
if any page is already mapped. Only once you know the iommu is clear for
use, then you can start populating it and account for anything you put
in it (and de-account anything you remove from it when cleaning things
up).

  +
  +   /* ... or restore  */
  +   if (restore  (tbl-it_offset == 0))
  +   set_bit(0, tbl-it_map);
  +}
  +EXPORT_SYMBOL_GPL(iommu_reset_table);
  +
  +/*
  + * Returns the number of used IOMMU pages (4K) within
  + * the same system page (4K or 64K).
  + *
  + * syspage_weight_zero is optimized for expected case == 0
  + * syspage_weight_one is optimized for expected case  1
  + * Other case are not used in this file.
  + */
  +#if PAGE_SIZE == IOMMU_PAGE_SIZE
  +
  +#define syspage_weight_zero(map, offset)   test_bit((map), (offset))
  +#define syspage_weight_one(map, offset)test_bit((map), 
  (offset))
  +
  +#elif PAGE_SIZE/IOMMU_PAGE_SIZE == 16
  +
  +static int syspage_weight_zero(unsigned long *map, unsigned long offset)
  +{
  +   offset = PAGE_MASK  IOMMU_PAGE_SHIFT;
  +   return 0xUL  (map[BIT_WORD(offset)] 
  +   (offset  (BITS_PER_LONG-1)));
  +}
 
 I would have expected these to be bools and return true if the weight
 matches the value.

What is that business anyway ? It's very obscure.

 If you replaced 0x above w/ this, would you need the #error below?
 
 (1UL  (PAGE_SIZE/IOMMU_PAGE_SIZE)) - 1)
 
  +
  +static int syspage_weight_one(unsigned long *map, unsigned long offset)
  +{
  +   int ret = 0, nbits = PAGE_SIZE/IOMMU_PAGE_SIZE;
  +
  +   /* Aligns TCE entry number to system page boundary */
  +   offset = PAGE_MASK  IOMMU_PAGE_SHIFT;
  +
  +   /* Count used 4K pages */
  +   while (nbits  (ret  2)) {
 
 Don't you have a ffs()?  Could also be used for _zero.  Surely there are
 some bitops helpers that could help here even on big endian.  hweight
 really doesn't work?
 
  +   if (test_bit(offset, map))
  +   ++ret;
  +
  +   --nbits;
  +   ++offset;
  +   }
  +
  +   return ret;
  +}
  +#else
  +#error TODO: support other page size
  +#endif

What combinations do you support ?

  +static void tce_flush(struct iommu_table *tbl)
  +{
  +   /* Flush/invalidate TLB caches if necessary */
  +   if (ppc_md.tce_flush)
  +   ppc_md.tce_flush(tbl);
  +
  +   /* Make sure updates are seen by hardware */
  +   mb();
  +}
 +
  +/*
  + * iommu_clear_tces clears tces and returned the number of system pages
  + * which it called put_page() on
  + */
  +static long clear_tces_nolock(struct iommu_table *tbl, unsigned long entry,
  +   unsigned long pages)
  +{
  +   int i, retpages = 0, clr;
  +   unsigned long oldtce, oldweight;
  +   struct page *page;
  +
  +   for (i = 0; i  pages; ++i, ++entry) {
  +   if (!test_bit(entry - tbl-it_offset, tbl-it_map))
  +   continue;
  +
  +   oldtce = ppc_md.tce_get(tbl, entry);
  +   ppc_md.tce_free(tbl, entry, 1);
  +
  +   oldweight = syspage_weight_one(tbl-it_map,
  +   entry - tbl-it_offset);
  +   clr = __test_and_clear_bit(entry - tbl-it_offset,
  +   tbl-it_map);
  +
  +   if (WARN_ON(!(oldtce  (TCE_PCI_WRITE | TCE_PCI_READ
  +   continue;
  +
  +   page = pfn_to_page(oldtce  PAGE_SHIFT);
  +
  +   if (WARN_ON(!page))
  +   continue;
  +
  +   if 

Re: [PATCH] vfio powerpc: enabled on powernv platform

2012-12-12 Thread Benjamin Herrenschmidt
On Wed, 2012-12-12 at 16:30 -0700, Alex Williamson wrote:
 Locked page accounting in this version is very, very broken.  How do
 powerpc folks feel about seemingly generic kernel iommu interfaces
 messing with the current task mm?  Besides that, more problems
 below...

After a second look  thought...

This whole accounting business is fucked. First, we simply can't just
randomly return errors from H_PUT_TCE because the process reached some
rlimit. This is not a proper failure mode. That means that the guest
will probably panic() ... possibly right in the middle of some disk
writeback or god knows what. Not good.

Also the overhead of doing all that crap on every TCE map/unmap is
ridiculous.

Finally, it's just not going to work for real mode which we really want,
since we can't take the mmap-sem in real mode anyway, so unless we
convert that counter to an atomic, we can't do it.

I'd suggest just not bothering, or if you want to bother, check once
when creating a TCE table that the rlimit is enough to bolt as many
pages as can be populated in that table and fail to create *that*. The
failure mode is much better, ie, qemu failing to create a PCI bus due to
insufficient rlimits.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] vfio powerpc: enabled on powernv platform

2012-12-12 Thread Alex Williamson
On Thu, 2012-12-13 at 13:57 +1100, Benjamin Herrenschmidt wrote:
 On Wed, 2012-12-12 at 16:30 -0700, Alex Williamson wrote:
  Locked page accounting in this version is very, very broken.  How do
  powerpc folks feel about seemingly generic kernel iommu interfaces
  messing with the current task mm?  Besides that, more problems
  below...
 
 After a second look  thought...
 
 This whole accounting business is fucked. First, we simply can't just
 randomly return errors from H_PUT_TCE because the process reached some
 rlimit. This is not a proper failure mode. That means that the guest
 will probably panic() ... possibly right in the middle of some disk
 writeback or god knows what. Not good.
 
 Also the overhead of doing all that crap on every TCE map/unmap is
 ridiculous.
 
 Finally, it's just not going to work for real mode which we really want,
 since we can't take the mmap-sem in real mode anyway, so unless we
 convert that counter to an atomic, we can't do it.
 
 I'd suggest just not bothering, or if you want to bother, check once
 when creating a TCE table that the rlimit is enough to bolt as many
 pages as can be populated in that table and fail to create *that*. The
 failure mode is much better, ie, qemu failing to create a PCI bus due to
 insufficient rlimits.

I agree, we don't seem to be headed in the right direction.  x86 needs
to track rlimits or else a user can exploit the interface to pin all the
memory in the system.  On power, only the iova window can be pinned, so
it's a fixed amount.  I could see it as granting access to a group
implicitly grants access to pinning the iova window.  We can still make
it more explicit by handling the rlimit accounting upfront.  Thanks,

Alex

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] powerpc: added DSCR support to ptrace

2012-12-12 Thread Alexey Kardashevskiy
The DSCR (aka Data Stream Control Register) is supported on some
server PowerPC chips and allow some control over the prefetch
of data streams.

The kernel already supports DSCR value per thread but there is also
a need in a ability to change it from an external process for
the specific pid.

The patch adds new register index PT_DSCR (index=44) which can be
set/get by:
  ptrace(PTRACE_POKEUSER, traced_process, PT_DSCR  3, dscr);
  dscr = ptrace(PTRACE_PEEKUSER, traced_process, PT_DSCR  3, NULL);

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 arch/powerpc/include/asm/ptrace.h |1 +
 arch/powerpc/kernel/ptrace.c  |   16 
 2 files changed, 17 insertions(+)

diff --git a/arch/powerpc/include/asm/ptrace.h 
b/arch/powerpc/include/asm/ptrace.h
index 9c21ed4..340fe36 100644
--- a/arch/powerpc/include/asm/ptrace.h
+++ b/arch/powerpc/include/asm/ptrace.h
@@ -276,6 +276,7 @@ static inline unsigned long 
regs_get_kernel_stack_nth(struct pt_regs *regs,
 #define PT_DAR 41
 #define PT_DSISR 42
 #define PT_RESULT 43
+#define PT_DSCR 44
 #define PT_REGS_COUNT 44
 
 #define PT_FPR048  /* each FP reg occupies 2 slots in this space */
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index c10fc28..d3ba67b 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -179,6 +179,17 @@ static int set_user_msr(struct task_struct *task, unsigned 
long msr)
return 0;
 }
 
+static unsigned long get_user_dscr(struct task_struct *task)
+{
+   return task-thread.dscr;
+}
+
+static int set_user_dscr(struct task_struct *task, unsigned long dscr)
+{
+   task-thread.dscr = dscr;
+   return 0;
+}
+
 /*
  * We prevent mucking around with the reserved area of trap
  * which are used internally by the kernel.
@@ -200,6 +211,9 @@ unsigned long ptrace_get_reg(struct task_struct *task, int 
regno)
if (regno == PT_MSR)
return get_user_msr(task);
 
+   if (regno == PT_DSCR)
+   return get_user_dscr(task);
+
if (regno  (sizeof(struct pt_regs) / sizeof(unsigned long)))
return ((unsigned long *)task-thread.regs)[regno];
 
@@ -218,6 +232,8 @@ int ptrace_put_reg(struct task_struct *task, int regno, 
unsigned long data)
return set_user_msr(task, data);
if (regno == PT_TRAP)
return set_user_trap(task, data);
+   if (regno == PT_DSCR)
+   return set_user_dscr(task, data);
 
if (regno = PT_MAX_PUT_REG) {
((unsigned long *)task-thread.regs)[regno] = data;
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: [PATCH] Revert crypto: caam - Updated SEC-4.0 device tree binding for ERA information.

2012-12-12 Thread Garg Vakul-B16394
Hello Kumar

This has been applied to: 

git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git.

Regards

Vakul

 -Original Message-
 From: Kumar Gala [mailto:ga...@kernel.crashing.org]
 Sent: Thursday, December 13, 2012 3:00 AM
 To: Garg Vakul-B16394
 Cc: linux-cry...@vger.kernel.org; linuxppc-...@ozlabs.org; devicetree-
 disc...@lists.ozlabs.org
 Subject: Re: [PATCH] Revert crypto: caam - Updated SEC-4.0 device tree
 binding for ERA information.
 
 
 On Dec 7, 2012, at 2:57 AM, Vakul Garg wrote:
 
  This reverts commit a2c0911c09190125f52c9941b9d187f601c2f7be.
 
  Signed-off-by: Vakul Garg va...@freescale.com
  ---
  Instead of adding SEC era information in crypto node's compatible, a
  new property 'fsl,sec-era' is being introduced into crypto node.
 
  .../devicetree/bindings/crypto/fsl-sec4.txt|5 ++---
  1 files changed, 2 insertions(+), 3 deletions(-)
 
 What tree do you think this has been applied to?
 
 - k


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] powerpc: added DSCR support to ptrace

2012-12-12 Thread Alexey Kardashevskiy
The DSCR (aka Data Stream Control Register) is supported on some
server PowerPC chips and allow some control over the prefetch
of data streams.

The kernel already supports DSCR value per thread but there is also
a need in a ability to change it from an external process for
the specific pid.

The patch adds new register index PT_DSCR (index=44) which can be
set/get by:
  ptrace(PTRACE_POKEUSER, traced_process, PT_DSCR  3, dscr);
  dscr = ptrace(PTRACE_PEEKUSER, traced_process, PT_DSCR  3, NULL);

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 arch/powerpc/include/asm/ptrace.h |1 +
 arch/powerpc/kernel/ptrace.c  |   17 +
 2 files changed, 18 insertions(+)

diff --git a/arch/powerpc/include/asm/ptrace.h 
b/arch/powerpc/include/asm/ptrace.h
index 9c21ed4..340fe36 100644
--- a/arch/powerpc/include/asm/ptrace.h
+++ b/arch/powerpc/include/asm/ptrace.h
@@ -276,6 +276,7 @@ static inline unsigned long 
regs_get_kernel_stack_nth(struct pt_regs *regs,
 #define PT_DAR 41
 #define PT_DSISR 42
 #define PT_RESULT 43
+#define PT_DSCR 44
 #define PT_REGS_COUNT 44
 
 #define PT_FPR048  /* each FP reg occupies 2 slots in this space */
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index c10fc28..aa19389 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -179,6 +179,18 @@ static int set_user_msr(struct task_struct *task, unsigned 
long msr)
return 0;
 }
 
+static unsigned long get_user_dscr(struct task_struct *task)
+{
+   return task-thread.dscr;
+}
+
+static int set_user_dscr(struct task_struct *task, unsigned long dscr)
+{
+   task-thread.dscr = dscr;
+   task-thread.dscr_inherit = 1;
+   return 0;
+}
+
 /*
  * We prevent mucking around with the reserved area of trap
  * which are used internally by the kernel.
@@ -200,6 +212,9 @@ unsigned long ptrace_get_reg(struct task_struct *task, int 
regno)
if (regno == PT_MSR)
return get_user_msr(task);
 
+   if (regno == PT_DSCR)
+   return get_user_dscr(task);
+
if (regno  (sizeof(struct pt_regs) / sizeof(unsigned long)))
return ((unsigned long *)task-thread.regs)[regno];
 
@@ -218,6 +233,8 @@ int ptrace_put_reg(struct task_struct *task, int regno, 
unsigned long data)
return set_user_msr(task, data);
if (regno == PT_TRAP)
return set_user_trap(task, data);
+   if (regno == PT_DSCR)
+   return set_user_dscr(task, data);
 
if (regno = PT_MAX_PUT_REG) {
((unsigned long *)task-thread.regs)[regno] = data;
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] vfio powerpc: enabled on powernv platform

2012-12-12 Thread Alexey Kardashevskiy

On 13/12/12 13:29, Benjamin Herrenschmidt wrote:

On Wed, 2012-12-12 at 07:34 -0700, Alex Williamson wrote:

But what would I put there?... IOMMU ID is more than enough at the moment
and struct iommu_table does not have anything what would have made sense to
show in the sysfs...


I believe David mentioned that PEs had user visible names.  Perhaps they
match an enclosure location or something.  Group numbers are rather
arbitrary and really have no guarantee of persistence.  Thanks,


I agree. Make up something, for example domain[PE] or something like
that.


To be able to add a PE number, I need to call iommu_group_alloc() in the 
correct place where I know this number OR I have to carry it in iommu_table 
till the moment the iommu_group_alloc() is called (acceptable but not cool).


I will post a patch which would help as a response to this mail.


--
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev