linux-next: manual merge of the driver-core tree with the powerpc tree

2009-06-16 Thread Stephen Rothwell
Hi Greg,

Today's linux-next merge of the driver-core tree got a conflict in
drivers/block/ps3disk.c between commits
6dee2c87ebbe5d7ce8c4c163966a0bd9c02c75ef (block/ps3: remove driver_data
direct access of struct device) and
03fa68c245cccbcb99035cbabaa13b408ba91ab5 (ps3: shorten ps3_system_bus_
[gs]et_driver_data to ps3_system_bus_[gs]et_drvdata) from the powerpc
tree and commit db7afa200c4ef6823a2a40e4ea1dd747775be01a (block/ps3:
remove driver_data direct access of struct device) from the driver-core
tree.

I fixed it up (I used the version from the powerpc tree).  Greg, I think
the driver-core patch is no longer relevant.
-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgp7DK8NPJvkN.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: linux-next: manual merge of the driver-core tree with the powerpc tree

2009-06-16 Thread Greg KH
On Tue, Jun 16, 2009 at 03:53:17PM +1000, Stephen Rothwell wrote:
 Hi Greg,
 
 Today's linux-next merge of the driver-core tree got a conflict in
 drivers/block/ps3disk.c between commits
 6dee2c87ebbe5d7ce8c4c163966a0bd9c02c75ef (block/ps3: remove driver_data
 direct access of struct device) and
 03fa68c245cccbcb99035cbabaa13b408ba91ab5 (ps3: shorten ps3_system_bus_
 [gs]et_driver_data to ps3_system_bus_[gs]et_drvdata) from the powerpc
 tree and commit db7afa200c4ef6823a2a40e4ea1dd747775be01a (block/ps3:
 remove driver_data direct access of struct device) from the driver-core
 tree.
 
 I fixed it up (I used the version from the powerpc tree).  Greg, I think
 the driver-core patch is no longer relevant.

I pushed out an update a number of hours ago (like 6+), so you should
have gotten it with this update.

When did you pull from my tree?  I also just sent a merge request to
Linus, so everything should be fixed up now.

thanks,

greg k-h
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: manual merge of the driver-core tree with the powerpc tree

2009-06-16 Thread Stephen Rothwell
Hi Greg,

On Mon, 15 Jun 2009 23:18:49 -0700 Greg KH g...@kroah.com wrote:

 I pushed out an update a number of hours ago (like 6+), so you should
 have gotten it with this update.
 
 When did you pull from my tree?  I also just sent a merge request to
 Linus, so everything should be fixed up now.

OK, thanks.  I fetch the trees usually between 8:30am and 10:00am UTC
+1000 (or +1100 in summer), so about 7 hours ago :-)

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/


pgpdq8U6asvT8.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: linux-next: Linus/powerpc tree build warning

2009-06-16 Thread Benjamin Herrenschmidt
On Tue, 2009-06-16 at 16:24 +1000, Stephen Rothwell wrote:
 Hi Ben,
 
 Today's linux-next build (powerpc ppc44x_defconfig) produced this warning:
 
 drivers/video/xilinxfb.c: In function 'xilinxfb_assign':
 drivers/video/xilinxfb.c:328: warning: cast to pointer from integer of 
 different size
 
 Introduced by commit dac4ccfb64bcdd5b4c248ccc22903d67486573cd (fbdev:
 Add PLB support and cleanup DCR in xilinxfb driver).

Known problem, a fix is currently being polished by Grant and John.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2.6.31] ehca: Tolerate dynamic memory operations and huge pages

2009-06-16 Thread Alexander Schmidt
Hi Roland,

thank you for taking a look at the code!

On Fri, 12 Jun 2009 21:50:58 -0700
Roland Dreier rdre...@cisco.com wrote:

 OK, one major issue with this patch and a few minor nits.
 
 First, the major issue is that I don't see anything in the patch that
 changes the code in ehca_mem_notifier() in ehca_main.c:
 
   case MEM_GOING_ONLINE:
   case MEM_GOING_OFFLINE:
   /* only ok if no hca is attached to the lpar */
   spin_lock_irqsave(shca_list_lock, flags);
   if (list_empty(shca_list)) {
   spin_unlock_irqrestore(shca_list_lock, flags);
   return NOTIFY_OK;
   } else {
   spin_unlock_irqrestore(shca_list_lock, flags);
   if (printk_timed_ratelimit(ehca_dmem_warn_time,
  30 * 1000))
   ehca_gen_err(DMEM operations are not allowed
in conjunction with eHCA);
   return NOTIFY_BAD;
   }
 
 But your patch description says:
 
   This patch implements toleration of dynamic memory operations
 
 But it seems you're still going to hit the same NOTIFY_BAD case above
 after your patch.  So something doesn't compute for me.  Could you
 explain more?

Yeah, the notifier code remains untouched as we still do not allow dynamic
memory operations _while_ our module is loaded. The patch allows the driver to
cope with DMEM operations that happened before the module was loaded, which
might result in a non-contiguous memory layout. When the driver registers
its global memory region in the system, the memory layout must be considered.

We chose the term toleration instead of support to illustrate this.

I'll put some more details into the changelog, incorporate the other comments
and send out a second version of the patch.

Thanks,
Alex

 
 Second, a nit:
 
   +#define EHCA_REG_MR 0
   +#define EHCA_REG_BUSMAP_MR (~0)
 
 and you pass these as the reg_busmap parm in:
 
int ehca_reg_mr(struct ehca_shca *shca,
  struct ehca_mr *e_mr,
  u64 *iova_start,
   @@ -991,7 +1031,8 @@
  struct ehca_pd *e_pd,
  struct ehca_mr_pginfo *pginfo,
  u32 *lkey, /*OUT*/
   -  u32 *rkey) /*OUT*/
   +  u32 *rkey, /*OUT*/
   +  int reg_busmap)
 
 and test it as:
 
   +  if (reg_busmap)
   +  ret = ehca_reg_bmap_mr_rpages(shca, e_mr, pginfo);
   +  else
   +  ret = ehca_reg_mr_rpages(shca, e_mr, pginfo);
 
 So the ~0 for true looks a bit odd.  One option would be to make
 reg_busmap a bool, since that's how you're using it, but then you lose
 the nice self-documenting macro where you call things.
 
 So I think it would be cleaner to do something like
 
 enum ehca_reg_type {
   EHCA_REG_MR,
   EHCA_REG_BUSMAP_MR
 };
 
 and make the int reg_busmap parameter into enum ehca_reg_type reg_type
 and have the code become
 
 + if (reg_type == EHCA_REG_BUSMAP_MR)
 + ret = ehca_reg_bmap_mr_rpages(shca, e_mr, pginfo);
 + else if (reg_type == EHCA_REG_MR)
 + ret = ehca_reg_mr_rpages(shca, e_mr, pginfo);
 + else
 + ret = -EINVAL
 
 or something like that.
 
   +struct ib_dma_mapping_ops ehca_dma_mapping_ops = {
   +  .mapping_error = ehca_dma_mapping_error,
   +  .map_single = ehca_dma_map_single,
   +  .unmap_single = ehca_dma_unmap_single,
   +  .map_page = ehca_dma_map_page,
   +  .unmap_page = ehca_dma_unmap_page,
   +  .map_sg = ehca_dma_map_sg,
   +  .unmap_sg = ehca_dma_unmap_sg,
   +  .dma_address = ehca_dma_address,
   +  .dma_len = ehca_dma_len,
   +  .sync_single_for_cpu = ehca_dma_sync_single_for_cpu,
   +  .sync_single_for_device = ehca_dma_sync_single_for_device,
   +  .alloc_coherent = ehca_dma_alloc_coherent,
   +  .free_coherent = ehca_dma_free_coherent,
   +};
 
 I always think structures like this are easier to read if you align the
 '=' signs.  But no big deal.
 
   +  ret = ehca_create_busmap();
   +  if (ret) {
   +  ehca_gen_err(Cannot create busmap.);
   +  goto module_init2;
   +  }
   +
  ret = ibmebus_register_driver(ehca_driver);
  if (ret) {
  ehca_gen_err(Cannot register eHCA device driver);
  ret = -EINVAL;
   -  goto module_init2;
   +  goto module_init3;
  }

  ret = register_memory_notifier(ehca_mem_nb);
  if (ret) {
  ehca_gen_err(Failed registering memory add/remove notifier);
   -  goto module_init3;
   +  goto module_init4;
 
 Having to renumber unrelated things is when something changes is why I
 don't like this style of error path labels.  But I think it's well and
 truly too late to fix that in ehca.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2.6.31 try 2] ehca: Tolerate dynamic memory operations and huge pages

2009-06-16 Thread Alexander Schmidt
From: Hannes Hering heri...@de.ibm.com

This patch implements toleration of dynamic memory operations and 16 GB
gigantic pages. Toleration means that the driver can cope with dynamic
memory operations that happened before the driver was loaded. While using the
ehca driver, dynamic memory operations are still prohibited. On module load the
driver walks through available system memory, checks for available memory ranges
and then registers the kernel internal memory region accordingly. The
translation of address ranges is implemented via a 3-level busmap.

Signed-off-by: Hannes Hering heri...@de.ibm.com

---
This patch is built and tested against infiniband.git. Please apply for 2.6.31.

 drivers/infiniband/hw/ehca/ehca_main.c |   20 +
 drivers/infiniband/hw/ehca/ehca_mrmw.c |  508 -
 drivers/infiniband/hw/ehca/ehca_mrmw.h |   13 
 3 files changed, 523 insertions(+), 18 deletions(-)

--- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_main.c
+++ infiniband.git/drivers/infiniband/hw/ehca/ehca_main.c
@@ -52,7 +52,7 @@
 #include ehca_tools.h
 #include hcp_if.h
 
-#define HCAD_VERSION 0026
+#define HCAD_VERSION 0027
 
 MODULE_LICENSE(Dual BSD/GPL);
 MODULE_AUTHOR(Christoph Raisch rai...@de.ibm.com);
@@ -506,6 +506,7 @@ static int ehca_init_device(struct ehca_
shca-ib_device.detach_mcast= ehca_detach_mcast;
shca-ib_device.process_mad = ehca_process_mad;
shca-ib_device.mmap= ehca_mmap;
+   shca-ib_device.dma_ops = ehca_dma_mapping_ops;
 
if (EHCA_BMASK_GET(HCA_CAP_SRQ, shca-hca_cap)) {
shca-ib_device.uverbs_cmd_mask |=
@@ -1028,17 +1029,23 @@ static int __init ehca_module_init(void)
goto module_init1;
}
 
+   ret = ehca_create_busmap();
+   if (ret) {
+   ehca_gen_err(Cannot create busmap.);
+   goto module_init2;
+   }
+
ret = ibmebus_register_driver(ehca_driver);
if (ret) {
ehca_gen_err(Cannot register eHCA device driver);
ret = -EINVAL;
-   goto module_init2;
+   goto module_init3;
}
 
ret = register_memory_notifier(ehca_mem_nb);
if (ret) {
ehca_gen_err(Failed registering memory add/remove notifier);
-   goto module_init3;
+   goto module_init4;
}
 
if (ehca_poll_all_eqs != 1) {
@@ -1053,9 +1060,12 @@ static int __init ehca_module_init(void)
 
return 0;
 
-module_init3:
+module_init4:
ibmebus_unregister_driver(ehca_driver);
 
+module_init3:
+   ehca_destroy_busmap();
+
 module_init2:
ehca_destroy_slab_caches();
 
@@ -1073,6 +1083,8 @@ static void __exit ehca_module_exit(void
 
unregister_memory_notifier(ehca_mem_nb);
 
+   ehca_destroy_busmap();
+
ehca_destroy_slab_caches();
 
ehca_destroy_comp_pool();
--- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_mrmw.c
+++ infiniband.git/drivers/infiniband/hw/ehca/ehca_mrmw.c
@@ -53,6 +53,38 @@
 /* max number of rpages (per hcall register_rpages) */
 #define MAX_RPAGES 512
 
+/* DMEM toleration management */
+#define EHCA_SECTSHIFTSECTION_SIZE_BITS
+#define EHCA_SECTSIZE  (1UL  EHCA_SECTSHIFT)
+#define EHCA_HUGEPAGESHIFT 34
+#define EHCA_HUGEPAGE_SIZE (1UL  EHCA_HUGEPAGESHIFT)
+#define EHCA_HUGEPAGE_PFN_MASK ((EHCA_HUGEPAGE_SIZE - 1)  PAGE_SHIFT)
+#define EHCA_INVAL_ADDR0xULL
+#define EHCA_DIR_INDEX_SHIFT 13   /* 8k Entries in 64k block */
+#define EHCA_TOP_INDEX_SHIFT (EHCA_DIR_INDEX_SHIFT * 2)
+#define EHCA_MAP_ENTRIES (1  EHCA_DIR_INDEX_SHIFT)
+#define EHCA_TOP_MAP_SIZE (0x1)   /* currently fixed map size 
*/
+#define EHCA_DIR_MAP_SIZE (0x1)
+#define EHCA_ENT_MAP_SIZE (0x1)
+#define EHCA_INDEX_MASK (EHCA_MAP_ENTRIES - 1)
+
+static unsigned long ehca_mr_len;
+
+/*
+ * Memory map data structures
+ */
+struct ehca_dir_bmap {
+   u64 ent[EHCA_MAP_ENTRIES];
+};
+struct ehca_top_bmap {
+   struct ehca_dir_bmap *dir[EHCA_MAP_ENTRIES];
+};
+struct ehca_bmap {
+   struct ehca_top_bmap *top[EHCA_MAP_ENTRIES];
+};
+
+static struct ehca_bmap *ehca_bmap;
+
 static struct kmem_cache *mr_cache;
 static struct kmem_cache *mw_cache;
 
@@ -68,6 +100,8 @@ enum ehca_mr_pgsize {
 #define EHCA_MR_PGSHIFT1M  20
 #define EHCA_MR_PGSHIFT16M 24
 
+static u64 ehca_map_vaddr(void *caddr);
+
 static u32 ehca_encode_hwpage_size(u32 pgsize)
 {
int log = ilog2(pgsize);
@@ -135,7 +169,8 @@ struct ib_mr *ehca_get_dma_mr(struct ib_
goto get_dma_mr_exit0;
}
 
-   ret = ehca_reg_maxmr(shca, e_maxmr, (u64 *)KERNELBASE,
+   ret = ehca_reg_maxmr(shca, e_maxmr,
+(void *)ehca_map_vaddr((void *)KERNELBASE),
 mr_access_flags, e_pd,
 

Re: [PATCH 2/2] uio: add an of_genirq driver

2009-06-16 Thread Wolfram Sang

  diff --git a/Documentation/powerpc/dts-bindings/uio-generic.txt 
  b/Documentation/powerpc/dts-bindings/uio-generic.txt
  new file mode 100644
  index 000..8ad9861
  --- /dev/null
  +++ b/Documentation/powerpc/dts-bindings/uio-generic.txt
  @@ -0,0 +1,16 @@
  +UIO for custom devices
  +
  +A device which will be mapped using the UIO subsystem.
  +
  +Properties:
  + - compatible : should contain the specific model used, followed by
  +                generic-uio.
  + - reg : address range(s) of the device (up to MAX_UIO_MAPS)
  + - interrupts : interrupt of the device
  +
  +Example:
  +        c64f...@0 {
  +                compatible = ptx,c64fpga001, generic-uio;
  +                reg = 0x0 0x1;
  +                interrupts = 0 0 3;
  +        };
 
 Hmmm, I'm not happy about this.  The device tree describes the
 hardware, not the way Linux uses the hardware.  UIO definitely falls
 into the category of Linux implementation detail.

Yes, I am aware of that. I just started with the mechanisms which are available
today and hoped we could find some compatible-value which will suit all needs.

 This should be approached from the other way around.  Either the
 generic-uio of_platform driver should contain an explicit list of
 devices to be handled by UIO,

Well, that could lead to a quite huge match_table over time.

 or the OF infrastructure should be modified to allow things like force
 binding of_devices to of_drivers at runtime.

That is an interesting idea. I could imagine something like a 'new_compatible
entry in the sysfs-section of the driver similar to 'new_id' for PCI. After
writing a new compatible-string into it, matching will triggered again with the
new entry added. That could (should?) also be placed at the of-core-level. Or
did you have something else in mind?

Regards,

   Wolfram

-- 
Pengutronix e.K.   | Wolfram Sang|
Industrial Linux Solutions | http://www.pengutronix.de/  |


signature.asc
Description: Digital signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2 2/2] mtd/maps/mtd-ram: add an of-platform driver

2009-06-16 Thread Wolfram Sang
On Sat, Jun 06, 2009 at 09:14:08AM +0100, David Woodhouse wrote:
 On Fri, 2009-06-05 at 14:05 +0200, Wolfram Sang wrote:
  Create an of-aware driver using the now exported generic functions from
  plat-ram.c. Also add the documentation for the binding. Partitions are
  not yet supported. Tested on a phyCORE-MPC5200B-IO.
 
 Do we have an ack for the device-tree bindings? 

Well, Grant acked so far and he surely has a voice in the device-tree-world :)
Or do you want a specific ack for that?

Regards,

   Wolfram

-- 
Pengutronix e.K.   | Wolfram Sang|
Industrial Linux Solutions | http://www.pengutronix.de/  |


signature.asc
Description: Digital signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2 2/2] mtd/maps/mtd-ram: add an of-platform driver

2009-06-16 Thread Wolfram Sang
On Mon, Jun 15, 2009 at 07:43:20PM +0200, Albrecht Dreß wrote:
 Am 05.06.09 14:05 schrieb(en) Wolfram Sang:
 Create an of-aware driver using the now exported generic functions  
 from plat-ram.c. Also add the documentation for the binding.  
 Partitions are not yet supported. Tested on a phyCORE-MPC5200B-IO.

 Dumb question: what is the current status of this patch?  I ask because  

I don't know.

David wondered if 'of_physmap' could be used, so I wrote what I experienced
during development. I can't tell if this helped the case.

Grant wondered if we need a bankwidth. IMHO it is needed for now, but I don't
know if this is a common agreement.

Regards,

   Wolfram

-- 
Pengutronix e.K.   | Wolfram Sang|
Industrial Linux Solutions | http://www.pengutronix.de/  |


signature.asc
Description: Digital signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Chipselect in SPI binding with mpc5200-psc-spi

2009-06-16 Thread Kári Davíðsson

Now my driver gets probed.

This modalias error was due to wrong compatible attribute of
my of node.

But I still need to have the call to of_register_spi_devices().

The attached patch shows what I mean.

rg
kd

Grant Likely wrote:

On Mon, Jun 15, 2009 at 10:36 AM, Kári
Davíðssonkari.davids...@marel.com wrote:

Is this true?

Grant Likely wrote:

Yes, this is right.  The psc_spi driver automatically registers all
spi children that it finds in the device tree onto the SPI bus.
Therefore registering an spi_driver() is the right thing to do.

I am writing an SPI protocol driver and I find that my driver
is never probed.

I tried to add and call to of_register_spi_devices() in the
drivers/spi/mpc52xx_psc_spi.c::mpc52xx_psc_spi_of_probe() function,
without much effect besided that the DTS node is parsed but the driver probe
is not called, actually it complains about a modalias for my node is
missing.


What do you see when you look in /sys/bus/spi/devices?  You should see
a directory for your device.  What do you see in /sys/bus/spi/drivers?
 In here you should see your driver.  If they are both there, then you
just have a problem matching your driver name to the device name.

g.

Index: drivers/spi/mpc52xx_psc_spi.c
===
--- drivers/spi/mpc52xx_psc_spi.c   (revision 548)
+++ drivers/spi/mpc52xx_psc_spi.c   (working copy)
@@ -22,6 +22,7 @@
 #include linux/delay.h
 #include linux/spi/spi.h
 #include linux/fsl_devices.h
+#include linux/of_spi.h
 
 #include asm/mpc52xx.h
 #include asm/mpc52xx_psc.h
@@ -371,7 +372,7 @@
 
 /* bus_num is used only for the case dev-platform_data == NULL */
 static int __init mpc52xx_psc_spi_do_probe(struct device *dev, u32 regaddr,
-   u32 size, unsigned int irq, s16 bus_num)
+   u32 size, unsigned int irq, s16 bus_num, struct 
spi_master ** pmaster)
 {
struct fsl_spi_platform_data *pdata = dev-platform_data;
struct mpc52xx_psc_spi *mps;
@@ -439,6 +440,10 @@
if (ret  0)
goto unreg_master;
 
+dev_info(dev, Activated\n);
+
+*pmaster = master;
+
return ret;
 
 unreg_master:
@@ -474,6 +479,8 @@
const u32 *regaddr_p;
u64 regaddr64, size64;
s16 id = -1;
+int res;
+struct spi_master * master = NULL;;
 
regaddr_p = of_get_address(op-node, 0, size64, NULL);
if (!regaddr_p) {
@@ -495,8 +502,16 @@
id = *psc_nump + 1;
}
 
-   return mpc52xx_psc_spi_do_probe(op-dev, (u32)regaddr64, (u32)size64,
-   irq_of_parse_and_map(op-node, 0), id);
+   res = mpc52xx_psc_spi_do_probe(op-dev, (u32)regaddr64, (u32)size64,
+   irq_of_parse_and_map(op-node, 0), id, 
master);
+
+if( master != NULL )
+{
+/* Add any subnodes on the SPI bus */
+of_register_spi_devices( master, op-node);
+}
+
+return res;
 }
 
 static int __exit mpc52xx_psc_spi_of_remove(struct of_device *op)
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] uio: add an of_genirq driver

2009-06-16 Thread Grant Likely
On Tue, Jun 16, 2009 at 3:04 AM, Wolfram Sangw.s...@pengutronix.de wrote:

  diff --git a/Documentation/powerpc/dts-bindings/uio-generic.txt 
  b/Documentation/powerpc/dts-bindings/uio-generic.txt
  new file mode 100644
  index 000..8ad9861
  --- /dev/null
  +++ b/Documentation/powerpc/dts-bindings/uio-generic.txt
  @@ -0,0 +1,16 @@
  +UIO for custom devices
  +
  +A device which will be mapped using the UIO subsystem.
  +
  +Properties:
  + - compatible : should contain the specific model used, followed by
  +                generic-uio.
  + - reg : address range(s) of the device (up to MAX_UIO_MAPS)
  + - interrupts : interrupt of the device
  +
  +Example:
  +        c64f...@0 {
  +                compatible = ptx,c64fpga001, generic-uio;
  +                reg = 0x0 0x1;
  +                interrupts = 0 0 3;
  +        };

 Hmmm, I'm not happy about this.  The device tree describes the
 hardware, not the way Linux uses the hardware.  UIO definitely falls
 into the category of Linux implementation detail.

 Yes, I am aware of that. I just started with the mechanisms which are 
 available
 today and hoped we could find some compatible-value which will suit all needs.

Trouble is a value that suits all needs today probably won't a year
from now.  :-)

 This should be approached from the other way around.  Either the
 generic-uio of_platform driver should contain an explicit list of
 devices to be handled by UIO,

 Well, that could lead to a quite huge match_table over time.

 or the OF infrastructure should be modified to allow things like force
 binding of_devices to of_drivers at runtime.

 That is an interesting idea. I could imagine something like a 'new_compatible
 entry in the sysfs-section of the driver similar to 'new_id' for PCI. After
 writing a new compatible-string into it, matching will triggered again with 
 the
 new entry added. That could (should?) also be placed at the of-core-level. Or
 did you have something else in mind?

Yeah, that sounds appropriate.

g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH V2 2/2] mtd/maps/mtd-ram: add an of-platform driver

2009-06-16 Thread Grant Likely
On Tue, Jun 16, 2009 at 3:18 AM, Wolfram Sangw.s...@pengutronix.de wrote:
 On Mon, Jun 15, 2009 at 07:43:20PM +0200, Albrecht Dreß wrote:
 Am 05.06.09 14:05 schrieb(en) Wolfram Sang:
 Create an of-aware driver using the now exported generic functions
 from plat-ram.c. Also add the documentation for the binding.
 Partitions are not yet supported. Tested on a phyCORE-MPC5200B-IO.

 Dumb question: what is the current status of this patch?  I ask because

 I don't know.

 David wondered if 'of_physmap' could be used, so I wrote what I experienced
 during development. I can't tell if this helped the case.

 Grant wondered if we need a bankwidth. IMHO it is needed for now, but I don't
 know if this is a common agreement.

I'm not happy about the use case though.  It probably shouldn't appear
in this binding, or if it does it should be tagged as an optional
property.  It is only in the 5200 localplus case that bank-width is
needed to figure out how to apply the workaround.

g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH V2 2/2] mtd/maps/mtd-ram: add an of-platform driver

2009-06-16 Thread Wolfram Sang
  Grant wondered if we need a bankwidth. IMHO it is needed for now, but I 
  don't
  know if this is a common agreement.
 
 I'm not happy about the use case though.  It probably shouldn't appear
 in this binding, or if it does it should be tagged as an optional
 property.  It is only in the 5200 localplus case that bank-width is
 needed to figure out how to apply the workaround.

Maybe there is a misunderstanding here. I am not talking about Albrecht's case.
What I replied to your concern is that bankwidth is used(!) in the underlying
map-ram-driver in mapram_erase() at the moment. Whether this is really needed
could be discussed perhaps, but is beyond the scope of this patch series IMHO.
I'd think this can be addressed in a later series, if needed, although this
could mean that the binding will change (bank-width becoming optional).

Regards,

   Wolfram

-- 
Pengutronix e.K.   | Wolfram Sang|
Industrial Linux Solutions | http://www.pengutronix.de/  |


signature.asc
Description: Digital signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] powerpc: 85xx: Add PHY fixup to socrates board code

2009-06-16 Thread Kumar Gala


On Apr 21, 2009, at 6:24 PM, Anatolij Gustschin wrote:


Kumar Gala wrote:


On Apr 21, 2009, at 4:54 PM, Kumar Gala wrote:



On Apr 21, 2009, at 12:19 PM, Anatolij Gustschin wrote:


If the firmware missed to initialize the PHY correctly,
Linux may hang up on socrates while eth0/eth1 interface
startup (caused by continuous unacknowledged PHY interrupt).

This patch adds PHY fixup to socrates platform code to
ensure the PHY is pre-initialized correctly. It is needed
to be compatible with older firmware.

Signed-off-by: Anatolij Gustschin ag...@denx.de
---
Changes since first version:
   use macros instead of register numbers as
   suggested by Anton

Kumar, could you please consider this patch for
inclusion into 2.6.30? Thanks!


Sorry.  I dont think this is board specific and should at a  
minimum be

done in m88e1011_config_init in drivers/net/phy/marvell.c.  Not sure
how 88E1011 differs from 88E, but I'm wondering if you really  
want

to set config_init for m88e1011 to m88e_config_init

- k


I got confused by the #'s.. I think we should have a struct in  
marvell.c

for m88e1121 which I'm guessing is the PHY you are using.


yes, m88e1121 is correct. In 2.6.30-rc2 there is already a m88e1121
struct in marvell_drivers[] in drivers/net/phy/marvell.c.
The reason I'm not doing the m88e1121 pre-init stuff in config_init
is as follows:

m88e1121 is a multi-PHY device with two PHY ports and each port
could signal an interrupt. This PHY device can be pin-strapped to use
shared interrupt pin for both PHY ports (as used on socrates board).
PHY specific config_init will be called e.g. while eth0 startup in
phy_attach() which is called from phy_connect() in drivers/net/phy/ 
phy_device.c.

phy_connect() then calls phy_start_interrupts() which registers the
interrupt handler and enables the interrupts for the PHY-port  
config_init

is called for. Now interrupts can be cleared/acknowledged for this
PHY-port (eth0 interface), but interrupts from the another PHY-port
can not be acknowledged as eth1 was not brought up yet.

Placing this fixup in drivers/net/phy/marvell.c as in config_init  
callback
could be done, but this will add more overhead as the fixup routine  
have

to do more work:

acquire 'struct mii_bus' pointer and walk through all registered PHYs
searching for the PHY which use the same interrupt, then getting
the address of this PHY on the bus and disable and clear PHY irqs
by writing/reading to/from this PHY, (but only in the case it was
not already brought up and has interrupts enabled!) e.g.:

struct mii_bus *bus = phydev-bus;
int addr;

for (addr = 0; addr  PHY_MAX_ADDR; addr++) {
struct phy_device *phy = bus-phy_map[addr];

if (addr != phydev-addr  bus-irq[addr] == phydev-irq 
(phy-phy_id  0x0ff0) == 0x01410cb0 
!(phy-interrupts  PHY_INTERRUPT_ENABLED)) {

int imask = phy_read(phy, MII_M1011_IMASK);

if (imask) {
phy_write(phy, 0x12, 0); /* disable */
phy_read(phy, 0x13); /* clear */
}
}
}

All this to allow support for multiple m88e1121 devices.
Otherwise, after registering first phy interrupt handler
and enabling interrupt pending irq on other PHY port or
other PHY device will lock up the board.

The fixup in this patch will only be done while mdio bus scan
before registering a PHY device.


Did we ever resolve this?

- k
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


PowerPC PCI DMA issues (prefetch/coherency?)

2009-06-16 Thread Chris Pringle

Hello All,

We're developing on a Freescale MPC8272 and are having some nasty 
problems with PCI bus mastering and data corruption.


We have some custom hardware that is bus mastering, reading data from 
the CPUs memory for it's own use. Most of the time, the data is correct, 
however occasionally we are seeing data that appears to be from 
somewhere else in memory (usually memory that has already been read by 
the PCI device). The problem looks like stale data on the PCI bridge 
prefetch buffers or a cache coherency problem, but we've been unable to 
come up with a solution to our problem. It is my understanding that it 
shouldn't be a cache coherency problem as the CPU cache should be 
snooped as the data is read from memory. Even if it were an issue, the 
pci_map_sg* functions should have sorted out any cache coherency issues 
before the DMA operation started.


I've not been able to find anything on the Freescale data sheet that 
provides any way of flushing the prefetch cache on the PCI bridge. We've 
done a bit of experimenting, and found that turning off prefetch appears 
to solve (or possibly mask?) the problem (at the expensive of massive 
performance problems). I've also tried DMA'ing two adjacent userspace 
buffers in memory (from the same page), and see corruption on the second 
buffer. If I populate both buffers, then DMA them both, the data is 
fine. If I populate the first, DMA the first, then populate the second 
and DMA the second, corruption occurs at the start of the second buffer. 
If I add 8-32 bytes of padding between the buffers, the problem goes away.


The PCI spec says that the PCI bridge is supposed to flush any data from 
it's prefetch buffers that are not read by the bus master, so 
technically, this isn't supposed to happen.


I've tried making sure that buffers are cache line (and page) aligned, 
and are multiples of cache lines, but it's made no difference. PIO mode 
works fine, and I've checked the data with the CPU just before, and 
immediately after the DMA and the driver sees no data integrity issues. 
There are memory write barriers just before the DMA start, so all the 
registers should be correct before the DMA starts.


For background info, the device doing the bus mastering is a Xilinx 
Virtex 5 FPGA. We've monitored the data as it comes off the PCI bus 
using ChipScope - so the firmware should not be manipulating the data in 
any way.


We have some hardware/firmware/drivers that has a lot of common code 
that runs on an x86 platform (as opposed to powerpc), and that works 
without any issues whatsoever.


Has anyone got any ideas what this might be? Does anyone of know issues 
with PCI bridges on the PowerPC platform? Is there extra things that 
need to be done from the driver when DMAing on PowerPC (I've looked at 
other drivers and there's nothing obvious). The chip errata doesn't have 
anything on it that looks like it could cause this.


I'm really hoping this is something that we're doing wrong in the driver 
or the firmware, but we've been through both the firmware and drivers 
countless times and are unable to see anything wrong.


Any thoughts/ideas would be much appreciated!

Regards,
Chris

--

__
Chris Pringle
Software Engineer

Miranda Technologies Ltd.
Hithercroft Road
Wallingford
Oxfordshire OX10 9DG
UK

Tel. +44 1491 820206
Fax. +44 1491 820001
www.miranda.com



Miranda Technologies Limited
Registered in England and Wales CN 02017053
Registered Office: James House, Mere Park, Dedmere Road, Marlow, Bucks, SL7 1FJ
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] RFC: powerpc: expose the multi-bit ops that underlie single-bit ops.

2009-06-16 Thread Geoff Thorpe
Thanks for taking the time to look at this Ben, comments inline.

Benjamin Herrenschmidt wrote:
 On Tue, 2009-05-26 at 14:19 -0400, Geoff Thorpe wrote:
 NOT FOR COMMIT, THIS IS A REQUEST FOR FEEDBACK.

 The bitops.h functions that operate on a single bit in a bitfield are
 implemented by operating on the corresponding word location. In all cases
 the inner logic appears to be valid if the mask being applied has more
 than one bit set, so this patch exposes those inner operations. Indeed,
 set_bits() was already available, but it duplicated code from set_bit()
 (rather than making the latter a wrapper) - it was also missing the
 PPC405_ERR77() workaround and the volatile address qualifier present in
 other APIs. This corrects that, and exposes the other multi-bit
 equivalents.

 One advantage of these multi-bit forms is that they allow word-sized
 variables to essentially be their own spinlocks.
 
 Hi ! Sorry for the delay, that was on my have a look one of these days,
 low priority list for a bit too long :-)

NP, optimal throughput often requires a compromise in latency :-)

 
 NB, the same factoring is possible in asm-generic/bitops/[non-]atomic.h.
 I would be happy to provide corresponding patches if this approach is
 deemed appropriate. Feedback would be most welcome.

 Signed-off-by: Geoff Thorpe geoff.tho...@freescale.com
 ---
  arch/powerpc/include/asm/bitops.h |  111 
 +++--
  1 files changed, 69 insertions(+), 42 deletions(-)

 diff --git a/arch/powerpc/include/asm/bitops.h 
 b/arch/powerpc/include/asm/bitops.h
 index 897eade..72de28c 100644
 --- a/arch/powerpc/include/asm/bitops.h
 +++ b/arch/powerpc/include/asm/bitops.h
 @@ -56,11 +56,10 @@
  #define BITOP_WORD(nr)  ((nr) / BITS_PER_LONG)
  #define BITOP_LE_SWIZZLE((BITS_PER_LONG-1)  ~0x7)
  
 -static __inline__ void set_bit(int nr, volatile unsigned long *addr)
 +static __inline__ void set_bits(unsigned long mask, volatile unsigned long 
 *_p)
  {
  unsigned long old;
 -unsigned long mask = BITOP_MASK(nr);
 -unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
 +unsigned long *p = (unsigned long *)_p;
  
  __asm__ __volatile__(
  1:PPC_LLARX %0,0,%3  # set_bit\n
 @@ -73,11 +72,16 @@ static __inline__ void set_bit(int nr, volatile unsigned 
 long *addr)
  : cc );
  }
  
 -static __inline__ void clear_bit(int nr, volatile unsigned long *addr)
 +static __inline__ void set_bit(int nr, volatile unsigned long *addr)
 +{
 +set_bits(BITOP_MASK(nr), addr + BITOP_WORD(nr));
 +}
 
 No objection with the above.
 
 +static __inline__ void clear_bits(unsigned long mask,
 +volatile unsigned long *_p)
  {
  unsigned long old;
 -unsigned long mask = BITOP_MASK(nr);
 -unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
 +unsigned long *p = (unsigned long *)_p;
  
  __asm__ __volatile__(
  1:PPC_LLARX %0,0,%3  # clear_bit\n
 @@ -90,11 +94,16 @@ static __inline__ void clear_bit(int nr, volatile 
 unsigned long *addr)
  : cc );
  }
  
 -static __inline__ void clear_bit_unlock(int nr, volatile unsigned long 
 *addr)
 +static __inline__ void clear_bit(int nr, volatile unsigned long *addr)
 +{
 +clear_bits(BITOP_MASK(nr), addr + BITOP_WORD(nr));
 +}
 
 Looks good too.
 
 +static __inline__ void clear_bits_unlock(unsigned long mask,
 +volatile unsigned long *_p)
  {
  unsigned long old;
 -unsigned long mask = BITOP_MASK(nr);
 -unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
 +unsigned long *p = (unsigned long *)_p;
  
  __asm__ __volatile__(
  LWSYNC_ON_SMP
 @@ -108,11 +117,16 @@ static __inline__ void clear_bit_unlock(int nr, 
 volatile unsigned long *addr)
  : cc, memory);
  }
  
 -static __inline__ void change_bit(int nr, volatile unsigned long *addr)
 +static __inline__ void clear_bit_unlock(int nr, volatile unsigned long 
 *addr)
 +{
 +clear_bits_unlock(BITOP_MASK(nr), addr + BITOP_WORD(nr));
 +}
 
 I'm not sure it's useful to provide a multi-bit variant of the
 lock and unlock primitives. Do other archs do ?

For clear_bit_unlock(), no they don't appear to. There is a fallback in
include/asm-generic though, and I notice that it's used in a few places,
eg. drivers/rtc/rtc-dev. OTOH some other archs appear to provide their
own test_and_set_bit_lock(), and there's a fallback in
includes/asm-generic for that too.

Do you see a reason to isolate either of these and not factor out the
inner word-based logic?

 
 +static __inline__ void change_bits(unsigned long mask,
 +volatile unsigned long *_p)
  {
  unsigned long old;
 -unsigned long mask = BITOP_MASK(nr);
 -unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
 +unsigned long *p = (unsigned long *)_p;
  
  __asm__ __volatile__(
  1:PPC_LLARX %0,0,%3  # change_bit\n
 @@ -125,12 +139,16 @@ static 

[PATCH] powerpc: Update Warp defconfig

2009-06-16 Thread Sean MacLennan
I forgot to include the defconfig in the last set of patches. So you
don't get to use the shiny new LEDS driver on the warp unless you turn
it on.

Enabling hotplug is also very important since we have moved to udev on
the warp.

Like most defconfig patches, most of the changes are just the normal
kernel changes, not warp specific.

If it is too early, and we want to hold off until say rc4 with
defconfigs I have no problem with that. I have tried to outline all the
changes we specifically made. If anybody sees any bad choices with
options, let me know!

Cheers,
   Sean

* Enable GPIO LEDS
* Enable LED triggers
* Move to slub
* Enable hotplug
* Enable timestamps on printks
* Enable UBIFS

Signed-off-by: Sean MacLennan smaclen...@pikatech.com
---
diff --git a/arch/powerpc/configs/44x/warp_defconfig 
b/arch/powerpc/configs/44x/warp_defconfig
index 3b77f09..787635f 100644
--- a/arch/powerpc/configs/44x/warp_defconfig
+++ b/arch/powerpc/configs/44x/warp_defconfig
@@ -1,7 +1,7 @@
 #
 # Automatically generated make config: don't edit
-# Linux kernel version: 2.6.29-rc2
-# Fri Jan 23 07:57:16 2009
+# Linux kernel version: 2.6.30
+# Tue Jun  9 23:35:36 2009
 #
 # CONFIG_PPC64 is not set
 
@@ -41,6 +41,7 @@ CONFIG_ARCH_HAS_ILOG2_U32=y
 CONFIG_GENERIC_HWEIGHT=y
 CONFIG_GENERIC_CALIBRATE_DELAY=y
 CONFIG_GENERIC_FIND_NEXT_BIT=y
+CONFIG_GENERIC_GPIO=y
 # CONFIG_ARCH_NO_VIRT_TO_BUS is not set
 CONFIG_PPC=y
 CONFIG_EARLY_PRINTK=y
@@ -53,10 +54,12 @@ CONFIG_PPC_UDBG_16550=y
 # CONFIG_GENERIC_TBSYNC is not set
 CONFIG_AUDIT_ARCH=y
 CONFIG_GENERIC_BUG=y
+CONFIG_DTC=y
 # CONFIG_DEFAULT_UIMAGE is not set
 CONFIG_PPC_DCR_NATIVE=y
 # CONFIG_PPC_DCR_MMIO is not set
 CONFIG_PPC_DCR=y
+CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
 CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config
 
 #
@@ -74,7 +77,17 @@ CONFIG_SYSVIPC_SYSCTL=y
 # CONFIG_BSD_PROCESS_ACCT is not set
 # CONFIG_TASKSTATS is not set
 # CONFIG_AUDIT is not set
-# CONFIG_IKCONFIG is not set
+
+#
+# RCU Subsystem
+#
+CONFIG_CLASSIC_RCU=y
+# CONFIG_TREE_RCU is not set
+# CONFIG_PREEMPT_RCU is not set
+# CONFIG_TREE_RCU_TRACE is not set
+# CONFIG_PREEMPT_RCU_TRACE is not set
+CONFIG_IKCONFIG=y
+CONFIG_IKCONFIG_PROC=y
 CONFIG_LOG_BUF_SHIFT=14
 CONFIG_GROUP_SCHED=y
 CONFIG_FAIR_GROUP_SCHED=y
@@ -82,27 +95,29 @@ CONFIG_FAIR_GROUP_SCHED=y
 CONFIG_USER_SCHED=y
 # CONFIG_CGROUP_SCHED is not set
 # CONFIG_CGROUPS is not set
-CONFIG_SYSFS_DEPRECATED=y
-CONFIG_SYSFS_DEPRECATED_V2=y
+# CONFIG_SYSFS_DEPRECATED_V2 is not set
 # CONFIG_RELAY is not set
 # CONFIG_NAMESPACES is not set
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_INITRAMFS_SOURCE=
+CONFIG_RD_GZIP=y
+# CONFIG_RD_BZIP2 is not set
+# CONFIG_RD_LZMA is not set
 # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
 CONFIG_SYSCTL=y
+CONFIG_ANON_INODES=y
 CONFIG_EMBEDDED=y
 CONFIG_SYSCTL_SYSCALL=y
 CONFIG_KALLSYMS=y
 # CONFIG_KALLSYMS_ALL is not set
 # CONFIG_KALLSYMS_EXTRA_PASS is not set
-# CONFIG_HOTPLUG is not set
+# CONFIG_STRIP_ASM_SYMS is not set
+CONFIG_HOTPLUG=y
 CONFIG_PRINTK=y
 CONFIG_BUG=y
 CONFIG_ELF_CORE=y
-CONFIG_COMPAT_BRK=y
 CONFIG_BASE_FULL=y
 CONFIG_FUTEX=y
-CONFIG_ANON_INODES=y
 CONFIG_EPOLL=y
 CONFIG_SIGNALFD=y
 CONFIG_TIMERFD=y
@@ -110,10 +125,13 @@ CONFIG_EVENTFD=y
 CONFIG_SHMEM=y
 CONFIG_AIO=y
 CONFIG_VM_EVENT_COUNTERS=y
-CONFIG_SLAB=y
-# CONFIG_SLUB is not set
+CONFIG_SLUB_DEBUG=y
+CONFIG_COMPAT_BRK=y
+# CONFIG_SLAB is not set
+CONFIG_SLUB=y
 # CONFIG_SLOB is not set
 # CONFIG_PROFILING is not set
+# CONFIG_MARKERS is not set
 CONFIG_HAVE_OPROFILE=y
 # CONFIG_KPROBES is not set
 CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
@@ -121,6 +139,7 @@ CONFIG_HAVE_IOREMAP_PROT=y
 CONFIG_HAVE_KPROBES=y
 CONFIG_HAVE_KRETPROBES=y
 CONFIG_HAVE_ARCH_TRACEHOOK=y
+# CONFIG_SLOW_WORK is not set
 # CONFIG_HAVE_GENERIC_DMA_COHERENT is not set
 CONFIG_SLABINFO=y
 CONFIG_RT_MUTEXES=y
@@ -133,7 +152,6 @@ CONFIG_MODULE_UNLOAD=y
 # CONFIG_MODULE_SRCVERSION_ALL is not set
 CONFIG_BLOCK=y
 # CONFIG_LBD is not set
-# CONFIG_BLK_DEV_IO_TRACE is not set
 # CONFIG_BLK_DEV_BSG is not set
 # CONFIG_BLK_DEV_INTEGRITY is not set
 
@@ -149,11 +167,6 @@ CONFIG_DEFAULT_AS=y
 # CONFIG_DEFAULT_CFQ is not set
 # CONFIG_DEFAULT_NOOP is not set
 CONFIG_DEFAULT_IOSCHED=anticipatory
-CONFIG_CLASSIC_RCU=y
-# CONFIG_TREE_RCU is not set
-# CONFIG_PREEMPT_RCU is not set
-# CONFIG_TREE_RCU_TRACE is not set
-# CONFIG_PREEMPT_RCU_TRACE is not set
 # CONFIG_FREEZER is not set
 
 #
@@ -173,10 +186,11 @@ CONFIG_WARP=y
 # CONFIG_ARCHES is not set
 # CONFIG_CANYONLANDS is not set
 # CONFIG_GLACIER is not set
+# CONFIG_REDWOOD is not set
 # CONFIG_YOSEMITE is not set
 # CONFIG_XILINX_VIRTEX440_GENERIC_BOARD is not set
 # CONFIG_PPC44x_SIMPLE is not set
-# CONFIG_PPC4xx_GPIO is not set
+CONFIG_PPC4xx_GPIO=y
 CONFIG_440EP=y
 CONFIG_IBM440EP_ERR42=y
 # CONFIG_IPIC is not set
@@ -235,9 +249,13 @@ CONFIG_ZONE_DMA_FLAG=1
 CONFIG_BOUNCE=y
 CONFIG_VIRT_TO_BUS=y
 CONFIG_UNEVICTABLE_LRU=y
+CONFIG_HAVE_MLOCK=y
+CONFIG_HAVE_MLOCKED_PAGE_BIT=y
+CONFIG_STDBINUTILS=y
 CONFIG_PPC_4K_PAGES=y
 # 

Re: PowerPC PCI DMA issues (prefetch/coherency?)

2009-06-16 Thread Roderick Colenbrander
Hi Chris,

The last couple of months I had the 'pleasure' to work with a Xilinx
ML510 ATX board which contains a Virtex-5 FXT. In my case I had to set
up the plbv46 pci soft-core to function properly with PCI add-on
boards and onboard pci devices. I have had a lot of issues including
ones with DMA but in the end it worked properly. I'm not sure how in
your case the Virtex-5 is wired to the PCI bus but in my case it was
connected to the crossbar switch of the PowerPC 440 which is inside
the Virtex-5 FXT. Not all Xilinx reference designs were interfaced
properly to this (sometimes the MPLB-SPLB connection from the
soft-core to the crossbar wasn't connected) and even then the PCI
soft-core wasn't configured properly by Xilinx and this MIGHT be
similar to your issue.

In the explanation of the issue I will use the terminology used by
Xilinx. The CPU side is the 'IPIF' side and the and the PCI side is
the 'PCI' side. Both domains have their own address domains and hence
translation from IPIFtoPCI and PCItoIPIF is needed. From what I
understood is that the 'incoming window' (ipif-pci) and outgoing
window (pciipif) should use different address ranges else the bridge
gets confused. I don't have access to my design right now but by head
Xilinx doesn't set C_IPIFBAR2PCIBAR_0 properly in a lot of cases and
just sets it to 0. Which means that cpu - ipif (lets say the soft
core is accessible from 0xa000 from the cpu) 0xa0001234 translates
to 0x1234 on the PCI side but PCI cards expect that when they
write low memory addresses that they write to system memory. What
should be done is to set C_IPIFBAR2PCIBAR_0 to 0xa000 in this
case, so that the incoming window is 0xa000-0x
and the outgoing window 0x-0x9fff or just set it half way
at 0x8000.

I'm not saying this is your issue but it could be it.

Regards,
Roderick Colenbrander


On Tue, Jun 16, 2009 at 4:08 PM, Chris Pringlechris.prin...@oxtel.com wrote:
 Hello All

 We're developing on a Freescale MPC8272 and are having some nasty problems
 with PCI bus mastering and data corruption.

 We have some custom hardware that is bus mastering, reading data from the
 CPUs memory for it's own use. Most of the time, the data is correct, however
 occasionally we are seeing data that appears to be from somewhere else in
 memory (usually memory that has already been read by the PCI device). The
 problem looks like stale data on the PCI bridge prefetch buffers or a cache
 coherency problem, but we've been unable to come up with a solution to our
 problem. It is my understanding that it shouldn't be a cache coherency
 problem as the CPU cache should be snooped as the data is read from memory.
 Even if it were an issue, the pci_map_sg* functions should have sorted out
 any cache coherency issues before the DMA operation started.

 I've not been able to find anything on the Freescale data sheet that
 provides any way of flushing the prefetch cache on the PCI bridge. We've
 done a bit of experimenting, and found that turning off prefetch appears to
 solve (or possibly mask?) the problem (at the expensive of massive
 performance problems). I've also tried DMA'ing two adjacent userspace
 buffers in memory (from the same page), and see corruption on the second
 buffer. If I populate both buffers, then DMA them both, the data is fine. If
 I populate the first, DMA the first, then populate the second and DMA the
 second, corruption occurs at the start of the second buffer. If I add 8-32
 bytes of padding between the buffers, the problem goes away.

 The PCI spec says that the PCI bridge is supposed to flush any data from
 it's prefetch buffers that are not read by the bus master, so technically,
 this isn't supposed to happen.

 I've tried making sure that buffers are cache line (and page) aligned, and
 are multiples of cache lines, but it's made no difference. PIO mode works
 fine, and I've checked the data with the CPU just before, and immediately
 after the DMA and the driver sees no data integrity issues. There are memory
 write barriers just before the DMA start, so all the registers should be
 correct before the DMA starts.

 For background info, the device doing the bus mastering is a Xilinx Virtex 5
 FPGA. We've monitored the data as it comes off the PCI bus using ChipScope -
 so the firmware should not be manipulating the data in any way.

 We have some hardware/firmware/drivers that has a lot of common code that
 runs on an x86 platform (as opposed to powerpc), and that works without any
 issues whatsoever.

 Has anyone got any ideas what this might be? Does anyone of know issues with
 PCI bridges on the PowerPC platform? Is there extra things that need to be
 done from the driver when DMAing on PowerPC (I've looked at other drivers
 and there's nothing obvious). The chip errata doesn't have anything on it
 that looks like it could cause this.

 I'm really hoping this is something that we're doing wrong in the driver or
 the firmware, but 

Re: [ewg] Re: [PATCH 2.6.31] ehca: Tolerate dynamic memory operations and huge pages

2009-06-16 Thread Roland Dreier

  Yeah, the notifier code remains untouched as we still do not allow dynamic
  memory operations _while_ our module is loaded. The patch allows the driver 
  to
  cope with DMEM operations that happened before the module was loaded, which
  might result in a non-contiguous memory layout. When the driver registers
  its global memory region in the system, the memory layout must be considered.
  
  We chose the term toleration instead of support to illustrate this.

I see.  So things just silently broke in some cases when the driver was
loaded after operations you didn't tolerate?

Anyway, thanks for the explanation.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


MPC83xx watchdog reset board dead lock

2009-06-16 Thread Leon Woestenberg
Hello,

this is a hardware, even board issue, but I hope to find the right
target audience here.


In our MPC83xx design I would like to prevent dead lock in case where
a field upgrade is performed, i.e. NOR Flash is erased or written, and
the MPC83xx built-in hardware watchdog triggers.

In u-boot the scenario can be easily reproduced by running this
command (WARNING, erases some sectors!) on an MPC8313E-RDB:

erase_wdg=mw.l 0xe204 0x1007 1;mw.w 0xe20e 0x556c 1;mw.w
0xe20e 0xaa39 1;erase 1:10-30

This sets up the watchdog to reset soonish, then starts erasing NOR
sectors. Watchdog triggers and resets - Dead lock.


Most MPC8xxx board designs I have seen suffer from this possible dead lock:
- NOR Flash is put in erase mode or write mode
- Hardware watchdog triggers
- HRESET# is asserted by the processor, during which the configuration
words are read from NOR Flash.

Either
HRESET# is not attached to NOR, NOR stays in erase/write mode and
invalid words will be read - dead lock

or either:
HRESET# is attached to NOR reset, NOR is reset, but stays in reset as
HRESET# stays asserted.



We have been looking at several solutions hardware wise that reset the
NOR flash on HRESET# going low, but the processors are stubborn,
read the config words only once, than dead lock.

I wonder if there are known-working designs for this.

Regards,
-- 
Leon
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: MPC83xx watchdog reset board dead lock

2009-06-16 Thread David Hawkins

Hi Leon,


Most MPC8xxx board designs I have seen suffer from this possible dead lock:
- NOR Flash is put in erase mode or write mode
- Hardware watchdog triggers
- HRESET# is asserted by the processor, during which the configuration
words are read from NOR Flash.

Either
HRESET# is not attached to NOR, NOR stays in erase/write mode and
invalid words will be read - dead lock

or either:
HRESET# is attached to NOR reset, NOR is reset, but stays in reset as
HRESET# stays asserted.

We have been looking at several solutions hardware wise that reset the
NOR flash on HRESET# going low, but the processors are stubborn,
read the config words only once, than dead lock.

I wonder if there are known-working designs for this.


What do you do in the case of blank flash on a board?

I'm not sure that this will work for you, but it might ...
on my boards, the PowerPC connects to an FPGA on the local
bus, and the flash is connected to that FPGA (it made
routing easier to go through the FPGA). When the board
powers up, a small FSM in the FPGA reads from the first
word in Flash, and checks it is not blank. If it is
blank, the configuration pins are asserted such that
the processor uses a hard-coded reset configuration word,
and the processor reset is deasserted. If the flash is
not blank, the processor is allowed to boot from Flash.

An alternative method for firmware updates would be
to use the boot from high-mem versus low-mem flag
in the RCWs. On a number of the Freescale EVMs, you
can have two images in the Flash; one at low mem,
and one at high mem. Toggling a dip switch selects
the value of the BMS bit in the RCWs, and lets you boot
from one image or the other. This is a good trick that
avoids having to have a debugger to recover back
to a sane image - note that a CPLD delivers the RCWs,
not the flash.

So in either of these examples, the RCWs source is
sometimes not the Flash image ... which requires
that feature designed into the original hardware.

Those two ideas should get you thinking :)

Cheers,
Dave

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v3] powerpc: add ioremap_early() for mapping IO regions before MMU_init()

2009-06-16 Thread Grant Likely
On Mon, Jun 15, 2009 at 12:57 AM, Benjamin
Herrenschmidtb...@kernel.crashing.org wrote:
 On Wed, 2009-05-27 at 12:55 -0600, Grant Likely wrote:
 From: Grant Likely grant.lik...@secretlab.ca

 ioremap_early() is useful for things like mapping SoC internally registers
 and early debug output because it allows mappings to devices to be setup
 early in the boot process where they are needed.  It also give a
 performance boost since BAT mapped registers don't get flushed out of
 the TLB.

 Without ioremap_early(), early mappings are set up in an ad-hoc manner
 and they get lost when the MMU is set up.  Drivers then have to perform
 hacky fixups to transition over to new mappings.

 Signed-off-by: Grant Likely grant.lik...@secretlab.ca
 ---

 My 40x config gives me:

 /home/benh/linux-powerpc-test/drivers/video/xilinxfb.c:409: warning:
 ‘dcr_host.base’ may be used uninitialized in this function

 (warning, I think, was already there, so the patch is going into -next
 but we may want another one, provided we find a way to shut the idiot up
 without horrible hacks since that's just gcc being stupid I believe).

I'll have the final fix out to you today.

g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PowerPC PCI DMA issues (prefetch/coherency?)

2009-06-16 Thread Scott Wood

Chris Pringle wrote:

The kernel version is 2.6.26. Firmware is custom on a custom board.

Cache coherency on PCI DMA requires that the memory be mapped with the M
attribute on this chip, but that should be happening based on 
detection of

the core.
  

I'm not sure where to look to verify this?


Check asm/cputable.h for CPU_FTR_NEED_COHERENT.  Make sure that 
CONFIG_8260 is one of the #ifdefs that turns that on.  It looks like 
that was in place by 2.6.26 in arch/powerpc.  I'm not sure what to look 
for in arch/ppc.



Also make sure that you park the bus on PCI and raise its arbitration
priority, as done at the end of fixup_pci in 
arch/powerpc/boot/cuboot-pq2.c.
  

Since this is a reasonably recent kernel,


Not really, there was a fair amount of 82xx work in the mid-2.6.20s. 
The addition of CPU_FTR_NEED_COHERENT to 82xx was somewhere in that time.


Can you try 2.6.30?

I'd guess that both of these 
things are correct. I've had a quick look in that file and there is code 
in there raising arbitartion priority and parking the bus.


Just because the code is there doesn't mean you're using it -- are you 
using cuImage?  Are you using arch/ppc or arch/powerpc?


Typically this would be done by firmware; it's only in cuboot because 
u-boot wasn't doing it.



BTW, you may want to post to linuxppc-dev@lists.ozlabs.org for
powerpc-specific issues, especially this kind of hardware issue.

  

I've just posted there :-)


Doh, sorry. :-)

Interestingly, I've just turned off cache snooping and the problem has 
got much worse. This has surprised me as I thought that part of the job 
done by pci_map_sg was to flush the CPU cache


It only flushes the cache on hardware that doesn't do coherent DMA.

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PowerPC PCI DMA issues (prefetch/coherency?)

2009-06-16 Thread Chris Pringle

Scott Wood wrote:


Check asm/cputable.h for CPU_FTR_NEED_COHERENT.  Make sure that 
CONFIG_8260 is one of the #ifdefs that turns that on.  It looks like 
that was in place by 2.6.26 in arch/powerpc.  I'm not sure what to 
look for in arch/ppc.
I've just checked that and it's definitely switched on in CPU_FTR_COMMON 
(CONFIG_8260 is also being used).



Also make sure that you park the bus on PCI and raise its arbitration
priority, as done at the end of fixup_pci in 
arch/powerpc/boot/cuboot-pq2.c.
  

Since this is a reasonably recent kernel,


Not really, there was a fair amount of 82xx work in the mid-2.6.20s. 
The addition of CPU_FTR_NEED_COHERENT to 82xx was somewhere in that time.


Can you try 2.6.30?
I'll give it a try, but that won't be a quick thing to do - will 
hopefully manage to get that done tomorrow if it patches without too 
many issues. I should point out that we've got the low latency patches 
on this kernel too; I guess it'd be worth trying it without them before 
I move kernels.


I'd guess that both of these things are correct. I've had a quick 
look in that file and there is code in there raising arbitartion 
priority and parking the bus.


Just because the code is there doesn't mean you're using it -- are you 
using cuImage?  Are you using arch/ppc or arch/powerpc?


Typically this would be done by firmware; it's only in cuboot because 
u-boot wasn't doing it.

Just checked this is being called and it is. We're using arch/powerpc.



Interestingly, I've just turned off cache snooping and the problem 
has got much worse. This has surprised me as I thought that part of 
the job done by pci_map_sg was to flush the CPU cache


It only flushes the cache on hardware that doesn't do coherent DMA.

Ah right - that would explain what we're seeing then... Doh. Thought I 
might have been onto something then. Is there any way to force a cache 
flush? That'd at least prove it was a caching issue if it resolved the 
problem.


Thanks,
Chris

--

__
Chris Pringle
Software Engineer

Miranda Technologies Ltd.
Hithercroft Road
Wallingford
Oxfordshire OX10 9DG
UK

Tel. +44 1491 820206
Fax. +44 1491 820001
www.miranda.com



Miranda Technologies Limited
Registered in England and Wales CN 02017053
Registered Office: James House, Mere Park, Dedmere Road, Marlow, Bucks, SL7 1FJ
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: MPC83xx watchdog reset board dead lock

2009-06-16 Thread Leon Woestenberg
Hello,

On Tue, Jun 16, 2009 at 6:30 PM, David Hawkinsd...@ovro.caltech.edu wrote:
 Most MPC8xxx board designs I have seen suffer from this possible dead
 lock:
 - NOR Flash is put in erase mode or write mode
 - Hardware watchdog triggers
 - HRESET# is asserted by the processor, during which the configuration
 words are read from NOR Flash.

 Either
 HRESET# is not attached to NOR, NOR stays in erase/write mode and
 invalid words will be read - dead lock

 or either:
 HRESET# is attached to NOR reset, NOR is reset, but stays in reset as
 HRESET# stays asserted.

 We have been looking at several solutions hardware wise that reset the
 NOR flash on HRESET# going low, but the processors are stubborn,
 read the config words only once, than dead lock.

 I wonder if there are known-working designs for this.

 What do you do in the case of blank flash on a board?

The problem is not with blank flash or firmware upgrades, we know how
to handle that.

Your solution is (a solution) to a different problem.

The problem lies in the fact that board dead lock can occur if the
watchdog triggers, for all reference designs I have seen.

Thanks for thinking along.  I would like to solve the original problem though.
BTW, we use CPLD/FPGAs on most of our boards, this one we do not for
cost reasons.


Regards,
-- 
Leon
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PowerPC PCI DMA issues (prefetch/coherency?)

2009-06-16 Thread Scott Wood

Chris Pringle wrote:
Ah right - that would explain what we're seeing then... Doh. Thought I 
might have been onto something then. Is there any way to force a cache 
flush? That'd at least prove it was a caching issue if it resolved the 
problem.


You could enable CONFIG_NOT_COHERENT_CACHE.

-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH V2 2/2] mtd/maps/mtd-ram: add an of-platform driver

2009-06-16 Thread Albrecht Dreß

Am 16.06.09 14:53 schrieb(en) Grant Likely:
I'm not happy about the use case though.  It probably shouldn't  
appear in this binding, or if it does it should be tagged as an  
optional property.


I agree with you that the naming is really misleading - other devices  
which are not mtd's may suffer from the same problem of the lpb (in my  
case, I have an extra memory-mapped Ethernet chip which I didn't try to  
access yet...).


As it is actually a chip select property, what about defining an  
optional property like cs-width = (8|16|32) which defaults to 8 and  
may be added to each 5200 lpb child?


It is only in the 5200 localplus case that bank-width is needed to  
figure out how to apply the workaround.


Just out of curiosity: what about the localbus of other Freescale  
chips (82xx? 83xx? Maybe others?)?


Thanks, Albrecht.


pgpLbC13BupnH.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: PowerPC PCI DMA issues (prefetch/coherency?)

2009-06-16 Thread Scott Wood

Arnd Bergmann wrote:

On Tuesday 16 June 2009, Scott Wood wrote:

Chris Pringle wrote:
Ah right - that would explain what we're seeing then... Doh. Thought I 
might have been onto something then. Is there any way to force a cache 
flush? That'd at least prove it was a caching issue if it resolved the 
problem.

You could enable CONFIG_NOT_COHERENT_CACHE.


If the whole system is noncoherent, that is the right solution.


I meant it more as a test than a permanent solution...


If the
device is the only one, you can also use dma_alloc_noncoherent() and
flush explicitly with dma_cache_sync().


I don't see how that would help -- aren't those also controlled by 
CONFIG_NOT_COHERENT_CACHE?


-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PowerPC PCI DMA issues (prefetch/coherency?)

2009-06-16 Thread Arnd Bergmann
On Tuesday 16 June 2009, Scott Wood wrote:
  If the
  device is the only one, you can also use dma_alloc_noncoherent() and
  flush explicitly with dma_cache_sync().
 
 I don't see how that would help -- aren't those also controlled by 
 CONFIG_NOT_COHERENT_CACHE?

Ah, yes you are right. PowerPC implements dma_alloc_noncoherent as
dma_alloc_coherent, so dma_cache_sync() is actually a NOP (or should be).

Actually there seems to be a bug in here: Since dma_alloc_noncoherent
gives you a coherent mapping (or NULL) on noncoherent machines,
dma_cache_sync() is redundant and should not actually flush the
cache, or we should change dma_alloc_noncoherent to do a simple
alloc_pages on CONFIG_NON_COHERENT_CACHE and leave dma_cache_sync()
as it is.

Arnd 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH] fsldma: Add DMA_SLAVE support

2009-06-16 Thread Dan Williams

Hi Ira,

Ira Snyder wrote:

Use the DMA_SLAVE capability of the DMAEngine API to copy/from a
scatterlist into an arbitrary list of hardware address/length pairs.

This allows a single DMA transaction to copy data from several different
devices into a scatterlist at the same time.

This also adds support to enable some controller-specific features such as
external start and external pause of a DMA transaction.

Signed-off-by: Ira W. Snyder i...@ovro.caltech.edu
---

This is a request for comments on this patch. I hunch it is not quite
ready for inclusion, though it is certainly ready for review. Correct
functioning of this patch depends on the patches submitted earlier.

As suggested by Dan Williams, I implemented DMA_SLAVE support for the
fsldma controller to allow me to use the hardware to transfer to/from a
scatterlist to a list of hardware address/length pairs.

I implemented support for the extra features available in the DMA
controller, such as external pause and external start. I have not tested
the features yet. I am willing to drop the support if everything else
looks good.

I have implemented helper functions for creating the list of hardware
address/length pairs as static inline functions in the linux/fsldma.h
header. Should I incorporate these into the driver itself and use
EXPORT_SYMBOL()? I've never done this before :)


Using EXPORT_SYMBOL would defeat the purpose of conforming to the 
dmaengine api which should allow other subsystems to generically 
discover an fsldma resource.



diff --git a/include/linux/fsldma.h b/include/linux/fsldma.h
new file mode 100644
index 000..a42dcdd
--- /dev/null
+++ b/include/linux/fsldma.h
@@ -0,0 +1,105 @@
+/*
+ * Freescale MPC83XX / MPC85XX DMA Controller
+ *
+ * Copyright (c) 2009 Ira W. Snyder i...@ovro.caltech.edu
+ *
+ * This file is licensed under the terms of the GNU General Public License
+ * version 2. This program is licensed as is without any warranty of any
+ * kind, whether express or implied.
+ */
+
+#ifndef __LINUX_FSLDMA_H__
+#define __LINUX_FSLDMA_H__
+
+#include linux/dmaengine.h
+
+/*
+ * physical hardware address / length pair for use with the
+ * DMAEngine DMA_SLAVE API
+ */
+struct fsl_dma_hw_addr {
+   struct list_head entry;
+
+   dma_addr_t address;
+   size_t length;
+};


Can you explain a bit more why you need the new dma address list, would 
a struct scatterlist suffice?


In general it is difficult to merge new functionality without an in-tree 
user.  Can you share the client of this new api?


I suspect you can get away without needing these new helper routines. 
Have a look at how Haavard implemented his DMA_SLAVE client in 
drivers/mmc/host/atmel-mci.c.


Haavard ended up needing to add some public structure definitions to 
include/linux, but my preference is to keep this in an 
architecture/platform specific header file location if possible.


Regards,
Dan

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: MPC83xx watchdog reset board dead lock

2009-06-16 Thread David Hawkins

Hi Leon,


Most MPC8xxx board designs I have seen suffer from this possible dead
lock:
- NOR Flash is put in erase mode or write mode
- Hardware watchdog triggers
- HRESET# is asserted by the processor, during which the configuration
words are read from NOR Flash.

Either
HRESET# is not attached to NOR, NOR stays in erase/write mode and
invalid words will be read - dead lock

or either:
HRESET# is attached to NOR reset, NOR is reset, but stays in reset as
HRESET# stays asserted.

We have been looking at several solutions hardware wise that reset the
NOR flash on HRESET# going low, but the processors are stubborn,
read the config words only once, than dead lock.

I wonder if there are known-working designs for this.

What do you do in the case of blank flash on a board?


The problem is not with blank flash or firmware upgrades, we know how
to handle that.

Your solution is (a solution) to a different problem.

The problem lies in the fact that board dead lock can occur if the
watchdog triggers, for all reference designs I have seen.

Thanks for thinking along.  I would like to solve the original problem though.
BTW, we use CPLD/FPGAs on most of our boards, this one we do not for
cost reasons.


So we're talking about the sequence where HRESET# asserts as
in say the logic analyzer trace on p34:

http://www.ovro.caltech.edu/~dwh/carma_board/powerpc_mpc8349e.pdf

the LALE pulse where the processor reads the RCWs occurs very
soon after the falling edge of HRESET#.

So the Flash needs to be reset to ensure that it is in
read-array mode, so that the processor doesn't choke.
Since HRESET# is still low, thats no good. A pulse
generator that is based on HRESET# might work, but
the pulse would have to be long enough to meet any
reset requirement of the flash, yet short enough so
that the read of the first RCW would be valid. Since
the local bus is running really slow at this point,
I think that could be done ok.

How about a set-reset flip flop that is set on the
falling edge of HRESET# and cleared on the rising
edge of LALE. That'll produce a decent reset
pulse to the flash, and then there is plenty of
time for the first access to produce valid data
on the bus.

Cheers,
Dave

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH V2 2/2] mtd/maps/mtd-ram: add an of-platform driver

2009-06-16 Thread Grant Likely
Okay, fair enough.  I wasn't paying very close attention when I replied.  It
still seems awkward to me, but not enough to object (ie. It's not
dangerous).

g.

On Jun 16, 2009 7:20 AM, Wolfram Sang w.s...@pengutronix.de wrote:

  Grant wondered if we need a bankwidth. IMHO it is needed for now, but I
don't   know if this i...
Maybe there is a misunderstanding here. I am not talking about Albrecht's
case.
What I replied to your concern is that bankwidth is used(!) in the
underlying
map-ram-driver in mapram_erase() at the moment. Whether this is really
needed
could be discussed perhaps, but is beyond the scope of this patch series
IMHO.
I'd think this can be addressed in a later series, if needed, although this
could mean that the binding will change (bank-width becoming optional).

Regards, Wolfram -- Pengutronix e.K. | Wolfram Sang ...

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAko3nAUACgkQD27XaX1/VRtTkACfW0aUMJHrU3m4DCel0pm5fA6J
WaQAnjGo5fn6JvMHt3Ke/xFTGB1uYT6p
=V9t5
-END PGP SIGNATURE-
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

2.6.30 kernel panic with any network driver

2009-06-16 Thread John Linn
I just merged in 2.6.30 into our tree and I'm seeing this only when I
have a network driver built into the kernel, and it doesn't seem to
matter which network driver.

 

Maybe I missed something I need to update in our network drivers to
match the kernel.

 

Thanks for any help,

John

 

console [ttyS0] enabled

brd: module loaded

loop: module loaded

Unable to handle kernel paging request for data at address 0x75686369

Faulting instruction address: 0xc01c3e34

Oops: Kernel access of bad area, sig: 11 [#1]

PREEMPT Xilinx Virtex440

Modules linked in:

NIP: c01c3e34 LR: c01c2ab4 CTR: 

REGS: cf831dc0 TRAP: 0300   Not tainted  (2.6.30)

MSR: 00029000 EE,ME,CE  CR: 84284324  XER: 4020

DEAR: 75686369, ESR: 

TASK = cf82f8a0[1] 'swapper' THREAD: cf83

GPR00: c01c2ab4 cf831e70 cf82f8a0 cf87b3a8 c034e5a4  c01c2a80
0001

GPR08: cfbe69d8 c034dd98 0f2b cf821778 84284342 fd9781f4 f6f6dff7
6ade

GPR16: d7f40173 bbd35ff7 3ef7fbff fff5d3f7  0059dac0 006f


GPR24:  0001 c034dd98 cf0416e0 cf87b3a0 cf87b3dc c034e5a4
75686369

NIP [c01c3e34] platform_match+0x20/0xa8

LR [c01c2ab4] __driver_attach+0x34/0xa8

Call Trace:

[cf831e70] [c027a394] klist_next+0x10c/0x120 (unreliable)

[cf831e90] [c01c2ab4] __driver_attach+0x34/0xa8

[cf831eb0] [c01c123c] bus_for_each_dev+0x5c/0x98

[cf831ee0] [c01c243c] driver_attach+0x24/0x34

[cf831ef0] [c01c2050] bus_add_driver+0x18c/0x264

[cf831f10] [c01c2fa8] driver_register+0x6c/0x170

[cf831f30] [c031f8cc] xemaclite_init+0x1c/0x4c

[cf831f50] [c00011ac] do_one_initcall+0x34/0x1a0

[cf831fd0] [c0307848] kernel_init+0xa0/0x104

[cf831ff0] [c000e188] kernel_thread+0x4c/0x68

Instruction dump:



This email and any attachments are intended for the sole use of the named 
recipient(s) and contain(s) confidential information that may be proprietary, 
privileged or copyrighted under applicable law. If you are not the intended 
recipient, do not read, copy, or forward this email message or any attachments. 
Delete this email message and any attachments immediately.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH] fsldma: Add DMA_SLAVE support

2009-06-16 Thread Ira Snyder
On Tue, Jun 16, 2009 at 12:01:40PM -0700, Dan Williams wrote:
 Hi Ira,

 Ira Snyder wrote:
 Use the DMA_SLAVE capability of the DMAEngine API to copy/from a
 scatterlist into an arbitrary list of hardware address/length pairs.

 This allows a single DMA transaction to copy data from several different
 devices into a scatterlist at the same time.

 This also adds support to enable some controller-specific features such as
 external start and external pause of a DMA transaction.

 Signed-off-by: Ira W. Snyder i...@ovro.caltech.edu
 ---

 This is a request for comments on this patch. I hunch it is not quite
 ready for inclusion, though it is certainly ready for review. Correct
 functioning of this patch depends on the patches submitted earlier.

 As suggested by Dan Williams, I implemented DMA_SLAVE support for the
 fsldma controller to allow me to use the hardware to transfer to/from a
 scatterlist to a list of hardware address/length pairs.

 I implemented support for the extra features available in the DMA
 controller, such as external pause and external start. I have not tested
 the features yet. I am willing to drop the support if everything else
 looks good.

 I have implemented helper functions for creating the list of hardware
 address/length pairs as static inline functions in the linux/fsldma.h
 header. Should I incorporate these into the driver itself and use
 EXPORT_SYMBOL()? I've never done this before :)

 Using EXPORT_SYMBOL would defeat the purpose of conforming to the  
 dmaengine api which should allow other subsystems to generically  
 discover an fsldma resource.


Any driver would still use dma_request_channel(), etc. to get access to
a DMA channel. AFAICT, DMA_SLAVE is intended for doing something
completely hardware-specific with the DMA controller.

 diff --git a/include/linux/fsldma.h b/include/linux/fsldma.h
 new file mode 100644
 index 000..a42dcdd
 --- /dev/null
 +++ b/include/linux/fsldma.h
 @@ -0,0 +1,105 @@
 +/*
 + * Freescale MPC83XX / MPC85XX DMA Controller
 + *
 + * Copyright (c) 2009 Ira W. Snyder i...@ovro.caltech.edu
 + *
 + * This file is licensed under the terms of the GNU General Public License
 + * version 2. This program is licensed as is without any warranty of any
 + * kind, whether express or implied.
 + */
 +
 +#ifndef __LINUX_FSLDMA_H__
 +#define __LINUX_FSLDMA_H__
 +
 +#include linux/dmaengine.h
 +
 +/*
 + * physical hardware address / length pair for use with the
 + * DMAEngine DMA_SLAVE API
 + */
 +struct fsl_dma_hw_addr {
 +   struct list_head entry;
 +
 +   dma_addr_t address;
 +   size_t length;
 +};

 Can you explain a bit more why you need the new dma address list, would  
 a struct scatterlist suffice?


I don't believe so. A scatterlist only holds page/length pairs. How
would you pass an arbitrary dma_addr_t/length pair in a scatterlist. I
/could/ abuse sg_dma_address() and do something like the following, but
I think you'd be even less inclined to take the patch:

struct scatterlist sg[10];

sg_dma_address(sg) = addr1;
sg_dma_len(sg) = len1;
sg++;
sg_dma_address(sg) = addr2;
sg_dma_len(sg) = len2;

/* and so on */

This would mean that there is a scatterlist with the struct page
pointers set to NULL, which has not had dma_map_sg() run on it. Seems
like abuse to me.

What I've implemented is this: (sorry about the poor drawing)

scatterlist   fsl_dma_hw_addr
+++---+
|  DATA  |   | DEST1 |
|  DATA  | +  +---+
|  DATA  | |
|  DATA  | |  +---+
|  DATA  | +---  | DEST2 |
|  DATA  |+---+
++
  .
  .
  .

Of course, the reverse works as well. You can copy from a list of
hardware address/length pairs to a scatterlist.

So, using my implementation of the DMA_SLAVE feature, you can take a big
chunk of data (which is organized into a scatterlist) and DMA it
directly to a set of hardware addresses, all in a single, unbroken
transaction.

I've got an FPGA programmer which needs a ~12MB image dumped to a FIFO
at 0xf0003000 in 4K chunks (all writes must be in the 0xf0003000 to
0xf0004000 range). The programmer is actually in control of the DMA
controller at that time. Internally, the FPGA programmer does some
toggling of pins, etc. which is needed to actually push the image into
the FPGA's themselves.

 In general it is difficult to merge new functionality without an in-tree  
 user.  Can you share the client of this new api?


I've inlined the driver for the FPGA programmer below. I don't think it
is appropriate to push into mainline, since it will only work for our
board, and nothing else.

It is pretty simple, but I'm totally open to suggestions for changes. I
used a char device to fill in a scatterlist, then set up the DMA to
0xf0003000 in 4K chunks.

I've got another driver that uses the interface, but this one is a bit
simpler. I can post the other one if you'd like, as well.

 I 

[PATCH RFC 1/2] Makefile: Never use -fno-omit-frame-pointer

2009-06-16 Thread Anton Vorontsov
According to Segher Boessenkool and GCC manual, -fomit-frame-pointer
is only the default when optimising on archs/ABIs where it doesn't
hinder debugging and -pg. So, we do not get it by default on x86,
not at any optimisation level.

On the other hand, *using* -fno-omit-frame-pointer causes gcc to
produce buggy code on PowerPC targets.

If Segher and GCC manual are right, this patch should be a no-op
for all arches except PowerPC, where the patch fixes gcc issues.

Signed-off-by: Anton Vorontsov avoront...@ru.mvista.com
---

See this thread for more discussion:
http://osdir.com/ml/linux-kernel/2009-05/msg01754.html

p.s.
Obviously, I didn't test this patch on anything else but PPC32. ;-)

Segher, do you know if all GCC versions that we support for
building Linux are behaving the way that GCC manual describe?

Thanks,

 Makefile |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Makefile b/Makefile
index ea63667..70ad1ff 100644
--- a/Makefile
+++ b/Makefile
@@ -535,7 +535,7 @@ KBUILD_CFLAGS += $(call cc-option, -fno-stack-protector)
 endif
 
 ifdef CONFIG_FRAME_POINTER
-KBUILD_CFLAGS  += -fno-omit-frame-pointer -fno-optimize-sibling-calls
+KBUILD_CFLAGS  += -fno-optimize-sibling-calls
 else
 KBUILD_CFLAGS  += -fomit-frame-pointer
 endif
-- 
1.6.3.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH RFC 2/2] powerpc: Remove -fno-omit-frame-pointer workarounds

2009-06-16 Thread Anton Vorontsov
The workarounds aren't needed any longer since the top level Makefile
doesn't pass -fno-omit-frame-pointer cflag for PowerPC builds.

Signed-off-by: Anton Vorontsov avoront...@ru.mvista.com
---
 arch/powerpc/Makefile|5 -
 arch/powerpc/kernel/Makefile |   12 ++--
 arch/powerpc/platforms/powermac/Makefile |2 +-
 lib/Kconfig.debug|6 +++---
 4 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index bc35f4e..dd3c63b 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -120,11 +120,6 @@ ifeq ($(CONFIG_6xx),y)
 KBUILD_CFLAGS  += -mcpu=powerpc
 endif
 
-# Work around a gcc code-gen bug with -fno-omit-frame-pointer.
-ifeq ($(CONFIG_FUNCTION_TRACER),y)
-KBUILD_CFLAGS  += -mno-sched-epilog
-endif
-
 cpu-as-$(CONFIG_4xx)   += -Wa,-m405
 cpu-as-$(CONFIG_6xx)   += -Wa,-maltivec
 cpu-as-$(CONFIG_POWER4)+= -Wa,-maltivec
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 612b0c4..19ebe1b 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -14,14 +14,14 @@ endif
 
 ifdef CONFIG_FUNCTION_TRACER
 # Do not trace early boot code
-CFLAGS_REMOVE_cputable.o = -pg -mno-sched-epilog
-CFLAGS_REMOVE_prom_init.o = -pg -mno-sched-epilog
-CFLAGS_REMOVE_btext.o = -pg -mno-sched-epilog
-CFLAGS_REMOVE_prom.o = -pg -mno-sched-epilog
+CFLAGS_REMOVE_cputable.o = -pg
+CFLAGS_REMOVE_prom_init.o = -pg
+CFLAGS_REMOVE_btext.o = -pg
+CFLAGS_REMOVE_prom.o = -pg
 # do not trace tracer code
-CFLAGS_REMOVE_ftrace.o = -pg -mno-sched-epilog
+CFLAGS_REMOVE_ftrace.o = -pg
 # timers used by tracing
-CFLAGS_REMOVE_time.o = -pg -mno-sched-epilog
+CFLAGS_REMOVE_time.o = -pg
 endif
 
 obj-y  := cputable.o ptrace.o syscalls.o \
diff --git a/arch/powerpc/platforms/powermac/Makefile 
b/arch/powerpc/platforms/powermac/Makefile
index 50f1693..0eb8781 100644
--- a/arch/powerpc/platforms/powermac/Makefile
+++ b/arch/powerpc/platforms/powermac/Makefile
@@ -2,7 +2,7 @@ CFLAGS_bootx_init.o += -fPIC
 
 ifdef CONFIG_FUNCTION_TRACER
 # Do not trace early boot code
-CFLAGS_REMOVE_bootx_init.o = -pg -mno-sched-epilog
+CFLAGS_REMOVE_bootx_init.o = -pg
 endif
 
 obj-y  += pic.o setup.o time.o feature.o pci.o \
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 39f113a..1479e54 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -492,7 +492,7 @@ config LOCKDEP
bool
depends on DEBUG_KERNEL  TRACE_IRQFLAGS_SUPPORT  STACKTRACE_SUPPORT 
 LOCKDEP_SUPPORT
select STACKTRACE
-   select FRAME_POINTER if !MIPS  !PPC  !ARM_UNWIND  !S390
+   select FRAME_POINTER if !MIPS  !ARM_UNWIND  !S390
select KALLSYMS
select KALLSYMS_ALL
 
@@ -861,13 +861,13 @@ config FAULT_INJECTION_STACKTRACE_FILTER
depends on FAULT_INJECTION_DEBUG_FS  STACKTRACE_SUPPORT
depends on !X86_64
select STACKTRACE
-   select FRAME_POINTER if !PPC  !S390
+   select FRAME_POINTER if !S390
help
  Provide stacktrace filter for fault-injection capabilities
 
 config LATENCYTOP
bool Latency measuring infrastructure
-   select FRAME_POINTER if !MIPS  !PPC  !S390
+   select FRAME_POINTER if !MIPS  !S390
select KALLSYMS
select KALLSYMS_ALL
select STACKTRACE
-- 
1.6.3.1
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: 2.6.30 kernel panic with any network driver

2009-06-16 Thread Scott Wood

John Linn wrote:

console [ttyS0] enabled

brd: module loaded

loop: module loaded

Unable to handle kernel paging request for data at address 0x75686369


That address spells uhci, FWIW.  Maybe a bad pointer cast somewhere in 
the USB platform data?


-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] RFC: powerpc: expose the multi-bit ops that underlie single-bit ops.

2009-06-16 Thread Benjamin Herrenschmidt
On Tue, 2009-06-16 at 10:28 -0400, Geoff Thorpe wrote:

  Hi ! Sorry for the delay, that was on my have a look one of these days,
  low priority list for a bit too long :-)
 
 NP, optimal throughput often requires a compromise in latency :-)

Hehehe, so true :-)
 
  I'm not sure it's useful to provide a multi-bit variant of the
  lock and unlock primitives. Do other archs do ?
 
 For clear_bit_unlock(), no they don't appear to. There is a fallback in
 include/asm-generic though, and I notice that it's used in a few places,
 eg. drivers/rtc/rtc-dev. OTOH some other archs appear to provide their
 own test_and_set_bit_lock(), and there's a fallback in
 includes/asm-generic for that too.
 
 Do you see a reason to isolate either of these and not factor out the
 inner word-based logic?

Don't bother, especially if we are using a macro to generate them, we
may as well make them all look the same.

 .../...

 Yup, believe it or not, I saw this coming but didn't have the guts to
 try proposing something like it up-front (in particular, I was wary of
 botching some subtleties in the assembly).

Fair enough :-)

  Maybe we can shrink that file significantly (and avoid the risk for
  typos etc...) by generating them all from a macro.
  
  Something like (typed directly into the mailer :-)
  
  #define DEFINE_BITOP(op, prefix, postfix) \
  asm volatile (\
  prefix\
  1:PPC_LLARX %0,0,%3\n \
  __stringify(op) %1,%0,%2\n  \
  PPC405_ERR77(0,%3)\
  PPC_STLCX %1,0,%3\n \
  bne- 1b\n   \
  postfix   \
   : =r (old), =r (t)
   : r (mask), r (p)
   : cc, memory)
  
  and so:
  
  static inline void set_bits(unsigned long mask, volatile unsigned long 
  *addr)
  {
  unsigned long old, t;
  
  DEFINE_BITOP(or, , );
  }
  
  static inline void test_and_set_bits(unsigned long mask, volatile unsigned 
  long *addr)
  {
  unsigned long old, t;
  
  DEFINE_BITOP(or, LWSYNC_ON_SMP, ISYNC_ON_SMP);
  
  return (old  mask) != 0;
  }
  
  etc...
 
 
 Sounds good, I'll try working this up and I'll send a new patch shortly.

You can also go totally mad and generate the whole function (both -s and
non -s variants) from one macro but I wouldn't go that far :-)

 So can I assume implicitly that changing the set_bits() function to add
 the 'volatile' qualifier to the prototype (and the missing
 PPC405_ERR77() workaround) was OK?

The PPC405_ERR77 workaround is definitely needed. The volatile, well, I
suspect it's useless, but it will remove warnings when callers call
these on something that is declared as volatile in the first place.

Do x86 use volatile there ? If not, then don't do it on powerpc neither,
it could well be an historical remain. It's not functionally useful, the
memory clobber in the asm takes care of telling the compiler not to
mess around I believe.

 Also - any opinion on whether the same re-factoring of the asm-generic
 versions should be undertaken? I'm not looking to bite off more than I
 can chew, but I don't know if it's frowned upon to make powerpc-only
 extensions to the API. And if you think an asm-generic patch makes
 sense, could that be taken via linuxppc-dev too or does it need to go
 elsewhere?

I'm not people care -that- much :-) You can always try and post it to
lkml (maybe linux-arch too) and see what comes back... but let's finish
the powerpc variant first :-)

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/2] lib: Provide generic atomic64_t implementation

2009-06-16 Thread Gabriel Paubert
On Sun, Jun 14, 2009 at 04:04:36PM +0300, Avi Kivity wrote:
 Paul Mackerras wrote:
 Avi Kivity writes:

   
 An alternative implementation using 64-bit cmpxchg will recover most 
 of the costs of hashed spinlocks.  I assume most serious 32-bit  
 architectures have them?
 

 Have a 64-bit cmpxchg, you mean?  x86 is the only one I know of, and
 it already has an atomic64_t implementation using cmpxchg8b (or
 whatever it's called).
   

 Yes (and it is cmpxchg8b).  I'm surprised powerpc doesn't have DCAS support.

Well, s390 and m68k have the equivalent (although I don't think Linux
suppiorts SMP m68k, although some dual 68040/68060 boards have existed).

But 32 bit PPC will never have it. It just does not fit in the architecture
since integer loads and stores are limited to 32 bit (or split into 32 bit
chunks). Besides that there is no instruction that performs a read-modify-write
of memory. This would make the LSU much more complex for a corner case.

Hey, Intel also botched the first implementation of cmpxchg8b on the Pentium:
the (in)famous f00f bug is actually lock cmpxchg8b with a register operand.

Now for these counters, other solutions could be considered, like using
the most significant bit as a lock and having only 63 usable bits (when 
counting ns, this overflows at 292 years). 


 My thinking is that the 32-bit non-x86 architectures will be mostly
 UP, so the overhead is just an interrupt enable/restore.  Those that
 are SMP I would expect to be small SMP -- mostly just 2 cpus and maybe
 a few 4-way systems.
   

 The new Nehalems provide 8 logical threads in a single socket.  All  
 those threads share a cache, and they have cmpxchg8b anyway, so this  
 won't matter.


The problem is not Nehalem (who wants to run 32 bit kernels on a Nehalem
anyway) or x86.

The problem is that the assumption that the largest PPC32 SMP are 4 way
may be outdated:

http://www.freescale.com/webapp/sps/site/prod_summary.jsp?fastpreview=1code=P4080

and some products including that processor have been announced (I don't know
whether they are shipping or not) and (apparently) run Linux.

Gabriel
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/2] [PATCH 2/2 v2] qe: add polling timeout to qe_issue_cmd()

2009-06-16 Thread Kumar Gala


On May 26, 2009, at 10:21 AM, Timur Tabi wrote:

The qe_issue_cmd() function (Freescale PowerPC QUICC Engine library)  
polls on
a register until a status bit changes, but does not include a  
timeout to
handle the situation if the bit never changes.  Change the code to  
use the new
spin_event_timeout() macro, which simplifies polling on a register  
without

a timeout.

Signed-off-by: Timur Tabi ti...@freescale.com
---

This patch depends on my previous patch, powerpc: introduce macro
spin_event_timeout().

arch/powerpc/sysdev/qe_lib/qe.c |9 ++---
1 files changed, 6 insertions(+), 3 deletions(-)


applied to next

- k
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PowerPC PCI DMA issues (prefetch/coherency?)

2009-06-16 Thread Benjamin Herrenschmidt
On Tue, 2009-06-16 at 20:02 +0200, Arnd Bergmann wrote:
 On Tuesday 16 June 2009, Scott Wood wrote:
   If the
   device is the only one, you can also use dma_alloc_noncoherent() and
   flush explicitly with dma_cache_sync().
  
  I don't see how that would help -- aren't those also controlled by 
  CONFIG_NOT_COHERENT_CACHE?
 
 Ah, yes you are right. PowerPC implements dma_alloc_noncoherent as
 dma_alloc_coherent, so dma_cache_sync() is actually a NOP (or should be).

But we still need to sync the result of dma_map_* when used multiple
times for a single mapping.

Cheers,
Ben.

 Actually there seems to be a bug in here: Since dma_alloc_noncoherent
 gives you a coherent mapping (or NULL) on noncoherent machines,
 dma_cache_sync() is redundant and should not actually flush the
 cache, or we should change dma_alloc_noncoherent to do a simple
 alloc_pages on CONFIG_NON_COHERENT_CACHE and leave dma_cache_sync()
 as it is.
 
   Arnd 
 ___
 Linuxppc-dev mailing list
 Linuxppc-dev@lists.ozlabs.org
 https://lists.ozlabs.org/listinfo/linuxppc-dev

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PowerPC PCI DMA issues (prefetch/coherency?)

2009-06-16 Thread FUJITA Tomonori
On Wed, 17 Jun 2009 10:18:45 +1000
Benjamin Herrenschmidt b...@kernel.crashing.org wrote:

 On Tue, 2009-06-16 at 20:02 +0200, Arnd Bergmann wrote:
  On Tuesday 16 June 2009, Scott Wood wrote:
If the
device is the only one, you can also use dma_alloc_noncoherent() and
flush explicitly with dma_cache_sync().
   
   I don't see how that would help -- aren't those also controlled by 
   CONFIG_NOT_COHERENT_CACHE?
  
  Ah, yes you are right. PowerPC implements dma_alloc_noncoherent as
  dma_alloc_coherent, so dma_cache_sync() is actually a NOP (or should be).
 
 But we still need to sync the result of dma_map_* when used multiple
 times for a single mapping.

We have dma_sync_{single|sg}_for_{cpu|device} API for the above
purpose.

dma_cache_sync is supposed to be used only with the buffers that
dma_alloc_noncoherent() returns. On architecutures that maps
dma_alloc_noncoherent to dma_alloc_coherent, dma_cache_sync() is
supposed to be NOP.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PowerPC PCI DMA issues (prefetch/coherency?)

2009-06-16 Thread Leon Woestenberg
Hello all,

On Wed, Jun 17, 2009 at 2:37 AM, FUJITA
Tomonorifujita.tomon...@lab.ntt.co.jp wrote:
 On Wed, 17 Jun 2009 10:18:45 +1000
 Benjamin Herrenschmidt b...@kernel.crashing.org wrote:

 On Tue, 2009-06-16 at 20:02 +0200, Arnd Bergmann wrote:
  On Tuesday 16 June 2009, Scott Wood wrote:
If the
device is the only one, you can also use dma_alloc_noncoherent() and
flush explicitly with dma_cache_sync().
  
   I don't see how that would help -- aren't those also controlled by
   CONFIG_NOT_COHERENT_CACHE?
 
  Ah, yes you are right. PowerPC implements dma_alloc_noncoherent as
  dma_alloc_coherent, so dma_cache_sync() is actually a NOP (or should be).

 But we still need to sync the result of dma_map_* when used multiple
 times for a single mapping.

 We have dma_sync_{single|sg}_for_{cpu|device} API for the above
 purpose.

 dma_cache_sync is supposed to be used only with the buffers that
 dma_alloc_noncoherent() returns. On architecutures that maps
 dma_alloc_noncoherent to dma_alloc_coherent, dma_cache_sync() is
 supposed to be NOP.

This discussion raised some doubt with me about my use case:

I my case (note I am not the poster) I am using (what LDD3 calls)
streaming mappings:

I use pci_map_sg(), have the device perform either DMA master reads or
writes to the bus address using PCIe.
After that, I use pci_unmap_sg().

My assumption is that pci_unmap_sg() either makes the cache coherent
or invalidated and thus I do not need to take further actions.
This is on a MPC83xx or 85xx system.

Is this assumption correct?

Regards,

Leon/

-- 
Leon
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] RFC: powerpc: expose the multi-bit ops that underlie single-bit ops.

2009-06-16 Thread Stephen Rothwell
On Wed, 17 Jun 2009 07:33:46 +1000 Benjamin Herrenschmidt 
b...@kernel.crashing.org wrote:

 You can also go totally mad and generate the whole function (both -s and
 non -s variants) from one macro but I wouldn't go that far :-)

Please don't (unless you keep the function names intact), it makes
finding the function with grep almost impossible ...

  Also - any opinion on whether the same re-factoring of the asm-generic
  versions should be undertaken? I'm not looking to bite off more than I
  can chew, but I don't know if it's frowned upon to make powerpc-only
  extensions to the API. And if you think an asm-generic patch makes
  sense, could that be taken via linuxppc-dev too or does it need to go
  elsewhere?
 
 I'm not people care -that- much :-) You can always try and post it to
 lkml (maybe linux-arch too) and see what comes back... but let's finish
 the powerpc variant first :-)

Also, we have an asm-generic maintainer now ...

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/


pgpNr9zFd9saM.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: PowerPC PCI DMA issues (prefetch/coherency?)

2009-06-16 Thread Benjamin Herrenschmidt
On Wed, 2009-06-17 at 09:37 +0900, FUJITA Tomonori wrote:
 
 dma_cache_sync is supposed to be used only with the buffers that
 dma_alloc_noncoherent() returns. On architecutures that maps
 dma_alloc_noncoherent to dma_alloc_coherent, dma_cache_sync() is
 supposed to be NOP.
 
Or at least a sync() on powerpc but yeah, I see. We should probably do
that.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Warning in block code

2009-06-16 Thread Benjamin Herrenschmidt
Hoy !

I see that:

block/blk-settings.c: In function ‘blk_set_default_limits’:
block/blk-settings.c:115: warning: large integer implicitly truncated to 
unsigned type

Comes from

lim-bounce_pfn = BLK_BOUNCE_ANY;

With BLK_BOUNCE_ANY being a

include/linux/blkdev.h:#define BLK_BOUNCE_ANY   (-1ULL)

And struct queue_limit.bounce_pfn is:
 
unsigned long   bounce_pfn;

(The warning is fishy as both quantities are unsigned, but there -is- truncation
happening here).

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/2] powerpc/of: Fix usage of dev_set_name() in of_device_alloc()

2009-06-16 Thread Benjamin Herrenschmidt
dev_set_name() takes a format string, so use it properly and avoid
a warning with recent gcc's

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---

 arch/powerpc/kernel/of_device.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-work.orig/arch/powerpc/kernel/of_device.c 2009-06-17 
11:34:54.0 +1000
+++ linux-work/arch/powerpc/kernel/of_device.c  2009-06-17 11:35:04.0 
+1000
@@ -76,7 +76,7 @@ struct of_device *of_device_alloc(struct
dev-dev.archdata.of_node = np;
 
if (bus_id)
-   dev_set_name(dev-dev, bus_id);
+   dev_set_name(dev-dev, %s, bus_id);
else
of_device_make_bus_id(dev);
 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 2/2] powerpc/440: Fix warning early debug code

2009-06-16 Thread Benjamin Herrenschmidt
The function udbg_44x_as1_flush() has the wrong prototype causing
a warning when enabling 440 early debug.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---

 arch/powerpc/kernel/udbg_16550.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-work.orig/arch/powerpc/kernel/udbg_16550.c2009-06-17 
11:52:05.0 +1000
+++ linux-work/arch/powerpc/kernel/udbg_16550.c 2009-06-17 11:52:08.0 
+1000
@@ -219,7 +219,7 @@ void udbg_init_pas_realmode(void)
 #ifdef CONFIG_PPC_EARLY_DEBUG_44x
 #include platforms/44x/44x.h
 
-static int udbg_44x_as1_flush(void)
+static void udbg_44x_as1_flush(void)
 {
if (udbg_comport) {
while ((as1_readb(udbg_comport-lsr)  LSR_THRE) == 0)
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] powerpc: Update Warp defconfig

2009-06-16 Thread Sean MacLennan
On Tue, 16 Jun 2009 11:24:35 -0400
Sean MacLennan smaclen...@pikatech.com wrote:

 I forgot to include the defconfig in the last set of patches. So you
 don't get to use the shiny new LEDS driver on the warp unless you turn
 it on.
 
 Enabling hotplug is also very important since we have moved to udev on
 the warp.
 
 Like most defconfig patches, most of the changes are just the normal
 kernel changes, not warp specific.
 
 If it is too early, and we want to hold off until say rc4 with
 defconfigs I have no problem with that. I have tried to outline all
 the changes we specifically made. If anybody sees any bad choices with
 options, let me know!
 
 Cheers,
Sean
 
 * Enable GPIO LEDS
 * Enable LED triggers
 * Move to slub
 * Enable hotplug
 * Enable timestamps on printks
 * Enable UBIFS

Oops, missed an important option in the list:

* Enable in kernel config

If you use Pika's development environment (PADS), you need this
enabled. We use the config to enable certain features, mainly udev.
Without this config, we will assume you have an old Pika kernel without
new features.

Cheers,
   Sean
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 1/3] powerpc/rtas: Turn rtas lock into a raw spinlock

2009-06-16 Thread Benjamin Herrenschmidt
RTAS currently uses a normal spinlock. However it can be called from
contexts where this is not necessarily a good idea. For example, it
can be called while syncing timebases, with the core timebase being
frozen. Unfortunately, that will deadlock in case of lock contention
when spinlock debugging is enabled as the spin lock debugging code
will try to use __delay() which ... relies on the timebase being
enabled.

Also RTAS can be used in some low level IRQ handling code path so it
may as well be a raw spinlock for -rt sake.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---

 arch/powerpc/include/asm/rtas.h |2 +-
 arch/powerpc/kernel/rtas.c  |   38 +-
 2 files changed, 30 insertions(+), 10 deletions(-)

--- linux-work.orig/arch/powerpc/include/asm/rtas.h 2009-06-10 
16:09:36.0 +1000
+++ linux-work/arch/powerpc/include/asm/rtas.h  2009-06-10 16:40:50.0 
+1000
@@ -58,7 +58,7 @@ struct rtas_t {
unsigned long entry;/* physical address pointer */
unsigned long base; /* physical address pointer */
unsigned long size;
-   spinlock_t lock;
+   raw_spinlock_t lock;
struct rtas_args args;
struct device_node *dev;/* virtual address pointer */
 };
Index: linux-work/arch/powerpc/kernel/rtas.c
===
--- linux-work.orig/arch/powerpc/kernel/rtas.c  2009-06-10 16:09:36.0 
+1000
+++ linux-work/arch/powerpc/kernel/rtas.c   2009-06-10 16:40:50.0 
+1000
@@ -40,7 +40,7 @@
 #include asm/atomic.h
 
 struct rtas_t rtas = {
-   .lock = SPIN_LOCK_UNLOCKED
+   .lock = __RAW_SPIN_LOCK_UNLOCKED
 };
 EXPORT_SYMBOL(rtas);
 
@@ -67,6 +67,28 @@ unsigned long rtas_rmo_buf;
 void (*rtas_flash_term_hook)(int);
 EXPORT_SYMBOL(rtas_flash_term_hook);
 
+/* RTAS use home made raw locking instead of spin_lock_irqsave
+ * because those can be called from within really nasty contexts
+ * such as having the timebase stopped which would lockup with
+ * normal locks and spinlock debugging enabled
+ */
+static unsigned long lock_rtas(void)
+{
+   unsigned long flags;
+
+   local_irq_save(flags);
+   preempt_disable();
+   __raw_spin_lock_flags(rtas.lock, flags);
+   return flags;
+}
+
+static void unlock_rtas(unsigned long flags)
+{
+   __raw_spin_unlock(rtas.lock);
+   local_irq_restore(flags);
+   preempt_enable();
+}
+
 /*
  * call_rtas_display_status and call_rtas_display_status_delay
  * are designed only for very early low-level debugging, which
@@ -79,7 +101,7 @@ static void call_rtas_display_status(cha
 
if (!rtas.base)
return;
-   spin_lock_irqsave(rtas.lock, s);
+   s = lock_rtas();
 
args-token = 10;
args-nargs = 1;
@@ -89,7 +111,7 @@ static void call_rtas_display_status(cha
 
enter_rtas(__pa(args));
 
-   spin_unlock_irqrestore(rtas.lock, s);
+   unlock_rtas(s);
 }
 
 static void call_rtas_display_status_delay(char c)
@@ -411,8 +433,7 @@ int rtas_call(int token, int nargs, int 
if (!rtas.entry || token == RTAS_UNKNOWN_SERVICE)
return -1;
 
-   /* Gotta do something different here, use global lock for now... */
-   spin_lock_irqsave(rtas.lock, s);
+   s = lock_rtas();
rtas_args = rtas.args;
 
rtas_args-token = token;
@@ -439,8 +460,7 @@ int rtas_call(int token, int nargs, int 
outputs[i] = rtas_args-rets[i+1];
ret = (nret  0)? rtas_args-rets[0]: 0;
 
-   /* Gotta do something different here, use global lock for now... */
-   spin_unlock_irqrestore(rtas.lock, s);
+   unlock_rtas(s);
 
if (buff_copy) {
log_error(buff_copy, ERR_TYPE_RTAS_LOG, 0);
@@ -837,7 +857,7 @@ asmlinkage int ppc_rtas(struct rtas_args
 
buff_copy = get_errorlog_buffer();
 
-   spin_lock_irqsave(rtas.lock, flags);
+   flags = lock_rtas();
 
rtas.args = args;
enter_rtas(__pa(rtas.args));
@@ -848,7 +868,7 @@ asmlinkage int ppc_rtas(struct rtas_args
if (args.rets[0] == -1)
errbuf = __fetch_rtas_last_error(buff_copy);
 
-   spin_unlock_irqrestore(rtas.lock, flags);
+   unlock_rtas(flags);
 
if (buff_copy) {
if (errbuf)
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 2/3] powerpc: Use one common impl. of RTAS timebase sync and use raw spinlock

2009-06-16 Thread Benjamin Herrenschmidt
Several platforms use their own copy of what is essentially the same code,
using RTAS to synchronize the timebases when bringing up new CPUs. This
moves it all into a single common implementation and additionally
turns the spinlock into a raw spinlock since the former can rely on
the timebase not being frozen when spinlock debugging is enabled, and finally
masks interrupts while the timebase is disabled.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---

 arch/powerpc/include/asm/rtas.h  |3 +++
 arch/powerpc/kernel/rtas.c   |   31 +++
 arch/powerpc/platforms/cell/smp.c|   30 ++
 arch/powerpc/platforms/chrp/smp.c|   30 ++
 arch/powerpc/platforms/pseries/smp.c |   30 ++
 5 files changed, 40 insertions(+), 84 deletions(-)

--- linux-work.orig/arch/powerpc/platforms/cell/smp.c   2009-06-17 
11:56:31.0 +1000
+++ linux-work/arch/powerpc/platforms/cell/smp.c2009-06-17 
11:56:42.0 +1000
@@ -36,7 +36,6 @@
 #include asm/prom.h
 #include asm/smp.h
 #include asm/paca.h
-#include asm/time.h
 #include asm/machdep.h
 #include asm/cputable.h
 #include asm/firmware.h
@@ -140,31 +139,6 @@ static void __devinit smp_cell_setup_cpu
mtspr(SPRN_DABRX, DABRX_KERNEL | DABRX_USER);
 }
 
-static DEFINE_SPINLOCK(timebase_lock);
-static unsigned long timebase = 0;
-
-static void __devinit cell_give_timebase(void)
-{
-   spin_lock(timebase_lock);
-   rtas_call(rtas_token(freeze-time-base), 0, 1, NULL);
-   timebase = get_tb();
-   spin_unlock(timebase_lock);
-
-   while (timebase)
-   barrier();
-   rtas_call(rtas_token(thaw-time-base), 0, 1, NULL);
-}
-
-static void __devinit cell_take_timebase(void)
-{
-   while (!timebase)
-   barrier();
-   spin_lock(timebase_lock);
-   set_tb(timebase  32, timebase  0x);
-   timebase = 0;
-   spin_unlock(timebase_lock);
-}
-
 static void __devinit smp_cell_kick_cpu(int nr)
 {
BUG_ON(nr  0 || nr = NR_CPUS);
@@ -224,8 +198,8 @@ void __init smp_init_cell(void)
 
/* Non-lpar has additional take/give timebase */
if (rtas_token(freeze-time-base) != RTAS_UNKNOWN_SERVICE) {
-   smp_ops-give_timebase = cell_give_timebase;
-   smp_ops-take_timebase = cell_take_timebase;
+   smp_ops-give_timebase = rtas_give_timebase;
+   smp_ops-take_timebase = rtas_take_timebase;
}
 
DBG( - smp_init_cell()\n);
Index: linux-work/arch/powerpc/platforms/chrp/smp.c
===
--- linux-work.orig/arch/powerpc/platforms/chrp/smp.c   2009-06-17 
11:56:31.0 +1000
+++ linux-work/arch/powerpc/platforms/chrp/smp.c2009-06-17 
11:56:42.0 +1000
@@ -26,7 +26,6 @@
 #include asm/io.h
 #include asm/prom.h
 #include asm/smp.h
-#include asm/time.h
 #include asm/machdep.h
 #include asm/mpic.h
 #include asm/rtas.h
@@ -45,37 +44,12 @@ static void __devinit smp_chrp_setup_cpu
 static DEFINE_SPINLOCK(timebase_lock);
 static unsigned int timebase_upper = 0, timebase_lower = 0;
 
-void __devinit smp_chrp_give_timebase(void)
-{
-   spin_lock(timebase_lock);
-   rtas_call(rtas_token(freeze-time-base), 0, 1, NULL);
-   timebase_upper = get_tbu();
-   timebase_lower = get_tbl();
-   spin_unlock(timebase_lock);
-
-   while (timebase_upper || timebase_lower)
-   barrier();
-   rtas_call(rtas_token(thaw-time-base), 0, 1, NULL);
-}
-
-void __devinit smp_chrp_take_timebase(void)
-{
-   while (!(timebase_upper || timebase_lower))
-   barrier();
-   spin_lock(timebase_lock);
-   set_tb(timebase_upper, timebase_lower);
-   timebase_upper = 0;
-   timebase_lower = 0;
-   spin_unlock(timebase_lock);
-   printk(CPU %i taken timebase\n, smp_processor_id());
-}
-
 /* CHRP with openpic */
 struct smp_ops_t chrp_smp_ops = {
.message_pass = smp_mpic_message_pass,
.probe = smp_mpic_probe,
.kick_cpu = smp_chrp_kick_cpu,
.setup_cpu = smp_chrp_setup_cpu,
-   .give_timebase = smp_chrp_give_timebase,
-   .take_timebase = smp_chrp_take_timebase,
+   .give_timebase = rtas_give_timebase,
+   .take_timebase = rtas_take_timebase,
 };
Index: linux-work/arch/powerpc/platforms/pseries/smp.c
===
--- linux-work.orig/arch/powerpc/platforms/pseries/smp.c2009-06-17 
11:56:31.0 +1000
+++ linux-work/arch/powerpc/platforms/pseries/smp.c 2009-06-17 
11:56:42.0 +1000
@@ -35,7 +35,6 @@
 #include asm/prom.h
 #include asm/smp.h
 #include asm/paca.h
-#include asm/time.h
 #include asm/machdep.h
 #include asm/cputable.h
 #include asm/firmware.h
@@ -118,31 +117,6 @@ static void __devinit smp_xics_setup_cpu
 }
 #endif /* CONFIG_XICS */
 
-static 

[PATCH 3/3] powerpc/pasemi: Use raw spinlock in SMP TB sync

2009-06-16 Thread Benjamin Herrenschmidt
spin_lock() can hang if called while the timebase is frozen,
so use a raw lock instead, also disable interrupts while
at it.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---

 arch/powerpc/platforms/pasemi/setup.c |   15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

--- linux-work.orig/arch/powerpc/platforms/pasemi/setup.c   2009-06-17 
12:13:33.0 +1000
+++ linux-work/arch/powerpc/platforms/pasemi/setup.c2009-06-17 
12:15:01.0 +1000
@@ -71,20 +71,25 @@ static void pas_restart(char *cmd)
 }
 
 #ifdef CONFIG_SMP
-static DEFINE_SPINLOCK(timebase_lock);
+static raw_spinlock_t timebase_lock;
 static unsigned long timebase;
 
 static void __devinit pas_give_timebase(void)
 {
-   spin_lock(timebase_lock);
+   unsigned long flags;
+
+   local_irq_save(flags);
+   hard_irq_disable();
+   __raw_spin_lock(timebase_lock);
mtspr(SPRN_TBCTL, TBCTL_FREEZE);
isync();
timebase = get_tb();
-   spin_unlock(timebase_lock);
+   __raw_spin_unlock(timebase_lock);
 
while (timebase)
barrier();
mtspr(SPRN_TBCTL, TBCTL_RESTART);
+   local_irq_restore(flags);
 }
 
 static void __devinit pas_take_timebase(void)
@@ -92,10 +97,10 @@ static void __devinit pas_take_timebase(
while (!timebase)
smp_rmb();
 
-   spin_lock(timebase_lock);
+   __raw_spin_lock(timebase_lock);
set_tb(timebase  32, timebase  0x);
timebase = 0;
-   spin_unlock(timebase_lock);
+   __raw_spin_unlock(timebase_lock);
 }
 
 struct smp_ops_t pas_smp_ops = {
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


mm: Move pgtable_cache_init() earlier

2009-06-16 Thread Benjamin Herrenschmidt
Some architectures need to initialize SLAB caches to be able
to allocate page tables. They do that from pgtable_cache_init()
so the later should be called earlier now, best is before
vmalloc_init().

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---

Note: Only powerpc, sparc and xtensa use this and only to
call kmem_cache_create() so with a bit of luck it should
just work... 

Index: linux-work/init/main.c
===
--- linux-work.orig/init/main.c 2009-06-17 13:41:33.0 +1000
+++ linux-work/init/main.c  2009-06-17 13:41:45.0 +1000
@@ -546,6 +546,7 @@ static void __init mm_init(void)
page_cgroup_init_flatmem();
mem_init();
kmem_cache_init();
+   pgtable_cache_init();
vmalloc_init();
 }
 
@@ -684,7 +685,6 @@ asmlinkage void __init start_kernel(void
late_time_init();
calibrate_delay();
pidmap_init();
-   pgtable_cache_init();
anon_vma_init();
 #ifdef CONFIG_X86
if (efi_enabled)


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: mm: Move pgtable_cache_init() earlier

2009-06-16 Thread Pekka Enberg
Hi Ben,

On Wed, Jun 17, 2009 at 6:48 AM, Benjamin
Herrenschmidtb...@kernel.crashing.org wrote:
 Some architectures need to initialize SLAB caches to be able
 to allocate page tables. They do that from pgtable_cache_init()
 so the later should be called earlier now, best is before
 vmalloc_init().

 Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org

Looks good to me!

Acked-by: Pekka Enberg penb...@cs.helsinki.fi
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Warning in block code

2009-06-16 Thread Jens Axboe
On Wed, Jun 17 2009, Benjamin Herrenschmidt wrote:
 Hoy !
 
 I see that:
 
 block/blk-settings.c: In function ???blk_set_default_limits???:
 block/blk-settings.c:115: warning: large integer implicitly truncated to 
 unsigned type
 
 Comes from
 
   lim-bounce_pfn = BLK_BOUNCE_ANY;
 
 With BLK_BOUNCE_ANY being a
 
 include/linux/blkdev.h:#define BLK_BOUNCE_ANY (-1ULL)
 
 And struct queue_limit.bounce_pfn is:
 
   unsigned long   bounce_pfn;
 
 (The warning is fishy as both quantities are unsigned, but there -is-
 truncation happening here).

Should be safe to just make it -1UL now.

-- 
Jens Axboe

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


linux-next: kvm/powerpc tree build failure

2009-06-16 Thread Stephen Rothwell
Hi Avi, Ben,

Today's linux-next build (powerpc ppc44x_defconfig) failed like this:

cc1: warnings being treated as errors
arch/powerpc/kvm/../../../virt/kvm/kvm_main.c: In function 
'__kvm_set_memory_region':
arch/powerpc/kvm/../../../virt/kvm/kvm_main.c:1178: error: integer overflow in 
expression

Probably caused by commit ac04527f7947020c5890090b2ac87af4e98d977e (KVM:
Disable large pages on misaligned memory slots). The build fails because
arch/powerpc is now being built (mostly) with -Werror.

arch/powerpc/include/asm/kvm_host.h:#define KVM_PAGES_PER_HPAGE (131)

this needs to be (1UL  31) or ((unsigned int)1  31).

I applied the following patch for today.

From: Stephen Rothwell s...@canb.auug.org.au
Date: Wed, 17 Jun 2009 14:57:29 +1000
Subject: [PATCH] kvm/powerpc: make 32 bit constant unsigned long

KVM_PAGES_PER_HPAGE needs to be unsigned long since its value is 2^31.

Eliminates this compiler warning:

arch/powerpc/kvm/../../../virt/kvm/kvm_main.c:1178: error: integer overflow in 
expression

Signed-off-by: Stephen Rothwell s...@canb.auug.org.au
---
 arch/powerpc/include/asm/kvm_host.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 3625424..d4caa61 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -34,7 +34,7 @@
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
 
 /* We don't currently support large pages. */
-#define KVM_PAGES_PER_HPAGE (131)
+#define KVM_PAGES_PER_HPAGE (1UL  31)
 
 struct kvm;
 struct kvm_run;
-- 
1.6.3.1

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [Patch 2/6] Introduce PPC64 specific Hardware Breakpoint interfaces

2009-06-16 Thread David Gibson
On Wed, Jun 10, 2009 at 02:38:06PM +0530, K.Prasad wrote:
 Introduce PPC64 implementation for the generic hardware breakpoint interfaces
 defined in kernel/hw_breakpoint.c. Enable the HAVE_HW_BREAKPOINT flag and the
 Makefile.

[snip]
 +void arch_update_kernel_hw_breakpoint(void *unused)
 +{
 + struct hw_breakpoint *bp;
 +
 + /* Check if there is nothing to update */
 + if (hbp_kernel_pos == HBP_NUM)
 + return;
 +
 + per_cpu(this_hbp_kernel[hbp_kernel_pos], get_cpu()) = bp =
 + hbp_kernel[hbp_kernel_pos];
 + if (bp == NULL)
 + kdabr = 0;
 + else
 + kdabr = (bp-info.address  ~HW_BREAKPOINT_ALIGN) |
 + bp-info.type | DABR_TRANSLATION;
 + set_dabr(kdabr);
 + put_cpu_no_resched();
 +}
 +
 +/*
 + * Install the thread breakpoints in their debug registers.
 + */
 +void arch_install_thread_hw_breakpoint(struct task_struct *tsk)
 +{
 + set_dabr(tsk-thread.dabr);
 +}
 +
 +/*
 + * Clear the DABR which contains the thread-specific breakpoint address
 + */
 +void arch_uninstall_thread_hw_breakpoint()
 +{
 + set_dabr(0);
 +}
 +
 +/*
 + * Store a breakpoint's encoded address, length, and type.
 + */
 +int arch_store_info(struct hw_breakpoint *bp, struct task_struct *tsk)
 +{
 + /* Symbol names from user-space are rejected */
 + if (tsk) {
 + if (bp-info.name)
 + return -EINVAL;
 + else
 + return 0;
 + }
 + /*
 +  * User-space requests will always have the address field populated
 +  * For kernel-addresses, either the address or symbol name can be
 +  * specified.
 +  */
 + if (bp-info.name)
 + bp-info.address = (unsigned long)
 + kallsyms_lookup_name(bp-info.name);
 + if (bp-info.address)
 + if (kallsyms_lookup_size_offset(bp-info.address,
 + (bp-info.symbolsize), NULL))
 + return 0;
 + return -EINVAL;
 +}
 +
 +/*
 + * Validate the arch-specific HW Breakpoint register settings
 + */
 +int arch_validate_hwbkpt_settings(struct hw_breakpoint *bp,
 + struct task_struct *tsk)
 +{
 + int is_kernel, ret = -EINVAL;
 +
 + if (!bp)
 + return ret;
 +
 + switch (bp-info.type) {
 + case HW_BREAKPOINT_READ:
 + case HW_BREAKPOINT_WRITE:
 + case HW_BREAKPOINT_RW:
 + break;
 + default:
 + return ret;
 + }
 +
 + if (bp-triggered)
 + ret = arch_store_info(bp, tsk);

Under what circumstances would triggered be NULL?  It's not clear to
me that you wouldn't still need arch_store_info() in this case.

 +
 + is_kernel = is_kernel_addr(bp-info.address);
 + if ((tsk  is_kernel) || (!tsk  !is_kernel))
 + return -EINVAL;
 +
 + return ret;
 +}
 +
 +void arch_update_user_hw_breakpoint(int pos, struct task_struct *tsk)
 +{
 + struct thread_struct *thread = (tsk-thread);
 + struct hw_breakpoint *bp = thread-hbp[0];
 +
 + if (bp)
 + thread-dabr = (bp-info.address  ~HW_BREAKPOINT_ALIGN) |
 + bp-info.type | DABR_TRANSLATION;
 + else
 + thread-dabr = 0;
 +}
 +
 +void arch_flush_thread_hw_breakpoint(struct task_struct *tsk)
 +{
 + struct thread_struct *thread = (tsk-thread);
 +
 + thread-dabr = 0;
 +}
 +
 +/*
 + * Handle debug exception notifications.
 + */
 +int __kprobes hw_breakpoint_handler(struct die_args *args)
 +{
 + int rc = NOTIFY_STOP;
 + struct hw_breakpoint *bp;
 + struct pt_regs *regs = args-regs;
 + unsigned long dar = regs-dar;
 + int cpu, is_one_shot, stepped = 1;
 +
 + /* Disable breakpoints during exception handling */
 + set_dabr(0);
 +
 + cpu = get_cpu();
 + /* Determine whether kernel- or user-space address is the trigger */
 + bp = (hbp_kernel_pos == HBP_NUM) ? current-thread.hbp[0] :
 + per_cpu(this_hbp_kernel[0], cpu);
 + /*
 +  * bp can be NULL due to lazy debug register switching
 +  * or due to the delay between updates of hbp_kernel_pos
 +  * and this_hbp_kernel.
 +  */
 + if (!bp)
 + goto out;
 +
 + is_one_shot = (bp-triggered == ptrace_triggered) ? 1 : 0;
 + per_cpu(dabr_data, cpu) = (hbp_kernel_pos == HBP_NUM) ?
 + current-thread.dabr : kdabr;
 +
 + /* Verify if dar lies within the address range occupied by the symbol
 +  * being watched. Since we cannot get the symbol size for
 +  * user-space requests we skip this check in that case
 +  */
 + if ((hbp_kernel_pos == 0) 
 + !((bp-info.address = dar) 
 +  (dar = (bp-info.address + bp-info.symbolsize
 + /*
 +  * This exception is triggered not because of a memory 

Re: [Patch 5/6] Modify Data storage exception code to recognise DABR match first

2009-06-16 Thread David Gibson
On Wed, Jun 10, 2009 at 02:38:24PM +0530, K.Prasad wrote:
 Modify Data storage exception code to first lookout for a DABR match before
 recognising a kprobe or xmon exception.
 
 Signed-off-by: K.Prasad pra...@linux.vnet.ibm.com
 ---
  arch/powerpc/mm/fault.c |   14 ++
  1 file changed, 6 insertions(+), 8 deletions(-)
 
 Index: linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c
 ===
 --- linux-2.6-tip.hbkpt.orig/arch/powerpc/mm/fault.c
 +++ linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c
 @@ -136,6 +136,12 @@ int __kprobes do_page_fault(struct pt_re
   error_code = 0x4820;
   else
   is_write = error_code  DSISR_ISSTORE;
 +
 + if (error_code  DSISR_DABRMATCH) {
 + /* DABR match */
 + do_dabr(regs, address, error_code);
 + return 0;
 + }

Again, given the amount of work you're doing on all the breakpoint
paths, I really think you should be rewriting do_dabr(), not just
hooking in around it.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [Patch 4/6] Modify process and processor handling code to recognise hardware debug registers

2009-06-16 Thread David Gibson
On Wed, Jun 10, 2009 at 02:38:18PM +0530, K.Prasad wrote:
 Modify process handling code to recognise hardware debug registers during copy
 and flush operations. Introduce a new TIF_DEBUG task flag to indicate a
 process's use of debug register. Load the debug register values into a
 new CPU during initialisation.
 
 Signed-off-by: K.Prasad pra...@linux.vnet.ibm.com
 ---
  arch/powerpc/kernel/process.c |   15 +++
  arch/powerpc/kernel/smp.c |2 ++
  2 files changed, 17 insertions(+)
 
 Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/process.c
 ===
 --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/process.c
 +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/process.c
 @@ -50,6 +50,7 @@
  #include asm/syscalls.h
  #ifdef CONFIG_PPC64
  #include asm/firmware.h
 +#include asm/hw_breakpoint.h
  #endif
  #include linux/kprobes.h
  #include linux/kdebug.h
 @@ -254,8 +255,10 @@ void do_dabr(struct pt_regs *regs, unsig
   11, SIGSEGV) == NOTIFY_STOP)
   return;
  
 +#ifndef CONFIG_PPC64
   if (debugger_dabr_match(regs))
   return;
 +#endif

Won't this disable the check for breakpoints set by xmon - but I don't
see anything in this patch series to convert xmon to use the new
breakpoint interface instead.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [Patch 2/6] Introduce PPC64 specific Hardware Breakpoint interfaces

2009-06-16 Thread David Gibson
On Mon, Jun 15, 2009 at 12:48:28PM +0530, K.Prasad wrote:
 On Mon, Jun 15, 2009 at 04:40:45PM +1000, David Gibson wrote:
  On Wed, Jun 10, 2009 at 12:13:49PM +0530, K.Prasad wrote:
   On Fri, Jun 05, 2009 at 03:11:58PM +1000, David Gibson wrote:
On Wed, Jun 03, 2009 at 10:05:11PM +0530, K.Prasad wrote:
   
 + else {
 + /*
 +  * This exception is triggered not because of a memory 
 access on
 +  * the monitored variable but in the double-word 
 address range
 +  * in which it is contained. We will consume this 
 exception,
 +  * considering it as 'noise'.
 +  */
 + rc = NOTIFY_STOP;
 + goto out;
 + }
 + is_one_shot = (bp-triggered == ptrace_triggered) ? 1 : 0;

Ouch, explicitly special-casing ptrace_triggered is pretty nasty.
Since the bp_info is already arch specific, maybe it should include a
flag to indicate whether the breakpoint is one-shot or not.

   
   The reason to check for ptrace_triggered is to contain the one-shot
   behaviour only to ptrace (thus retaining the semantics) and not to extend
   them to all user-space requests through
   register_user_hw_breakpoint().
  
  Right, but couldn't you implement that withing ptrace_triggered
  itself, without a special test here, by having it cancel the
  breakpoint.
 
 A special check (either using the callback routine as above, or using a
 special flag) will be required in hw_breakpoint_handler() to enable
 early return (without single-stepping). I'm not sure if I got your
 suggestion right, and let me know if you think so.

Well.. you could also recheck after calling triggered whether the
breakpoint is still in existence.  So cancelling the breakpoint in
triggered would.

Or you could change the signature for the triggered function, so it
returns whether to leave the breakpoint active or not.  That would
have some advantages you could easily implement either one-shot,
continuous or N-shot, or keep firing until some specific event
behaviour from within the triggered function.


But all that's a bit moot, because of the fact that you try to make
the TRAP timing consistent (before or after execution of triggering
instruction) but you don't do so for a general triggered hook.

   A one-shot behaviour for all user-space requests would create more work
   for the user-space programs (such as re-registration) and will leave open
   a small window of opportunity for debug register grabbing by kernel-space
   requests.
   
   So, in effect a request through register_user_hw_breakpoint() interface
   will behave as under:
   - Single-step over the causative instruction that triggered the
 breakpoint exception handler.
   - Deliver the SIGTRAP signal to user-space after executing the causative
 instruction.
   
   This behaviour is in consonance with that of kernel-space requests and
   those on x86 processors, and helps define a consistent behaviour across
   architectures for user-space.
   
   Let me know what you think on the same.
  
  I certainly see the value in consistent semantics across archs.
  However, I can also see uses for the powerpc trap-before-execute
  behaviour.  That's why I'm suggesting it might be worth having an
  arch-specific flag.
  
  [snip]
 
 So, you suggest that the 'one-shot' behaviour should be driven by
 user-request and not just confined to ptrace? (The default behaviour for
 all breakpoints-minus-ptrace will remain 'continuous' though).

Not one-shot behaviour as such, but whether the trigger fires before
or after execution of the triggering instruction.  The complication is
that combining fire-before semantics with continuous firing becomes
tricky.

 It can be implemented through an additional flag in 'struct
 arch_hw_breakpoint'. I can send a new version 7 of the patchset with this
 change (with the hope that the version 6 of the patchset looks fine in
 its present form!). Meanwhile, we'd like to know what uses you see in
 addition to the present one if the one-shot behaviour is made
 user-defined. Are those uses beyond what can be achieved through the
 present ptrace interface?
 
 +int __kprobes single_step_dabr_instruction(struct die_args *args)
 +{
 + struct pt_regs *regs = args-regs;
 + int cpu = get_cpu();
 + int ret = NOTIFY_DONE;
 + siginfo_t info;
 + unsigned long this_dabr_data = per_cpu(dabr_data, cpu);
 +
 + /*
 +  * Check if we are single-stepping as a result of a
 +  * previous HW Breakpoint exception
 +  */
 + if (this_dabr_data == 0)
 + goto out;
 +
 + regs-msr = ~MSR_SE;
 + /* Deliver signal to user-space */
 + if (this_dabr_data  TASK_SIZE) {
 + info.si_signo = SIGTRAP;
 + info.si_errno = 0;
 + info.si_code = TRAP_HWBKPT;
 +