[PATCH V2 0/2] cpufreq/powernv: Set core pstate to a minimum just before hotplugging it out

2014-09-05 Thread Preeti U Murthy
Today cpus go to winkle when they are offlined. Since it is the deepest
idle state that we have, it is expected to save good amount of power as compared
to online state, where cores can enter nap/fastsleep only which are
shallower idle states.
However we observed no powersavings with winkle as compared to nap/fastsleep
and traced the problem to the pstate of the core being kept at a high even
when the core is offline. This can keep the socket pstate high, thus burning
power unnecessarily. This patchset fixes this issue.

Changes in V2: Changed smp_call_function_any() to smp_call_function_single() in 
Patch[2/2]

---

Preeti U Murthy (2):
  cpufreq: Allow stop CPU callback to be used by all cpufreq drivers
  powernv/cpufreq: Set the pstate of the last hotplugged out cpu in 
policy-cpus to minimum


 drivers/cpufreq/cpufreq.c |2 +-
 drivers/cpufreq/powernv-cpufreq.c |9 +
 2 files changed, 10 insertions(+), 1 deletion(-)

--

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 1/2] cpufreq: Allow stop CPU callback to be used by all cpufreq drivers

2014-09-05 Thread Preeti U Murthy
Commit 367dc4aa932bfb3 (cpufreq: Add stop CPU callback to
cpufreq_driver interface) introduced the stop CPU callback for
intel_pstate drivers. During the CPU_DOWN_PREPARE stage, this
callback is invoked so that drivers can take some action on the
pstate of the cpu before it is taken offline. This callback was
assumed to be useful only for those drivers which have implemented
the set_policy CPU callback because they have no other way to take
action about the cpufreq of a CPU which is being hotplugged out
except in the exit callback which is called very late in the offline
process.

The drivers which implement the target/target_index callbacks were
expected to take care of requirements like the ones that commit
367dc4aa addresses in the GOV_STOP notification event. But there
are disadvantages to restricting the usage of stop CPU callback
to cpufreq drivers that implement the set_policy callbacks and who
want to take explicit action on the setting the cpufreq during a
hotplug operation.

1.GOV_STOP gets called for every CPU offline and drivers would usually
want to take action when the last cpu in the policy-cpus mask
is taken offline. As long as there is more than one cpu in the
policy-cpus mask, cpufreq core itself makes sure that the freq
for the other cpus in this mask is set according to the maximum load.
This is sensible and drivers which implement the target_index callback
would mostly not want to modify that. However the cpufreq core leaves a
loose end when the cpu in the policy-cpus mask is the last one to go offline;
it does nothing explicit to the frequency of the core. Drivers may need
a way to take some action here and stop CPU callback mechanism is the
best way to do it today.

2. We cannot implement driver specific actions in the GOV_STOP mechanism.
So we will need another driver callback which is invoked from here which is
unnecessary.

Therefore this patch extends the usage of stop CPU callback to be used
by all cpufreq drivers as long as they have this callback implemented
and irrespective of whether they are set_policy/target_index drivers.
The assumption is if the drivers find the GOV_STOP path to be a suitable
way of implementing what they want to do with the freq of the cpu
going offine,they will not implement the stop CPU callback at all.

Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com
---

 drivers/cpufreq/cpufreq.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index d9fdedd..6463f35 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1380,7 +1380,7 @@ static int __cpufreq_remove_dev_prepare(struct device 
*dev,
if (!cpufreq_suspended)
pr_debug(%s: policy Kobject moved to cpu: %d from: 
%d\n,
 __func__, new_cpu, cpu);
-   } else if (cpufreq_driver-stop_cpu  cpufreq_driver-setpolicy) {
+   } else if (cpufreq_driver-stop_cpu) {
cpufreq_driver-stop_cpu(policy);
}
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 2/2] powernv/cpufreq: Set the pstate of the last hotplugged out cpu in policy-cpus to minimum

2014-09-05 Thread Preeti U Murthy
Its possible today that the pstate of a core is held at a high even after the
entire core is hotplugged out if a load had just run on  the hotplugged cpu. 
This is
fair, since it is assumed that the pstate does not matter to a cpu in a deep 
idle
state, which is the expected state of a hotplugged core on powerpc. However on 
powerpc,
the pstate at a socket level is held at the maximum of the pstates of each 
core. Even
if the pstates of the active cores on that socket is low, the socket pstate is 
held
high due to the pstate of the hotplugged core in the above mentioned scenario. 
This
can cost significant amount of power loss for no good.

Besides, since it is a non active core, nothing can be done from the kernel's 
end
to set the frequency of the core right. Hence make use of the stop_cpu callback
to explicitly set the pstate of the core to a minimum when the last cpu of the
core gets hotplugged out.

Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com
---

 drivers/cpufreq/powernv-cpufreq.c |9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/cpufreq/powernv-cpufreq.c 
b/drivers/cpufreq/powernv-cpufreq.c
index 379c083..5a628f1 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -317,6 +317,14 @@ static int powernv_cpufreq_cpu_init(struct cpufreq_policy 
*policy)
return cpufreq_table_validate_and_show(policy, powernv_freqs);
 }
 
+static void powernv_cpufreq_stop_cpu(struct cpufreq_policy *policy)
+{
+   struct powernv_smp_call_data freq_data;
+
+   freq_data.pstate_id = powernv_pstate_info.min;
+   smp_call_function_single(policy-cpu, set_pstate, freq_data, 1);
+}
+
 static struct cpufreq_driver powernv_cpufreq_driver = {
.name   = powernv-cpufreq,
.flags  = CPUFREQ_CONST_LOOPS,
@@ -324,6 +332,7 @@ static struct cpufreq_driver powernv_cpufreq_driver = {
.verify = cpufreq_generic_frequency_table_verify,
.target_index   = powernv_cpufreq_target_index,
.get= powernv_cpufreq_get,
+   .stop_cpu   = powernv_cpufreq_stop_cpu,
.attr   = powernv_cpu_freq_attr,
 };
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2 0/2] cpufreq/powernv: Set core pstate to a minimum just before hotplugging it out

2014-09-05 Thread Viresh Kumar
On 5 September 2014 13:09, Preeti U Murthy pre...@linux.vnet.ibm.com wrote:
 Today cpus go to winkle when they are offlined. Since it is the deepest
 idle state that we have, it is expected to save good amount of power as 
 compared
 to online state, where cores can enter nap/fastsleep only which are
 shallower idle states.
 However we observed no powersavings with winkle as compared to nap/fastsleep
 and traced the problem to the pstate of the core being kept at a high even
 when the core is offline. This can keep the socket pstate high, thus burning
 power unnecessarily. This patchset fixes this issue.

 Changes in V2: Changed smp_call_function_any() to smp_call_function_single() 
 in Patch[2/2]

Acked-by: Viresh Kumar viresh.ku...@linaro.org
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: deb-pkg: Add support for powerpc little endian

2014-09-05 Thread Gabriel Paubert
On Fri, Sep 05, 2014 at 03:28:47PM +1000, Michael Neuling wrote:
 The Debian powerpc little endian architecture is called ppc64le.  This

Huh? ppc64le or ppc64el?

 is the default architecture used by Ubuntu for powerpc.
 
 The below checks the kernel config to see if we are compiling little
 endian and sets the Debian arch appropriately.
 
 Signed-off-by: Michael Neuling mi...@neuling.org
 
 diff --git a/scripts/package/builddeb b/scripts/package/builddeb
 index 35d5a58..6f4a1af 100644
 --- a/scripts/package/builddeb
 +++ b/scripts/package/builddeb
 @@ -37,7 +37,7 @@ create_package() {
   s390*)
   debarch=s390$(grep -q CONFIG_64BIT=y $KCONFIG_CONFIG  echo x 
 || true) ;;
   ppc*)
 - debarch=powerpc ;;
 + debarch=$(grep -q CPU_LITTLE_ENDIAN=y $KCONFIG_CONFIG  echo 
 ppc64el || echo powerpc) ;;
   parisc*)
   debarch=hppa ;;
   mips*)


Gabriel
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2] deb-pkg: Add support for powerpc little endian

2014-09-05 Thread Michael Neuling
On Fri, 2014-09-05 at 09:13 +0200, Gabriel Paubert wrote:
 On Fri, Sep 05, 2014 at 03:28:47PM +1000, Michael Neuling wrote:
  The Debian powerpc little endian architecture is called ppc64le.  This
 
 Huh? ppc64le or ppc64el?

ppc64el.  Commit message is wrong.  Fixed below.

Mikey


From: Michael Neuling mi...@neuling.org

deb-pkg: Add support for powerpc little endian

The Debian powerpc little endian architecture is called ppc64el.  This
is the default architecture used by Ubuntu for powerpc.

The below checks the kernel config to see if we are compiling little
endian and sets the Debian arch appropriately.

Signed-off-by: Michael Neuling mi...@neuling.org

diff --git a/scripts/package/builddeb b/scripts/package/builddeb
index 35d5a58..6f4a1af 100644
--- a/scripts/package/builddeb
+++ b/scripts/package/builddeb
@@ -37,7 +37,7 @@ create_package() {
s390*)
debarch=s390$(grep -q CONFIG_64BIT=y $KCONFIG_CONFIG  echo x 
|| true) ;;
ppc*)
-   debarch=powerpc ;;
+   debarch=$(grep -q CPU_LITTLE_ENDIAN=y $KCONFIG_CONFIG  echo 
ppc64el || echo powerpc) ;;
parisc*)
debarch=hppa ;;
mips*)



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: bit fields data tearing

2014-09-05 Thread David Laight
From: Paul E. McKenney
 On Thu, Sep 04, 2014 at 10:47:24PM -0400, Peter Hurley wrote:
  Hi James,
 
  On 09/04/2014 10:11 PM, James Bottomley wrote:
   On Thu, 2014-09-04 at 17:17 -0700, Paul E. McKenney wrote:
   +And there are anti-guarantees:
   +
   + (*) These guarantees do not apply to bitfields, because compilers often
   + generate code to modify these using non-atomic read-modify-write
   + sequences.  Do not attempt to use bitfields to synchronize parallel
   + algorithms.
   +
   + (*) Even in cases where bitfields are protected by locks, all fields
   + in a given bitfield must be protected by one lock.  If two fields
   + in a given bitfield are protected by different locks, the 
   compiler's
   + non-atomic read-modify-write sequences can cause an update to one
   + field to corrupt the value of an adjacent field.
   +
   + (*) These guarantees apply only to properly aligned and sized scalar
   + variables.  Properly sized currently means int and long,
   + because some CPU families do not support loads and stores of
   + other sizes.  (Some CPU families is currently believed to
   + be only Alpha 21064.  If this is actually the case, a different
   + non-guarantee is likely to be formulated.)
  
   This is a bit unclear.  Presumably you're talking about definiteness of
   the outcome (as in what's seen after multiple stores to the same
   variable).
 
  No, the last conditions refers to adjacent byte stores from different
  cpu contexts (either interrupt or SMP).
 
   The guarantees are only for natural width on Parisc as well,
   so you would get a mess if you did byte stores to adjacent memory
   locations.
 
  For a simple test like:
 
  struct x {
  long a;
  char b;
  char c;
  char d;
  char e;
  };
 
  void store_bc(struct x *p) {
  p-b = 1;
  p-c = 2;
  }
 
  on parisc, gcc generates separate byte stores
 
  void store_bc(struct x *p) {
 0:   34 1c 00 02 ldi 1,ret0
 4:   0f 5c 12 08 stb ret0,4(r26)
 8:   34 1c 00 04 ldi 2,ret0
 c:   e8 40 c0 00 bv r0(rp)
10:   0f 5c 12 0a stb ret0,5(r26)
 
  which appears to confirm that on parisc adjacent byte data
  is safe from corruption by concurrent cpu updates; that is,
 
  CPU 0| CPU 1
   |
  p-b = 1 | p-c = 2
   |
 
  will result in p-b == 1  p-c == 2 (assume both values
  were 0 before the call to store_bc()).
 
 What Peter said.  I would ask for suggestions for better wording, but
 I would much rather be able to say that single-byte reads and writes
 are atomic and that aligned-short reads and writes are also atomic.
 
 Thus far, it looks like we lose only very old Alpha systems, so unless
 I hear otherwise, I update my patch to outlaw these very old systems.

People with old Alphas can run NetBSD instead, along with those who have real 
VAXen :-)

I've seen gcc generate 32bit accesses for 16bit structure members on arm.
It does this because of the more limited range of the offsets for the 16bit 
access.
OTOH I don't know if it ever did this for writes - so it may be moot.

David

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Michael Cree
On Thu, Sep 04, 2014 at 07:08:48PM -0700, H. Peter Anvin wrote:
 On 09/04/2014 05:59 PM, Peter Hurley wrote:
  I have no idea how prevalent the ev56 is compared to the ev5.
  Still we're talking about a chip that came out in 1996.
 
 Ah yes, I stand corrected.  According to Wikipedia, the affected CPUs
 were all the 2106x CPUs (EV4, EV45, LCA4, LCA45) plus the 21164 with no
 suffix (EV5).  However, we're still talking about museum pieces here.

Yes, that is correct, EV56 is the first Alpha CPU to have the byte-word
extension (BWX) CPU instructions.

It would not worry me if the kernel decided to assume atomic aligned
scalar accesses for all arches, thus terminating support for Alphas
without BWX.

The X server, ever since the libpciaccess change, does not work on
Alphas without BWX.

Debian Alpha (pretty much up to date at Debian-Ports) is still compiled
for all Alphas, i.e., without BWX.  The last attempt to start compiling
Debian Alpha with BWX, about three years ago when Alpha was kicked out
to Debian-Ports resulted in a couple or so complaints so got nowhere.
It's frustrating supporting the lowest common demoninator as many of
the bugs specific to Alpha can be resolved by recompiling with the BWX.
The kernel no longer supporting Alphas without BWX might just be the
incentive we need to switch Debian Alpha to compiling with BWX.

Cheers
Michael.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] pseries: Make CPU hotplug path endian safe

2014-09-05 Thread bharata . rao
From: Bharata B Rao bhar...@linux.vnet.ibm.com

- ibm,rtas-configure-connector should treat the RTAS data as big endian.
- Treat ibm,ppc-interrupt-server#s as big-endian when setting
  smp_processor_id during hotplug.

Signed-off-by: Bharata B Rao bhar...@linux.vnet.ibm.com
---
 arch/powerpc/platforms/pseries/dlpar.c   | 10 +-
 arch/powerpc/platforms/pseries/hotplug-cpu.c |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index 2d0b4d6..dc55f9c 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -48,11 +48,11 @@ static struct property *dlpar_parse_cc_property(struct 
cc_workarea *ccwa)
if (!prop)
return NULL;
 
-   name = (char *)ccwa + ccwa-name_offset;
+   name = (char *)ccwa + be32_to_cpu(ccwa-name_offset);
prop-name = kstrdup(name, GFP_KERNEL);
 
-   prop-length = ccwa-prop_length;
-   value = (char *)ccwa + ccwa-prop_offset;
+   prop-length = be32_to_cpu(ccwa-prop_length);
+   value = (char *)ccwa + be32_to_cpu(ccwa-prop_offset);
prop-value = kmemdup(value, prop-length, GFP_KERNEL);
if (!prop-value) {
dlpar_free_cc_property(prop);
@@ -78,7 +78,7 @@ static struct device_node *dlpar_parse_cc_node(struct 
cc_workarea *ccwa,
if (!dn)
return NULL;
 
-   name = (char *)ccwa + ccwa-name_offset;
+   name = (char *)ccwa + be32_to_cpu(ccwa-name_offset);
dn-full_name = kasprintf(GFP_KERNEL, %s/%s, path, name);
if (!dn-full_name) {
kfree(dn);
@@ -148,7 +148,7 @@ struct device_node *dlpar_configure_connector(u32 drc_index,
return NULL;
 
ccwa = (struct cc_workarea *)data_buf[0];
-   ccwa-drc_index = drc_index;
+   ccwa-drc_index = cpu_to_be32(drc_index);
ccwa-zero = 0;
 
do {
diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 20d6297..447f8c6 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -247,7 +247,7 @@ static int pseries_add_processor(struct device_node *np)
unsigned int cpu;
cpumask_var_t candidate_mask, tmp;
int err = -ENOSPC, len, nthreads, i;
-   const u32 *intserv;
+   const __be32 *intserv;
 
intserv = of_get_property(np, ibm,ppc-interrupt-server#s, len);
if (!intserv)
@@ -293,7 +293,7 @@ static int pseries_add_processor(struct device_node *np)
for_each_cpu(cpu, tmp) {
BUG_ON(cpu_present(cpu));
set_cpu_present(cpu, true);
-   set_hard_smp_processor_id(cpu, *intserv++);
+   set_hard_smp_processor_id(cpu, be32_to_cpu(*intserv++));
}
err = 0;
 out_unlock:
-- 
1.7.11.7

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 05/21] PCI/MSI: Introduce weak arch_find_msi_chip() to find MSI chip

2014-09-05 Thread Yijing Wang
Introduce weak arch_find_msi_chip() to find the match msi_chip.
Currently, MSI chip associates pci bus to msi_chip. Because in
ARM platform, there may be more than one MSI controller in system.
Associate pci bus to msi_chip help pci device to find the match
msi_chip and setup MSI/MSI-X irq correctly. But in other platform,
like in x86. we only need one MSI chip, because all device use
the same MSI address/data and irq etc. So it's no need to associate
pci bus to MSI chip, just use a arch function, arch_find_msi_chip()
to return the MSI chip for simplicity. The default weak
arch_find_msi_chip() used in ARM platform, find the MSI chip
by pci bus.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 drivers/pci/msi.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index a77e7f7..539c11d 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -29,9 +29,14 @@ static int pci_msi_enable = 1;
 
 /* Arch hooks */
 
+struct msi_chip * __weak arch_find_msi_chip(struct pci_dev *dev)
+{
+   return dev-bus-msi;
+}
+
 int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
 {
-   struct msi_chip *chip = dev-bus-msi;
+   struct msi_chip *chip = arch_find_msi_chip(dev);
int err;
 
if (!chip || !chip-setup_irq)
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 03/21] MSI: Remove the redundant irq_set_chip_data()

2014-09-05 Thread Yijing Wang
Currently, pcie-designware, pcie-rcar, pci-tegra drivers
use irq chip_data to save the msi_chip pointer. They
already call irq_set_chip_data() in their own MSI irq map
functions. So irq_set_chip_data() in arch_setup_msi_irq()
is useless.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 drivers/pci/msi.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index f6cb317..d547f7f 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -41,8 +41,6 @@ int __weak arch_setup_msi_irq(struct pci_dev *dev, struct 
msi_desc *desc)
if (err  0)
return err;
 
-   irq_set_chip_data(desc-irq, chip);
-
return 0;
 }
 
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 04/21] x86/xen/MSI: Eliminate arch_msix_mask_irq() and arch_msi_mask_irq()

2014-09-05 Thread Yijing Wang
Commit 0e4ccb150 added two __weak arch functions arch_msix_mask_irq()
and arch_msi_mask_irq() to fix a bug found when running xen in x86.
Introduced these two funcntions make MSI code complex. And mask/unmask
is the irq actions related to interrupt controller, should not use
weak arch functions to override mask/unmask interfaces. This patch
reverted commit 0e4ccb150 and export struct irq_chip msi_chip, modify
msi_chip-irq_mask/irq_unmask() in xen init functions to fix this
bug for simplicity. Also this is preparation for using struct
msi_chip instead of weak arch MSI functions in all platforms.

Signed-off-by: Yijing Wang wangyij...@huawei.com
CC: Konrad Rzeszutek Wilk konrad.w...@oracle.com
---
 arch/x86/include/asm/apic.h |4 
 arch/x86/include/asm/x86_init.h |3 ---
 arch/x86/kernel/apic/io_apic.c  |2 +-
 arch/x86/kernel/x86_init.c  |   10 --
 arch/x86/pci/xen.c  |   16 ++--
 drivers/pci/msi.c   |   22 ++
 include/linux/msi.h |4 ++--
 7 files changed, 19 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 465b309..47a5f94 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -43,6 +43,10 @@ static inline void generic_apic_probe(void)
 }
 #endif
 
+#ifdef CONFIG_PCI_MSI
+extern struct irq_chip msi_chip;
+#endif
+
 #ifdef CONFIG_X86_LOCAL_APIC
 
 extern unsigned int apic_verbosity;
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index e45e4da..f58a9c7 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -172,7 +172,6 @@ struct x86_platform_ops {
 
 struct pci_dev;
 struct msi_msg;
-struct msi_desc;
 
 struct x86_msi_ops {
int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type);
@@ -183,8 +182,6 @@ struct x86_msi_ops {
void (*teardown_msi_irqs)(struct pci_dev *dev);
void (*restore_msi_irqs)(struct pci_dev *dev);
int  (*setup_hpet_msi)(unsigned int irq, unsigned int id);
-   u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
-   u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
 };
 
 struct IO_APIC_route_entry;
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index e877cfb..2a2ec28 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -3161,7 +3161,7 @@ msi_set_affinity(struct irq_data *data, const struct 
cpumask *mask, bool force)
  * IRQ Chip for MSI PCI/PCI-X/PCI-Express Devices,
  * which implement the MSI or MSI-X Capability Structure.
  */
-static struct irq_chip msi_chip = {
+struct irq_chip msi_chip = {
.name   = PCI-MSI,
.irq_unmask = unmask_msi_irq,
.irq_mask   = mask_msi_irq,
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index e48b674..234b072 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -116,8 +116,6 @@ struct x86_msi_ops x86_msi = {
.teardown_msi_irqs  = default_teardown_msi_irqs,
.restore_msi_irqs   = default_restore_msi_irqs,
.setup_hpet_msi = default_setup_hpet_msi,
-   .msi_mask_irq   = default_msi_mask_irq,
-   .msix_mask_irq  = default_msix_mask_irq,
 };
 
 /* MSI arch specific hooks */
@@ -140,14 +138,6 @@ void arch_restore_msi_irqs(struct pci_dev *dev)
 {
x86_msi.restore_msi_irqs(dev);
 }
-u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
-{
-   return x86_msi.msi_mask_irq(desc, mask, flag);
-}
-u32 arch_msix_mask_irq(struct msi_desc *desc, u32 flag)
-{
-   return x86_msi.msix_mask_irq(desc, flag);
-}
 #endif
 
 struct x86_io_apic_ops x86_io_apic_ops = {
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index ad03739..84c2fce 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -394,13 +394,9 @@ static void xen_teardown_msi_irq(unsigned int irq)
 {
xen_destroy_irq(irq);
 }
-static u32 xen_nop_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
-{
-   return 0;
-}
-static u32 xen_nop_msix_mask_irq(struct msi_desc *desc, u32 flag)
+
+void xen_nop_msi_mask(struct irq_data *data)
 {
-   return 0;
 }
 #endif
 
@@ -425,8 +421,8 @@ int __init pci_xen_init(void)
x86_msi.setup_msi_irqs = xen_setup_msi_irqs;
x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
-   x86_msi.msi_mask_irq = xen_nop_msi_mask_irq;
-   x86_msi.msix_mask_irq = xen_nop_msix_mask_irq;
+   msi_chip.irq_mask = xen_nop_msi_mask;
+   msi_chip.irq_unmask = xen_nop_msi_mask;
 #endif
return 0;
 }
@@ -506,8 +502,8 @@ int __init pci_xen_initial_domain(void)
x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs;
x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
x86_msi.restore_msi_irqs = 

[PATCH v1 00/21] Use MSI chip to configure MSI/MSI-X in all platforms

2014-09-05 Thread Yijing Wang
This series is based Bjorn's pci-next branch + Alexander Gordeev's two patches
Remove arch_msi_check_device() link: https://lkml.org/lkml/2014/7/12/41

Currently, there are a lot of weak arch functions in MSI code.
Thierry Reding Introduced MSI chip framework to configure MSI/MSI-X in arm.
This series use MSI chip framework to refactor MSI code across all platforms
to eliminate weak arch functions. It has been tested fine in x86(with or without
irq remap).


RFC-v1: Updated [patch 4/21] x86/xen/MSI: Eliminate..., export msi_chip 
instead
 of #ifdef to fix MSI bug in xen running in x86. 
 Rename arch_get_match_msi_chip() to arch_find_msi_chip().
 Drop use struct device as the msi_chip argument, we will do 
that
 later in another patchset.

Yijing Wang (21):
  PCI/MSI: Clean up struct msi_chip argument
  PCI/MSI: Remove useless bus-msi assignment
  MSI: Remove the redundant irq_set_chip_data()
  x86/xen/MSI: Eliminate arch_msix_mask_irq() and arch_msi_mask_irq()
  PCI/MSI: Introduce weak arch_find_msi_chip() to find MSI chip
  PCI/MSI: Refactor struct msi_chip to make it become more common
  x86/MSI: Use MSI chip framework to configure MSI/MSI-X irq
  x86/xen/MSI: Use MSI chip framework to configure MSI/MSI-X irq
  Irq_remapping/MSI: Use MSI chip framework to configure MSI/MSI-X irq
  x86/MSI: Remove unused MSI weak arch functions
  MIPS/Octeon/MSI: Use MSI chip framework to configure MSI/MSI-X irq
  MIPS/Xlp: Remove the dead function destroy_irq() to fix build error
  MIPS/Xlp/MSI: Use MSI chip framework to configure MSI/MSI-X irq
  MIPS/Xlr/MSI: Use MSI chip framework to configure MSI/MSI-X irq
  Powerpc/MSI: Use MSI chip framework to configure MSI/MSI-X irq
  s390/MSI: Use MSI chip framework to configure MSI/MSI-X irq
  arm/iop13xx/MSI: Use MSI chip framework to configure MSI/MSI-X irq
  IA64/MSI: Use MSI chip framework to configure MSI/MSI-X irq
  Sparc/MSI: Use MSI chip framework to configure MSI/MSI-X irq
  tile/MSI: Use MSI chip framework to configure MSI/MSI-X irq
  PCI/MSI: Clean up unused MSI arch functions

 arch/arm/mach-iop13xx/include/mach/pci.h |2 +
 arch/arm/mach-iop13xx/iq81340mc.c|1 +
 arch/arm/mach-iop13xx/iq81340sc.c|1 +
 arch/arm/mach-iop13xx/msi.c  |9 ++-
 arch/arm/mach-iop13xx/pci.c  |6 ++
 arch/ia64/kernel/msi_ia64.c  |   18 -
 arch/mips/pci/msi-octeon.c   |   35 +---
 arch/mips/pci/msi-xlp.c  |   18 +++-
 arch/mips/pci/pci-xlr.c  |   15 +++-
 arch/powerpc/kernel/msi.c|   14 +++-
 arch/s390/pci/pci.c  |   18 -
 arch/sparc/kernel/pci.c  |   14 +++-
 arch/tile/kernel/pci_gx.c|   14 +++-
 arch/x86/include/asm/apic.h  |4 +
 arch/x86/include/asm/pci.h   |4 +-
 arch/x86/include/asm/x86_init.h  |7 --
 arch/x86/kernel/apic/io_apic.c   |   16 -
 arch/x86/kernel/x86_init.c   |   34 
 arch/x86/pci/xen.c   |   60 +--
 drivers/iommu/irq_remapping.c|9 ++-
 drivers/irqchip/irq-armada-370-xp.c  |   12 +--
 drivers/pci/host/pci-tegra.c |8 +-
 drivers/pci/host/pcie-designware.c   |4 +-
 drivers/pci/host/pcie-rcar.c |8 +-
 drivers/pci/msi.c|  123 +-
 drivers/pci/probe.c  |1 -
 include/linux/msi.h  |   26 ++-
 27 files changed, 268 insertions(+), 213 deletions(-)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 10/21] x86/MSI: Remove unused MSI weak arch functions

2014-09-05 Thread Yijing Wang
Now we can clean up MSI weak arch functions in x86.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 arch/x86/include/asm/pci.h  |3 ---
 arch/x86/include/asm/x86_init.h |4 
 arch/x86/kernel/apic/io_apic.c  |2 +-
 arch/x86/kernel/x86_init.c  |   24 
 drivers/iommu/irq_remapping.c   |1 -
 5 files changed, 1 insertions(+), 33 deletions(-)

diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index 878a06d..34f9676 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -96,14 +96,11 @@ extern void pci_iommu_alloc(void);
 #ifdef CONFIG_PCI_MSI
 /* implemented in arch/x86/kernel/apic/io_apic. */
 struct msi_desc;
-int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
 void native_teardown_msi_irq(unsigned int irq);
-void native_restore_msi_irqs(struct pci_dev *dev);
 int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
  unsigned int irq_base, unsigned int irq_offset);
 extern struct msi_chip *x86_msi_chip;
 #else
-#define native_setup_msi_irqs  NULL
 #define native_teardown_msi_irqNULL
 #endif
 
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index f58a9c7..2514f67 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -174,13 +174,9 @@ struct pci_dev;
 struct msi_msg;
 
 struct x86_msi_ops {
-   int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type);
void (*compose_msi_msg)(struct pci_dev *dev, unsigned int irq,
unsigned int dest, struct msi_msg *msg,
   u8 hpet_id);
-   void (*teardown_msi_irq)(unsigned int irq);
-   void (*teardown_msi_irqs)(struct pci_dev *dev);
-   void (*restore_msi_irqs)(struct pci_dev *dev);
int  (*setup_hpet_msi)(unsigned int irq, unsigned int id);
 };
 
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 882b95e..f998192 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -3200,7 +3200,7 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc 
*msidesc,
return 0;
 }
 
-int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
+static int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
struct msi_desc *msidesc;
unsigned int irq;
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 234b072..cc32568 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -110,34 +110,10 @@ EXPORT_SYMBOL_GPL(x86_platform);
 
 #if defined(CONFIG_PCI_MSI)
 struct x86_msi_ops x86_msi = {
-   .setup_msi_irqs = native_setup_msi_irqs,
.compose_msi_msg= native_compose_msi_msg,
-   .teardown_msi_irq   = native_teardown_msi_irq,
-   .teardown_msi_irqs  = default_teardown_msi_irqs,
-   .restore_msi_irqs   = default_restore_msi_irqs,
.setup_hpet_msi = default_setup_hpet_msi,
 };
 
-/* MSI arch specific hooks */
-int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
-{
-   return x86_msi.setup_msi_irqs(dev, nvec, type);
-}
-
-void arch_teardown_msi_irqs(struct pci_dev *dev)
-{
-   x86_msi.teardown_msi_irqs(dev);
-}
-
-void arch_teardown_msi_irq(unsigned int irq)
-{
-   x86_msi.teardown_msi_irq(irq);
-}
-
-void arch_restore_msi_irqs(struct pci_dev *dev)
-{
-   x86_msi.restore_msi_irqs(dev);
-}
 #endif
 
 struct x86_io_apic_ops x86_io_apic_ops = {
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index e75026e..99b1c0f 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -170,7 +170,6 @@ static void __init irq_remapping_modify_x86_ops(void)
x86_io_apic_ops.set_affinity= set_remapped_irq_affinity;
x86_io_apic_ops.setup_entry = setup_ioapic_remapped_entry;
x86_io_apic_ops.eoi_ioapic_pin  = eoi_ioapic_pin_remapped;
-   x86_msi.setup_msi_irqs  = irq_remapping_setup_msi_irqs;
x86_msi.setup_hpet_msi  = setup_hpet_msi_remapped;
x86_msi.compose_msi_msg = compose_remapped_msi_msg;
x86_msi_chip = remap_msi_chip;
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 11/21] MIPS/Octeon/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread Yijing Wang
Use MSI chip framework instead of arch MSI functions to configure
MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 arch/mips/pci/msi-octeon.c |   35 ++-
 1 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/arch/mips/pci/msi-octeon.c b/arch/mips/pci/msi-octeon.c
index ab0c5d1..0335d75 100644
--- a/arch/mips/pci/msi-octeon.c
+++ b/arch/mips/pci/msi-octeon.c
@@ -57,7 +57,7 @@ static int msi_irq_size;
  *
  * Returns 0 on success.
  */
-int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
+static int octeon_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
 {
struct msi_msg msg;
u16 control;
@@ -133,12 +133,12 @@ msi_irq_allocated:
/* Make sure the search for available interrupts didn't fail */
if (irq = 64) {
if (request_private_bits) {
-   pr_err(arch_setup_msi_irq: Unable to find %d free 
interrupts, trying just one,
+   pr_err(octeon_setup_msi_irq: Unable to find %d free 
interrupts, trying just one,
   1  request_private_bits);
request_private_bits = 0;
goto try_only_one;
} else
-   panic(arch_setup_msi_irq: Unable to find a free MSI 
interrupt);
+   panic(octeon_setup_msi_irq: Unable to find a free MSI 
interrupt);
}
 
/* MSI interrupts start at logical IRQ OCTEON_IRQ_MSI_BIT0 */
@@ -169,7 +169,7 @@ msi_irq_allocated:
msg.address_hi = (0 + CVMX_SLI_PCIE_MSI_RCV)  32;
break;
default:
-   panic(arch_setup_msi_irq: Invalid octeon_dma_bar_type);
+   panic(octeon_setup_msi_irq: Invalid octeon_dma_bar_type);
}
msg.data = irq - OCTEON_IRQ_MSI_BIT0;
 
@@ -184,7 +184,7 @@ msi_irq_allocated:
return 0;
 }
 
-int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
+static int octeon_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
struct msi_desc *entry;
int ret;
@@ -203,7 +203,7 @@ int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int 
type)
return 1;
 
list_for_each_entry(entry, dev-msi_list, list) {
-   ret = arch_setup_msi_irq(dev, entry);
+   ret = octeon_setup_msi_irq(dev, entry);
if (ret  0)
return ret;
if (ret  0)
@@ -212,14 +212,13 @@ int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, 
int type)
 
return 0;
 }
-
 /**
  * Called when a device no longer needs its MSI interrupts. All
  * MSI interrupts for the device are freed.
  *
  * @irq:The devices first irq number. There may be multple in sequence.
  */
-void arch_teardown_msi_irq(unsigned int irq)
+static void octeon_teardown_msi_irq(unsigned int irq)
 {
int number_irqs;
u64 bitmask;
@@ -228,8 +227,8 @@ void arch_teardown_msi_irq(unsigned int irq)
 
if ((irq  OCTEON_IRQ_MSI_BIT0)
|| (irq  msi_irq_size + OCTEON_IRQ_MSI_BIT0))
-   panic(arch_teardown_msi_irq: Attempted to teardown illegal 
- MSI interrupt (%d), irq);
+   panic(octeon_teardown_msi_irq: Attempted to teardown illegal 
+   MSI interrupt (%d), irq);
 
irq -= OCTEON_IRQ_MSI_BIT0;
index = irq / 64;
@@ -242,7 +241,7 @@ void arch_teardown_msi_irq(unsigned int irq)
 */
number_irqs = 0;
while ((irq0 + number_irqs  64) 
-  (msi_multiple_irq_bitmask[index]
+   (msi_multiple_irq_bitmask[index]
 (1ull  (irq0 + number_irqs
number_irqs++;
number_irqs++;
@@ -251,8 +250,8 @@ void arch_teardown_msi_irq(unsigned int irq)
/* Shift the mask to the correct bit location */
bitmask = irq0;
if ((msi_free_irq_bitmask[index]  bitmask) != bitmask)
-   panic(arch_teardown_msi_irq: Attempted to teardown MSI 
- interrupt (%d) not in use, irq);
+   panic(octeon_teardown_msi_irq: Attempted to teardown MSI 
+   interrupt (%d) not in use, irq);
 
/* Checks are done, update the in use bitmask */
spin_lock(msi_free_irq_bitmask_lock);
@@ -261,6 +260,16 @@ void arch_teardown_msi_irq(unsigned int irq)
spin_unlock(msi_free_irq_bitmask_lock);
 }
 
+static struct msi_chip octeon_msi_chip = {
+   .setup_irqs = octeon_setup_msi_irqs,
+   .teardown_irq = octeon_teardown_msi_irq,
+};
+
+struct msi_chip *arch_find_msi_chip(struct pci_dev *dev)
+{
+   return octeon_msi_chip;
+}
+
 static DEFINE_RAW_SPINLOCK(octeon_irq_msi_lock);
 
 static u64 msi_rcv_reg[4];
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org

[PATCH v1 09/21] Irq_remapping/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread Yijing Wang
Use MSI chip framework instead of arch MSI functions to configure
MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 drivers/iommu/irq_remapping.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 33c4395..e75026e 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -148,6 +148,11 @@ static int irq_remapping_setup_msi_irqs(struct pci_dev 
*dev,
return do_setup_msix_irqs(dev, nvec);
 }
 
+static struct msi_chip remap_msi_chip = {
+   .setup_irqs = irq_remapping_setup_msi_irqs,
+   .teardown_irq = native_teardown_msi_irq,
+};
+
 static void eoi_ioapic_pin_remapped(int apic, int pin, int vector)
 {
/*
@@ -165,9 +170,10 @@ static void __init irq_remapping_modify_x86_ops(void)
x86_io_apic_ops.set_affinity= set_remapped_irq_affinity;
x86_io_apic_ops.setup_entry = setup_ioapic_remapped_entry;
x86_io_apic_ops.eoi_ioapic_pin  = eoi_ioapic_pin_remapped;
-   x86_msi.setup_msi_irqs  = irq_remapping_setup_msi_irqs;
+   x86_msi.setup_msi_irqs  = irq_remapping_setup_msi_irqs;
x86_msi.setup_hpet_msi  = setup_hpet_msi_remapped;
x86_msi.compose_msi_msg = compose_remapped_msi_msg;
+   x86_msi_chip = remap_msi_chip;
 }
 
 static __init int setup_nointremap(char *str)
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 07/21] x86/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread Yijing Wang
Use MSI chip framework instead of arch MSI functions to configure
MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 arch/x86/include/asm/pci.h |1 +
 arch/x86/kernel/apic/io_apic.c |   12 
 2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index 0892ea0..878a06d 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -101,6 +101,7 @@ void native_teardown_msi_irq(unsigned int irq);
 void native_restore_msi_irqs(struct pci_dev *dev);
 int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
  unsigned int irq_base, unsigned int irq_offset);
+extern struct msi_chip *x86_msi_chip;
 #else
 #define native_setup_msi_irqs  NULL
 #define native_teardown_msi_irqNULL
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 2a2ec28..882b95e 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -3337,6 +3337,18 @@ int default_setup_hpet_msi(unsigned int irq, unsigned 
int id)
 }
 #endif
 
+struct msi_chip apic_msi_chip = {
+   .setup_irqs = native_setup_msi_irqs,
+   .teardown_irq = native_teardown_msi_irq,
+};
+
+struct msi_chip *arch_find_msi_chip(struct pci_dev *dev)
+{
+   return x86_msi_chip;
+}
+
+struct msi_chip *x86_msi_chip = apic_msi_chip;
+
 #endif /* CONFIG_PCI_MSI */
 /*
  * Hypertransport interrupt support
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 06/21] PCI/MSI: Refactor struct msi_chip to make it become more common

2014-09-05 Thread Yijing Wang
Now there are a lot of __weak arch functions in MSI code.
These functions make MSI driver complex. Thierry Reding Introduced
a new MSI chip framework to configure MSI/MSI-X irq in ARM. Use
the new MSI chip framework to refactor all other platform MSI
arch code to eliminate weak arch MSI functions. This patch add
.restore_irq() and .setup_irqs() to make it become more common.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 drivers/pci/msi.c   |   15 +++
 include/linux/msi.h |3 +++
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 539c11d..d78d637 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -63,6 +63,11 @@ int __weak arch_setup_msi_irqs(struct pci_dev *dev, int 
nvec, int type)
 {
struct msi_desc *entry;
int ret;
+   struct msi_chip *chip;
+
+   chip = arch_find_msi_chip(dev);
+   if (chip  chip-setup_irqs)
+   return chip-setup_irqs(dev, nvec, type);
 
/*
 * If an architecture wants to support multiple MSI, it needs to
@@ -105,6 +110,11 @@ void default_teardown_msi_irqs(struct pci_dev *dev)
 
 void __weak arch_teardown_msi_irqs(struct pci_dev *dev)
 {
+   struct msi_chip *chip = arch_find_msi_chip(dev);
+
+   if (chip  chip-teardown_irqs)
+   return chip-teardown_irqs(dev);
+
return default_teardown_msi_irqs(dev);
 }
 
@@ -128,6 +138,11 @@ static void default_restore_msi_irq(struct pci_dev *dev, 
int irq)
 
 void __weak arch_restore_msi_irqs(struct pci_dev *dev)
 {
+   struct msi_chip *chip = arch_find_msi_chip(dev);
+
+   if (chip  chip-restore_irqs)
+   return chip-restore_irqs(dev);
+
return default_restore_msi_irqs(dev);
 }
 
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 5650848..92a51e7 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -72,7 +72,10 @@ struct msi_chip {
struct list_head list;
 
int (*setup_irq)(struct pci_dev *dev, struct msi_desc *desc);
+   int (*setup_irqs)(struct pci_dev *dev, int nvec, int type);
void (*teardown_irq)(unsigned int irq);
+   void (*teardown_irqs)(struct pci_dev *dev);
+   void (*restore_irqs)(struct pci_dev *dev);
 };
 
 #endif /* LINUX_MSI_H */
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 08/21] x86/xen/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread Yijing Wang
Use MSI chip framework instead of arch MSI functions to configure
MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.

Signed-off-by: Yijing Wang wangyij...@huawei.com
CC: Konrad Rzeszutek Wilk konrad.w...@oracle.com
---
 arch/x86/pci/xen.c |   46 ++
 1 files changed, 30 insertions(+), 16 deletions(-)

diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 84c2fce..e669ee4 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -376,6 +376,11 @@ static void xen_initdom_restore_msi_irqs(struct pci_dev 
*dev)
 }
 #endif
 
+static void xen_teardown_msi_irq(unsigned int irq)
+{
+   xen_destroy_irq(irq);
+}
+
 static void xen_teardown_msi_irqs(struct pci_dev *dev)
 {
struct msi_desc *msidesc;
@@ -385,19 +390,26 @@ static void xen_teardown_msi_irqs(struct pci_dev *dev)
xen_pci_frontend_disable_msix(dev);
else
xen_pci_frontend_disable_msi(dev);
-
-   /* Free the IRQ's and the msidesc using the generic code. */
-   default_teardown_msi_irqs(dev);
-}
-
-static void xen_teardown_msi_irq(unsigned int irq)
-{
-   xen_destroy_irq(irq);
+
+   list_for_each_entry(msidesc, dev-msi_list, list) {
+   int i, nvec;
+   if (msidesc-irq == 0)
+   continue;
+   if (msidesc-nvec_used)
+   nvec = msidesc-nvec_used;
+   else
+   nvec = 1  msidesc-msi_attrib.multiple;
+   for (i = 0; i  nvec; i++)
+   xen_teardown_msi_irq(msidesc-irq + i);
+   }
 }
 
 void xen_nop_msi_mask(struct irq_data *data)
 {
 }
+
+struct msi_chip xen_msi_chip;
+
 #endif
 
 int __init pci_xen_init(void)
@@ -418,9 +430,9 @@ int __init pci_xen_init(void)
 #endif
 
 #ifdef CONFIG_PCI_MSI
-   x86_msi.setup_msi_irqs = xen_setup_msi_irqs;
-   x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
-   x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
+   xen_msi_chip.setup_irqs = xen_setup_msi_irqs;
+   xen_msi_chip.teardown_irqs = xen_teardown_msi_irqs;
+   x86_msi_chip = xen_msi_chip;
msi_chip.irq_mask = xen_nop_msi_mask;
msi_chip.irq_unmask = xen_nop_msi_mask;
 #endif
@@ -441,8 +453,9 @@ int __init pci_xen_hvm_init(void)
 #endif
 
 #ifdef CONFIG_PCI_MSI
-   x86_msi.setup_msi_irqs = xen_hvm_setup_msi_irqs;
-   x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
+   xen_msi_chip.setup_irqs = xen_hvm_setup_msi_irqs;
+   xen_msi_chip.teardown_irq = xen_teardown_msi_irq;
+   x86_msi_chip = xen_msi_chip;
 #endif
return 0;
 }
@@ -499,9 +512,10 @@ int __init pci_xen_initial_domain(void)
int irq;
 
 #ifdef CONFIG_PCI_MSI
-   x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs;
-   x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
-   x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs;
+   xen_msi_chip.setup_irqs = xen_initdom_setup_msi_irqs;
+   xen_msi_chip.teardown_irq = xen_teardown_msi_irq;
+   xen_msi_chip.restore_irqs = xen_initdom_restore_msi_irqs;
+   x86_msi_chip = xen_msi_chip;
msi_chip.irq_mask = xen_nop_msi_mask;
msi_chip.irq_unmask = xen_nop_msi_mask;
 #endif
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 16/21] s390/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread Yijing Wang
Use MSI chip framework instead of arch MSI functions to configure
MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 arch/s390/pci/pci.c |   18 ++
 1 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index 2fa7b14..da5316e 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -358,7 +358,7 @@ static void zpci_irq_handler(struct airq_struct *airq)
}
 }
 
-int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
+int zpci_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
 {
struct zpci_dev *zdev = get_zdev(pdev);
unsigned int hwirq, msi_vecs;
@@ -434,7 +434,7 @@ out:
return rc;
 }
 
-void arch_teardown_msi_irqs(struct pci_dev *pdev)
+static void zpci_teardown_msi_irqs(struct pci_dev *pdev)
 {
struct zpci_dev *zdev = get_zdev(pdev);
struct msi_desc *msi;
@@ -448,9 +448,9 @@ void arch_teardown_msi_irqs(struct pci_dev *pdev)
/* Release MSI interrupts */
list_for_each_entry(msi, pdev-msi_list, list) {
if (msi-msi_attrib.is_msix)
-   default_msix_mask_irq(msi, 1);
+   __msix_mask_irq(msi, 1);
else
-   default_msi_mask_irq(msi, 1, 1);
+   __msi_mask_irq(msi, 1, 1);
irq_set_msi_desc(msi-irq, NULL);
irq_free_desc(msi-irq);
msi-msg.address_lo = 0;
@@ -464,6 +464,16 @@ void arch_teardown_msi_irqs(struct pci_dev *pdev)
airq_iv_free_bit(zpci_aisb_iv, zdev-aisb);
 }
 
+static struct msi_chip zpci_msi_chip = {
+   .setup_irqs = zpci_setup_msi_irqs,
+   .teardown_irqs = zpci_teardown_msi_irqs,
+};
+
+struct msi_chip *arch_find_msi_chip(struct pci_dev *dev)
+{
+   return zpci_msi_chip;
+}
+
 static void zpci_map_resources(struct zpci_dev *zdev)
 {
struct pci_dev *pdev = zdev-pdev;
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 15/21] Powerpc/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread Yijing Wang
Use MSI chip framework instead of arch MSI functions to configure
MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 arch/powerpc/kernel/msi.c |   14 --
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c
index 71bd161..01781a4 100644
--- a/arch/powerpc/kernel/msi.c
+++ b/arch/powerpc/kernel/msi.c
@@ -13,7 +13,7 @@
 
 #include asm/machdep.h
 
-int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
+static int ppc_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
if (!ppc_md.setup_msi_irqs || !ppc_md.teardown_msi_irqs) {
pr_debug(msi: Platform doesn't provide MSI callbacks.\n);
@@ -27,7 +27,17 @@ int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int 
type)
return ppc_md.setup_msi_irqs(dev, nvec, type);
 }
 
-void arch_teardown_msi_irqs(struct pci_dev *dev)
+static void ppc_teardown_msi_irqs(struct pci_dev *dev)
 {
ppc_md.teardown_msi_irqs(dev);
 }
+
+static struct msi_chip ppc_msi_chip = {
+   .setup_irqs = ppc_setup_msi_irqs,
+   .teardown_irqs = ppc_teardown_msi_irqs,
+};
+
+struct msi_chip *arch_find_msi_chip(struct pci_dev *dev)
+{
+   return ppc_msi_chip;
+}
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 02/21] PCI/MSI: Remove useless bus-msi assignment

2014-09-05 Thread Yijing Wang
Currently, PCI drivers will initialize bus-msi in
pcibios_add_bus(). pcibios_add_bus() will be called
in every pci bus initialization. So the bus-msi
assignment in pci_alloc_child_bus() is useless.

Signed-off-by: Yijing Wang wangyij...@huawei.com
CC: Thierry Reding thierry.red...@avionic-design.de
CC: Thomas Petazzoni thomas.petazz...@free-electrons.com
---
 drivers/pci/probe.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index e3cf8a2..8296576 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -677,7 +677,6 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus 
*parent,
 
child-parent = parent;
child-ops = parent-ops;
-   child-msi = parent-msi;
child-sysdata = parent-sysdata;
child-bus_flags = parent-bus_flags;
 
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 17/21] arm/iop13xx/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread Yijing Wang
Use MSI chip framework instead of arch MSI functions to configure
MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 arch/arm/mach-iop13xx/include/mach/pci.h |2 ++
 arch/arm/mach-iop13xx/iq81340mc.c|1 +
 arch/arm/mach-iop13xx/iq81340sc.c|1 +
 arch/arm/mach-iop13xx/msi.c  |9 +++--
 arch/arm/mach-iop13xx/pci.c  |6 ++
 5 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mach-iop13xx/include/mach/pci.h 
b/arch/arm/mach-iop13xx/include/mach/pci.h
index 59f42b5..7a073cb 100644
--- a/arch/arm/mach-iop13xx/include/mach/pci.h
+++ b/arch/arm/mach-iop13xx/include/mach/pci.h
@@ -10,6 +10,8 @@ struct pci_bus *iop13xx_scan_bus(int nr, struct pci_sys_data 
*);
 void iop13xx_atu_select(struct hw_pci *plat_pci);
 void iop13xx_pci_init(void);
 void iop13xx_map_pci_memory(void);
+void iop13xx_add_bus(struct pci_bus *bus);
+extern struct msi_chip iop13xx_msi_chip;
 
 #define IOP_PCI_STATUS_ERROR (PCI_STATUS_PARITY |   \
   PCI_STATUS_SIG_TARGET_ABORT | \
diff --git a/arch/arm/mach-iop13xx/iq81340mc.c 
b/arch/arm/mach-iop13xx/iq81340mc.c
index 9cd07d3..19d47cb 100644
--- a/arch/arm/mach-iop13xx/iq81340mc.c
+++ b/arch/arm/mach-iop13xx/iq81340mc.c
@@ -59,6 +59,7 @@ static struct hw_pci iq81340mc_pci __initdata = {
.map_irq= iq81340mc_pcix_map_irq,
.scan   = iop13xx_scan_bus,
.preinit= iop13xx_pci_init,
+   .add_bus= iop13xx_add_bus;
 };
 
 static int __init iq81340mc_pci_init(void)
diff --git a/arch/arm/mach-iop13xx/iq81340sc.c 
b/arch/arm/mach-iop13xx/iq81340sc.c
index b3ec11c..4d56993 100644
--- a/arch/arm/mach-iop13xx/iq81340sc.c
+++ b/arch/arm/mach-iop13xx/iq81340sc.c
@@ -61,6 +61,7 @@ static struct hw_pci iq81340sc_pci __initdata = {
.scan   = iop13xx_scan_bus,
.map_irq= iq81340sc_atux_map_irq,
.preinit= iop13xx_pci_init
+   .add_bus= iop13xx_add_bus;
 };
 
 static int __init iq81340sc_pci_init(void)
diff --git a/arch/arm/mach-iop13xx/msi.c b/arch/arm/mach-iop13xx/msi.c
index e7730cf..1a8cb2f 100644
--- a/arch/arm/mach-iop13xx/msi.c
+++ b/arch/arm/mach-iop13xx/msi.c
@@ -132,7 +132,7 @@ static struct irq_chip iop13xx_msi_chip = {
.irq_unmask = unmask_msi_irq,
 };
 
-int arch_setup_msi_irq(struct pci_dev *pdev, struct msi_desc *desc)
+static int iop13xx_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
 {
int id, irq = irq_alloc_desc_from(IRQ_IOP13XX_MSI_0, -1);
struct msi_msg msg;
@@ -159,7 +159,12 @@ int arch_setup_msi_irq(struct pci_dev *pdev, struct 
msi_desc *desc)
return 0;
 }
 
-void arch_teardown_msi_irq(unsigned int irq)
+static void iop13xx_teardown_msi_irq(unsigned int irq)
 {
irq_free_desc(irq);
 }
+
+struct msi_chip iop13xx_chip = {
+   .setup_irq = iop13xx_setup_msi_irq,
+   .teardown_irq = iop13xx_teardown_msi_irq,
+};
diff --git a/arch/arm/mach-iop13xx/pci.c b/arch/arm/mach-iop13xx/pci.c
index 9082b84..f498800 100644
--- a/arch/arm/mach-iop13xx/pci.c
+++ b/arch/arm/mach-iop13xx/pci.c
@@ -962,6 +962,12 @@ void __init iop13xx_atu_select(struct hw_pci *plat_pci)
}
 }
 
+void iop13xx_add_bus(struct pci_bus *bus)
+{
+   if (IS_ENABLED(CONFIG_PCI_MSI))
+   bus-msi = iop13xx_msi_chip;
+}
+
 void __init iop13xx_pci_init(void)
 {
/* clear pre-existing south bridge errors */
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 19/21] Sparc/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread Yijing Wang
Use MSI chip framework instead of arch MSI functions to configure
MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 arch/sparc/kernel/pci.c |   14 --
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c
index b36365f..2a89ee2 100644
--- a/arch/sparc/kernel/pci.c
+++ b/arch/sparc/kernel/pci.c
@@ -905,7 +905,7 @@ int pci_domain_nr(struct pci_bus *pbus)
 EXPORT_SYMBOL(pci_domain_nr);
 
 #ifdef CONFIG_PCI_MSI
-int arch_setup_msi_irq(struct pci_dev *pdev, struct msi_desc *desc)
+int sparc_setup_msi_irq(struct pci_dev *pdev, struct msi_desc *desc)
 {
struct pci_pbm_info *pbm = pdev-dev.archdata.host_controller;
unsigned int irq;
@@ -916,7 +916,7 @@ int arch_setup_msi_irq(struct pci_dev *pdev, struct 
msi_desc *desc)
return pbm-setup_msi_irq(irq, pdev, desc);
 }
 
-void arch_teardown_msi_irq(unsigned int irq)
+void sparc_teardown_msi_irq(unsigned int irq)
 {
struct msi_desc *entry = irq_get_msi_desc(irq);
struct pci_dev *pdev = entry-dev;
@@ -925,6 +925,16 @@ void arch_teardown_msi_irq(unsigned int irq)
if (pbm-teardown_msi_irq)
pbm-teardown_msi_irq(irq, pdev);
 }
+
+static struct msi_chip sparc_msi_chip = {
+   .setup_irq = sparc_setup_msi_irq,
+   .teardown_irq = sparc_teardown_msi_irq,
+};
+
+struct msi_chip *arch_find_msi_chip(struct pci_dev *dev)
+{
+   return sparc_msi_chip;
+}
 #endif /* !(CONFIG_PCI_MSI) */
 
 static void ali_sound_dma_hack(struct pci_dev *pdev, int set_bit)
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 13/21] MIPS/Xlp/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread Yijing Wang
Use MSI chip framework instead of arch MSI functions to configure
MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 arch/mips/pci/msi-xlp.c |   14 --
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/mips/pci/msi-xlp.c b/arch/mips/pci/msi-xlp.c
index e469dc7..6b791ef 100644
--- a/arch/mips/pci/msi-xlp.c
+++ b/arch/mips/pci/msi-xlp.c
@@ -245,7 +245,7 @@ static struct irq_chip xlp_msix_chip = {
.irq_unmask = unmask_msi_irq,
 };
 
-void arch_teardown_msi_irq(unsigned int irq)
+void xlp_teardown_msi_irq(unsigned int irq)
 {
 }
 
@@ -450,7 +450,7 @@ static int xlp_setup_msix(uint64_t lnkbase, int node, int 
link,
return 0;
 }
 
-int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
+static int xlp_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
 {
struct pci_dev *lnkdev;
uint64_t lnkbase;
@@ -472,6 +472,16 @@ int arch_setup_msi_irq(struct pci_dev *dev, struct 
msi_desc *desc)
return xlp_setup_msi(lnkbase, node, link, desc);
 }
 
+static struct msi_chip xlp_chip = {
+   .setup_irq = xlp_setup_msi_irq,
+   .teardown_irq = xlp_teardown_msi_irq,
+};
+
+struct msi_chip *arch_find_msi_chip(struct pci_dev *dev)
+{
+   return xlp_chip;
+}
+
 void __init xlp_init_node_msi_irqs(int node, int link)
 {
struct nlm_soc_info *nodep;
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 18/21] IA64/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread Yijing Wang
Use MSI chip framework instead of arch MSI functions to configure
MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 arch/ia64/kernel/msi_ia64.c |   18 ++
 1 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/arch/ia64/kernel/msi_ia64.c b/arch/ia64/kernel/msi_ia64.c
index 4efe748..55ac859 100644
--- a/arch/ia64/kernel/msi_ia64.c
+++ b/arch/ia64/kernel/msi_ia64.c
@@ -112,15 +112,15 @@ static struct irq_chip ia64_msi_chip = {
 };
 
 
-int arch_setup_msi_irq(struct pci_dev *pdev, struct msi_desc *desc)
+static int arch_ia64_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
 {
if (platform_setup_msi_irq)
-   return platform_setup_msi_irq(pdev, desc);
+   return platform_setup_msi_irq(dev, desc);
 
-   return ia64_setup_msi_irq(pdev, desc);
+   return ia64_setup_msi_irq(dev, desc);
 }
 
-void arch_teardown_msi_irq(unsigned int irq)
+static void arch_ia64_teardown_msi_irq(unsigned int irq)
 {
if (platform_teardown_msi_irq)
return platform_teardown_msi_irq(irq);
@@ -128,6 +128,16 @@ void arch_teardown_msi_irq(unsigned int irq)
return ia64_teardown_msi_irq(irq);
 }
 
+static struct msi_chip chip = {
+   .setup_irq = arch_ia64_setup_msi_irq,
+   .teardown_irq = arch_ia64_teardown_msi_irq,
+};
+
+struct msi_chip *arch_find_msi_chip(struct pci_dev *dev)
+{
+   return chip;
+}
+
 #ifdef CONFIG_INTEL_IOMMU
 #ifdef CONFIG_SMP
 static int dmar_msi_set_affinity(struct irq_data *data,
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 12/21] MIPS/Xlp: Remove the dead function destroy_irq() to fix build error

2014-09-05 Thread Yijing Wang
Commit 465665f78a7 (mips: Kill pointless destroy_irq()) removed
the destroy_irq(). So remove the leftover one in xlp_setup_msix()
to fix build error.

arch/mips/pci/msi-xlp.c: In function 'xlp_setup_msix':
arch/mips/pci/msi-xlp.c:447:3: error: implicit declaration of function 
'destroy_irq'..
cc1: some warnings being treated as errors
make[1]: *** [arch/mips/pci/msi-xlp.o] Error 1
make: *** [arch/mips/pci/] Error 2

Signed-off-by: Yijing Wang wangyii...@huawei.com
Cc: Thomas Gleixner t...@linutronix.de
---
 arch/mips/pci/msi-xlp.c |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/arch/mips/pci/msi-xlp.c b/arch/mips/pci/msi-xlp.c
index fa374fe..e469dc7 100644
--- a/arch/mips/pci/msi-xlp.c
+++ b/arch/mips/pci/msi-xlp.c
@@ -443,10 +443,8 @@ static int xlp_setup_msix(uint64_t lnkbase, int node, int 
link,
msg.data = 0xc00 | msixvec;
 
ret = irq_set_msi_desc(xirq, desc);
-   if (ret  0) {
-   destroy_irq(xirq);
+   if (ret  0)
return ret;
-   }
 
write_msi_msg(xirq, msg);
return 0;
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 14/21] MIPS/Xlr/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread Yijing Wang
Use MSI chip framework instead of arch MSI functions to configure
MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 arch/mips/pci/pci-xlr.c |   15 +--
 1 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/mips/pci/pci-xlr.c b/arch/mips/pci/pci-xlr.c
index 0dde803..7bd91cc 100644
--- a/arch/mips/pci/pci-xlr.c
+++ b/arch/mips/pci/pci-xlr.c
@@ -214,11 +214,11 @@ static int get_irq_vector(const struct pci_dev *dev)
 }
 
 #ifdef CONFIG_PCI_MSI
-void arch_teardown_msi_irq(unsigned int irq)
+void xlr_teardown_msi_irq(unsigned int irq)
 {
 }
 
-int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
+int xlr_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
 {
struct msi_msg msg;
struct pci_dev *lnk;
@@ -263,6 +263,17 @@ int arch_setup_msi_irq(struct pci_dev *dev, struct 
msi_desc *desc)
write_msi_msg(irq, msg);
return 0;
 }
+
+static struct msi_chip xlr_msi_chip = {
+   .setup_irq = xlr_setup_msi_irq,
+   .teardown_irq = xlr_teardown_msi_irq,
+};
+
+struct msi_chip *arch_find_msi_chip(struct pci_dev *dev)
+{
+   return xlr_msi_chip;
+}
+
 #endif
 
 /* Extra ACK needed for XLR on chip PCI controller */
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 21/21] PCI/MSI: Clean up unused MSI arch functions

2014-09-05 Thread Yijing Wang
Now we use struct msi_chip in all platforms to configure
MSI/MSI-X. We can clean up the unused arch functions.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 drivers/iommu/irq_remapping.c |2 +-
 drivers/pci/msi.c |   99 -
 include/linux/msi.h   |   14 --
 3 files changed, 39 insertions(+), 76 deletions(-)

diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 99b1c0f..6e645f0 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -92,7 +92,7 @@ error:
 
/*
 * Restore altered MSI descriptor fields and prevent just destroyed
-* IRQs from tearing down again in default_teardown_msi_irqs()
+* IRQs from tearing down again in teardown_msi_irqs()
 */
msidesc-irq = 0;
msidesc-nvec_used = 0;
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index d78d637..e3e7f4f 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -34,50 +34,31 @@ struct msi_chip * __weak arch_find_msi_chip(struct pci_dev 
*dev)
return dev-bus-msi;
 }
 
-int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
-{
-   struct msi_chip *chip = arch_find_msi_chip(dev);
-   int err;
-
-   if (!chip || !chip-setup_irq)
-   return -EINVAL;
-
-   err = chip-setup_irq(dev, desc);
-   if (err  0)
-   return err;
-
-   return 0;
-}
-
-void __weak arch_teardown_msi_irq(unsigned int irq)
-{
-   struct msi_chip *chip = irq_get_chip_data(irq);
-
-   if (!chip || !chip-teardown_irq)
-   return;
-
-   chip-teardown_irq(irq);
-}
-
-int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
+int setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
struct msi_desc *entry;
int ret;
struct msi_chip *chip;
 
chip = arch_find_msi_chip(dev);
-   if (chip  chip-setup_irqs)
+   if (!chip)
+   return -EINVAL;
+
+   if (chip-setup_irqs)
return chip-setup_irqs(dev, nvec, type);
 
/*
 * If an architecture wants to support multiple MSI, it needs to
-* override arch_setup_msi_irqs()
+* implement chip-setup_irqs().
 */
if (type == PCI_CAP_ID_MSI  nvec  1)
return 1;
 
+   if (!chip-setup_irq)
+   return -EINVAL;
+
list_for_each_entry(entry, dev-msi_list, list) {
-   ret = arch_setup_msi_irq(dev, entry);
+   ret = chip-setup_irq(dev, entry);
if (ret  0)
return ret;
if (ret  0)
@@ -87,13 +68,20 @@ int __weak arch_setup_msi_irqs(struct pci_dev *dev, int 
nvec, int type)
return 0;
 }
 
-/*
- * We have a default implementation available as a separate non-weak
- * function, as it is used by the Xen x86 PCI code
- */
-void default_teardown_msi_irqs(struct pci_dev *dev)
+static void teardown_msi_irqs(struct pci_dev *dev)
 {
struct msi_desc *entry;
+   struct msi_chip *chip;
+
+   chip = arch_find_msi_chip(dev);
+   if (!chip)
+   return;
+
+   if (chip-teardown_irqs)
+   return chip-teardown_irqs(dev);
+
+   if (!chip-teardown_irq)
+   return;
 
list_for_each_entry(entry, dev-msi_list, list) {
int i, nvec;
@@ -104,20 +92,10 @@ void default_teardown_msi_irqs(struct pci_dev *dev)
else
nvec = 1  entry-msi_attrib.multiple;
for (i = 0; i  nvec; i++)
-   arch_teardown_msi_irq(entry-irq + i);
+   chip-teardown_irq(entry-irq + i);
}
 }
 
-void __weak arch_teardown_msi_irqs(struct pci_dev *dev)
-{
-   struct msi_chip *chip = arch_find_msi_chip(dev);
-
-   if (chip  chip-teardown_irqs)
-   return chip-teardown_irqs(dev);
-
-   return default_teardown_msi_irqs(dev);
-}
-
 static void default_restore_msi_irq(struct pci_dev *dev, int irq)
 {
struct msi_desc *entry;
@@ -136,10 +114,18 @@ static void default_restore_msi_irq(struct pci_dev *dev, 
int irq)
write_msi_msg(irq, entry-msg);
 }
 
-void __weak arch_restore_msi_irqs(struct pci_dev *dev)
+static void default_restore_msi_irqs(struct pci_dev *dev)
 {
-   struct msi_chip *chip = arch_find_msi_chip(dev);
+   struct msi_desc *entry = NULL;
+
+   list_for_each_entry(entry, dev-msi_list, list) {
+   default_restore_msi_irq(dev, entry-irq);
+   }
+}
 
+static void restore_msi_irqs(struct pci_dev *dev)
+{
+   struct msi_chip *chip = arch_find_msi_chip(dev);
if (chip  chip-restore_irqs)
return chip-restore_irqs(dev);
 
@@ -248,15 +234,6 @@ void unmask_msi_irq(struct irq_data *data)
msi_set_mask_bit(data, 0);
 }
 
-void default_restore_msi_irqs(struct pci_dev *dev)
-{
-   struct msi_desc *entry;
-
-   

[PATCH v1 20/21] tile/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread Yijing Wang
Use MSI chip framework instead of arch MSI functions to configure
MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.

Signed-off-by: Yijing Wang wangyij...@huawei.com
---
 arch/tile/kernel/pci_gx.c |   14 --
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/tile/kernel/pci_gx.c b/arch/tile/kernel/pci_gx.c
index e39f9c5..4912b75 100644
--- a/arch/tile/kernel/pci_gx.c
+++ b/arch/tile/kernel/pci_gx.c
@@ -1485,7 +1485,7 @@ static struct irq_chip tilegx_msi_chip = {
/* TBD: support set_affinity. */
 };
 
-int arch_setup_msi_irq(struct pci_dev *pdev, struct msi_desc *desc)
+static int tile_setup_msi_irq(struct pci_dev *pdev, struct msi_desc *desc)
 {
struct pci_controller *controller;
gxio_trio_context_t *trio_context;
@@ -1604,7 +1604,17 @@ is_64_failure:
return ret;
 }
 
-void arch_teardown_msi_irq(unsigned int irq)
+void tile_teardown_msi_irq(unsigned int irq)
 {
irq_free_hwirq(irq);
 }
+
+static struct msi_chip tile_msi_chip = {
+   .setup_irq = tile_setup_msi_irq,
+   .teardown_irq = tile_teardown_msi_irq,
+};
+
+struct msi_chip *arch_find_msi_chip(struct pci_dev *dev)
+{
+   return tile_msi_chip;
+}
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v1 01/21] PCI/MSI: Clean up struct msi_chip argument

2014-09-05 Thread Yijing Wang
Msi_chip functions setup_irq/teardown_irq rarely use msi_chip
argument. We can look up msi_chip pointer by the device pointer
or irq number, so clean up msi_chip argument.

Signed-off-by: Yijing Wang wangyij...@huawei.com
CC: Thierry Reding thierry.red...@gmail.com
CC: Thomas Petazzoni thomas.petazz...@free-electrons.com
---
 drivers/irqchip/irq-armada-370-xp.c |   12 +---
 drivers/pci/host/pci-tegra.c|8 +---
 drivers/pci/host/pcie-designware.c  |4 ++--
 drivers/pci/host/pcie-rcar.c|8 +---
 drivers/pci/msi.c   |4 ++--
 include/linux/msi.h |5 ++---
 6 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/drivers/irqchip/irq-armada-370-xp.c 
b/drivers/irqchip/irq-armada-370-xp.c
index 574aba0..658990c 100644
--- a/drivers/irqchip/irq-armada-370-xp.c
+++ b/drivers/irqchip/irq-armada-370-xp.c
@@ -129,9 +129,8 @@ static void armada_370_xp_free_msi(int hwirq)
mutex_unlock(msi_used_lock);
 }
 
-static int armada_370_xp_setup_msi_irq(struct msi_chip *chip,
-  struct pci_dev *pdev,
-  struct msi_desc *desc)
+static int armada_370_xp_setup_msi_irq(struct pci_dev *pdev,
+   struct msi_desc *desc)
 {
struct msi_msg msg;
int virq, hwirq;
@@ -156,8 +155,7 @@ static int armada_370_xp_setup_msi_irq(struct msi_chip 
*chip,
return 0;
 }
 
-static void armada_370_xp_teardown_msi_irq(struct msi_chip *chip,
-  unsigned int irq)
+static void armada_370_xp_teardown_msi_irq(unsigned int irq)
 {
struct irq_data *d = irq_get_irq_data(irq);
unsigned long hwirq = d-hwirq;
@@ -166,8 +164,8 @@ static void armada_370_xp_teardown_msi_irq(struct msi_chip 
*chip,
armada_370_xp_free_msi(hwirq);
 }
 
-static int armada_370_xp_check_msi_device(struct msi_chip *chip, struct 
pci_dev *dev,
- int nvec, int type)
+static int armada_370_xp_check_msi_device(struct pci_dev *dev,
+   int nvec, int type)
 {
/* We support MSI, but not MSI-X */
if (type == PCI_CAP_ID_MSI)
diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c
index 0fb0fdb..edd4040 100644
--- a/drivers/pci/host/pci-tegra.c
+++ b/drivers/pci/host/pci-tegra.c
@@ -1157,9 +1157,10 @@ static irqreturn_t tegra_pcie_msi_irq(int irq, void 
*data)
return processed  0 ? IRQ_HANDLED : IRQ_NONE;
 }
 
-static int tegra_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev,
+static int tegra_msi_setup_irq(struct pci_dev *pdev,
   struct msi_desc *desc)
 {
+   struct msi_chip *chip = pdev-bus-msi;
struct tegra_msi *msi = to_tegra_msi(chip);
struct msi_msg msg;
unsigned int irq;
@@ -1185,10 +1186,11 @@ static int tegra_msi_setup_irq(struct msi_chip *chip, 
struct pci_dev *pdev,
return 0;
 }
 
-static void tegra_msi_teardown_irq(struct msi_chip *chip, unsigned int irq)
+static void tegra_msi_teardown_irq(unsigned int irq)
 {
-   struct tegra_msi *msi = to_tegra_msi(chip);
struct irq_data *d = irq_get_irq_data(irq);
+   struct msi_chip *chip = irq_get_chip_data(irq);
+   struct tegra_msi *msi = to_tegra_msi(chip);
 
tegra_msi_free(msi, d-hwirq);
 }
diff --git a/drivers/pci/host/pcie-designware.c 
b/drivers/pci/host/pcie-designware.c
index 52bd3a1..2204456 100644
--- a/drivers/pci/host/pcie-designware.c
+++ b/drivers/pci/host/pcie-designware.c
@@ -342,7 +342,7 @@ static void clear_irq(unsigned int irq)
msi-msi_attrib.multiple = 0;
 }
 
-static int dw_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev,
+static int dw_msi_setup_irq(struct pci_dev *pdev,
struct msi_desc *desc)
 {
int irq, pos, msgvec;
@@ -384,7 +384,7 @@ static int dw_msi_setup_irq(struct msi_chip *chip, struct 
pci_dev *pdev,
return 0;
 }
 
-static void dw_msi_teardown_irq(struct msi_chip *chip, unsigned int irq)
+static void dw_msi_teardown_irq(unsigned int irq)
 {
clear_irq(irq);
 }
diff --git a/drivers/pci/host/pcie-rcar.c b/drivers/pci/host/pcie-rcar.c
index 4884ee5..647bc9f 100644
--- a/drivers/pci/host/pcie-rcar.c
+++ b/drivers/pci/host/pcie-rcar.c
@@ -615,9 +615,10 @@ static irqreturn_t rcar_pcie_msi_irq(int irq, void *data)
return IRQ_HANDLED;
 }
 
-static int rcar_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev,
+static int rcar_msi_setup_irq(struct pci_dev *pdev,
  struct msi_desc *desc)
 {
+   struct msi_chip *chip = pdev-bus-msi;
struct rcar_msi *msi = to_rcar_msi(chip);
struct rcar_pcie *pcie = container_of(chip, struct rcar_pcie, msi.chip);
struct msi_msg msg;
@@ -645,10 +646,11 @@ static int rcar_msi_setup_irq(struct msi_chip *chip, 
struct pci_dev *pdev,
return 0;
 }
 
-static void rcar_msi_teardown_irq(struct msi_chip *chip, 

Re: [PATCH v1 09/21] Irq_remapping/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread Sergei Shtylyov

Hello.

On 9/5/2014 2:09 PM, Yijing Wang wrote:


Use MSI chip framework instead of arch MSI functions to configure
MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.



Signed-off-by: Yijing Wang wangyij...@huawei.com
---
  drivers/iommu/irq_remapping.c |8 +++-
  1 files changed, 7 insertions(+), 1 deletions(-)



diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 33c4395..e75026e 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c

[...]

@@ -165,9 +170,10 @@ static void __init irq_remapping_modify_x86_ops(void)
x86_io_apic_ops.set_affinity= set_remapped_irq_affinity;
x86_io_apic_ops.setup_entry = setup_ioapic_remapped_entry;
x86_io_apic_ops.eoi_ioapic_pin  = eoi_ioapic_pin_remapped;
-   x86_msi.setup_msi_irqs  = irq_remapping_setup_msi_irqs;
+   x86_msi.setup_msi_irqs  = irq_remapping_setup_msi_irqs;


   AFAICS, this change only converts tabs to spaces, so not needed at all.


x86_msi.setup_hpet_msi  = setup_hpet_msi_remapped;
x86_msi.compose_msi_msg = compose_remapped_msi_msg;
+   x86_msi_chip = remap_msi_chip;


   Please align = with the rest of assignments.

WBR, Sergei

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v1 15/21] Powerpc/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread Sergei Shtylyov

Hello.

On 9/5/2014 2:10 PM, Yijing Wang wrote:


Use MSI chip framework instead of arch MSI functions to configure
MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.



Signed-off-by: Yijing Wang wangyij...@huawei.com
---
  arch/powerpc/kernel/msi.c |   14 --
  1 files changed, 12 insertions(+), 2 deletions(-)



diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c
index 71bd161..01781a4 100644
--- a/arch/powerpc/kernel/msi.c
+++ b/arch/powerpc/kernel/msi.c

[...]

@@ -27,7 +27,17 @@ int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int 
type)
return ppc_md.setup_msi_irqs(dev, nvec, type);
  }

-void arch_teardown_msi_irqs(struct pci_dev *dev)
+static void ppc_teardown_msi_irqs(struct pci_dev *dev)


   Shouldn't this function take IRQ # instead?


  {
ppc_md.teardown_msi_irqs(dev);
  }
+
+static struct msi_chip ppc_msi_chip = {
+   .setup_irqs = ppc_setup_msi_irqs,
+   .teardown_irqs = ppc_teardown_msi_irqs,
+};
+
+struct msi_chip *arch_find_msi_chip(struct pci_dev *dev)
+{
+   return ppc_msi_chip;
+}


WBR, Sergei

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/2][v4] powerpc/fsl-booke: Add initial T1040/T1042 RDB board support

2014-09-05 Thread Priyanka Jain
T1040/T1042RDB is Freescale Reference Design Board.
The board can support both T1040/T1042 QorIQ Power Architecture™ processor.

T1040/T1042RDB board Overview
---
- SERDES Connections, 8 lanes supporting:
- PCI
- SGMII
- QSGMII
- SATA 2.0
- DDR Controller
- Supports rates of up to 1600 MHz data-rate
- Supports one DDR3LP UDIMM
-IFC/Local Bus
- NAND flash: 1GB 8-bit NAND flash
- NOR: 128MB 16-bit NOR Flash
- Ethernet
- Two on-board RGMII 10/100/1G ethernet ports.
- PHY #0 remains powered up during deep-sleep
- CPLD
- Clocks
- System and DDR clock (SYSCLK, “DDRCLK”)
- SERDES clocks
- Power Supplies
- USB
- Supports two USB 2.0 ports with integrated PHYs
- Two type A ports with 5V@1.5A per port.
- SDHC
- SDHC/SDXC connector
- SPI
- On-board 64MB SPI flash
- I2C
- Devices connected: EEPROM, thermal monitor, VID controller
- Other IO
- Two Serial ports
- ProfiBus port

Add support for T1040/T1042 RDB board:
-add device tree
-add entry in Kconfig to build
-Add entry in corenet_generic.c, as it is similar to other corenet platforms

Signed-off-by: Priyanka Jain priyanka.j...@freescale.com
Signed-off-by: Poonam Aggrwal poonam.aggr...@freescale.com
Signed-off-by: Prabhakar Kushwaha prabha...@freescale.com
---
changes for v4: Updated cpld compatible string field

changes for v3: Incorporated Scott comments on moving cpld compatible
 field to board specific file as cpld binaries are different

changes for v2: Incorporated Scott comments on using common name
 for compatible string for cpld as register set is same

 arch/powerpc/boot/dts/t1040rdb.dts|   48 
 arch/powerpc/boot/dts/t1042rdb.dts|   48 
 arch/powerpc/boot/dts/t104xrdb.dtsi   |  156 +
 arch/powerpc/platforms/85xx/Kconfig   |2 +-
 arch/powerpc/platforms/85xx/corenet_generic.c |2 +
 5 files changed, 255 insertions(+), 1 deletions(-)
 create mode 100644 arch/powerpc/boot/dts/t1040rdb.dts
 create mode 100644 arch/powerpc/boot/dts/t1042rdb.dts
 create mode 100644 arch/powerpc/boot/dts/t104xrdb.dtsi

diff --git a/arch/powerpc/boot/dts/t1040rdb.dts 
b/arch/powerpc/boot/dts/t1040rdb.dts
new file mode 100644
index 000..79a0bed
--- /dev/null
+++ b/arch/powerpc/boot/dts/t1040rdb.dts
@@ -0,0 +1,48 @@
+/*
+ * T1040RDB Device Tree Source
+ *
+ * Copyright 2014 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *  names of its contributors may be used to endorse or promote products
+ *  derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License (GPL) as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor AS IS AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/include/ fsl/t104xsi-pre.dtsi
+/include/ t104xrdb.dtsi
+
+/ {
+   model = fsl,T1040RDB;
+   compatible = fsl,T1040RDB;
+   ifc: localbus@ffe124000 {
+   cpld@3,0 {
+   compatible = fsl,t1040rdb-cpld;
+   };
+   };
+};
+
+/include/ fsl/t1040si-post.dtsi
diff --git a/arch/powerpc/boot/dts/t1042rdb.dts 
b/arch/powerpc/boot/dts/t1042rdb.dts
new file mode 100644
index 000..228a635
--- /dev/null
+++ b/arch/powerpc/boot/dts/t1042rdb.dts
@@ -0,0 +1,48 @@
+/*
+ * T1042RDB Device Tree Source
+ *
+ * Copyright 2014 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * 

[PATCH 2/2][v4] powerpc/fsl-booke: Add initial T1042RDB_PI board support

2014-09-05 Thread Priyanka Jain
T1042RDB_PI is Freescale Reference Design Board supporting the T1042
QorIQ Power Architecture™ processor. T1042 is a reduced personality
of T1040 SoC without Integrated 8-port Gigabit. The board is designed
with low power features targeted for Printing Image Market.

T1042RDB_PI is  similar to T1040RDB board with few differences like
it has video interface, supports T1042 personality only

T1042RDB_PI board Overview
---
- SERDES Connections, 8 lanes supporting:
- PCI
- SATA 2.0
- DDR Controller
- Supports rates of up to 1600 MHz data-rate
- Supports one DDR3LP UDIMM
-IFC/Local Bus
- NAND flash: 1GB 8-bit NAND flash
- NOR: 128MB 16-bit NOR Flash
- Ethernet
- Two on-board RGMII 10/100/1G ethernet ports.
- PHY #0 remains powered up during deep-sleep
- CPLD
- Clocks
- System and DDR clock (SYSCLK, “DDRCLK”)
- SERDES clocks
- Power Supplies
- USB
- Supports two USB 2.0 ports with integrated PHYs
- Two type A ports with 5V@1.5A per port.
- SDHC
- SDHC/SDXC connector
- SPI
- On-board 64MB SPI flash
- I2C
- Device connected: EEPROM, thermal monitor, VID controller, RTC
- Other IO
- Two Serial ports
- ProfiBus port

Add support for T1042RDB_PI board:
-add device tree
-Add entry in corenet_generic.c, as it is similar to other corenet platforms

Signed-off-by: Poonam Aggrwal poonam.aggr...@freescale.com
Signed-off-by: Prabhakar Kushwaha prabha...@freescale.com
Signed-off-by: Priyanka Jain priyanka.j...@freescale.com
---
changes for v4: Updated cpld compatible string field

changes for v3: Incorporated Scott comments on moving cpld compatible
 field to board specific file as cpld binaries are different

changes for v2: Incorporated Scott comments on using common name
 for compatible string for cpld as register set is same

 arch/powerpc/boot/dts/t1042rdb_pi.dts |   57 +
 arch/powerpc/platforms/85xx/corenet_generic.c |1 +
 2 files changed, 58 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/boot/dts/t1042rdb_pi.dts

diff --git a/arch/powerpc/boot/dts/t1042rdb_pi.dts 
b/arch/powerpc/boot/dts/t1042rdb_pi.dts
new file mode 100644
index 000..b9d0877
--- /dev/null
+++ b/arch/powerpc/boot/dts/t1042rdb_pi.dts
@@ -0,0 +1,57 @@
+/*
+ * T1042RDB_PI Device Tree Source
+ *
+ * Copyright 2014 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *  names of its contributors may be used to endorse or promote products
+ *  derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License (GPL) as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor AS IS AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/include/ fsl/t104xsi-pre.dtsi
+/include/ t104xrdb.dtsi
+
+/ {
+   model = fsl,T1042RDB_PI;
+   compatible = fsl,T1042RDB_PI;
+   ifc: localbus@ffe124000 {
+   cpld@3,0 {
+   compatible = fsl,t1042rdb_pi-cpld;
+   };
+   };
+   soc: soc@ffe00 {
+   i2c@118000 {
+   rtc@68 {
+   compatible = dallas,ds1337;
+   reg = 0x68;
+   interrupts = 0x2 0x1 0 0;
+   };
+   };
+   };
+};
+
+/include/ fsl/t1042si-post.dtsi
diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c 
b/arch/powerpc/platforms/85xx/corenet_generic.c
index c268f89..100e80d 100644

Re: [PATCH v1 15/21] Powerpc/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread Sergei Shtylyov

On 9/5/2014 3:33 PM, wangyijing wrote:


Use MSI chip framework instead of arch MSI functions to configure
MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.



Signed-off-by: Yijing Wang wangyij...@huawei.com
---
  arch/powerpc/kernel/msi.c |   14 --
  1 files changed, 12 insertions(+), 2 deletions(-)



diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c
index 71bd161..01781a4 100644
--- a/arch/powerpc/kernel/msi.c
+++ b/arch/powerpc/kernel/msi.c

[...]

@@ -27,7 +27,17 @@ int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int 
type)
  return ppc_md.setup_msi_irqs(dev, nvec, type);
  }

-void arch_teardown_msi_irqs(struct pci_dev *dev)
+static void ppc_teardown_msi_irqs(struct pci_dev *dev)



   Shouldn't this function take IRQ # instead?



This function need to teardown all msi irqs of the pci dev, we should pass the 
pci dev as argument .


   Ah, I've mixed up the teardown_irqs() method with teardown_irq()! Too 
similar. :-)



Thanks!
Yijing.


WBR, Sergei

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] deb-pkg: Add support for powerpc little endian

2014-09-05 Thread Thadeu Lima de Souza Cascardo
On Fri, Sep 05, 2014 at 05:55:18PM +1000, Michael Neuling wrote:
 On Fri, 2014-09-05 at 09:13 +0200, Gabriel Paubert wrote:
  On Fri, Sep 05, 2014 at 03:28:47PM +1000, Michael Neuling wrote:
   The Debian powerpc little endian architecture is called ppc64le.  This
  
  Huh? ppc64le or ppc64el?
 
 ppc64el.  Commit message is wrong.  Fixed below.
 
 Mikey
 
 

What about ppc64?

Also, I sent that already a month ago. Both linuxppc-dev and Michal
Marek were on cc.

http://marc.info/?l=linux-kernelm=140744360328562w=2

Cascardo.

 From: Michael Neuling mi...@neuling.org
 
 deb-pkg: Add support for powerpc little endian
 
 The Debian powerpc little endian architecture is called ppc64el.  This
 is the default architecture used by Ubuntu for powerpc.
 
 The below checks the kernel config to see if we are compiling little
 endian and sets the Debian arch appropriately.
 
 Signed-off-by: Michael Neuling mi...@neuling.org
 
 diff --git a/scripts/package/builddeb b/scripts/package/builddeb
 index 35d5a58..6f4a1af 100644
 --- a/scripts/package/builddeb
 +++ b/scripts/package/builddeb
 @@ -37,7 +37,7 @@ create_package() {
   s390*)
   debarch=s390$(grep -q CONFIG_64BIT=y $KCONFIG_CONFIG  echo x 
 || true) ;;
   ppc*)
 - debarch=powerpc ;;
 + debarch=$(grep -q CPU_LITTLE_ENDIAN=y $KCONFIG_CONFIG  echo 
 ppc64el || echo powerpc) ;;
   parisc*)
   debarch=hppa ;;
   mips*)
 
 
 
 ___
 Linuxppc-dev mailing list
 Linuxppc-dev@lists.ozlabs.org
 https://lists.ozlabs.org/listinfo/linuxppc-dev

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] deb-pkg: Add support for powerpc little endian

2014-09-05 Thread Josh Boyer
On Fri, Sep 5, 2014 at 3:55 AM, Michael Neuling mi...@neuling.org wrote:
 On Fri, 2014-09-05 at 09:13 +0200, Gabriel Paubert wrote:
 On Fri, Sep 05, 2014 at 03:28:47PM +1000, Michael Neuling wrote:
  The Debian powerpc little endian architecture is called ppc64le.  This

 Huh? ppc64le or ppc64el?

 ppc64el.  Commit message is wrong.  Fixed below.

Yay!  Just like every other architecture, we continue to have the deb
based distros call it one thing, and the RPM based distros call it
another.  At least we're consistent in our inconsistency.

josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Peter Hurley
[ +cc linux-arm ]

Hi David,

On 09/05/2014 04:30 AM, David Laight wrote:
 I've seen gcc generate 32bit accesses for 16bit structure members on arm.
 It does this because of the more limited range of the offsets for the 16bit 
 access.
 OTOH I don't know if it ever did this for writes - so it may be moot.

Can you recall the particulars, like what ARM config or what code?

I tried an overly-simple test to see if gcc would bump up to the word load for
the 12-bit offset mode, but it stuck with register offset rather than immediate
offset. [I used the compiler options for allmodconfig and a 4.8 cross-compiler.]

Maybe the test doesn't generate enough register pressure on the compiler?

Regards,
Peter Hurley

#define ARRAY_SIZE(x)  (sizeof(x)/sizeof((x)[0]))

struct x {
long unused[64];
short b[12];
int unused2[10];
short c;
};

void store_c(struct x *p, short a[]) {
int i;

for (i = 0; i  ARRAY_SIZE(p-b); i++)
p-b[i] = a[i];
p-c = 2;
}


void store_c(struct x *p, short a[]) {
   0:   e1a0c00dmov ip, sp
   4:   e3a03000mov r3, #0
   8:   e92dd800push{fp, ip, lr, pc}
   c:   e24cb004sub fp, ip, #4
int i;

for (i = 0; i  ARRAY_SIZE(p-b); i++)
p-b[i] = a[i];
  10:   e191c0b3ldrhip, [r1, r3]
  14:   e0802003add r2, r0, r3
  18:   e2822c01add r2, r2, #256; 0x100
  1c:   e2833002add r3, r3, #2
  20:   e3530018cmp r3, #24
  24:   e1c2c0b0strhip, [r2]
  28:   1af8bne 10 store_c+0x10
p-c = 2;
  2c:   e3a03d05mov r3, #320; 0x140
  30:   e3a02002mov r2, #2
  34:   e18020b3strhr2, [r0, r3]
  38:   e89da800ldm sp, {fp, sp, pc}

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: bit fields data tearing

2014-09-05 Thread David Laight
From: Peter Hurley
 [ +cc linux-arm ]
 
 Hi David,
 
 On 09/05/2014 04:30 AM, David Laight wrote:
  I've seen gcc generate 32bit accesses for 16bit structure members on arm.
  It does this because of the more limited range of the offsets for the 16bit 
  access.
  OTOH I don't know if it ever did this for writes - so it may be moot.
 
 Can you recall the particulars, like what ARM config or what code?
 
 I tried an overly-simple test to see if gcc would bump up to the word load for
 the 12-bit offset mode, but it stuck with register offset rather than 
 immediate
 offset. [I used the compiler options for allmodconfig and a 4.8 
 cross-compiler.]
 
 Maybe the test doesn't generate enough register pressure on the compiler?

Dunno, I would have been using a much older version of the compiler.
It is possible that it doesn't do it any more.
It might only have done it for loads.

The compiler used to use misaligned 32bit loads for structure
members on large 4n+2 byte boundaries as well.
I'm pretty sure it doesn't do that either.

There have been a lot of compiler versions since I was compiling
anything for arm.

David

 Regards,
 Peter Hurley
 
 #define ARRAY_SIZE(x)  (sizeof(x)/sizeof((x)[0]))
 
 struct x {
   long unused[64];
   short b[12];
   int unused2[10];
   short c;
 };
 
 void store_c(struct x *p, short a[]) {
   int i;
 
   for (i = 0; i  ARRAY_SIZE(p-b); i++)
   p-b[i] = a[i];
   p-c = 2;
 }
 
 
 void store_c(struct x *p, short a[]) {
0: e1a0c00dmov ip, sp
4: e3a03000mov r3, #0
8: e92dd800push{fp, ip, lr, pc}
c: e24cb004sub fp, ip, #4
   int i;
 
   for (i = 0; i  ARRAY_SIZE(p-b); i++)
   p-b[i] = a[i];
   10: e191c0b3ldrhip, [r1, r3]
   14: e0802003add r2, r0, r3
   18: e2822c01add r2, r2, #256; 0x100
   1c: e2833002add r3, r3, #2
   20: e3530018cmp r3, #24
   24: e1c2c0b0strhip, [r2]
   28: 1af8bne 10 store_c+0x10
   p-c = 2;
   2c: e3a03d05mov r3, #320; 0x140
   30: e3a02002mov r2, #2
   34: e18020b3strhr2, [r0, r3]
   38: e89da800ldm sp, {fp, sp, pc}

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] iommu/fsl: Fix warning resulting from adding PCI device twice

2014-09-05 Thread Joerg Roedel
On Thu, Sep 04, 2014 at 05:08:45PM +0530, Varun Sethi wrote:
 iommu_group_get_for_dev determines the iommu group for the PCI device and adds
 the device to the group.
 
 In the PAMU driver we were again adding the device to the same group without 
 checking
 if the device already had an iommu group. This resulted in the following 
 warning.

 [...]
 
 Signed-off-by: Varun Sethi varun.se...@freescale.com
 ---
 v2 changes
 - directly check for the device iommu_group
 
  drivers/iommu/fsl_pamu_domain.c |   10 --
  1 file changed, 8 insertions(+), 2 deletions(-)

Applied to iommu/fixes and added stable tag, thanks.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] pseries: Make CPU hotplug path endian safe

2014-09-05 Thread Nathan Fontenot
On 09/05/2014 04:16 AM, bharata@gmail.com wrote:
 From: Bharata B Rao bhar...@linux.vnet.ibm.com
 
 - ibm,rtas-configure-connector should treat the RTAS data as big endian.
 - Treat ibm,ppc-interrupt-server#s as big-endian when setting
   smp_processor_id during hotplug.
 
 Signed-off-by: Bharata B Rao bhar...@linux.vnet.ibm.com
 ---
  arch/powerpc/platforms/pseries/dlpar.c   | 10 +-
  arch/powerpc/platforms/pseries/hotplug-cpu.c |  4 ++--
  2 files changed, 7 insertions(+), 7 deletions(-)
 
 diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
 b/arch/powerpc/platforms/pseries/dlpar.c
 index 2d0b4d6..dc55f9c 100644
 --- a/arch/powerpc/platforms/pseries/dlpar.c
 +++ b/arch/powerpc/platforms/pseries/dlpar.c
 @@ -48,11 +48,11 @@ static struct property *dlpar_parse_cc_property(struct 
 cc_workarea *ccwa)
   if (!prop)
   return NULL;
  
 - name = (char *)ccwa + ccwa-name_offset;
 + name = (char *)ccwa + be32_to_cpu(ccwa-name_offset);
   prop-name = kstrdup(name, GFP_KERNEL);
  
 - prop-length = ccwa-prop_length;
 - value = (char *)ccwa + ccwa-prop_offset;
 + prop-length = be32_to_cpu(ccwa-prop_length);
 + value = (char *)ccwa + be32_to_cpu(ccwa-prop_offset);
   prop-value = kmemdup(value, prop-length, GFP_KERNEL);
   if (!prop-value) {
   dlpar_free_cc_property(prop);
 @@ -78,7 +78,7 @@ static struct device_node *dlpar_parse_cc_node(struct 
 cc_workarea *ccwa,
   if (!dn)
   return NULL;
  
 - name = (char *)ccwa + ccwa-name_offset;
 + name = (char *)ccwa + be32_to_cpu(ccwa-name_offset);
   dn-full_name = kasprintf(GFP_KERNEL, %s/%s, path, name);
   if (!dn-full_name) {
   kfree(dn);
 @@ -148,7 +148,7 @@ struct device_node *dlpar_configure_connector(u32 
 drc_index,
   return NULL;
  
   ccwa = (struct cc_workarea *)data_buf[0];
 - ccwa-drc_index = drc_index;
 + ccwa-drc_index = cpu_to_be32(drc_index);

I need to look at this some more but I think this may cause an issue for
partition migration. If I am following the code correctly, starting in
pseries_devicetree_update(), the drc_index value passed to 
dlpar_configure_connector is pulled directly out of a buffer we get from
firmware. This would mean the drc_index value is already in BE format.

Whereas for cpu hotplug the drc_index value is passed in from userspace
via the cpu probe interface in sysfs. I assume that you are seeing the
drc_index value getting passed in in LE format.

-Nathan

   ccwa-zero = 0;
  
   do {
 diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
 b/arch/powerpc/platforms/pseries/hotplug-cpu.c
 index 20d6297..447f8c6 100644
 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
 +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
 @@ -247,7 +247,7 @@ static int pseries_add_processor(struct device_node *np)
   unsigned int cpu;
   cpumask_var_t candidate_mask, tmp;
   int err = -ENOSPC, len, nthreads, i;
 - const u32 *intserv;
 + const __be32 *intserv;
  
   intserv = of_get_property(np, ibm,ppc-interrupt-server#s, len);
   if (!intserv)
 @@ -293,7 +293,7 @@ static int pseries_add_processor(struct device_node *np)
   for_each_cpu(cpu, tmp) {
   BUG_ON(cpu_present(cpu));
   set_cpu_present(cpu, true);
 - set_hard_smp_processor_id(cpu, *intserv++);
 + set_hard_smp_processor_id(cpu, be32_to_cpu(*intserv++));
   }
   err = 0;
  out_unlock:
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [Xen-devel] [PATCH v1 08/21] x86/xen/MSI: Use MSI chip framework to configure MSI/MSI-X irq

2014-09-05 Thread David Vrabel
On 05/09/14 11:09, Yijing Wang wrote:
 Use MSI chip framework instead of arch MSI functions to configure
 MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework.
[...]
 --- a/arch/x86/pci/xen.c
 +++ b/arch/x86/pci/xen.c
[...]
 @@ -418,9 +430,9 @@ int __init pci_xen_init(void)
  #endif
  
  #ifdef CONFIG_PCI_MSI
 - x86_msi.setup_msi_irqs = xen_setup_msi_irqs;
 - x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
 - x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
 + xen_msi_chip.setup_irqs = xen_setup_msi_irqs;
 + xen_msi_chip.teardown_irqs = xen_teardown_msi_irqs;
 + x86_msi_chip = xen_msi_chip;
   msi_chip.irq_mask = xen_nop_msi_mask;
   msi_chip.irq_unmask = xen_nop_msi_mask;

Why have these not been changed to set the x86_msi_chip.mask/unmask
fields instead?

David
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Peter Hurley
On 09/04/2014 10:08 PM, H. Peter Anvin wrote:
 On 09/04/2014 05:59 PM, Peter Hurley wrote:
 I have no idea how prevalent the ev56 is compared to the ev5.
 Still we're talking about a chip that came out in 1996.
 
 Ah yes, I stand corrected.  According to Wikipedia, the affected CPUs
 were all the 2106x CPUs (EV4, EV45, LCA4, LCA45) plus the 21164 with no
 suffix (EV5).  However, we're still talking about museum pieces here.
 
 I wonder what the one I have in my garage is... I'm sure I could emulate
 it faster, though.

Which is a bit ironic because I remember when Digital had a team
working on emulating native x86 apps on Alpha/NT.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] pseries: Make CPU hotplug path endian safe

2014-09-05 Thread Bharata B Rao
On Fri, Sep 5, 2014 at 7:38 PM, Nathan Fontenot
nf...@linux.vnet.ibm.com wrote:
 On 09/05/2014 04:16 AM, bharata@gmail.com wrote:
 From: Bharata B Rao bhar...@linux.vnet.ibm.com

 - ibm,rtas-configure-connector should treat the RTAS data as big endian.
 - Treat ibm,ppc-interrupt-server#s as big-endian when setting
   smp_processor_id during hotplug.

 Signed-off-by: Bharata B Rao bhar...@linux.vnet.ibm.com
 ---
  arch/powerpc/platforms/pseries/dlpar.c   | 10 +-
  arch/powerpc/platforms/pseries/hotplug-cpu.c |  4 ++--
  2 files changed, 7 insertions(+), 7 deletions(-)

 diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
 b/arch/powerpc/platforms/pseries/dlpar.c
 index 2d0b4d6..dc55f9c 100644
 --- a/arch/powerpc/platforms/pseries/dlpar.c
 +++ b/arch/powerpc/platforms/pseries/dlpar.c
 @@ -48,11 +48,11 @@ static struct property *dlpar_parse_cc_property(struct 
 cc_workarea *ccwa)
   if (!prop)
   return NULL;

 - name = (char *)ccwa + ccwa-name_offset;
 + name = (char *)ccwa + be32_to_cpu(ccwa-name_offset);
   prop-name = kstrdup(name, GFP_KERNEL);

 - prop-length = ccwa-prop_length;
 - value = (char *)ccwa + ccwa-prop_offset;
 + prop-length = be32_to_cpu(ccwa-prop_length);
 + value = (char *)ccwa + be32_to_cpu(ccwa-prop_offset);
   prop-value = kmemdup(value, prop-length, GFP_KERNEL);
   if (!prop-value) {
   dlpar_free_cc_property(prop);
 @@ -78,7 +78,7 @@ static struct device_node *dlpar_parse_cc_node(struct 
 cc_workarea *ccwa,
   if (!dn)
   return NULL;

 - name = (char *)ccwa + ccwa-name_offset;
 + name = (char *)ccwa + be32_to_cpu(ccwa-name_offset);
   dn-full_name = kasprintf(GFP_KERNEL, %s/%s, path, name);
   if (!dn-full_name) {
   kfree(dn);
 @@ -148,7 +148,7 @@ struct device_node *dlpar_configure_connector(u32 
 drc_index,
   return NULL;

   ccwa = (struct cc_workarea *)data_buf[0];
 - ccwa-drc_index = drc_index;
 + ccwa-drc_index = cpu_to_be32(drc_index);

 I need to look at this some more but I think this may cause an issue for
 partition migration. If I am following the code correctly, starting in
 pseries_devicetree_update(), the drc_index value passed to
 dlpar_configure_connector is pulled directly out of a buffer we get from
 firmware. This would mean the drc_index value is already in BE format.

Yes I see that now.


 Whereas for cpu hotplug the drc_index value is passed in from userspace
 via the cpu probe interface in sysfs. I assume that you are seeing the
 drc_index value getting passed in in LE format.

Yes I am seeing drc_index in LE format for an LE guest during CPU
hotplug operation.

Regards,
Bharata.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread H. Peter Anvin
On 09/05/2014 08:31 AM, Peter Hurley wrote:
 
 Which is a bit ironic because I remember when Digital had a team
 working on emulating native x86 apps on Alpha/NT.
 

Right, because the x86 architecture was obsolete and would never scale...

-hpa

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Peter Hurley
On 09/05/2014 08:37 AM, David Laight wrote:
 From: Peter Hurley
 On 09/05/2014 04:30 AM, David Laight wrote:
 I've seen gcc generate 32bit accesses for 16bit structure members on arm.
 It does this because of the more limited range of the offsets for the 16bit 
 access.
 OTOH I don't know if it ever did this for writes - so it may be moot.

 Can you recall the particulars, like what ARM config or what code?

 I tried an overly-simple test to see if gcc would bump up to the word load 
 for
 the 12-bit offset mode, but it stuck with register offset rather than 
 immediate
 offset. [I used the compiler options for allmodconfig and a 4.8 
 cross-compiler.]

 Maybe the test doesn't generate enough register pressure on the compiler?
 
 Dunno, I would have been using a much older version of the compiler.
 It is possible that it doesn't do it any more.
 It might only have done it for loads.
 
 The compiler used to use misaligned 32bit loads for structure
 members on large 4n+2 byte boundaries as well.
 I'm pretty sure it doesn't do that either.
 
 There have been a lot of compiler versions since I was compiling
 anything for arm.

Yeah, it seems gcc for ARM no longer uses the larger operand size as a
substitute for 12-bit immediate offset addressing mode, even for reads.

While this test:

struct x {
short b[12];
};

short load_b(struct x *p) {
return p-b[8];
}

generates the 8-bit immediate offset form,

short load_b(struct x *p) {
   0:   e1d001f0ldrsh   r0, [r0, #16]
   4:   e12fff1ebx  lr


pushing the offset out past 256:

struct x {
long unused[64];
short b[12];
};

short load_b(struct x *p) {
return p-b[8];
}

generates the register offset addressing mode instead of 12-bit immediate:

short load_b(struct x *p) {
   0:   e3a03e11mov r3, #272; 0x110
   4:   e19000f3ldrsh   r0, [r0, r3]
   8:   e12fff1ebx  lr

Regards,
Peter Hurley

[Note: I compiled without the frame pointer to simplify the code generation]
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Paul E. McKenney
On Fri, Sep 05, 2014 at 08:16:48PM +1200, Michael Cree wrote:
 On Thu, Sep 04, 2014 at 07:08:48PM -0700, H. Peter Anvin wrote:
  On 09/04/2014 05:59 PM, Peter Hurley wrote:
   I have no idea how prevalent the ev56 is compared to the ev5.
   Still we're talking about a chip that came out in 1996.
  
  Ah yes, I stand corrected.  According to Wikipedia, the affected CPUs
  were all the 2106x CPUs (EV4, EV45, LCA4, LCA45) plus the 21164 with no
  suffix (EV5).  However, we're still talking about museum pieces here.
 
 Yes, that is correct, EV56 is the first Alpha CPU to have the byte-word
 extension (BWX) CPU instructions.
 
 It would not worry me if the kernel decided to assume atomic aligned
 scalar accesses for all arches, thus terminating support for Alphas
 without BWX.
 
 The X server, ever since the libpciaccess change, does not work on
 Alphas without BWX.
 
 Debian Alpha (pretty much up to date at Debian-Ports) is still compiled
 for all Alphas, i.e., without BWX.  The last attempt to start compiling
 Debian Alpha with BWX, about three years ago when Alpha was kicked out
 to Debian-Ports resulted in a couple or so complaints so got nowhere.
 It's frustrating supporting the lowest common demoninator as many of
 the bugs specific to Alpha can be resolved by recompiling with the BWX.
 The kernel no longer supporting Alphas without BWX might just be the
 incentive we need to switch Debian Alpha to compiling with BWX.

Very good, then I update my patch as follows.  Thoughts?

Thanx, Paul



documentation: Record limitations of bitfields and small variables

This commit documents the fact that it is not safe to use bitfields as
shared variables in synchronization algorithms.  It also documents that
CPUs must provide one-byte and two-byte load and store instructions
in order to be supported by the Linux kernel.  (Michael Cree
has agreed to the resulting non-support of pre-EV56 Alpha CPUs:
https://lkml.org/lkml/2014/9/5/143.

Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com

diff --git a/Documentation/memory-barriers.txt 
b/Documentation/memory-barriers.txt
index 87be0a8a78de..455df6b298f7 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -269,6 +269,30 @@ And there are a number of things that _must_ or _must_not_ 
be assumed:
STORE *(A + 4) = Y; STORE *A = X;
STORE {*A, *(A + 4) } = {X, Y};
 
+And there are anti-guarantees:
+
+ (*) These guarantees do not apply to bitfields, because compilers often
+ generate code to modify these using non-atomic read-modify-write
+ sequences.  Do not attempt to use bitfields to synchronize parallel
+ algorithms.
+
+ (*) Even in cases where bitfields are protected by locks, all fields
+ in a given bitfield must be protected by one lock.  If two fields
+ in a given bitfield are protected by different locks, the compiler's
+ non-atomic read-modify-write sequences can cause an update to one
+ field to corrupt the value of an adjacent field.
+
+ (*) These guarantees apply only to properly aligned and sized scalar
+ variables.  Properly sized currently means variables that are the
+ same size as char, short, int and long.  Properly aligned
+ means the natural alignment, thus no constraints for char,
+ two-byte alignment for short, four-byte alignment for int,
+ and either four-byte or eight-byte alignment for long, on 32-bit
+ and 64-bit systems, respectively.  Note that this means that the
+ Linux kernel does not support pre-EV56 Alpha CPUs, because these
+ older CPUs do not provide one-byte and two-byte loads and stores.
+ Alpha EV56 and later Alpha CPUs are still supported.
+
 
 =
 WHAT ARE MEMORY BARRIERS?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Paul E. McKenney
On Fri, Sep 05, 2014 at 11:09:50AM -0700, Paul E. McKenney wrote:
 On Fri, Sep 05, 2014 at 08:16:48PM +1200, Michael Cree wrote:
  On Thu, Sep 04, 2014 at 07:08:48PM -0700, H. Peter Anvin wrote:
   On 09/04/2014 05:59 PM, Peter Hurley wrote:
I have no idea how prevalent the ev56 is compared to the ev5.
Still we're talking about a chip that came out in 1996.
   
   Ah yes, I stand corrected.  According to Wikipedia, the affected CPUs
   were all the 2106x CPUs (EV4, EV45, LCA4, LCA45) plus the 21164 with no
   suffix (EV5).  However, we're still talking about museum pieces here.
  
  Yes, that is correct, EV56 is the first Alpha CPU to have the byte-word
  extension (BWX) CPU instructions.
  
  It would not worry me if the kernel decided to assume atomic aligned
  scalar accesses for all arches, thus terminating support for Alphas
  without BWX.
  
  The X server, ever since the libpciaccess change, does not work on
  Alphas without BWX.
  
  Debian Alpha (pretty much up to date at Debian-Ports) is still compiled
  for all Alphas, i.e., without BWX.  The last attempt to start compiling
  Debian Alpha with BWX, about three years ago when Alpha was kicked out
  to Debian-Ports resulted in a couple or so complaints so got nowhere.
  It's frustrating supporting the lowest common demoninator as many of
  the bugs specific to Alpha can be resolved by recompiling with the BWX.
  The kernel no longer supporting Alphas without BWX might just be the
  incentive we need to switch Debian Alpha to compiling with BWX.
 
 Very good, then I update my patch as follows.  Thoughts?

And, while I am at it, fix smp_load_acquire() and smp_store_release()
to allow single-byte and double-byte accesses.  (Adding Peter Zijlstra
on CC.)

Thanx, Paul



compiler: Allow 1- and 2-byte smp_load_acquire() and smp_store_release()

CPUs without single-byte and double-byte loads and stores place some
interesting requirements on concurrent code.  For example (adapted
from Peter Hurley's test code), suppose we have the following structure:

struct foo {
spinlock_t lock1;
spinlock_t lock2;
char a; /* Protected by lock1. */
char b; /* Protected by lock2. */
};
struct foo *foop;

Of course, it is common (and good) practice to place data protected
by different locks in separate cache lines.  However, if the locks are
rarely acquired (for example, only in rare error cases), and there are
a great many instances of the data structure, then memory footprint can
trump false-sharing concerns, so that it can be better to place them in
the same cache cache line as above.

But if the CPU does not support single-byte loads and stores, a store
to foop-a will do a non-atomic read-modify-write operation on foop-b,
which will come as a nasty surprise to someone holding foop-lock2.  So we
now require CPUs to support single-byte and double-byte loads and stores.
Therefore, this commit adjusts the definition of __native_word() to allow
these sizes to be used by smp_load_acquire() and smp_store_release().

Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com
Cc: Peter Zijlstra pet...@infradead.org

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index d5ad7b1118fc..934a834ab9f9 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -311,7 +311,7 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int 
val, int expect);
 
 /* Is this type a native word size -- useful for atomic operations */
 #ifndef __native_word
-# define __native_word(t) (sizeof(t) == sizeof(int) || sizeof(t) == 
sizeof(long))
+# define __native_word(t) (sizeof(t) == sizeof(char) || sizeof(t) == 
sizeof(short) || sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
 #endif
 
 /* Compile time object size, -1 for unknown */

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Peter Hurley
On 09/05/2014 02:09 PM, Paul E. McKenney wrote:
 On Fri, Sep 05, 2014 at 08:16:48PM +1200, Michael Cree wrote:
 On Thu, Sep 04, 2014 at 07:08:48PM -0700, H. Peter Anvin wrote:
 On 09/04/2014 05:59 PM, Peter Hurley wrote:
 I have no idea how prevalent the ev56 is compared to the ev5.
 Still we're talking about a chip that came out in 1996.

 Ah yes, I stand corrected.  According to Wikipedia, the affected CPUs
 were all the 2106x CPUs (EV4, EV45, LCA4, LCA45) plus the 21164 with no
 suffix (EV5).  However, we're still talking about museum pieces here.

 Yes, that is correct, EV56 is the first Alpha CPU to have the byte-word
 extension (BWX) CPU instructions.

 It would not worry me if the kernel decided to assume atomic aligned
 scalar accesses for all arches, thus terminating support for Alphas
 without BWX.

 The X server, ever since the libpciaccess change, does not work on
 Alphas without BWX.

 Debian Alpha (pretty much up to date at Debian-Ports) is still compiled
 for all Alphas, i.e., without BWX.  The last attempt to start compiling
 Debian Alpha with BWX, about three years ago when Alpha was kicked out
 to Debian-Ports resulted in a couple or so complaints so got nowhere.
 It's frustrating supporting the lowest common demoninator as many of
 the bugs specific to Alpha can be resolved by recompiling with the BWX.
 The kernel no longer supporting Alphas without BWX might just be the
 incentive we need to switch Debian Alpha to compiling with BWX.
 
 Very good, then I update my patch as follows.  Thoughts?
 
   Thanx, Paul

Minor [optional] edits.

Thanks,
Peter Hurley

 
 
 documentation: Record limitations of bitfields and small variables
 
 This commit documents the fact that it is not safe to use bitfields as
 shared variables in synchronization algorithms.  It also documents that
 CPUs must provide one-byte and two-byte load and store instructions
   ^
atomic
 in order to be supported by the Linux kernel.  (Michael Cree
 has agreed to the resulting non-support of pre-EV56 Alpha CPUs:
 https://lkml.org/lkml/2014/9/5/143.
 
 Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com
 
 diff --git a/Documentation/memory-barriers.txt 
 b/Documentation/memory-barriers.txt
 index 87be0a8a78de..455df6b298f7 100644
 --- a/Documentation/memory-barriers.txt
 +++ b/Documentation/memory-barriers.txt
 @@ -269,6 +269,30 @@ And there are a number of things that _must_ or 
 _must_not_ be assumed:
   STORE *(A + 4) = Y; STORE *A = X;
   STORE {*A, *(A + 4) } = {X, Y};
  
 +And there are anti-guarantees:
 +
 + (*) These guarantees do not apply to bitfields, because compilers often
 + generate code to modify these using non-atomic read-modify-write
 + sequences.  Do not attempt to use bitfields to synchronize parallel
 + algorithms.
 +
 + (*) Even in cases where bitfields are protected by locks, all fields
 + in a given bitfield must be protected by one lock.  If two fields
 + in a given bitfield are protected by different locks, the compiler's
 + non-atomic read-modify-write sequences can cause an update to one
 + field to corrupt the value of an adjacent field.
 +
 + (*) These guarantees apply only to properly aligned and sized scalar
 + variables.  Properly sized currently means variables that are the
 + same size as char, short, int and long.  Properly aligned
 + means the natural alignment, thus no constraints for char,
 + two-byte alignment for short, four-byte alignment for int,
 + and either four-byte or eight-byte alignment for long, on 32-bit
 + and 64-bit systems, respectively.  Note that this means that the
 + Linux kernel does not support pre-EV56 Alpha CPUs, because these
 + older CPUs do not provide one-byte and two-byte loads and stores.
 ^
non-atomic
 + Alpha EV56 and later Alpha CPUs are still supported.
 +
  
  =
  WHAT ARE MEMORY BARRIERS?
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Paul E. McKenney
On Fri, Sep 05, 2014 at 02:50:31PM -0400, Peter Hurley wrote:
 On 09/05/2014 02:09 PM, Paul E. McKenney wrote:
  On Fri, Sep 05, 2014 at 08:16:48PM +1200, Michael Cree wrote:
  On Thu, Sep 04, 2014 at 07:08:48PM -0700, H. Peter Anvin wrote:
  On 09/04/2014 05:59 PM, Peter Hurley wrote:
  I have no idea how prevalent the ev56 is compared to the ev5.
  Still we're talking about a chip that came out in 1996.
 
  Ah yes, I stand corrected.  According to Wikipedia, the affected CPUs
  were all the 2106x CPUs (EV4, EV45, LCA4, LCA45) plus the 21164 with no
  suffix (EV5).  However, we're still talking about museum pieces here.
 
  Yes, that is correct, EV56 is the first Alpha CPU to have the byte-word
  extension (BWX) CPU instructions.
 
  It would not worry me if the kernel decided to assume atomic aligned
  scalar accesses for all arches, thus terminating support for Alphas
  without BWX.
 
  The X server, ever since the libpciaccess change, does not work on
  Alphas without BWX.
 
  Debian Alpha (pretty much up to date at Debian-Ports) is still compiled
  for all Alphas, i.e., without BWX.  The last attempt to start compiling
  Debian Alpha with BWX, about three years ago when Alpha was kicked out
  to Debian-Ports resulted in a couple or so complaints so got nowhere.
  It's frustrating supporting the lowest common demoninator as many of
  the bugs specific to Alpha can be resolved by recompiling with the BWX.
  The kernel no longer supporting Alphas without BWX might just be the
  incentive we need to switch Debian Alpha to compiling with BWX.
  
  Very good, then I update my patch as follows.  Thoughts?
  
  Thanx, Paul
 
 Minor [optional] edits.
 
 Thanks,
 Peter Hurley
 
  
  
  documentation: Record limitations of bitfields and small variables
  
  This commit documents the fact that it is not safe to use bitfields as
  shared variables in synchronization algorithms.  It also documents that
  CPUs must provide one-byte and two-byte load and store instructions
^
 atomic

Here you meant non-atomic?  My guess is that you are referring to the
fact that you could emulate a one-byte store on pre-EV56 Alpha CPUs
using the ll and sc atomic-read-modify-write instructions, correct?

  in order to be supported by the Linux kernel.  (Michael Cree
  has agreed to the resulting non-support of pre-EV56 Alpha CPUs:
  https://lkml.org/lkml/2014/9/5/143.
  
  Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com
  
  diff --git a/Documentation/memory-barriers.txt 
  b/Documentation/memory-barriers.txt
  index 87be0a8a78de..455df6b298f7 100644
  --- a/Documentation/memory-barriers.txt
  +++ b/Documentation/memory-barriers.txt
  @@ -269,6 +269,30 @@ And there are a number of things that _must_ or 
  _must_not_ be assumed:
  STORE *(A + 4) = Y; STORE *A = X;
  STORE {*A, *(A + 4) } = {X, Y};
   
  +And there are anti-guarantees:
  +
  + (*) These guarantees do not apply to bitfields, because compilers often
  + generate code to modify these using non-atomic read-modify-write
  + sequences.  Do not attempt to use bitfields to synchronize parallel
  + algorithms.
  +
  + (*) Even in cases where bitfields are protected by locks, all fields
  + in a given bitfield must be protected by one lock.  If two fields
  + in a given bitfield are protected by different locks, the compiler's
  + non-atomic read-modify-write sequences can cause an update to one
  + field to corrupt the value of an adjacent field.
  +
  + (*) These guarantees apply only to properly aligned and sized scalar
  + variables.  Properly sized currently means variables that are the
  + same size as char, short, int and long.  Properly aligned
  + means the natural alignment, thus no constraints for char,
  + two-byte alignment for short, four-byte alignment for int,
  + and either four-byte or eight-byte alignment for long, on 32-bit
  + and 64-bit systems, respectively.  Note that this means that the
  + Linux kernel does not support pre-EV56 Alpha CPUs, because these
  + older CPUs do not provide one-byte and two-byte loads and stores.
  ^
 non-atomic

I took this, thank you!

Thanx, Paul

  + Alpha EV56 and later Alpha CPUs are still supported.
  +
   
   =
   WHAT ARE MEMORY BARRIERS?
  
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] pseries: Fix endianness in cpu hotplug and hotremove

2014-09-05 Thread Thomas Falcon
This patch attempts to ensure that all values are in the proper
endianness format when both hotadding and hotremoving cpus.

Signed-off-by: Thomas Falcon tlfal...@linux.vnet.ibm.com
---
 arch/powerpc/platforms/pseries/dlpar.c   | 56 ++--
 arch/powerpc/platforms/pseries/hotplug-cpu.c | 20 +-
 2 files changed, 38 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index a2450b8..c1d7e40 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -24,11 +24,11 @@
 #include asm/rtas.h
 
 struct cc_workarea {
-   u32 drc_index;
-   u32 zero;
-   u32 name_offset;
-   u32 prop_length;
-   u32 prop_offset;
+   __be32  drc_index;
+   __be32  zero;
+   __be32  name_offset;
+   __be32  prop_length;
+   __be32  prop_offset;
 };
 
 void dlpar_free_cc_property(struct property *prop)
@@ -48,11 +48,11 @@ static struct property *dlpar_parse_cc_property(struct 
cc_workarea *ccwa)
if (!prop)
return NULL;
 
-   name = (char *)ccwa + ccwa-name_offset;
+   name = (char *)ccwa + be32_to_cpu(ccwa-name_offset);
prop-name = kstrdup(name, GFP_KERNEL);
 
-   prop-length = ccwa-prop_length;
-   value = (char *)ccwa + ccwa-prop_offset;
+   prop-length = be32_to_cpu(ccwa-prop_length);
+   value = (char *)ccwa + be32_to_cpu(ccwa-prop_offset);
prop-value = kmemdup(value, prop-length, GFP_KERNEL);
if (!prop-value) {
dlpar_free_cc_property(prop);
@@ -78,7 +78,7 @@ static struct device_node *dlpar_parse_cc_node(struct 
cc_workarea *ccwa,
if (!dn)
return NULL;
 
-   name = (char *)ccwa + ccwa-name_offset;
+   name = (char *)ccwa + be32_to_cpu(ccwa-name_offset);
dn-full_name = kasprintf(GFP_KERNEL, %s/%s, path, name);
if (!dn-full_name) {
kfree(dn);
@@ -148,7 +148,7 @@ struct device_node *dlpar_configure_connector(u32 drc_index,
return NULL;
 
ccwa = (struct cc_workarea *)data_buf[0];
-   ccwa-drc_index = drc_index;
+   ccwa-drc_index = cpu_to_be32(drc_index);
ccwa-zero = 0;
 
do {
@@ -363,10 +363,10 @@ static int dlpar_online_cpu(struct device_node *dn)
int rc = 0;
unsigned int cpu;
int len, nthreads, i;
-   const u32 *intserv;
+   const __be32 *intserv_be;
 
-   intserv = of_get_property(dn, ibm,ppc-interrupt-server#s, len);
-   if (!intserv)
+   intserv_be = of_get_property(dn, ibm,ppc-interrupt-server#s, len);
+   if (!intserv_be)
return -EINVAL;
 
nthreads = len / sizeof(u32);
@@ -374,7 +374,7 @@ static int dlpar_online_cpu(struct device_node *dn)
cpu_maps_update_begin();
for (i = 0; i  nthreads; i++) {
for_each_present_cpu(cpu) {
-   if (get_hard_smp_processor_id(cpu) != intserv[i])
+   if (get_hard_smp_processor_id(cpu) != 
be32_to_cpu(intserv_be[i]))
continue;
BUG_ON(get_cpu_current_state(cpu)
!= CPU_STATE_OFFLINE);
@@ -388,7 +388,7 @@ static int dlpar_online_cpu(struct device_node *dn)
}
if (cpu == num_possible_cpus())
printk(KERN_WARNING Could not find cpu to online 
-  with physical id 0x%x\n, intserv[i]);
+  with physical id 0x%x\n, 
be32_to_cpu(intserv_be[i]));
}
cpu_maps_update_done();
 
@@ -442,18 +442,17 @@ static int dlpar_offline_cpu(struct device_node *dn)
int rc = 0;
unsigned int cpu;
int len, nthreads, i;
-   const u32 *intserv;
+   const __be32 *intserv_be;
 
-   intserv = of_get_property(dn, ibm,ppc-interrupt-server#s, len);
-   if (!intserv)
+   intserv_be = of_get_property(dn, ibm,ppc-interrupt-server#s, len);
+   if (!intserv_be)
return -EINVAL;
 
nthreads = len / sizeof(u32);
-
cpu_maps_update_begin();
for (i = 0; i  nthreads; i++) {
for_each_present_cpu(cpu) {
-   if (get_hard_smp_processor_id(cpu) != intserv[i])
+   if (get_hard_smp_processor_id(cpu) != 
be32_to_cpu(intserv_be[i]))
continue;
 
if (get_cpu_current_state(cpu) == CPU_STATE_OFFLINE)
@@ -469,20 +468,19 @@ static int dlpar_offline_cpu(struct device_node *dn)
break;
 
}
-
/*
 * The cpu is in CPU_STATE_INACTIVE.
 * Upgrade it's state to CPU_STATE_OFFLINE.
 */
set_preferred_offline_state(cpu, CPU_STATE_OFFLINE);
-

Re: bit fields data tearing

2014-09-05 Thread Peter Hurley
On 09/05/2014 03:05 PM, Paul E. McKenney wrote:
 On Fri, Sep 05, 2014 at 02:50:31PM -0400, Peter Hurley wrote:
 On 09/05/2014 02:09 PM, Paul E. McKenney wrote:

[cut]

 

 documentation: Record limitations of bitfields and small variables

 This commit documents the fact that it is not safe to use bitfields as
 shared variables in synchronization algorithms.  It also documents that
 CPUs must provide one-byte and two-byte load and store instructions
^
 atomic
 
 Here you meant non-atomic?  My guess is that you are referring to the
 fact that you could emulate a one-byte store on pre-EV56 Alpha CPUs
 using the ll and sc atomic-read-modify-write instructions, correct?

Yes, that's what I meant. I must be tired and am misreading the commit
message, or misinterpreting it's meaning.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Peter Zijlstra
On Fri, Sep 05, 2014 at 11:31:09AM -0700, Paul E. McKenney wrote:
 compiler: Allow 1- and 2-byte smp_load_acquire() and smp_store_release()
 
 CPUs without single-byte and double-byte loads and stores place some
 interesting requirements on concurrent code.  For example (adapted
 from Peter Hurley's test code), suppose we have the following structure:
 
   struct foo {
   spinlock_t lock1;
   spinlock_t lock2;
   char a; /* Protected by lock1. */
   char b; /* Protected by lock2. */
   };
   struct foo *foop;
 
 Of course, it is common (and good) practice to place data protected
 by different locks in separate cache lines.  However, if the locks are
 rarely acquired (for example, only in rare error cases), and there are
 a great many instances of the data structure, then memory footprint can
 trump false-sharing concerns, so that it can be better to place them in
 the same cache cache line as above.
 
 But if the CPU does not support single-byte loads and stores, a store
 to foop-a will do a non-atomic read-modify-write operation on foop-b,
 which will come as a nasty surprise to someone holding foop-lock2.  So we
 now require CPUs to support single-byte and double-byte loads and stores.
 Therefore, this commit adjusts the definition of __native_word() to allow
 these sizes to be used by smp_load_acquire() and smp_store_release().

So does this patch depends on a patch that removes pre EV56 alpha
support? I'm all for removing that, but I need to see the patch merged
before we can do this.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Peter Hurley
On 09/05/2014 03:52 PM, Peter Zijlstra wrote:
 On Fri, Sep 05, 2014 at 11:31:09AM -0700, Paul E. McKenney wrote:
 compiler: Allow 1- and 2-byte smp_load_acquire() and smp_store_release()

 CPUs without single-byte and double-byte loads and stores place some
 interesting requirements on concurrent code.  For example (adapted
 from Peter Hurley's test code), suppose we have the following structure:
 
  struct foo {
  spinlock_t lock1;
  spinlock_t lock2;
  char a; /* Protected by lock1. */
  char b; /* Protected by lock2. */
  };
  struct foo *foop;
 
 Of course, it is common (and good) practice to place data protected
 by different locks in separate cache lines.  However, if the locks are
 rarely acquired (for example, only in rare error cases), and there are
 a great many instances of the data structure, then memory footprint can
 trump false-sharing concerns, so that it can be better to place them in
 the same cache cache line as above.

 But if the CPU does not support single-byte loads and stores, a store
 to foop-a will do a non-atomic read-modify-write operation on foop-b,
 which will come as a nasty surprise to someone holding foop-lock2.  So we
 now require CPUs to support single-byte and double-byte loads and stores.
 Therefore, this commit adjusts the definition of __native_word() to allow
 these sizes to be used by smp_load_acquire() and smp_store_release().
 
 So does this patch depends on a patch that removes pre EV56 alpha
 support? I'm all for removing that, but I need to see the patch merged
 before we can do this.

I'm working on that but Alpha's Kconfig is not quite straightforward.


... and I'm wondering if I should _remove_ pre-EV56 configurations or
move the default choice and produce a warning about unsupported Alpha
CPUs instead?

Regards,
Peter Hurley

[ How does one do a red popup in kbuild?
  The 'comment' approach is too subtle.
]



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Paul E. McKenney
On Fri, Sep 05, 2014 at 03:24:35PM -0400, Peter Hurley wrote:
 On 09/05/2014 03:05 PM, Paul E. McKenney wrote:
  On Fri, Sep 05, 2014 at 02:50:31PM -0400, Peter Hurley wrote:
  On 09/05/2014 02:09 PM, Paul E. McKenney wrote:
 
 [cut]
 
  
 
  documentation: Record limitations of bitfields and small variables
 
  This commit documents the fact that it is not safe to use bitfields as
  shared variables in synchronization algorithms.  It also documents that
  CPUs must provide one-byte and two-byte load and store instructions
 ^
  atomic
  
  Here you meant non-atomic?  My guess is that you are referring to the
  fact that you could emulate a one-byte store on pre-EV56 Alpha CPUs
  using the ll and sc atomic-read-modify-write instructions, correct?
 
 Yes, that's what I meant. I must be tired and am misreading the commit
 message, or misinterpreting it's meaning.

Very good, got it!

Thanx, Paul

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Peter Zijlstra
On Fri, Sep 05, 2014 at 04:01:35PM -0400, Peter Hurley wrote:
  So does this patch depends on a patch that removes pre EV56 alpha
  support? I'm all for removing that, but I need to see the patch merged
  before we can do this.
 
 I'm working on that but Alpha's Kconfig is not quite straightforward.
 
 
 ... and I'm wondering if I should _remove_ pre-EV56 configurations or
 move the default choice and produce a warning about unsupported Alpha
 CPUs instead?
 

 depends BROKEN 

or is that deprecated?
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Peter Hurley
On 09/05/2014 03:38 PM, Marc Gauthier wrote:
 Paul E. McKenney wrote:
 On Fri, Sep 05, 2014 at 02:50:31PM -0400, Peter Hurley wrote:
 On 09/05/2014 02:09 PM, Paul E. McKenney wrote:
 This commit documents the fact that it is not safe to use bitfields as
 shared variables in synchronization algorithms.  It also documents that
 CPUs must provide one-byte and two-byte load and store instructions
^
 atomic

 Here you meant non-atomic?  My guess is that you are referring to the
 fact that you could emulate a one-byte store on pre-EV56 Alpha CPUs
 using the ll and sc atomic-read-modify-write instructions, correct?

 in order to be supported by the Linux kernel.  (Michael Cree
 has agreed to the resulting non-support of pre-EV56 Alpha CPUs:
 https://lkml.org/lkml/2014/9/5/143.
 [...]
 
 + and 64-bit systems, respectively.  Note that this means that the
 + Linux kernel does not support pre-EV56 Alpha CPUs, because these
 + older CPUs do not provide one-byte and two-byte loads and stores.
  ^
 non-atomic

 I took this, thank you!
 
 Eum, am I totally lost, or aren't both of these supposed to say atomic ?
 
 Can't imagine requiring a CPU to provide non-atomic loads and stores
 (i.e. requiring old Alpha behavior?).

Here's how I read the two statements.

First, the commit message:

It [this commit] documents that CPUs [supported by the Linux kernel]
_must provide_ atomic one-byte and two-byte naturally aligned loads and stores.

Second, in the body of the document:

The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these
older CPUs _do not provide_ atomic one-byte and two-byte loads and stores.

Regards,
Peter Hurley

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread H. Peter Anvin
On 09/05/2014 01:12 PM, Peter Zijlstra wrote:

 ... and I'm wondering if I should _remove_ pre-EV56 configurations or
 move the default choice and produce a warning about unsupported Alpha
 CPUs instead?

 
  depends BROKEN 
 
 or is that deprecated?
 

Just rip it out, like I did for the i386.

-hpa

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Paul E. McKenney
On Fri, Sep 05, 2014 at 04:01:35PM -0400, Peter Hurley wrote:
 On 09/05/2014 03:52 PM, Peter Zijlstra wrote:
  On Fri, Sep 05, 2014 at 11:31:09AM -0700, Paul E. McKenney wrote:
  compiler: Allow 1- and 2-byte smp_load_acquire() and smp_store_release()
 
  CPUs without single-byte and double-byte loads and stores place some
  interesting requirements on concurrent code.  For example (adapted
  from Peter Hurley's test code), suppose we have the following structure:
  
 struct foo {
 spinlock_t lock1;
 spinlock_t lock2;
 char a; /* Protected by lock1. */
 char b; /* Protected by lock2. */
 };
 struct foo *foop;
  
  Of course, it is common (and good) practice to place data protected
  by different locks in separate cache lines.  However, if the locks are
  rarely acquired (for example, only in rare error cases), and there are
  a great many instances of the data structure, then memory footprint can
  trump false-sharing concerns, so that it can be better to place them in
  the same cache cache line as above.
 
  But if the CPU does not support single-byte loads and stores, a store
  to foop-a will do a non-atomic read-modify-write operation on foop-b,
  which will come as a nasty surprise to someone holding foop-lock2.  So we
  now require CPUs to support single-byte and double-byte loads and stores.
  Therefore, this commit adjusts the definition of __native_word() to allow
  these sizes to be used by smp_load_acquire() and smp_store_release().
  
  So does this patch depends on a patch that removes pre EV56 alpha
  support? I'm all for removing that, but I need to see the patch merged
  before we can do this.
 
 I'm working on that but Alpha's Kconfig is not quite straightforward.
 
 
 ... and I'm wondering if I should _remove_ pre-EV56 configurations or
 move the default choice and produce a warning about unsupported Alpha
 CPUs instead?

I suspect that either would work, given that the Alpha community is
pretty close-knit.  Just setting the appropriate flag to make the
compiler generate one-byte and two-byte loads and stores would
probably suffice.  ;-)

Thanx, Paul

 Regards,
 Peter Hurley
 
 [ How does one do a red popup in kbuild?
   The 'comment' approach is too subtle.
 ]
 
 
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread H. Peter Anvin
On 09/05/2014 01:14 PM, Peter Hurley wrote:
 
 Here's how I read the two statements.
 
 First, the commit message:
 
 It [this commit] documents that CPUs [supported by the Linux kernel]
 _must provide_ atomic one-byte and two-byte naturally aligned loads and 
 stores.
 
 Second, in the body of the document:
 
 The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these
 older CPUs _do not provide_ atomic one-byte and two-byte loads and stores.
 

Does this apply in general or only to SMP configurations?  I guess
non-SMP configurations would still have problems if interrupted in the
wrong place...

-hpa

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Michael Cree
On Fri, Sep 05, 2014 at 04:14:48PM -0400, Peter Hurley wrote:
 Second, in the body of the document:
 
 The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these
 older CPUs _do not provide_ atomic one-byte and two-byte loads and stores.

Let's be clear here, the pre-EV56 Alpha CPUs do provide an atomic
one-byte and two-byte load and store; it's just that one must use
locked load and store sequences to achieve atomicity.  The point,
I think, is that the pre-EV56 Alpha CPUs provide non-atomic one-byte
and two-byte load and stores as the norm, and that is the problem.

Cheers
Michael.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Paul E. McKenney
On Fri, Sep 05, 2014 at 04:14:48PM -0400, Peter Hurley wrote:
 On 09/05/2014 03:38 PM, Marc Gauthier wrote:
  Paul E. McKenney wrote:
  On Fri, Sep 05, 2014 at 02:50:31PM -0400, Peter Hurley wrote:
  On 09/05/2014 02:09 PM, Paul E. McKenney wrote:
  This commit documents the fact that it is not safe to use bitfields as
  shared variables in synchronization algorithms.  It also documents that
  CPUs must provide one-byte and two-byte load and store instructions
 ^
  atomic
 
  Here you meant non-atomic?  My guess is that you are referring to the
  fact that you could emulate a one-byte store on pre-EV56 Alpha CPUs
  using the ll and sc atomic-read-modify-write instructions, correct?
 
  in order to be supported by the Linux kernel.  (Michael Cree
  has agreed to the resulting non-support of pre-EV56 Alpha CPUs:
  https://lkml.org/lkml/2014/9/5/143.
  [...]
  
  + and 64-bit systems, respectively.  Note that this means that the
  + Linux kernel does not support pre-EV56 Alpha CPUs, because these
  + older CPUs do not provide one-byte and two-byte loads and stores.
   ^
  non-atomic
 
  I took this, thank you!
  
  Eum, am I totally lost, or aren't both of these supposed to say atomic ?
  
  Can't imagine requiring a CPU to provide non-atomic loads and stores
  (i.e. requiring old Alpha behavior?).
 
 Here's how I read the two statements.
 
 First, the commit message:
 
 It [this commit] documents that CPUs [supported by the Linux kernel]
 _must provide_ atomic one-byte and two-byte naturally aligned loads and 
 stores.
 
 Second, in the body of the document:
 
 The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these
 older CPUs _do not provide_ atomic one-byte and two-byte loads and stores.

Hmmm...  It is a bit ambiguous.  How about the following?

Thanx, Paul



documentation: Record limitations of bitfields and small variables

This commit documents the fact that it is not safe to use bitfields
as shared variables in synchronization algorithms.  It also documents
that CPUs must provide one-byte and two-byte normal load and store
instructions in order to be supported by the Linux kernel.  (Michael
Cree has agreed to the resulting non-support of pre-EV56 Alpha CPUs:
https://lkml.org/lkml/2014/9/5/143.)

Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com

diff --git a/Documentation/memory-barriers.txt 
b/Documentation/memory-barriers.txt
index 87be0a8a78de..fe4d51b704c5 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -269,6 +269,37 @@ And there are a number of things that _must_ or _must_not_ 
be assumed:
STORE *(A + 4) = Y; STORE *A = X;
STORE {*A, *(A + 4) } = {X, Y};
 
+And there are anti-guarantees:
+
+ (*) These guarantees do not apply to bitfields, because compilers often
+ generate code to modify these using non-atomic read-modify-write
+ sequences.  Do not attempt to use bitfields to synchronize parallel
+ algorithms.
+
+ (*) Even in cases where bitfields are protected by locks, all fields
+ in a given bitfield must be protected by one lock.  If two fields
+ in a given bitfield are protected by different locks, the compiler's
+ non-atomic read-modify-write sequences can cause an update to one
+ field to corrupt the value of an adjacent field.
+
+ (*) These guarantees apply only to properly aligned and sized scalar
+ variables.  Properly sized currently means variables that are
+ the same size as char, short, int and long.  Properly
+ aligned means the natural alignment, thus no constraints for
+ char, two-byte alignment for short, four-byte alignment for
+ int, and either four-byte or eight-byte alignment for long,
+ on 32-bit and 64-bit systems, respectively.  Note that this means
+ that the Linux kernel does not support pre-EV56 Alpha CPUs,
+ because these older CPUs do not provide one-byte and two-byte
+ load and store instructions.  (In theory, the pre-EV56 Alpha CPUs
+ can emulate these instructions using load-linked/store-conditional
+ instructions, but in practice this approach has excessive overhead.
+ Keep in mind that this emulation would be required on -all- single-
+ and double-byte loads and stores in order to handle adjacent bytes
+ protected by different locks.)
+
+ Alpha EV56 and later Alpha CPUs are still supported.
+
 
 =
 WHAT ARE MEMORY BARRIERS?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Paul E. McKenney
On Fri, Sep 05, 2014 at 01:34:52PM -0700, H. Peter Anvin wrote:
 On 09/05/2014 01:14 PM, Peter Hurley wrote:
  
  Here's how I read the two statements.
  
  First, the commit message:
  
  It [this commit] documents that CPUs [supported by the Linux kernel]
  _must provide_ atomic one-byte and two-byte naturally aligned loads and 
  stores.
  
  Second, in the body of the document:
  
  The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these
  older CPUs _do not provide_ atomic one-byte and two-byte loads and stores.
  
 
 Does this apply in general or only to SMP configurations?  I guess
 non-SMP configurations would still have problems if interrupted in the
 wrong place...

And preemption could cause problems, too.  So I believe that it needs
to be universal.

Thanx, Paul

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Michael Cree
On Fri, Sep 05, 2014 at 01:34:52PM -0700, H. Peter Anvin wrote:
 On 09/05/2014 01:14 PM, Peter Hurley wrote:
  
  Here's how I read the two statements.
  
  First, the commit message:
  
  It [this commit] documents that CPUs [supported by the Linux kernel]
  _must provide_ atomic one-byte and two-byte naturally aligned loads and 
  stores.
  
  Second, in the body of the document:
  
  The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these
  older CPUs _do not provide_ atomic one-byte and two-byte loads and stores.
  
 
 Does this apply in general or only to SMP configurations?  I guess
 non-SMP configurations would still have problems if interrupted in the
 wrong place...

Yes.

Cheers
Michael.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Thomas Gleixner
On Fri, 5 Sep 2014, Paul E. McKenney wrote:
 On Fri, Sep 05, 2014 at 01:34:52PM -0700, H. Peter Anvin wrote:
  On 09/05/2014 01:14 PM, Peter Hurley wrote:
   
   Here's how I read the two statements.
   
   First, the commit message:
   
   It [this commit] documents that CPUs [supported by the Linux kernel]
   _must provide_ atomic one-byte and two-byte naturally aligned loads and 
   stores.
   
   Second, in the body of the document:
   
   The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these
   older CPUs _do not provide_ atomic one-byte and two-byte loads and 
   stores.
   
  
  Does this apply in general or only to SMP configurations?  I guess
  non-SMP configurations would still have problems if interrupted in the
  wrong place...
 
 And preemption could cause problems, too.  So I believe that it needs
 to be universal.

Well preemption is usually caused by an interrupt, except you have a
combined load and preempt instruction :)

Thanks,

tglx
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Paul E. McKenney
On Fri, Sep 05, 2014 at 10:48:34PM +0200, Thomas Gleixner wrote:
 On Fri, 5 Sep 2014, Paul E. McKenney wrote:
  On Fri, Sep 05, 2014 at 01:34:52PM -0700, H. Peter Anvin wrote:
   On 09/05/2014 01:14 PM, Peter Hurley wrote:

Here's how I read the two statements.

First, the commit message:

It [this commit] documents that CPUs [supported by the Linux kernel]
_must provide_ atomic one-byte and two-byte naturally aligned loads and 
stores.

Second, in the body of the document:

The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these
older CPUs _do not provide_ atomic one-byte and two-byte loads and 
stores.

   
   Does this apply in general or only to SMP configurations?  I guess
   non-SMP configurations would still have problems if interrupted in the
   wrong place...
  
  And preemption could cause problems, too.  So I believe that it needs
  to be universal.
 
 Well preemption is usually caused by an interrupt, except you have a
 combined load and preempt instruction :)

Fair point!  ;-)

Thanx, Paul

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Peter Hurley
On 09/05/2014 04:39 PM, Michael Cree wrote:
 On Fri, Sep 05, 2014 at 04:14:48PM -0400, Peter Hurley wrote:
 Second, in the body of the document:

 The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these
 older CPUs _do not provide_ atomic one-byte and two-byte loads and stores.
 
 Let's be clear here, the pre-EV56 Alpha CPUs do provide an atomic
 one-byte and two-byte load and store; it's just that one must use
 locked load and store sequences to achieve atomicity.  The point,
 I think, is that the pre-EV56 Alpha CPUs provide non-atomic one-byte
 and two-byte load and stores as the norm, and that is the problem.

I'm all for an Alpha expert to jump in here and meet the criteria;
which is that byte stores cannot corrupt adjacent storage (nor can
aligned short stores).

To my mind, a quick look at Documentation/circular-buffers.txt will
pretty much convince anyone that trying to differentiate by execution
context is undoable.

If someone wants to make Alphas do cmpxchg loops for every byte store,
then ok. Or any other solution that doesn't require subsystem code
changes.

Regards,
Peter Hurley
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] PCI/MSI: Remove arch_msi_check_device()

2014-09-05 Thread Bjorn Helgaas
On Sat, Jul 12, 2014 at 01:21:06PM +0200, Alexander Gordeev wrote:
 Hello,
 
 This is a cleanup effort to get rid of useless arch_msi_check_device().
 I am not sure what were the reasons for its existence in the first place,
 but at the moment it appears totally unnecessary.
 
 Thanks!
 
 Cc: linuxppc-dev@lists.ozlabs.org
 Cc: linux-...@vger.kernel.org
 
 Alexander Gordeev (2):
   PCI/MSI/PPC: Remove arch_msi_check_device()
   PCI/MSI: Remove arch_msi_check_device()

I applied these (with Michael's ack on the first, and v2 of the second) to
pci/msi for v3.18, thanks!
 
  arch/powerpc/include/asm/machdep.h |2 -
  arch/powerpc/kernel/msi.c  |   12 +---
  arch/powerpc/platforms/cell/axon_msi.c |9 --
  arch/powerpc/platforms/powernv/pci.c   |   19 +++-
  arch/powerpc/platforms/pseries/msi.c   |   42
  ++- arch/powerpc/sysdev/fsl_msi.c  |
  12 ++-- arch/powerpc/sysdev/mpic_pasemi_msi.c  |   11 +--
  arch/powerpc/sysdev/mpic_u3msi.c   |   28 +++---
  arch/powerpc/sysdev/ppc4xx_hsta_msi.c  |   18 
  arch/powerpc/sysdev/ppc4xx_msi.c   |   19 
  drivers/pci/msi.c  |   49
  --- include/linux/msi.h|
  3 -- 12 files changed, 63 insertions(+), 161 deletions(-)
 
 -- 1.7.7.6
 
 -- To unsubscribe from this list: send the line unsubscribe linux-pci
 in the body of a message to majord...@vger.kernel.org More majordomo info
 at  http://vger.kernel.org/majordomo-info.html
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] PCI/MSI: Remove arch_msi_check_device()

2014-09-05 Thread Bjorn Helgaas
On Fri, Sep 5, 2014 at 3:25 PM, Bjorn Helgaas bhelg...@google.com wrote:
 On Sat, Jul 12, 2014 at 01:21:06PM +0200, Alexander Gordeev wrote:
 Hello,

 This is a cleanup effort to get rid of useless arch_msi_check_device().
 I am not sure what were the reasons for its existence in the first place,
 but at the moment it appears totally unnecessary.

 Thanks!

 Cc: linuxppc-dev@lists.ozlabs.org
 Cc: linux-...@vger.kernel.org

 Alexander Gordeev (2):
   PCI/MSI/PPC: Remove arch_msi_check_device()
   PCI/MSI: Remove arch_msi_check_device()

 I applied these (with Michael's ack on the first, and v2 of the second) to
 pci/msi for v3.18, thanks!

  arch/powerpc/include/asm/machdep.h |2 -
  arch/powerpc/kernel/msi.c  |   12 +---
  arch/powerpc/platforms/cell/axon_msi.c |9 --
  arch/powerpc/platforms/powernv/pci.c   |   19 +++-
  arch/powerpc/platforms/pseries/msi.c   |   42
  ++- arch/powerpc/sysdev/fsl_msi.c  |
  12 ++-- arch/powerpc/sysdev/mpic_pasemi_msi.c  |   11 +--
  arch/powerpc/sysdev/mpic_u3msi.c   |   28 +++---
  arch/powerpc/sysdev/ppc4xx_hsta_msi.c  |   18 
  arch/powerpc/sysdev/ppc4xx_msi.c   |   19 
  drivers/pci/msi.c  |   49
  --- include/linux/msi.h|
  3 -- 12 files changed, 63 insertions(+), 161 deletions(-)

Oh, I forgot -- if you'd rather take the first one through the PPC
tree, you can do that and I can merge the second one later.  Let me
know if you want to do that.

Bjorn
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: bit fields data tearing

2014-09-05 Thread Michael Cree
On Fri, Sep 05, 2014 at 05:12:28PM -0400, Peter Hurley wrote:
 On 09/05/2014 04:39 PM, Michael Cree wrote:
  On Fri, Sep 05, 2014 at 04:14:48PM -0400, Peter Hurley wrote:
  Second, in the body of the document:
 
  The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these
  older CPUs _do not provide_ atomic one-byte and two-byte loads and stores.
  
  Let's be clear here, the pre-EV56 Alpha CPUs do provide an atomic
  one-byte and two-byte load and store; it's just that one must use
  locked load and store sequences to achieve atomicity.  The point,
  I think, is that the pre-EV56 Alpha CPUs provide non-atomic one-byte
  and two-byte load and stores as the norm, and that is the problem.
 
 I'm all for an Alpha expert to jump in here and meet the criteria;
 which is that byte stores cannot corrupt adjacent storage (nor can
 aligned short stores).
 
 To my mind, a quick look at Documentation/circular-buffers.txt will
 pretty much convince anyone that trying to differentiate by execution
 context is undoable.
 
 If someone wants to make Alphas do cmpxchg loops for every byte store,
 then ok. Or any other solution that doesn't require subsystem code
 changes.

I am not suggesting that anyone do that work.  I'm certainly not going
to do it.

All I was pointing out is that the claim that _do not provide_ made
above with emphasis is, strictly interpreted, not true, thus should not
be committed to the documentation without further clarification.

Cheers
Michael.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] PCI: Increase BAR size quirk for IBM ipr SAS Crocodile adapters

2014-09-05 Thread Bjorn Helgaas
On Thu, Aug 21, 2014 at 09:26:52AM +1000, Anton Blanchard wrote:
 From: Douglas Lehr dll...@us.ibm.com
 
 The Crocodile chip occasionally comes up with 4k and 8k BAR sizes.
 Due to an errata, setting the SR-IOV page size causes the physical
 function BARs to expand to the system page size.  Since ppc64 uses
 64k pages, when Linux tries to assign the smaller resource sizes
 to the now 64k BARs the address will be truncated and the BARs will
 overlap.
 
 This quirk will force Linux to allocate the resource as a full page,
 which will avoid the overlap.
 
 Cc: sta...@vger.kernel.org 
 Signed-off-by: Douglas Lehr dll...@us.ibm.com
 Signed-off-by: Anton Blanchard an...@samba.org
 Acked-by: Milton Miller milt...@us.ibm.com

Applied to pci/misc for v3.18, thanks!

I tweaked it to print the expanded resource, see below.

 ---
  drivers/pci/quirks.c |   19 +++
  1 file changed, 19 insertions(+)
 
 diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
 index 80c2d01..45b946d 100644
 --- a/drivers/pci/quirks.c
 +++ b/drivers/pci/quirks.c
 @@ -24,6 +24,7 @@
  #include linux/ioport.h
  #include linux/sched.h
  #include linux/ktime.h
 +#include linux/mm.h
  #include asm/dma.h /* isa_dma_bridge_buggy */
  #include pci.h
  
 @@ -287,6 +288,24 @@ static void quirk_citrine(struct pci_dev *dev)
  }
  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_IBM,  PCI_DEVICE_ID_IBM_CITRINE,  
 quirk_citrine);
  
 +/*  On IBM Crocodile ipr SAS adapters, expand bar size to system page size. 
 */
 +static void quirk_extend_bar_to_page(struct pci_dev *dev)
 +{
 + int i;
 +
 + for (i = 0; i  PCI_STD_RESOURCE_END; i++) {
 + struct resource *r = dev-resource[i];
 +
 + if (r-flags  IORESOURCE_MEM  resource_size(r)  PAGE_SIZE) {
 + dev_info(dev-dev, Setting Bar size to Page size);
 + r-end = PAGE_SIZE-1;
 + r-start = 0;
 + r-flags |= IORESOURCE_UNSET;
 + }
 + }
 +}
 +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_IBM, 0x034a, 
 quirk_extend_bar_to_page);
 +
  /*
   *  S3 868 and 968 chips report region size equal to 32M, but they decode 
 64M.
   *  If it's needed, re-allocate the region.
 -- 



commit 86b6431a306ab5a5204c436a45a3337fb17efa21
Author: Douglas Lehr dll...@us.ibm.com
Date:   Thu Aug 21 09:26:52 2014 +1000

PCI: Increase IBM ipr SAS Crocodile BARs to at least system page size

The Crocodile chip occasionally comes up with 4k and 8k BAR sizes.  Due to
an erratum, setting the SR-IOV page size causes the physical function BARs
to expand to the system page size.  Since ppc64 uses 64k pages, when Linux
tries to assign the smaller resource sizes to the now 64k BARs the address
will be truncated and the BARs will overlap.

Force Linux to allocate the resource as a full page, which avoids the
overlap.

[bhelgaas: print expanded resource, too]
Signed-off-by: Douglas Lehr dll...@us.ibm.com
Signed-off-by: Anton Blanchard an...@samba.org
Signed-off-by: Bjorn Helgaas bhelg...@google.com
Acked-by: Milton Miller milt...@us.ibm.com
CC: sta...@vger.kernel.org

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 80c2d014283d..e73960311fb4 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -24,6 +24,7 @@
 #include linux/ioport.h
 #include linux/sched.h
 #include linux/ktime.h
+#include linux/mm.h
 #include asm/dma.h   /* isa_dma_bridge_buggy */
 #include pci.h
 
@@ -287,6 +288,25 @@ static void quirk_citrine(struct pci_dev *dev)
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_IBM,PCI_DEVICE_ID_IBM_CITRINE,  
quirk_citrine);
 
+/*  On IBM Crocodile ipr SAS adapters, expand BAR to system page size */
+static void quirk_extend_bar_to_page(struct pci_dev *dev)
+{
+   int i;
+
+   for (i = 0; i  PCI_STD_RESOURCE_END; i++) {
+   struct resource *r = dev-resource[i];
+
+   if (r-flags  IORESOURCE_MEM  resource_size(r)  PAGE_SIZE) {
+   r-end = PAGE_SIZE - 1;
+   r-start = 0;
+   r-flags |= IORESOURCE_UNSET;
+   dev_info(dev-dev, expanded BAR %d to page size: 
%pR\n,
+r);
+   }
+   }
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_IBM, 0x034a, quirk_extend_bar_to_page);
+
 /*
  *  S3 868 and 968 chips report region size equal to 32M, but they decode 64M.
  *  If it's needed, re-allocate the region.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev