[PATCH V2 0/2] cpufreq/powernv: Set core pstate to a minimum just before hotplugging it out
Today cpus go to winkle when they are offlined. Since it is the deepest idle state that we have, it is expected to save good amount of power as compared to online state, where cores can enter nap/fastsleep only which are shallower idle states. However we observed no powersavings with winkle as compared to nap/fastsleep and traced the problem to the pstate of the core being kept at a high even when the core is offline. This can keep the socket pstate high, thus burning power unnecessarily. This patchset fixes this issue. Changes in V2: Changed smp_call_function_any() to smp_call_function_single() in Patch[2/2] --- Preeti U Murthy (2): cpufreq: Allow stop CPU callback to be used by all cpufreq drivers powernv/cpufreq: Set the pstate of the last hotplugged out cpu in policy-cpus to minimum drivers/cpufreq/cpufreq.c |2 +- drivers/cpufreq/powernv-cpufreq.c |9 + 2 files changed, 10 insertions(+), 1 deletion(-) -- ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 1/2] cpufreq: Allow stop CPU callback to be used by all cpufreq drivers
Commit 367dc4aa932bfb3 (cpufreq: Add stop CPU callback to cpufreq_driver interface) introduced the stop CPU callback for intel_pstate drivers. During the CPU_DOWN_PREPARE stage, this callback is invoked so that drivers can take some action on the pstate of the cpu before it is taken offline. This callback was assumed to be useful only for those drivers which have implemented the set_policy CPU callback because they have no other way to take action about the cpufreq of a CPU which is being hotplugged out except in the exit callback which is called very late in the offline process. The drivers which implement the target/target_index callbacks were expected to take care of requirements like the ones that commit 367dc4aa addresses in the GOV_STOP notification event. But there are disadvantages to restricting the usage of stop CPU callback to cpufreq drivers that implement the set_policy callbacks and who want to take explicit action on the setting the cpufreq during a hotplug operation. 1.GOV_STOP gets called for every CPU offline and drivers would usually want to take action when the last cpu in the policy-cpus mask is taken offline. As long as there is more than one cpu in the policy-cpus mask, cpufreq core itself makes sure that the freq for the other cpus in this mask is set according to the maximum load. This is sensible and drivers which implement the target_index callback would mostly not want to modify that. However the cpufreq core leaves a loose end when the cpu in the policy-cpus mask is the last one to go offline; it does nothing explicit to the frequency of the core. Drivers may need a way to take some action here and stop CPU callback mechanism is the best way to do it today. 2. We cannot implement driver specific actions in the GOV_STOP mechanism. So we will need another driver callback which is invoked from here which is unnecessary. Therefore this patch extends the usage of stop CPU callback to be used by all cpufreq drivers as long as they have this callback implemented and irrespective of whether they are set_policy/target_index drivers. The assumption is if the drivers find the GOV_STOP path to be a suitable way of implementing what they want to do with the freq of the cpu going offine,they will not implement the stop CPU callback at all. Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com --- drivers/cpufreq/cpufreq.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index d9fdedd..6463f35 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1380,7 +1380,7 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, if (!cpufreq_suspended) pr_debug(%s: policy Kobject moved to cpu: %d from: %d\n, __func__, new_cpu, cpu); - } else if (cpufreq_driver-stop_cpu cpufreq_driver-setpolicy) { + } else if (cpufreq_driver-stop_cpu) { cpufreq_driver-stop_cpu(policy); } ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 2/2] powernv/cpufreq: Set the pstate of the last hotplugged out cpu in policy-cpus to minimum
Its possible today that the pstate of a core is held at a high even after the entire core is hotplugged out if a load had just run on the hotplugged cpu. This is fair, since it is assumed that the pstate does not matter to a cpu in a deep idle state, which is the expected state of a hotplugged core on powerpc. However on powerpc, the pstate at a socket level is held at the maximum of the pstates of each core. Even if the pstates of the active cores on that socket is low, the socket pstate is held high due to the pstate of the hotplugged core in the above mentioned scenario. This can cost significant amount of power loss for no good. Besides, since it is a non active core, nothing can be done from the kernel's end to set the frequency of the core right. Hence make use of the stop_cpu callback to explicitly set the pstate of the core to a minimum when the last cpu of the core gets hotplugged out. Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com --- drivers/cpufreq/powernv-cpufreq.c |9 + 1 file changed, 9 insertions(+) diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c index 379c083..5a628f1 100644 --- a/drivers/cpufreq/powernv-cpufreq.c +++ b/drivers/cpufreq/powernv-cpufreq.c @@ -317,6 +317,14 @@ static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy) return cpufreq_table_validate_and_show(policy, powernv_freqs); } +static void powernv_cpufreq_stop_cpu(struct cpufreq_policy *policy) +{ + struct powernv_smp_call_data freq_data; + + freq_data.pstate_id = powernv_pstate_info.min; + smp_call_function_single(policy-cpu, set_pstate, freq_data, 1); +} + static struct cpufreq_driver powernv_cpufreq_driver = { .name = powernv-cpufreq, .flags = CPUFREQ_CONST_LOOPS, @@ -324,6 +332,7 @@ static struct cpufreq_driver powernv_cpufreq_driver = { .verify = cpufreq_generic_frequency_table_verify, .target_index = powernv_cpufreq_target_index, .get= powernv_cpufreq_get, + .stop_cpu = powernv_cpufreq_stop_cpu, .attr = powernv_cpu_freq_attr, }; ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V2 0/2] cpufreq/powernv: Set core pstate to a minimum just before hotplugging it out
On 5 September 2014 13:09, Preeti U Murthy pre...@linux.vnet.ibm.com wrote: Today cpus go to winkle when they are offlined. Since it is the deepest idle state that we have, it is expected to save good amount of power as compared to online state, where cores can enter nap/fastsleep only which are shallower idle states. However we observed no powersavings with winkle as compared to nap/fastsleep and traced the problem to the pstate of the core being kept at a high even when the core is offline. This can keep the socket pstate high, thus burning power unnecessarily. This patchset fixes this issue. Changes in V2: Changed smp_call_function_any() to smp_call_function_single() in Patch[2/2] Acked-by: Viresh Kumar viresh.ku...@linaro.org ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: deb-pkg: Add support for powerpc little endian
On Fri, Sep 05, 2014 at 03:28:47PM +1000, Michael Neuling wrote: The Debian powerpc little endian architecture is called ppc64le. This Huh? ppc64le or ppc64el? is the default architecture used by Ubuntu for powerpc. The below checks the kernel config to see if we are compiling little endian and sets the Debian arch appropriately. Signed-off-by: Michael Neuling mi...@neuling.org diff --git a/scripts/package/builddeb b/scripts/package/builddeb index 35d5a58..6f4a1af 100644 --- a/scripts/package/builddeb +++ b/scripts/package/builddeb @@ -37,7 +37,7 @@ create_package() { s390*) debarch=s390$(grep -q CONFIG_64BIT=y $KCONFIG_CONFIG echo x || true) ;; ppc*) - debarch=powerpc ;; + debarch=$(grep -q CPU_LITTLE_ENDIAN=y $KCONFIG_CONFIG echo ppc64el || echo powerpc) ;; parisc*) debarch=hppa ;; mips*) Gabriel ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2] deb-pkg: Add support for powerpc little endian
On Fri, 2014-09-05 at 09:13 +0200, Gabriel Paubert wrote: On Fri, Sep 05, 2014 at 03:28:47PM +1000, Michael Neuling wrote: The Debian powerpc little endian architecture is called ppc64le. This Huh? ppc64le or ppc64el? ppc64el. Commit message is wrong. Fixed below. Mikey From: Michael Neuling mi...@neuling.org deb-pkg: Add support for powerpc little endian The Debian powerpc little endian architecture is called ppc64el. This is the default architecture used by Ubuntu for powerpc. The below checks the kernel config to see if we are compiling little endian and sets the Debian arch appropriately. Signed-off-by: Michael Neuling mi...@neuling.org diff --git a/scripts/package/builddeb b/scripts/package/builddeb index 35d5a58..6f4a1af 100644 --- a/scripts/package/builddeb +++ b/scripts/package/builddeb @@ -37,7 +37,7 @@ create_package() { s390*) debarch=s390$(grep -q CONFIG_64BIT=y $KCONFIG_CONFIG echo x || true) ;; ppc*) - debarch=powerpc ;; + debarch=$(grep -q CPU_LITTLE_ENDIAN=y $KCONFIG_CONFIG echo ppc64el || echo powerpc) ;; parisc*) debarch=hppa ;; mips*) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: bit fields data tearing
From: Paul E. McKenney On Thu, Sep 04, 2014 at 10:47:24PM -0400, Peter Hurley wrote: Hi James, On 09/04/2014 10:11 PM, James Bottomley wrote: On Thu, 2014-09-04 at 17:17 -0700, Paul E. McKenney wrote: +And there are anti-guarantees: + + (*) These guarantees do not apply to bitfields, because compilers often + generate code to modify these using non-atomic read-modify-write + sequences. Do not attempt to use bitfields to synchronize parallel + algorithms. + + (*) Even in cases where bitfields are protected by locks, all fields + in a given bitfield must be protected by one lock. If two fields + in a given bitfield are protected by different locks, the compiler's + non-atomic read-modify-write sequences can cause an update to one + field to corrupt the value of an adjacent field. + + (*) These guarantees apply only to properly aligned and sized scalar + variables. Properly sized currently means int and long, + because some CPU families do not support loads and stores of + other sizes. (Some CPU families is currently believed to + be only Alpha 21064. If this is actually the case, a different + non-guarantee is likely to be formulated.) This is a bit unclear. Presumably you're talking about definiteness of the outcome (as in what's seen after multiple stores to the same variable). No, the last conditions refers to adjacent byte stores from different cpu contexts (either interrupt or SMP). The guarantees are only for natural width on Parisc as well, so you would get a mess if you did byte stores to adjacent memory locations. For a simple test like: struct x { long a; char b; char c; char d; char e; }; void store_bc(struct x *p) { p-b = 1; p-c = 2; } on parisc, gcc generates separate byte stores void store_bc(struct x *p) { 0: 34 1c 00 02 ldi 1,ret0 4: 0f 5c 12 08 stb ret0,4(r26) 8: 34 1c 00 04 ldi 2,ret0 c: e8 40 c0 00 bv r0(rp) 10: 0f 5c 12 0a stb ret0,5(r26) which appears to confirm that on parisc adjacent byte data is safe from corruption by concurrent cpu updates; that is, CPU 0| CPU 1 | p-b = 1 | p-c = 2 | will result in p-b == 1 p-c == 2 (assume both values were 0 before the call to store_bc()). What Peter said. I would ask for suggestions for better wording, but I would much rather be able to say that single-byte reads and writes are atomic and that aligned-short reads and writes are also atomic. Thus far, it looks like we lose only very old Alpha systems, so unless I hear otherwise, I update my patch to outlaw these very old systems. People with old Alphas can run NetBSD instead, along with those who have real VAXen :-) I've seen gcc generate 32bit accesses for 16bit structure members on arm. It does this because of the more limited range of the offsets for the 16bit access. OTOH I don't know if it ever did this for writes - so it may be moot. David ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Thu, Sep 04, 2014 at 07:08:48PM -0700, H. Peter Anvin wrote: On 09/04/2014 05:59 PM, Peter Hurley wrote: I have no idea how prevalent the ev56 is compared to the ev5. Still we're talking about a chip that came out in 1996. Ah yes, I stand corrected. According to Wikipedia, the affected CPUs were all the 2106x CPUs (EV4, EV45, LCA4, LCA45) plus the 21164 with no suffix (EV5). However, we're still talking about museum pieces here. Yes, that is correct, EV56 is the first Alpha CPU to have the byte-word extension (BWX) CPU instructions. It would not worry me if the kernel decided to assume atomic aligned scalar accesses for all arches, thus terminating support for Alphas without BWX. The X server, ever since the libpciaccess change, does not work on Alphas without BWX. Debian Alpha (pretty much up to date at Debian-Ports) is still compiled for all Alphas, i.e., without BWX. The last attempt to start compiling Debian Alpha with BWX, about three years ago when Alpha was kicked out to Debian-Ports resulted in a couple or so complaints so got nowhere. It's frustrating supporting the lowest common demoninator as many of the bugs specific to Alpha can be resolved by recompiling with the BWX. The kernel no longer supporting Alphas without BWX might just be the incentive we need to switch Debian Alpha to compiling with BWX. Cheers Michael. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] pseries: Make CPU hotplug path endian safe
From: Bharata B Rao bhar...@linux.vnet.ibm.com - ibm,rtas-configure-connector should treat the RTAS data as big endian. - Treat ibm,ppc-interrupt-server#s as big-endian when setting smp_processor_id during hotplug. Signed-off-by: Bharata B Rao bhar...@linux.vnet.ibm.com --- arch/powerpc/platforms/pseries/dlpar.c | 10 +- arch/powerpc/platforms/pseries/hotplug-cpu.c | 4 ++-- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c index 2d0b4d6..dc55f9c 100644 --- a/arch/powerpc/platforms/pseries/dlpar.c +++ b/arch/powerpc/platforms/pseries/dlpar.c @@ -48,11 +48,11 @@ static struct property *dlpar_parse_cc_property(struct cc_workarea *ccwa) if (!prop) return NULL; - name = (char *)ccwa + ccwa-name_offset; + name = (char *)ccwa + be32_to_cpu(ccwa-name_offset); prop-name = kstrdup(name, GFP_KERNEL); - prop-length = ccwa-prop_length; - value = (char *)ccwa + ccwa-prop_offset; + prop-length = be32_to_cpu(ccwa-prop_length); + value = (char *)ccwa + be32_to_cpu(ccwa-prop_offset); prop-value = kmemdup(value, prop-length, GFP_KERNEL); if (!prop-value) { dlpar_free_cc_property(prop); @@ -78,7 +78,7 @@ static struct device_node *dlpar_parse_cc_node(struct cc_workarea *ccwa, if (!dn) return NULL; - name = (char *)ccwa + ccwa-name_offset; + name = (char *)ccwa + be32_to_cpu(ccwa-name_offset); dn-full_name = kasprintf(GFP_KERNEL, %s/%s, path, name); if (!dn-full_name) { kfree(dn); @@ -148,7 +148,7 @@ struct device_node *dlpar_configure_connector(u32 drc_index, return NULL; ccwa = (struct cc_workarea *)data_buf[0]; - ccwa-drc_index = drc_index; + ccwa-drc_index = cpu_to_be32(drc_index); ccwa-zero = 0; do { diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 20d6297..447f8c6 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -247,7 +247,7 @@ static int pseries_add_processor(struct device_node *np) unsigned int cpu; cpumask_var_t candidate_mask, tmp; int err = -ENOSPC, len, nthreads, i; - const u32 *intserv; + const __be32 *intserv; intserv = of_get_property(np, ibm,ppc-interrupt-server#s, len); if (!intserv) @@ -293,7 +293,7 @@ static int pseries_add_processor(struct device_node *np) for_each_cpu(cpu, tmp) { BUG_ON(cpu_present(cpu)); set_cpu_present(cpu, true); - set_hard_smp_processor_id(cpu, *intserv++); + set_hard_smp_processor_id(cpu, be32_to_cpu(*intserv++)); } err = 0; out_unlock: -- 1.7.11.7 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 05/21] PCI/MSI: Introduce weak arch_find_msi_chip() to find MSI chip
Introduce weak arch_find_msi_chip() to find the match msi_chip. Currently, MSI chip associates pci bus to msi_chip. Because in ARM platform, there may be more than one MSI controller in system. Associate pci bus to msi_chip help pci device to find the match msi_chip and setup MSI/MSI-X irq correctly. But in other platform, like in x86. we only need one MSI chip, because all device use the same MSI address/data and irq etc. So it's no need to associate pci bus to MSI chip, just use a arch function, arch_find_msi_chip() to return the MSI chip for simplicity. The default weak arch_find_msi_chip() used in ARM platform, find the MSI chip by pci bus. Signed-off-by: Yijing Wang wangyij...@huawei.com --- drivers/pci/msi.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index a77e7f7..539c11d 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -29,9 +29,14 @@ static int pci_msi_enable = 1; /* Arch hooks */ +struct msi_chip * __weak arch_find_msi_chip(struct pci_dev *dev) +{ + return dev-bus-msi; +} + int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) { - struct msi_chip *chip = dev-bus-msi; + struct msi_chip *chip = arch_find_msi_chip(dev); int err; if (!chip || !chip-setup_irq) -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 03/21] MSI: Remove the redundant irq_set_chip_data()
Currently, pcie-designware, pcie-rcar, pci-tegra drivers use irq chip_data to save the msi_chip pointer. They already call irq_set_chip_data() in their own MSI irq map functions. So irq_set_chip_data() in arch_setup_msi_irq() is useless. Signed-off-by: Yijing Wang wangyij...@huawei.com --- drivers/pci/msi.c |2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index f6cb317..d547f7f 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -41,8 +41,6 @@ int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) if (err 0) return err; - irq_set_chip_data(desc-irq, chip); - return 0; } -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 04/21] x86/xen/MSI: Eliminate arch_msix_mask_irq() and arch_msi_mask_irq()
Commit 0e4ccb150 added two __weak arch functions arch_msix_mask_irq() and arch_msi_mask_irq() to fix a bug found when running xen in x86. Introduced these two funcntions make MSI code complex. And mask/unmask is the irq actions related to interrupt controller, should not use weak arch functions to override mask/unmask interfaces. This patch reverted commit 0e4ccb150 and export struct irq_chip msi_chip, modify msi_chip-irq_mask/irq_unmask() in xen init functions to fix this bug for simplicity. Also this is preparation for using struct msi_chip instead of weak arch MSI functions in all platforms. Signed-off-by: Yijing Wang wangyij...@huawei.com CC: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- arch/x86/include/asm/apic.h |4 arch/x86/include/asm/x86_init.h |3 --- arch/x86/kernel/apic/io_apic.c |2 +- arch/x86/kernel/x86_init.c | 10 -- arch/x86/pci/xen.c | 16 ++-- drivers/pci/msi.c | 22 ++ include/linux/msi.h |4 ++-- 7 files changed, 19 insertions(+), 42 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 465b309..47a5f94 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -43,6 +43,10 @@ static inline void generic_apic_probe(void) } #endif +#ifdef CONFIG_PCI_MSI +extern struct irq_chip msi_chip; +#endif + #ifdef CONFIG_X86_LOCAL_APIC extern unsigned int apic_verbosity; diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h index e45e4da..f58a9c7 100644 --- a/arch/x86/include/asm/x86_init.h +++ b/arch/x86/include/asm/x86_init.h @@ -172,7 +172,6 @@ struct x86_platform_ops { struct pci_dev; struct msi_msg; -struct msi_desc; struct x86_msi_ops { int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type); @@ -183,8 +182,6 @@ struct x86_msi_ops { void (*teardown_msi_irqs)(struct pci_dev *dev); void (*restore_msi_irqs)(struct pci_dev *dev); int (*setup_hpet_msi)(unsigned int irq, unsigned int id); - u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag); - u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag); }; struct IO_APIC_route_entry; diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index e877cfb..2a2ec28 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -3161,7 +3161,7 @@ msi_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force) * IRQ Chip for MSI PCI/PCI-X/PCI-Express Devices, * which implement the MSI or MSI-X Capability Structure. */ -static struct irq_chip msi_chip = { +struct irq_chip msi_chip = { .name = PCI-MSI, .irq_unmask = unmask_msi_irq, .irq_mask = mask_msi_irq, diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c index e48b674..234b072 100644 --- a/arch/x86/kernel/x86_init.c +++ b/arch/x86/kernel/x86_init.c @@ -116,8 +116,6 @@ struct x86_msi_ops x86_msi = { .teardown_msi_irqs = default_teardown_msi_irqs, .restore_msi_irqs = default_restore_msi_irqs, .setup_hpet_msi = default_setup_hpet_msi, - .msi_mask_irq = default_msi_mask_irq, - .msix_mask_irq = default_msix_mask_irq, }; /* MSI arch specific hooks */ @@ -140,14 +138,6 @@ void arch_restore_msi_irqs(struct pci_dev *dev) { x86_msi.restore_msi_irqs(dev); } -u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag) -{ - return x86_msi.msi_mask_irq(desc, mask, flag); -} -u32 arch_msix_mask_irq(struct msi_desc *desc, u32 flag) -{ - return x86_msi.msix_mask_irq(desc, flag); -} #endif struct x86_io_apic_ops x86_io_apic_ops = { diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c index ad03739..84c2fce 100644 --- a/arch/x86/pci/xen.c +++ b/arch/x86/pci/xen.c @@ -394,13 +394,9 @@ static void xen_teardown_msi_irq(unsigned int irq) { xen_destroy_irq(irq); } -static u32 xen_nop_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag) -{ - return 0; -} -static u32 xen_nop_msix_mask_irq(struct msi_desc *desc, u32 flag) + +void xen_nop_msi_mask(struct irq_data *data) { - return 0; } #endif @@ -425,8 +421,8 @@ int __init pci_xen_init(void) x86_msi.setup_msi_irqs = xen_setup_msi_irqs; x86_msi.teardown_msi_irq = xen_teardown_msi_irq; x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs; - x86_msi.msi_mask_irq = xen_nop_msi_mask_irq; - x86_msi.msix_mask_irq = xen_nop_msix_mask_irq; + msi_chip.irq_mask = xen_nop_msi_mask; + msi_chip.irq_unmask = xen_nop_msi_mask; #endif return 0; } @@ -506,8 +502,8 @@ int __init pci_xen_initial_domain(void) x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs; x86_msi.teardown_msi_irq = xen_teardown_msi_irq; x86_msi.restore_msi_irqs =
[PATCH v1 00/21] Use MSI chip to configure MSI/MSI-X in all platforms
This series is based Bjorn's pci-next branch + Alexander Gordeev's two patches Remove arch_msi_check_device() link: https://lkml.org/lkml/2014/7/12/41 Currently, there are a lot of weak arch functions in MSI code. Thierry Reding Introduced MSI chip framework to configure MSI/MSI-X in arm. This series use MSI chip framework to refactor MSI code across all platforms to eliminate weak arch functions. It has been tested fine in x86(with or without irq remap). RFC-v1: Updated [patch 4/21] x86/xen/MSI: Eliminate..., export msi_chip instead of #ifdef to fix MSI bug in xen running in x86. Rename arch_get_match_msi_chip() to arch_find_msi_chip(). Drop use struct device as the msi_chip argument, we will do that later in another patchset. Yijing Wang (21): PCI/MSI: Clean up struct msi_chip argument PCI/MSI: Remove useless bus-msi assignment MSI: Remove the redundant irq_set_chip_data() x86/xen/MSI: Eliminate arch_msix_mask_irq() and arch_msi_mask_irq() PCI/MSI: Introduce weak arch_find_msi_chip() to find MSI chip PCI/MSI: Refactor struct msi_chip to make it become more common x86/MSI: Use MSI chip framework to configure MSI/MSI-X irq x86/xen/MSI: Use MSI chip framework to configure MSI/MSI-X irq Irq_remapping/MSI: Use MSI chip framework to configure MSI/MSI-X irq x86/MSI: Remove unused MSI weak arch functions MIPS/Octeon/MSI: Use MSI chip framework to configure MSI/MSI-X irq MIPS/Xlp: Remove the dead function destroy_irq() to fix build error MIPS/Xlp/MSI: Use MSI chip framework to configure MSI/MSI-X irq MIPS/Xlr/MSI: Use MSI chip framework to configure MSI/MSI-X irq Powerpc/MSI: Use MSI chip framework to configure MSI/MSI-X irq s390/MSI: Use MSI chip framework to configure MSI/MSI-X irq arm/iop13xx/MSI: Use MSI chip framework to configure MSI/MSI-X irq IA64/MSI: Use MSI chip framework to configure MSI/MSI-X irq Sparc/MSI: Use MSI chip framework to configure MSI/MSI-X irq tile/MSI: Use MSI chip framework to configure MSI/MSI-X irq PCI/MSI: Clean up unused MSI arch functions arch/arm/mach-iop13xx/include/mach/pci.h |2 + arch/arm/mach-iop13xx/iq81340mc.c|1 + arch/arm/mach-iop13xx/iq81340sc.c|1 + arch/arm/mach-iop13xx/msi.c |9 ++- arch/arm/mach-iop13xx/pci.c |6 ++ arch/ia64/kernel/msi_ia64.c | 18 - arch/mips/pci/msi-octeon.c | 35 +--- arch/mips/pci/msi-xlp.c | 18 +++- arch/mips/pci/pci-xlr.c | 15 +++- arch/powerpc/kernel/msi.c| 14 +++- arch/s390/pci/pci.c | 18 - arch/sparc/kernel/pci.c | 14 +++- arch/tile/kernel/pci_gx.c| 14 +++- arch/x86/include/asm/apic.h |4 + arch/x86/include/asm/pci.h |4 +- arch/x86/include/asm/x86_init.h |7 -- arch/x86/kernel/apic/io_apic.c | 16 - arch/x86/kernel/x86_init.c | 34 arch/x86/pci/xen.c | 60 +-- drivers/iommu/irq_remapping.c|9 ++- drivers/irqchip/irq-armada-370-xp.c | 12 +-- drivers/pci/host/pci-tegra.c |8 +- drivers/pci/host/pcie-designware.c |4 +- drivers/pci/host/pcie-rcar.c |8 +- drivers/pci/msi.c| 123 +- drivers/pci/probe.c |1 - include/linux/msi.h | 26 ++- 27 files changed, 268 insertions(+), 213 deletions(-) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 10/21] x86/MSI: Remove unused MSI weak arch functions
Now we can clean up MSI weak arch functions in x86. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/x86/include/asm/pci.h |3 --- arch/x86/include/asm/x86_init.h |4 arch/x86/kernel/apic/io_apic.c |2 +- arch/x86/kernel/x86_init.c | 24 drivers/iommu/irq_remapping.c |1 - 5 files changed, 1 insertions(+), 33 deletions(-) diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h index 878a06d..34f9676 100644 --- a/arch/x86/include/asm/pci.h +++ b/arch/x86/include/asm/pci.h @@ -96,14 +96,11 @@ extern void pci_iommu_alloc(void); #ifdef CONFIG_PCI_MSI /* implemented in arch/x86/kernel/apic/io_apic. */ struct msi_desc; -int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type); void native_teardown_msi_irq(unsigned int irq); -void native_restore_msi_irqs(struct pci_dev *dev); int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc, unsigned int irq_base, unsigned int irq_offset); extern struct msi_chip *x86_msi_chip; #else -#define native_setup_msi_irqs NULL #define native_teardown_msi_irqNULL #endif diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h index f58a9c7..2514f67 100644 --- a/arch/x86/include/asm/x86_init.h +++ b/arch/x86/include/asm/x86_init.h @@ -174,13 +174,9 @@ struct pci_dev; struct msi_msg; struct x86_msi_ops { - int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type); void (*compose_msi_msg)(struct pci_dev *dev, unsigned int irq, unsigned int dest, struct msi_msg *msg, u8 hpet_id); - void (*teardown_msi_irq)(unsigned int irq); - void (*teardown_msi_irqs)(struct pci_dev *dev); - void (*restore_msi_irqs)(struct pci_dev *dev); int (*setup_hpet_msi)(unsigned int irq, unsigned int id); }; diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index 882b95e..f998192 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -3200,7 +3200,7 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc, return 0; } -int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) +static int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) { struct msi_desc *msidesc; unsigned int irq; diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c index 234b072..cc32568 100644 --- a/arch/x86/kernel/x86_init.c +++ b/arch/x86/kernel/x86_init.c @@ -110,34 +110,10 @@ EXPORT_SYMBOL_GPL(x86_platform); #if defined(CONFIG_PCI_MSI) struct x86_msi_ops x86_msi = { - .setup_msi_irqs = native_setup_msi_irqs, .compose_msi_msg= native_compose_msi_msg, - .teardown_msi_irq = native_teardown_msi_irq, - .teardown_msi_irqs = default_teardown_msi_irqs, - .restore_msi_irqs = default_restore_msi_irqs, .setup_hpet_msi = default_setup_hpet_msi, }; -/* MSI arch specific hooks */ -int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) -{ - return x86_msi.setup_msi_irqs(dev, nvec, type); -} - -void arch_teardown_msi_irqs(struct pci_dev *dev) -{ - x86_msi.teardown_msi_irqs(dev); -} - -void arch_teardown_msi_irq(unsigned int irq) -{ - x86_msi.teardown_msi_irq(irq); -} - -void arch_restore_msi_irqs(struct pci_dev *dev) -{ - x86_msi.restore_msi_irqs(dev); -} #endif struct x86_io_apic_ops x86_io_apic_ops = { diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c index e75026e..99b1c0f 100644 --- a/drivers/iommu/irq_remapping.c +++ b/drivers/iommu/irq_remapping.c @@ -170,7 +170,6 @@ static void __init irq_remapping_modify_x86_ops(void) x86_io_apic_ops.set_affinity= set_remapped_irq_affinity; x86_io_apic_ops.setup_entry = setup_ioapic_remapped_entry; x86_io_apic_ops.eoi_ioapic_pin = eoi_ioapic_pin_remapped; - x86_msi.setup_msi_irqs = irq_remapping_setup_msi_irqs; x86_msi.setup_hpet_msi = setup_hpet_msi_remapped; x86_msi.compose_msi_msg = compose_remapped_msi_msg; x86_msi_chip = remap_msi_chip; -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 11/21] MIPS/Octeon/MSI: Use MSI chip framework to configure MSI/MSI-X irq
Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/mips/pci/msi-octeon.c | 35 ++- 1 files changed, 22 insertions(+), 13 deletions(-) diff --git a/arch/mips/pci/msi-octeon.c b/arch/mips/pci/msi-octeon.c index ab0c5d1..0335d75 100644 --- a/arch/mips/pci/msi-octeon.c +++ b/arch/mips/pci/msi-octeon.c @@ -57,7 +57,7 @@ static int msi_irq_size; * * Returns 0 on success. */ -int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) +static int octeon_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) { struct msi_msg msg; u16 control; @@ -133,12 +133,12 @@ msi_irq_allocated: /* Make sure the search for available interrupts didn't fail */ if (irq = 64) { if (request_private_bits) { - pr_err(arch_setup_msi_irq: Unable to find %d free interrupts, trying just one, + pr_err(octeon_setup_msi_irq: Unable to find %d free interrupts, trying just one, 1 request_private_bits); request_private_bits = 0; goto try_only_one; } else - panic(arch_setup_msi_irq: Unable to find a free MSI interrupt); + panic(octeon_setup_msi_irq: Unable to find a free MSI interrupt); } /* MSI interrupts start at logical IRQ OCTEON_IRQ_MSI_BIT0 */ @@ -169,7 +169,7 @@ msi_irq_allocated: msg.address_hi = (0 + CVMX_SLI_PCIE_MSI_RCV) 32; break; default: - panic(arch_setup_msi_irq: Invalid octeon_dma_bar_type); + panic(octeon_setup_msi_irq: Invalid octeon_dma_bar_type); } msg.data = irq - OCTEON_IRQ_MSI_BIT0; @@ -184,7 +184,7 @@ msi_irq_allocated: return 0; } -int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) +static int octeon_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) { struct msi_desc *entry; int ret; @@ -203,7 +203,7 @@ int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) return 1; list_for_each_entry(entry, dev-msi_list, list) { - ret = arch_setup_msi_irq(dev, entry); + ret = octeon_setup_msi_irq(dev, entry); if (ret 0) return ret; if (ret 0) @@ -212,14 +212,13 @@ int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) return 0; } - /** * Called when a device no longer needs its MSI interrupts. All * MSI interrupts for the device are freed. * * @irq:The devices first irq number. There may be multple in sequence. */ -void arch_teardown_msi_irq(unsigned int irq) +static void octeon_teardown_msi_irq(unsigned int irq) { int number_irqs; u64 bitmask; @@ -228,8 +227,8 @@ void arch_teardown_msi_irq(unsigned int irq) if ((irq OCTEON_IRQ_MSI_BIT0) || (irq msi_irq_size + OCTEON_IRQ_MSI_BIT0)) - panic(arch_teardown_msi_irq: Attempted to teardown illegal - MSI interrupt (%d), irq); + panic(octeon_teardown_msi_irq: Attempted to teardown illegal + MSI interrupt (%d), irq); irq -= OCTEON_IRQ_MSI_BIT0; index = irq / 64; @@ -242,7 +241,7 @@ void arch_teardown_msi_irq(unsigned int irq) */ number_irqs = 0; while ((irq0 + number_irqs 64) - (msi_multiple_irq_bitmask[index] + (msi_multiple_irq_bitmask[index] (1ull (irq0 + number_irqs number_irqs++; number_irqs++; @@ -251,8 +250,8 @@ void arch_teardown_msi_irq(unsigned int irq) /* Shift the mask to the correct bit location */ bitmask = irq0; if ((msi_free_irq_bitmask[index] bitmask) != bitmask) - panic(arch_teardown_msi_irq: Attempted to teardown MSI - interrupt (%d) not in use, irq); + panic(octeon_teardown_msi_irq: Attempted to teardown MSI + interrupt (%d) not in use, irq); /* Checks are done, update the in use bitmask */ spin_lock(msi_free_irq_bitmask_lock); @@ -261,6 +260,16 @@ void arch_teardown_msi_irq(unsigned int irq) spin_unlock(msi_free_irq_bitmask_lock); } +static struct msi_chip octeon_msi_chip = { + .setup_irqs = octeon_setup_msi_irqs, + .teardown_irq = octeon_teardown_msi_irq, +}; + +struct msi_chip *arch_find_msi_chip(struct pci_dev *dev) +{ + return octeon_msi_chip; +} + static DEFINE_RAW_SPINLOCK(octeon_irq_msi_lock); static u64 msi_rcv_reg[4]; -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org
[PATCH v1 09/21] Irq_remapping/MSI: Use MSI chip framework to configure MSI/MSI-X irq
Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. Signed-off-by: Yijing Wang wangyij...@huawei.com --- drivers/iommu/irq_remapping.c |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c index 33c4395..e75026e 100644 --- a/drivers/iommu/irq_remapping.c +++ b/drivers/iommu/irq_remapping.c @@ -148,6 +148,11 @@ static int irq_remapping_setup_msi_irqs(struct pci_dev *dev, return do_setup_msix_irqs(dev, nvec); } +static struct msi_chip remap_msi_chip = { + .setup_irqs = irq_remapping_setup_msi_irqs, + .teardown_irq = native_teardown_msi_irq, +}; + static void eoi_ioapic_pin_remapped(int apic, int pin, int vector) { /* @@ -165,9 +170,10 @@ static void __init irq_remapping_modify_x86_ops(void) x86_io_apic_ops.set_affinity= set_remapped_irq_affinity; x86_io_apic_ops.setup_entry = setup_ioapic_remapped_entry; x86_io_apic_ops.eoi_ioapic_pin = eoi_ioapic_pin_remapped; - x86_msi.setup_msi_irqs = irq_remapping_setup_msi_irqs; + x86_msi.setup_msi_irqs = irq_remapping_setup_msi_irqs; x86_msi.setup_hpet_msi = setup_hpet_msi_remapped; x86_msi.compose_msi_msg = compose_remapped_msi_msg; + x86_msi_chip = remap_msi_chip; } static __init int setup_nointremap(char *str) -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 07/21] x86/MSI: Use MSI chip framework to configure MSI/MSI-X irq
Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/x86/include/asm/pci.h |1 + arch/x86/kernel/apic/io_apic.c | 12 2 files changed, 13 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h index 0892ea0..878a06d 100644 --- a/arch/x86/include/asm/pci.h +++ b/arch/x86/include/asm/pci.h @@ -101,6 +101,7 @@ void native_teardown_msi_irq(unsigned int irq); void native_restore_msi_irqs(struct pci_dev *dev); int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc, unsigned int irq_base, unsigned int irq_offset); +extern struct msi_chip *x86_msi_chip; #else #define native_setup_msi_irqs NULL #define native_teardown_msi_irqNULL diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index 2a2ec28..882b95e 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -3337,6 +3337,18 @@ int default_setup_hpet_msi(unsigned int irq, unsigned int id) } #endif +struct msi_chip apic_msi_chip = { + .setup_irqs = native_setup_msi_irqs, + .teardown_irq = native_teardown_msi_irq, +}; + +struct msi_chip *arch_find_msi_chip(struct pci_dev *dev) +{ + return x86_msi_chip; +} + +struct msi_chip *x86_msi_chip = apic_msi_chip; + #endif /* CONFIG_PCI_MSI */ /* * Hypertransport interrupt support -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 06/21] PCI/MSI: Refactor struct msi_chip to make it become more common
Now there are a lot of __weak arch functions in MSI code. These functions make MSI driver complex. Thierry Reding Introduced a new MSI chip framework to configure MSI/MSI-X irq in ARM. Use the new MSI chip framework to refactor all other platform MSI arch code to eliminate weak arch MSI functions. This patch add .restore_irq() and .setup_irqs() to make it become more common. Signed-off-by: Yijing Wang wangyij...@huawei.com --- drivers/pci/msi.c | 15 +++ include/linux/msi.h |3 +++ 2 files changed, 18 insertions(+), 0 deletions(-) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 539c11d..d78d637 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -63,6 +63,11 @@ int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) { struct msi_desc *entry; int ret; + struct msi_chip *chip; + + chip = arch_find_msi_chip(dev); + if (chip chip-setup_irqs) + return chip-setup_irqs(dev, nvec, type); /* * If an architecture wants to support multiple MSI, it needs to @@ -105,6 +110,11 @@ void default_teardown_msi_irqs(struct pci_dev *dev) void __weak arch_teardown_msi_irqs(struct pci_dev *dev) { + struct msi_chip *chip = arch_find_msi_chip(dev); + + if (chip chip-teardown_irqs) + return chip-teardown_irqs(dev); + return default_teardown_msi_irqs(dev); } @@ -128,6 +138,11 @@ static void default_restore_msi_irq(struct pci_dev *dev, int irq) void __weak arch_restore_msi_irqs(struct pci_dev *dev) { + struct msi_chip *chip = arch_find_msi_chip(dev); + + if (chip chip-restore_irqs) + return chip-restore_irqs(dev); + return default_restore_msi_irqs(dev); } diff --git a/include/linux/msi.h b/include/linux/msi.h index 5650848..92a51e7 100644 --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -72,7 +72,10 @@ struct msi_chip { struct list_head list; int (*setup_irq)(struct pci_dev *dev, struct msi_desc *desc); + int (*setup_irqs)(struct pci_dev *dev, int nvec, int type); void (*teardown_irq)(unsigned int irq); + void (*teardown_irqs)(struct pci_dev *dev); + void (*restore_irqs)(struct pci_dev *dev); }; #endif /* LINUX_MSI_H */ -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 08/21] x86/xen/MSI: Use MSI chip framework to configure MSI/MSI-X irq
Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. Signed-off-by: Yijing Wang wangyij...@huawei.com CC: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- arch/x86/pci/xen.c | 46 ++ 1 files changed, 30 insertions(+), 16 deletions(-) diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c index 84c2fce..e669ee4 100644 --- a/arch/x86/pci/xen.c +++ b/arch/x86/pci/xen.c @@ -376,6 +376,11 @@ static void xen_initdom_restore_msi_irqs(struct pci_dev *dev) } #endif +static void xen_teardown_msi_irq(unsigned int irq) +{ + xen_destroy_irq(irq); +} + static void xen_teardown_msi_irqs(struct pci_dev *dev) { struct msi_desc *msidesc; @@ -385,19 +390,26 @@ static void xen_teardown_msi_irqs(struct pci_dev *dev) xen_pci_frontend_disable_msix(dev); else xen_pci_frontend_disable_msi(dev); - - /* Free the IRQ's and the msidesc using the generic code. */ - default_teardown_msi_irqs(dev); -} - -static void xen_teardown_msi_irq(unsigned int irq) -{ - xen_destroy_irq(irq); + + list_for_each_entry(msidesc, dev-msi_list, list) { + int i, nvec; + if (msidesc-irq == 0) + continue; + if (msidesc-nvec_used) + nvec = msidesc-nvec_used; + else + nvec = 1 msidesc-msi_attrib.multiple; + for (i = 0; i nvec; i++) + xen_teardown_msi_irq(msidesc-irq + i); + } } void xen_nop_msi_mask(struct irq_data *data) { } + +struct msi_chip xen_msi_chip; + #endif int __init pci_xen_init(void) @@ -418,9 +430,9 @@ int __init pci_xen_init(void) #endif #ifdef CONFIG_PCI_MSI - x86_msi.setup_msi_irqs = xen_setup_msi_irqs; - x86_msi.teardown_msi_irq = xen_teardown_msi_irq; - x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs; + xen_msi_chip.setup_irqs = xen_setup_msi_irqs; + xen_msi_chip.teardown_irqs = xen_teardown_msi_irqs; + x86_msi_chip = xen_msi_chip; msi_chip.irq_mask = xen_nop_msi_mask; msi_chip.irq_unmask = xen_nop_msi_mask; #endif @@ -441,8 +453,9 @@ int __init pci_xen_hvm_init(void) #endif #ifdef CONFIG_PCI_MSI - x86_msi.setup_msi_irqs = xen_hvm_setup_msi_irqs; - x86_msi.teardown_msi_irq = xen_teardown_msi_irq; + xen_msi_chip.setup_irqs = xen_hvm_setup_msi_irqs; + xen_msi_chip.teardown_irq = xen_teardown_msi_irq; + x86_msi_chip = xen_msi_chip; #endif return 0; } @@ -499,9 +512,10 @@ int __init pci_xen_initial_domain(void) int irq; #ifdef CONFIG_PCI_MSI - x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs; - x86_msi.teardown_msi_irq = xen_teardown_msi_irq; - x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs; + xen_msi_chip.setup_irqs = xen_initdom_setup_msi_irqs; + xen_msi_chip.teardown_irq = xen_teardown_msi_irq; + xen_msi_chip.restore_irqs = xen_initdom_restore_msi_irqs; + x86_msi_chip = xen_msi_chip; msi_chip.irq_mask = xen_nop_msi_mask; msi_chip.irq_unmask = xen_nop_msi_mask; #endif -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 16/21] s390/MSI: Use MSI chip framework to configure MSI/MSI-X irq
Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/s390/pci/pci.c | 18 ++ 1 files changed, 14 insertions(+), 4 deletions(-) diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c index 2fa7b14..da5316e 100644 --- a/arch/s390/pci/pci.c +++ b/arch/s390/pci/pci.c @@ -358,7 +358,7 @@ static void zpci_irq_handler(struct airq_struct *airq) } } -int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) +int zpci_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) { struct zpci_dev *zdev = get_zdev(pdev); unsigned int hwirq, msi_vecs; @@ -434,7 +434,7 @@ out: return rc; } -void arch_teardown_msi_irqs(struct pci_dev *pdev) +static void zpci_teardown_msi_irqs(struct pci_dev *pdev) { struct zpci_dev *zdev = get_zdev(pdev); struct msi_desc *msi; @@ -448,9 +448,9 @@ void arch_teardown_msi_irqs(struct pci_dev *pdev) /* Release MSI interrupts */ list_for_each_entry(msi, pdev-msi_list, list) { if (msi-msi_attrib.is_msix) - default_msix_mask_irq(msi, 1); + __msix_mask_irq(msi, 1); else - default_msi_mask_irq(msi, 1, 1); + __msi_mask_irq(msi, 1, 1); irq_set_msi_desc(msi-irq, NULL); irq_free_desc(msi-irq); msi-msg.address_lo = 0; @@ -464,6 +464,16 @@ void arch_teardown_msi_irqs(struct pci_dev *pdev) airq_iv_free_bit(zpci_aisb_iv, zdev-aisb); } +static struct msi_chip zpci_msi_chip = { + .setup_irqs = zpci_setup_msi_irqs, + .teardown_irqs = zpci_teardown_msi_irqs, +}; + +struct msi_chip *arch_find_msi_chip(struct pci_dev *dev) +{ + return zpci_msi_chip; +} + static void zpci_map_resources(struct zpci_dev *zdev) { struct pci_dev *pdev = zdev-pdev; -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 15/21] Powerpc/MSI: Use MSI chip framework to configure MSI/MSI-X irq
Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/powerpc/kernel/msi.c | 14 -- 1 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c index 71bd161..01781a4 100644 --- a/arch/powerpc/kernel/msi.c +++ b/arch/powerpc/kernel/msi.c @@ -13,7 +13,7 @@ #include asm/machdep.h -int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) +static int ppc_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) { if (!ppc_md.setup_msi_irqs || !ppc_md.teardown_msi_irqs) { pr_debug(msi: Platform doesn't provide MSI callbacks.\n); @@ -27,7 +27,17 @@ int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) return ppc_md.setup_msi_irqs(dev, nvec, type); } -void arch_teardown_msi_irqs(struct pci_dev *dev) +static void ppc_teardown_msi_irqs(struct pci_dev *dev) { ppc_md.teardown_msi_irqs(dev); } + +static struct msi_chip ppc_msi_chip = { + .setup_irqs = ppc_setup_msi_irqs, + .teardown_irqs = ppc_teardown_msi_irqs, +}; + +struct msi_chip *arch_find_msi_chip(struct pci_dev *dev) +{ + return ppc_msi_chip; +} -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 02/21] PCI/MSI: Remove useless bus-msi assignment
Currently, PCI drivers will initialize bus-msi in pcibios_add_bus(). pcibios_add_bus() will be called in every pci bus initialization. So the bus-msi assignment in pci_alloc_child_bus() is useless. Signed-off-by: Yijing Wang wangyij...@huawei.com CC: Thierry Reding thierry.red...@avionic-design.de CC: Thomas Petazzoni thomas.petazz...@free-electrons.com --- drivers/pci/probe.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index e3cf8a2..8296576 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -677,7 +677,6 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent, child-parent = parent; child-ops = parent-ops; - child-msi = parent-msi; child-sysdata = parent-sysdata; child-bus_flags = parent-bus_flags; -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 17/21] arm/iop13xx/MSI: Use MSI chip framework to configure MSI/MSI-X irq
Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/arm/mach-iop13xx/include/mach/pci.h |2 ++ arch/arm/mach-iop13xx/iq81340mc.c|1 + arch/arm/mach-iop13xx/iq81340sc.c|1 + arch/arm/mach-iop13xx/msi.c |9 +++-- arch/arm/mach-iop13xx/pci.c |6 ++ 5 files changed, 17 insertions(+), 2 deletions(-) diff --git a/arch/arm/mach-iop13xx/include/mach/pci.h b/arch/arm/mach-iop13xx/include/mach/pci.h index 59f42b5..7a073cb 100644 --- a/arch/arm/mach-iop13xx/include/mach/pci.h +++ b/arch/arm/mach-iop13xx/include/mach/pci.h @@ -10,6 +10,8 @@ struct pci_bus *iop13xx_scan_bus(int nr, struct pci_sys_data *); void iop13xx_atu_select(struct hw_pci *plat_pci); void iop13xx_pci_init(void); void iop13xx_map_pci_memory(void); +void iop13xx_add_bus(struct pci_bus *bus); +extern struct msi_chip iop13xx_msi_chip; #define IOP_PCI_STATUS_ERROR (PCI_STATUS_PARITY | \ PCI_STATUS_SIG_TARGET_ABORT | \ diff --git a/arch/arm/mach-iop13xx/iq81340mc.c b/arch/arm/mach-iop13xx/iq81340mc.c index 9cd07d3..19d47cb 100644 --- a/arch/arm/mach-iop13xx/iq81340mc.c +++ b/arch/arm/mach-iop13xx/iq81340mc.c @@ -59,6 +59,7 @@ static struct hw_pci iq81340mc_pci __initdata = { .map_irq= iq81340mc_pcix_map_irq, .scan = iop13xx_scan_bus, .preinit= iop13xx_pci_init, + .add_bus= iop13xx_add_bus; }; static int __init iq81340mc_pci_init(void) diff --git a/arch/arm/mach-iop13xx/iq81340sc.c b/arch/arm/mach-iop13xx/iq81340sc.c index b3ec11c..4d56993 100644 --- a/arch/arm/mach-iop13xx/iq81340sc.c +++ b/arch/arm/mach-iop13xx/iq81340sc.c @@ -61,6 +61,7 @@ static struct hw_pci iq81340sc_pci __initdata = { .scan = iop13xx_scan_bus, .map_irq= iq81340sc_atux_map_irq, .preinit= iop13xx_pci_init + .add_bus= iop13xx_add_bus; }; static int __init iq81340sc_pci_init(void) diff --git a/arch/arm/mach-iop13xx/msi.c b/arch/arm/mach-iop13xx/msi.c index e7730cf..1a8cb2f 100644 --- a/arch/arm/mach-iop13xx/msi.c +++ b/arch/arm/mach-iop13xx/msi.c @@ -132,7 +132,7 @@ static struct irq_chip iop13xx_msi_chip = { .irq_unmask = unmask_msi_irq, }; -int arch_setup_msi_irq(struct pci_dev *pdev, struct msi_desc *desc) +static int iop13xx_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) { int id, irq = irq_alloc_desc_from(IRQ_IOP13XX_MSI_0, -1); struct msi_msg msg; @@ -159,7 +159,12 @@ int arch_setup_msi_irq(struct pci_dev *pdev, struct msi_desc *desc) return 0; } -void arch_teardown_msi_irq(unsigned int irq) +static void iop13xx_teardown_msi_irq(unsigned int irq) { irq_free_desc(irq); } + +struct msi_chip iop13xx_chip = { + .setup_irq = iop13xx_setup_msi_irq, + .teardown_irq = iop13xx_teardown_msi_irq, +}; diff --git a/arch/arm/mach-iop13xx/pci.c b/arch/arm/mach-iop13xx/pci.c index 9082b84..f498800 100644 --- a/arch/arm/mach-iop13xx/pci.c +++ b/arch/arm/mach-iop13xx/pci.c @@ -962,6 +962,12 @@ void __init iop13xx_atu_select(struct hw_pci *plat_pci) } } +void iop13xx_add_bus(struct pci_bus *bus) +{ + if (IS_ENABLED(CONFIG_PCI_MSI)) + bus-msi = iop13xx_msi_chip; +} + void __init iop13xx_pci_init(void) { /* clear pre-existing south bridge errors */ -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 19/21] Sparc/MSI: Use MSI chip framework to configure MSI/MSI-X irq
Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/sparc/kernel/pci.c | 14 -- 1 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c index b36365f..2a89ee2 100644 --- a/arch/sparc/kernel/pci.c +++ b/arch/sparc/kernel/pci.c @@ -905,7 +905,7 @@ int pci_domain_nr(struct pci_bus *pbus) EXPORT_SYMBOL(pci_domain_nr); #ifdef CONFIG_PCI_MSI -int arch_setup_msi_irq(struct pci_dev *pdev, struct msi_desc *desc) +int sparc_setup_msi_irq(struct pci_dev *pdev, struct msi_desc *desc) { struct pci_pbm_info *pbm = pdev-dev.archdata.host_controller; unsigned int irq; @@ -916,7 +916,7 @@ int arch_setup_msi_irq(struct pci_dev *pdev, struct msi_desc *desc) return pbm-setup_msi_irq(irq, pdev, desc); } -void arch_teardown_msi_irq(unsigned int irq) +void sparc_teardown_msi_irq(unsigned int irq) { struct msi_desc *entry = irq_get_msi_desc(irq); struct pci_dev *pdev = entry-dev; @@ -925,6 +925,16 @@ void arch_teardown_msi_irq(unsigned int irq) if (pbm-teardown_msi_irq) pbm-teardown_msi_irq(irq, pdev); } + +static struct msi_chip sparc_msi_chip = { + .setup_irq = sparc_setup_msi_irq, + .teardown_irq = sparc_teardown_msi_irq, +}; + +struct msi_chip *arch_find_msi_chip(struct pci_dev *dev) +{ + return sparc_msi_chip; +} #endif /* !(CONFIG_PCI_MSI) */ static void ali_sound_dma_hack(struct pci_dev *pdev, int set_bit) -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 13/21] MIPS/Xlp/MSI: Use MSI chip framework to configure MSI/MSI-X irq
Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/mips/pci/msi-xlp.c | 14 -- 1 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/mips/pci/msi-xlp.c b/arch/mips/pci/msi-xlp.c index e469dc7..6b791ef 100644 --- a/arch/mips/pci/msi-xlp.c +++ b/arch/mips/pci/msi-xlp.c @@ -245,7 +245,7 @@ static struct irq_chip xlp_msix_chip = { .irq_unmask = unmask_msi_irq, }; -void arch_teardown_msi_irq(unsigned int irq) +void xlp_teardown_msi_irq(unsigned int irq) { } @@ -450,7 +450,7 @@ static int xlp_setup_msix(uint64_t lnkbase, int node, int link, return 0; } -int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) +static int xlp_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) { struct pci_dev *lnkdev; uint64_t lnkbase; @@ -472,6 +472,16 @@ int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) return xlp_setup_msi(lnkbase, node, link, desc); } +static struct msi_chip xlp_chip = { + .setup_irq = xlp_setup_msi_irq, + .teardown_irq = xlp_teardown_msi_irq, +}; + +struct msi_chip *arch_find_msi_chip(struct pci_dev *dev) +{ + return xlp_chip; +} + void __init xlp_init_node_msi_irqs(int node, int link) { struct nlm_soc_info *nodep; -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 18/21] IA64/MSI: Use MSI chip framework to configure MSI/MSI-X irq
Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/ia64/kernel/msi_ia64.c | 18 ++ 1 files changed, 14 insertions(+), 4 deletions(-) diff --git a/arch/ia64/kernel/msi_ia64.c b/arch/ia64/kernel/msi_ia64.c index 4efe748..55ac859 100644 --- a/arch/ia64/kernel/msi_ia64.c +++ b/arch/ia64/kernel/msi_ia64.c @@ -112,15 +112,15 @@ static struct irq_chip ia64_msi_chip = { }; -int arch_setup_msi_irq(struct pci_dev *pdev, struct msi_desc *desc) +static int arch_ia64_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) { if (platform_setup_msi_irq) - return platform_setup_msi_irq(pdev, desc); + return platform_setup_msi_irq(dev, desc); - return ia64_setup_msi_irq(pdev, desc); + return ia64_setup_msi_irq(dev, desc); } -void arch_teardown_msi_irq(unsigned int irq) +static void arch_ia64_teardown_msi_irq(unsigned int irq) { if (platform_teardown_msi_irq) return platform_teardown_msi_irq(irq); @@ -128,6 +128,16 @@ void arch_teardown_msi_irq(unsigned int irq) return ia64_teardown_msi_irq(irq); } +static struct msi_chip chip = { + .setup_irq = arch_ia64_setup_msi_irq, + .teardown_irq = arch_ia64_teardown_msi_irq, +}; + +struct msi_chip *arch_find_msi_chip(struct pci_dev *dev) +{ + return chip; +} + #ifdef CONFIG_INTEL_IOMMU #ifdef CONFIG_SMP static int dmar_msi_set_affinity(struct irq_data *data, -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 12/21] MIPS/Xlp: Remove the dead function destroy_irq() to fix build error
Commit 465665f78a7 (mips: Kill pointless destroy_irq()) removed the destroy_irq(). So remove the leftover one in xlp_setup_msix() to fix build error. arch/mips/pci/msi-xlp.c: In function 'xlp_setup_msix': arch/mips/pci/msi-xlp.c:447:3: error: implicit declaration of function 'destroy_irq'.. cc1: some warnings being treated as errors make[1]: *** [arch/mips/pci/msi-xlp.o] Error 1 make: *** [arch/mips/pci/] Error 2 Signed-off-by: Yijing Wang wangyii...@huawei.com Cc: Thomas Gleixner t...@linutronix.de --- arch/mips/pci/msi-xlp.c |4 +--- 1 files changed, 1 insertions(+), 3 deletions(-) diff --git a/arch/mips/pci/msi-xlp.c b/arch/mips/pci/msi-xlp.c index fa374fe..e469dc7 100644 --- a/arch/mips/pci/msi-xlp.c +++ b/arch/mips/pci/msi-xlp.c @@ -443,10 +443,8 @@ static int xlp_setup_msix(uint64_t lnkbase, int node, int link, msg.data = 0xc00 | msixvec; ret = irq_set_msi_desc(xirq, desc); - if (ret 0) { - destroy_irq(xirq); + if (ret 0) return ret; - } write_msi_msg(xirq, msg); return 0; -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 14/21] MIPS/Xlr/MSI: Use MSI chip framework to configure MSI/MSI-X irq
Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/mips/pci/pci-xlr.c | 15 +-- 1 files changed, 13 insertions(+), 2 deletions(-) diff --git a/arch/mips/pci/pci-xlr.c b/arch/mips/pci/pci-xlr.c index 0dde803..7bd91cc 100644 --- a/arch/mips/pci/pci-xlr.c +++ b/arch/mips/pci/pci-xlr.c @@ -214,11 +214,11 @@ static int get_irq_vector(const struct pci_dev *dev) } #ifdef CONFIG_PCI_MSI -void arch_teardown_msi_irq(unsigned int irq) +void xlr_teardown_msi_irq(unsigned int irq) { } -int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) +int xlr_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) { struct msi_msg msg; struct pci_dev *lnk; @@ -263,6 +263,17 @@ int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) write_msi_msg(irq, msg); return 0; } + +static struct msi_chip xlr_msi_chip = { + .setup_irq = xlr_setup_msi_irq, + .teardown_irq = xlr_teardown_msi_irq, +}; + +struct msi_chip *arch_find_msi_chip(struct pci_dev *dev) +{ + return xlr_msi_chip; +} + #endif /* Extra ACK needed for XLR on chip PCI controller */ -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 21/21] PCI/MSI: Clean up unused MSI arch functions
Now we use struct msi_chip in all platforms to configure MSI/MSI-X. We can clean up the unused arch functions. Signed-off-by: Yijing Wang wangyij...@huawei.com --- drivers/iommu/irq_remapping.c |2 +- drivers/pci/msi.c | 99 - include/linux/msi.h | 14 -- 3 files changed, 39 insertions(+), 76 deletions(-) diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c index 99b1c0f..6e645f0 100644 --- a/drivers/iommu/irq_remapping.c +++ b/drivers/iommu/irq_remapping.c @@ -92,7 +92,7 @@ error: /* * Restore altered MSI descriptor fields and prevent just destroyed -* IRQs from tearing down again in default_teardown_msi_irqs() +* IRQs from tearing down again in teardown_msi_irqs() */ msidesc-irq = 0; msidesc-nvec_used = 0; diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index d78d637..e3e7f4f 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -34,50 +34,31 @@ struct msi_chip * __weak arch_find_msi_chip(struct pci_dev *dev) return dev-bus-msi; } -int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) -{ - struct msi_chip *chip = arch_find_msi_chip(dev); - int err; - - if (!chip || !chip-setup_irq) - return -EINVAL; - - err = chip-setup_irq(dev, desc); - if (err 0) - return err; - - return 0; -} - -void __weak arch_teardown_msi_irq(unsigned int irq) -{ - struct msi_chip *chip = irq_get_chip_data(irq); - - if (!chip || !chip-teardown_irq) - return; - - chip-teardown_irq(irq); -} - -int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) +int setup_msi_irqs(struct pci_dev *dev, int nvec, int type) { struct msi_desc *entry; int ret; struct msi_chip *chip; chip = arch_find_msi_chip(dev); - if (chip chip-setup_irqs) + if (!chip) + return -EINVAL; + + if (chip-setup_irqs) return chip-setup_irqs(dev, nvec, type); /* * If an architecture wants to support multiple MSI, it needs to -* override arch_setup_msi_irqs() +* implement chip-setup_irqs(). */ if (type == PCI_CAP_ID_MSI nvec 1) return 1; + if (!chip-setup_irq) + return -EINVAL; + list_for_each_entry(entry, dev-msi_list, list) { - ret = arch_setup_msi_irq(dev, entry); + ret = chip-setup_irq(dev, entry); if (ret 0) return ret; if (ret 0) @@ -87,13 +68,20 @@ int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) return 0; } -/* - * We have a default implementation available as a separate non-weak - * function, as it is used by the Xen x86 PCI code - */ -void default_teardown_msi_irqs(struct pci_dev *dev) +static void teardown_msi_irqs(struct pci_dev *dev) { struct msi_desc *entry; + struct msi_chip *chip; + + chip = arch_find_msi_chip(dev); + if (!chip) + return; + + if (chip-teardown_irqs) + return chip-teardown_irqs(dev); + + if (!chip-teardown_irq) + return; list_for_each_entry(entry, dev-msi_list, list) { int i, nvec; @@ -104,20 +92,10 @@ void default_teardown_msi_irqs(struct pci_dev *dev) else nvec = 1 entry-msi_attrib.multiple; for (i = 0; i nvec; i++) - arch_teardown_msi_irq(entry-irq + i); + chip-teardown_irq(entry-irq + i); } } -void __weak arch_teardown_msi_irqs(struct pci_dev *dev) -{ - struct msi_chip *chip = arch_find_msi_chip(dev); - - if (chip chip-teardown_irqs) - return chip-teardown_irqs(dev); - - return default_teardown_msi_irqs(dev); -} - static void default_restore_msi_irq(struct pci_dev *dev, int irq) { struct msi_desc *entry; @@ -136,10 +114,18 @@ static void default_restore_msi_irq(struct pci_dev *dev, int irq) write_msi_msg(irq, entry-msg); } -void __weak arch_restore_msi_irqs(struct pci_dev *dev) +static void default_restore_msi_irqs(struct pci_dev *dev) { - struct msi_chip *chip = arch_find_msi_chip(dev); + struct msi_desc *entry = NULL; + + list_for_each_entry(entry, dev-msi_list, list) { + default_restore_msi_irq(dev, entry-irq); + } +} +static void restore_msi_irqs(struct pci_dev *dev) +{ + struct msi_chip *chip = arch_find_msi_chip(dev); if (chip chip-restore_irqs) return chip-restore_irqs(dev); @@ -248,15 +234,6 @@ void unmask_msi_irq(struct irq_data *data) msi_set_mask_bit(data, 0); } -void default_restore_msi_irqs(struct pci_dev *dev) -{ - struct msi_desc *entry; - -
[PATCH v1 20/21] tile/MSI: Use MSI chip framework to configure MSI/MSI-X irq
Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/tile/kernel/pci_gx.c | 14 -- 1 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/tile/kernel/pci_gx.c b/arch/tile/kernel/pci_gx.c index e39f9c5..4912b75 100644 --- a/arch/tile/kernel/pci_gx.c +++ b/arch/tile/kernel/pci_gx.c @@ -1485,7 +1485,7 @@ static struct irq_chip tilegx_msi_chip = { /* TBD: support set_affinity. */ }; -int arch_setup_msi_irq(struct pci_dev *pdev, struct msi_desc *desc) +static int tile_setup_msi_irq(struct pci_dev *pdev, struct msi_desc *desc) { struct pci_controller *controller; gxio_trio_context_t *trio_context; @@ -1604,7 +1604,17 @@ is_64_failure: return ret; } -void arch_teardown_msi_irq(unsigned int irq) +void tile_teardown_msi_irq(unsigned int irq) { irq_free_hwirq(irq); } + +static struct msi_chip tile_msi_chip = { + .setup_irq = tile_setup_msi_irq, + .teardown_irq = tile_teardown_msi_irq, +}; + +struct msi_chip *arch_find_msi_chip(struct pci_dev *dev) +{ + return tile_msi_chip; +} -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1 01/21] PCI/MSI: Clean up struct msi_chip argument
Msi_chip functions setup_irq/teardown_irq rarely use msi_chip argument. We can look up msi_chip pointer by the device pointer or irq number, so clean up msi_chip argument. Signed-off-by: Yijing Wang wangyij...@huawei.com CC: Thierry Reding thierry.red...@gmail.com CC: Thomas Petazzoni thomas.petazz...@free-electrons.com --- drivers/irqchip/irq-armada-370-xp.c | 12 +--- drivers/pci/host/pci-tegra.c|8 +--- drivers/pci/host/pcie-designware.c |4 ++-- drivers/pci/host/pcie-rcar.c|8 +--- drivers/pci/msi.c |4 ++-- include/linux/msi.h |5 ++--- 6 files changed, 21 insertions(+), 20 deletions(-) diff --git a/drivers/irqchip/irq-armada-370-xp.c b/drivers/irqchip/irq-armada-370-xp.c index 574aba0..658990c 100644 --- a/drivers/irqchip/irq-armada-370-xp.c +++ b/drivers/irqchip/irq-armada-370-xp.c @@ -129,9 +129,8 @@ static void armada_370_xp_free_msi(int hwirq) mutex_unlock(msi_used_lock); } -static int armada_370_xp_setup_msi_irq(struct msi_chip *chip, - struct pci_dev *pdev, - struct msi_desc *desc) +static int armada_370_xp_setup_msi_irq(struct pci_dev *pdev, + struct msi_desc *desc) { struct msi_msg msg; int virq, hwirq; @@ -156,8 +155,7 @@ static int armada_370_xp_setup_msi_irq(struct msi_chip *chip, return 0; } -static void armada_370_xp_teardown_msi_irq(struct msi_chip *chip, - unsigned int irq) +static void armada_370_xp_teardown_msi_irq(unsigned int irq) { struct irq_data *d = irq_get_irq_data(irq); unsigned long hwirq = d-hwirq; @@ -166,8 +164,8 @@ static void armada_370_xp_teardown_msi_irq(struct msi_chip *chip, armada_370_xp_free_msi(hwirq); } -static int armada_370_xp_check_msi_device(struct msi_chip *chip, struct pci_dev *dev, - int nvec, int type) +static int armada_370_xp_check_msi_device(struct pci_dev *dev, + int nvec, int type) { /* We support MSI, but not MSI-X */ if (type == PCI_CAP_ID_MSI) diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c index 0fb0fdb..edd4040 100644 --- a/drivers/pci/host/pci-tegra.c +++ b/drivers/pci/host/pci-tegra.c @@ -1157,9 +1157,10 @@ static irqreturn_t tegra_pcie_msi_irq(int irq, void *data) return processed 0 ? IRQ_HANDLED : IRQ_NONE; } -static int tegra_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev, +static int tegra_msi_setup_irq(struct pci_dev *pdev, struct msi_desc *desc) { + struct msi_chip *chip = pdev-bus-msi; struct tegra_msi *msi = to_tegra_msi(chip); struct msi_msg msg; unsigned int irq; @@ -1185,10 +1186,11 @@ static int tegra_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev, return 0; } -static void tegra_msi_teardown_irq(struct msi_chip *chip, unsigned int irq) +static void tegra_msi_teardown_irq(unsigned int irq) { - struct tegra_msi *msi = to_tegra_msi(chip); struct irq_data *d = irq_get_irq_data(irq); + struct msi_chip *chip = irq_get_chip_data(irq); + struct tegra_msi *msi = to_tegra_msi(chip); tegra_msi_free(msi, d-hwirq); } diff --git a/drivers/pci/host/pcie-designware.c b/drivers/pci/host/pcie-designware.c index 52bd3a1..2204456 100644 --- a/drivers/pci/host/pcie-designware.c +++ b/drivers/pci/host/pcie-designware.c @@ -342,7 +342,7 @@ static void clear_irq(unsigned int irq) msi-msi_attrib.multiple = 0; } -static int dw_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev, +static int dw_msi_setup_irq(struct pci_dev *pdev, struct msi_desc *desc) { int irq, pos, msgvec; @@ -384,7 +384,7 @@ static int dw_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev, return 0; } -static void dw_msi_teardown_irq(struct msi_chip *chip, unsigned int irq) +static void dw_msi_teardown_irq(unsigned int irq) { clear_irq(irq); } diff --git a/drivers/pci/host/pcie-rcar.c b/drivers/pci/host/pcie-rcar.c index 4884ee5..647bc9f 100644 --- a/drivers/pci/host/pcie-rcar.c +++ b/drivers/pci/host/pcie-rcar.c @@ -615,9 +615,10 @@ static irqreturn_t rcar_pcie_msi_irq(int irq, void *data) return IRQ_HANDLED; } -static int rcar_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev, +static int rcar_msi_setup_irq(struct pci_dev *pdev, struct msi_desc *desc) { + struct msi_chip *chip = pdev-bus-msi; struct rcar_msi *msi = to_rcar_msi(chip); struct rcar_pcie *pcie = container_of(chip, struct rcar_pcie, msi.chip); struct msi_msg msg; @@ -645,10 +646,11 @@ static int rcar_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev, return 0; } -static void rcar_msi_teardown_irq(struct msi_chip *chip,
Re: [PATCH v1 09/21] Irq_remapping/MSI: Use MSI chip framework to configure MSI/MSI-X irq
Hello. On 9/5/2014 2:09 PM, Yijing Wang wrote: Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. Signed-off-by: Yijing Wang wangyij...@huawei.com --- drivers/iommu/irq_remapping.c |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c index 33c4395..e75026e 100644 --- a/drivers/iommu/irq_remapping.c +++ b/drivers/iommu/irq_remapping.c [...] @@ -165,9 +170,10 @@ static void __init irq_remapping_modify_x86_ops(void) x86_io_apic_ops.set_affinity= set_remapped_irq_affinity; x86_io_apic_ops.setup_entry = setup_ioapic_remapped_entry; x86_io_apic_ops.eoi_ioapic_pin = eoi_ioapic_pin_remapped; - x86_msi.setup_msi_irqs = irq_remapping_setup_msi_irqs; + x86_msi.setup_msi_irqs = irq_remapping_setup_msi_irqs; AFAICS, this change only converts tabs to spaces, so not needed at all. x86_msi.setup_hpet_msi = setup_hpet_msi_remapped; x86_msi.compose_msi_msg = compose_remapped_msi_msg; + x86_msi_chip = remap_msi_chip; Please align = with the rest of assignments. WBR, Sergei ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v1 15/21] Powerpc/MSI: Use MSI chip framework to configure MSI/MSI-X irq
Hello. On 9/5/2014 2:10 PM, Yijing Wang wrote: Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/powerpc/kernel/msi.c | 14 -- 1 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c index 71bd161..01781a4 100644 --- a/arch/powerpc/kernel/msi.c +++ b/arch/powerpc/kernel/msi.c [...] @@ -27,7 +27,17 @@ int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) return ppc_md.setup_msi_irqs(dev, nvec, type); } -void arch_teardown_msi_irqs(struct pci_dev *dev) +static void ppc_teardown_msi_irqs(struct pci_dev *dev) Shouldn't this function take IRQ # instead? { ppc_md.teardown_msi_irqs(dev); } + +static struct msi_chip ppc_msi_chip = { + .setup_irqs = ppc_setup_msi_irqs, + .teardown_irqs = ppc_teardown_msi_irqs, +}; + +struct msi_chip *arch_find_msi_chip(struct pci_dev *dev) +{ + return ppc_msi_chip; +} WBR, Sergei ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/2][v4] powerpc/fsl-booke: Add initial T1040/T1042 RDB board support
T1040/T1042RDB is Freescale Reference Design Board. The board can support both T1040/T1042 QorIQ Power Architecture™ processor. T1040/T1042RDB board Overview --- - SERDES Connections, 8 lanes supporting: - PCI - SGMII - QSGMII - SATA 2.0 - DDR Controller - Supports rates of up to 1600 MHz data-rate - Supports one DDR3LP UDIMM -IFC/Local Bus - NAND flash: 1GB 8-bit NAND flash - NOR: 128MB 16-bit NOR Flash - Ethernet - Two on-board RGMII 10/100/1G ethernet ports. - PHY #0 remains powered up during deep-sleep - CPLD - Clocks - System and DDR clock (SYSCLK, “DDRCLK”) - SERDES clocks - Power Supplies - USB - Supports two USB 2.0 ports with integrated PHYs - Two type A ports with 5V@1.5A per port. - SDHC - SDHC/SDXC connector - SPI - On-board 64MB SPI flash - I2C - Devices connected: EEPROM, thermal monitor, VID controller - Other IO - Two Serial ports - ProfiBus port Add support for T1040/T1042 RDB board: -add device tree -add entry in Kconfig to build -Add entry in corenet_generic.c, as it is similar to other corenet platforms Signed-off-by: Priyanka Jain priyanka.j...@freescale.com Signed-off-by: Poonam Aggrwal poonam.aggr...@freescale.com Signed-off-by: Prabhakar Kushwaha prabha...@freescale.com --- changes for v4: Updated cpld compatible string field changes for v3: Incorporated Scott comments on moving cpld compatible field to board specific file as cpld binaries are different changes for v2: Incorporated Scott comments on using common name for compatible string for cpld as register set is same arch/powerpc/boot/dts/t1040rdb.dts| 48 arch/powerpc/boot/dts/t1042rdb.dts| 48 arch/powerpc/boot/dts/t104xrdb.dtsi | 156 + arch/powerpc/platforms/85xx/Kconfig |2 +- arch/powerpc/platforms/85xx/corenet_generic.c |2 + 5 files changed, 255 insertions(+), 1 deletions(-) create mode 100644 arch/powerpc/boot/dts/t1040rdb.dts create mode 100644 arch/powerpc/boot/dts/t1042rdb.dts create mode 100644 arch/powerpc/boot/dts/t104xrdb.dtsi diff --git a/arch/powerpc/boot/dts/t1040rdb.dts b/arch/powerpc/boot/dts/t1040rdb.dts new file mode 100644 index 000..79a0bed --- /dev/null +++ b/arch/powerpc/boot/dts/t1040rdb.dts @@ -0,0 +1,48 @@ +/* + * T1040RDB Device Tree Source + * + * Copyright 2014 Freescale Semiconductor Inc. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of Freescale Semiconductor nor the + * names of its contributors may be used to endorse or promote products + * derived from this software without specific prior written permission. + * + * + * ALTERNATIVELY, this software may be distributed under the terms of the + * GNU General Public License (GPL) as published by the Free Software + * Foundation, either version 2 of that License or (at your option) any + * later version. + * + * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor AS IS AND ANY + * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +/include/ fsl/t104xsi-pre.dtsi +/include/ t104xrdb.dtsi + +/ { + model = fsl,T1040RDB; + compatible = fsl,T1040RDB; + ifc: localbus@ffe124000 { + cpld@3,0 { + compatible = fsl,t1040rdb-cpld; + }; + }; +}; + +/include/ fsl/t1040si-post.dtsi diff --git a/arch/powerpc/boot/dts/t1042rdb.dts b/arch/powerpc/boot/dts/t1042rdb.dts new file mode 100644 index 000..228a635 --- /dev/null +++ b/arch/powerpc/boot/dts/t1042rdb.dts @@ -0,0 +1,48 @@ +/* + * T1042RDB Device Tree Source + * + * Copyright 2014 Freescale Semiconductor Inc. + * + * Redistribution and use in source and binary forms, with or without + *
[PATCH 2/2][v4] powerpc/fsl-booke: Add initial T1042RDB_PI board support
T1042RDB_PI is Freescale Reference Design Board supporting the T1042 QorIQ Power Architecture™ processor. T1042 is a reduced personality of T1040 SoC without Integrated 8-port Gigabit. The board is designed with low power features targeted for Printing Image Market. T1042RDB_PI is similar to T1040RDB board with few differences like it has video interface, supports T1042 personality only T1042RDB_PI board Overview --- - SERDES Connections, 8 lanes supporting: - PCI - SATA 2.0 - DDR Controller - Supports rates of up to 1600 MHz data-rate - Supports one DDR3LP UDIMM -IFC/Local Bus - NAND flash: 1GB 8-bit NAND flash - NOR: 128MB 16-bit NOR Flash - Ethernet - Two on-board RGMII 10/100/1G ethernet ports. - PHY #0 remains powered up during deep-sleep - CPLD - Clocks - System and DDR clock (SYSCLK, “DDRCLK”) - SERDES clocks - Power Supplies - USB - Supports two USB 2.0 ports with integrated PHYs - Two type A ports with 5V@1.5A per port. - SDHC - SDHC/SDXC connector - SPI - On-board 64MB SPI flash - I2C - Device connected: EEPROM, thermal monitor, VID controller, RTC - Other IO - Two Serial ports - ProfiBus port Add support for T1042RDB_PI board: -add device tree -Add entry in corenet_generic.c, as it is similar to other corenet platforms Signed-off-by: Poonam Aggrwal poonam.aggr...@freescale.com Signed-off-by: Prabhakar Kushwaha prabha...@freescale.com Signed-off-by: Priyanka Jain priyanka.j...@freescale.com --- changes for v4: Updated cpld compatible string field changes for v3: Incorporated Scott comments on moving cpld compatible field to board specific file as cpld binaries are different changes for v2: Incorporated Scott comments on using common name for compatible string for cpld as register set is same arch/powerpc/boot/dts/t1042rdb_pi.dts | 57 + arch/powerpc/platforms/85xx/corenet_generic.c |1 + 2 files changed, 58 insertions(+), 0 deletions(-) create mode 100644 arch/powerpc/boot/dts/t1042rdb_pi.dts diff --git a/arch/powerpc/boot/dts/t1042rdb_pi.dts b/arch/powerpc/boot/dts/t1042rdb_pi.dts new file mode 100644 index 000..b9d0877 --- /dev/null +++ b/arch/powerpc/boot/dts/t1042rdb_pi.dts @@ -0,0 +1,57 @@ +/* + * T1042RDB_PI Device Tree Source + * + * Copyright 2014 Freescale Semiconductor Inc. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of Freescale Semiconductor nor the + * names of its contributors may be used to endorse or promote products + * derived from this software without specific prior written permission. + * + * + * ALTERNATIVELY, this software may be distributed under the terms of the + * GNU General Public License (GPL) as published by the Free Software + * Foundation, either version 2 of that License or (at your option) any + * later version. + * + * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor AS IS AND ANY + * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +/include/ fsl/t104xsi-pre.dtsi +/include/ t104xrdb.dtsi + +/ { + model = fsl,T1042RDB_PI; + compatible = fsl,T1042RDB_PI; + ifc: localbus@ffe124000 { + cpld@3,0 { + compatible = fsl,t1042rdb_pi-cpld; + }; + }; + soc: soc@ffe00 { + i2c@118000 { + rtc@68 { + compatible = dallas,ds1337; + reg = 0x68; + interrupts = 0x2 0x1 0 0; + }; + }; + }; +}; + +/include/ fsl/t1042si-post.dtsi diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c index c268f89..100e80d 100644
Re: [PATCH v1 15/21] Powerpc/MSI: Use MSI chip framework to configure MSI/MSI-X irq
On 9/5/2014 3:33 PM, wangyijing wrote: Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/powerpc/kernel/msi.c | 14 -- 1 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c index 71bd161..01781a4 100644 --- a/arch/powerpc/kernel/msi.c +++ b/arch/powerpc/kernel/msi.c [...] @@ -27,7 +27,17 @@ int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) return ppc_md.setup_msi_irqs(dev, nvec, type); } -void arch_teardown_msi_irqs(struct pci_dev *dev) +static void ppc_teardown_msi_irqs(struct pci_dev *dev) Shouldn't this function take IRQ # instead? This function need to teardown all msi irqs of the pci dev, we should pass the pci dev as argument . Ah, I've mixed up the teardown_irqs() method with teardown_irq()! Too similar. :-) Thanks! Yijing. WBR, Sergei ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2] deb-pkg: Add support for powerpc little endian
On Fri, Sep 05, 2014 at 05:55:18PM +1000, Michael Neuling wrote: On Fri, 2014-09-05 at 09:13 +0200, Gabriel Paubert wrote: On Fri, Sep 05, 2014 at 03:28:47PM +1000, Michael Neuling wrote: The Debian powerpc little endian architecture is called ppc64le. This Huh? ppc64le or ppc64el? ppc64el. Commit message is wrong. Fixed below. Mikey What about ppc64? Also, I sent that already a month ago. Both linuxppc-dev and Michal Marek were on cc. http://marc.info/?l=linux-kernelm=140744360328562w=2 Cascardo. From: Michael Neuling mi...@neuling.org deb-pkg: Add support for powerpc little endian The Debian powerpc little endian architecture is called ppc64el. This is the default architecture used by Ubuntu for powerpc. The below checks the kernel config to see if we are compiling little endian and sets the Debian arch appropriately. Signed-off-by: Michael Neuling mi...@neuling.org diff --git a/scripts/package/builddeb b/scripts/package/builddeb index 35d5a58..6f4a1af 100644 --- a/scripts/package/builddeb +++ b/scripts/package/builddeb @@ -37,7 +37,7 @@ create_package() { s390*) debarch=s390$(grep -q CONFIG_64BIT=y $KCONFIG_CONFIG echo x || true) ;; ppc*) - debarch=powerpc ;; + debarch=$(grep -q CPU_LITTLE_ENDIAN=y $KCONFIG_CONFIG echo ppc64el || echo powerpc) ;; parisc*) debarch=hppa ;; mips*) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2] deb-pkg: Add support for powerpc little endian
On Fri, Sep 5, 2014 at 3:55 AM, Michael Neuling mi...@neuling.org wrote: On Fri, 2014-09-05 at 09:13 +0200, Gabriel Paubert wrote: On Fri, Sep 05, 2014 at 03:28:47PM +1000, Michael Neuling wrote: The Debian powerpc little endian architecture is called ppc64le. This Huh? ppc64le or ppc64el? ppc64el. Commit message is wrong. Fixed below. Yay! Just like every other architecture, we continue to have the deb based distros call it one thing, and the RPM based distros call it another. At least we're consistent in our inconsistency. josh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
[ +cc linux-arm ] Hi David, On 09/05/2014 04:30 AM, David Laight wrote: I've seen gcc generate 32bit accesses for 16bit structure members on arm. It does this because of the more limited range of the offsets for the 16bit access. OTOH I don't know if it ever did this for writes - so it may be moot. Can you recall the particulars, like what ARM config or what code? I tried an overly-simple test to see if gcc would bump up to the word load for the 12-bit offset mode, but it stuck with register offset rather than immediate offset. [I used the compiler options for allmodconfig and a 4.8 cross-compiler.] Maybe the test doesn't generate enough register pressure on the compiler? Regards, Peter Hurley #define ARRAY_SIZE(x) (sizeof(x)/sizeof((x)[0])) struct x { long unused[64]; short b[12]; int unused2[10]; short c; }; void store_c(struct x *p, short a[]) { int i; for (i = 0; i ARRAY_SIZE(p-b); i++) p-b[i] = a[i]; p-c = 2; } void store_c(struct x *p, short a[]) { 0: e1a0c00dmov ip, sp 4: e3a03000mov r3, #0 8: e92dd800push{fp, ip, lr, pc} c: e24cb004sub fp, ip, #4 int i; for (i = 0; i ARRAY_SIZE(p-b); i++) p-b[i] = a[i]; 10: e191c0b3ldrhip, [r1, r3] 14: e0802003add r2, r0, r3 18: e2822c01add r2, r2, #256; 0x100 1c: e2833002add r3, r3, #2 20: e3530018cmp r3, #24 24: e1c2c0b0strhip, [r2] 28: 1af8bne 10 store_c+0x10 p-c = 2; 2c: e3a03d05mov r3, #320; 0x140 30: e3a02002mov r2, #2 34: e18020b3strhr2, [r0, r3] 38: e89da800ldm sp, {fp, sp, pc} ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: bit fields data tearing
From: Peter Hurley [ +cc linux-arm ] Hi David, On 09/05/2014 04:30 AM, David Laight wrote: I've seen gcc generate 32bit accesses for 16bit structure members on arm. It does this because of the more limited range of the offsets for the 16bit access. OTOH I don't know if it ever did this for writes - so it may be moot. Can you recall the particulars, like what ARM config or what code? I tried an overly-simple test to see if gcc would bump up to the word load for the 12-bit offset mode, but it stuck with register offset rather than immediate offset. [I used the compiler options for allmodconfig and a 4.8 cross-compiler.] Maybe the test doesn't generate enough register pressure on the compiler? Dunno, I would have been using a much older version of the compiler. It is possible that it doesn't do it any more. It might only have done it for loads. The compiler used to use misaligned 32bit loads for structure members on large 4n+2 byte boundaries as well. I'm pretty sure it doesn't do that either. There have been a lot of compiler versions since I was compiling anything for arm. David Regards, Peter Hurley #define ARRAY_SIZE(x) (sizeof(x)/sizeof((x)[0])) struct x { long unused[64]; short b[12]; int unused2[10]; short c; }; void store_c(struct x *p, short a[]) { int i; for (i = 0; i ARRAY_SIZE(p-b); i++) p-b[i] = a[i]; p-c = 2; } void store_c(struct x *p, short a[]) { 0: e1a0c00dmov ip, sp 4: e3a03000mov r3, #0 8: e92dd800push{fp, ip, lr, pc} c: e24cb004sub fp, ip, #4 int i; for (i = 0; i ARRAY_SIZE(p-b); i++) p-b[i] = a[i]; 10: e191c0b3ldrhip, [r1, r3] 14: e0802003add r2, r0, r3 18: e2822c01add r2, r2, #256; 0x100 1c: e2833002add r3, r3, #2 20: e3530018cmp r3, #24 24: e1c2c0b0strhip, [r2] 28: 1af8bne 10 store_c+0x10 p-c = 2; 2c: e3a03d05mov r3, #320; 0x140 30: e3a02002mov r2, #2 34: e18020b3strhr2, [r0, r3] 38: e89da800ldm sp, {fp, sp, pc} ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] pseries: Make CPU hotplug path endian safe
On 09/05/2014 04:16 AM, bharata@gmail.com wrote: From: Bharata B Rao bhar...@linux.vnet.ibm.com - ibm,rtas-configure-connector should treat the RTAS data as big endian. - Treat ibm,ppc-interrupt-server#s as big-endian when setting smp_processor_id during hotplug. Signed-off-by: Bharata B Rao bhar...@linux.vnet.ibm.com --- arch/powerpc/platforms/pseries/dlpar.c | 10 +- arch/powerpc/platforms/pseries/hotplug-cpu.c | 4 ++-- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c index 2d0b4d6..dc55f9c 100644 --- a/arch/powerpc/platforms/pseries/dlpar.c +++ b/arch/powerpc/platforms/pseries/dlpar.c @@ -48,11 +48,11 @@ static struct property *dlpar_parse_cc_property(struct cc_workarea *ccwa) if (!prop) return NULL; - name = (char *)ccwa + ccwa-name_offset; + name = (char *)ccwa + be32_to_cpu(ccwa-name_offset); prop-name = kstrdup(name, GFP_KERNEL); - prop-length = ccwa-prop_length; - value = (char *)ccwa + ccwa-prop_offset; + prop-length = be32_to_cpu(ccwa-prop_length); + value = (char *)ccwa + be32_to_cpu(ccwa-prop_offset); prop-value = kmemdup(value, prop-length, GFP_KERNEL); if (!prop-value) { dlpar_free_cc_property(prop); @@ -78,7 +78,7 @@ static struct device_node *dlpar_parse_cc_node(struct cc_workarea *ccwa, if (!dn) return NULL; - name = (char *)ccwa + ccwa-name_offset; + name = (char *)ccwa + be32_to_cpu(ccwa-name_offset); dn-full_name = kasprintf(GFP_KERNEL, %s/%s, path, name); if (!dn-full_name) { kfree(dn); @@ -148,7 +148,7 @@ struct device_node *dlpar_configure_connector(u32 drc_index, return NULL; ccwa = (struct cc_workarea *)data_buf[0]; - ccwa-drc_index = drc_index; + ccwa-drc_index = cpu_to_be32(drc_index); I need to look at this some more but I think this may cause an issue for partition migration. If I am following the code correctly, starting in pseries_devicetree_update(), the drc_index value passed to dlpar_configure_connector is pulled directly out of a buffer we get from firmware. This would mean the drc_index value is already in BE format. Whereas for cpu hotplug the drc_index value is passed in from userspace via the cpu probe interface in sysfs. I assume that you are seeing the drc_index value getting passed in in LE format. -Nathan ccwa-zero = 0; do { diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 20d6297..447f8c6 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -247,7 +247,7 @@ static int pseries_add_processor(struct device_node *np) unsigned int cpu; cpumask_var_t candidate_mask, tmp; int err = -ENOSPC, len, nthreads, i; - const u32 *intserv; + const __be32 *intserv; intserv = of_get_property(np, ibm,ppc-interrupt-server#s, len); if (!intserv) @@ -293,7 +293,7 @@ static int pseries_add_processor(struct device_node *np) for_each_cpu(cpu, tmp) { BUG_ON(cpu_present(cpu)); set_cpu_present(cpu, true); - set_hard_smp_processor_id(cpu, *intserv++); + set_hard_smp_processor_id(cpu, be32_to_cpu(*intserv++)); } err = 0; out_unlock: ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [Xen-devel] [PATCH v1 08/21] x86/xen/MSI: Use MSI chip framework to configure MSI/MSI-X irq
On 05/09/14 11:09, Yijing Wang wrote: Use MSI chip framework instead of arch MSI functions to configure MSI/MSI-X irq. So we can manage MSI/MSI-X irq in a unified framework. [...] --- a/arch/x86/pci/xen.c +++ b/arch/x86/pci/xen.c [...] @@ -418,9 +430,9 @@ int __init pci_xen_init(void) #endif #ifdef CONFIG_PCI_MSI - x86_msi.setup_msi_irqs = xen_setup_msi_irqs; - x86_msi.teardown_msi_irq = xen_teardown_msi_irq; - x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs; + xen_msi_chip.setup_irqs = xen_setup_msi_irqs; + xen_msi_chip.teardown_irqs = xen_teardown_msi_irqs; + x86_msi_chip = xen_msi_chip; msi_chip.irq_mask = xen_nop_msi_mask; msi_chip.irq_unmask = xen_nop_msi_mask; Why have these not been changed to set the x86_msi_chip.mask/unmask fields instead? David ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On 09/04/2014 10:08 PM, H. Peter Anvin wrote: On 09/04/2014 05:59 PM, Peter Hurley wrote: I have no idea how prevalent the ev56 is compared to the ev5. Still we're talking about a chip that came out in 1996. Ah yes, I stand corrected. According to Wikipedia, the affected CPUs were all the 2106x CPUs (EV4, EV45, LCA4, LCA45) plus the 21164 with no suffix (EV5). However, we're still talking about museum pieces here. I wonder what the one I have in my garage is... I'm sure I could emulate it faster, though. Which is a bit ironic because I remember when Digital had a team working on emulating native x86 apps on Alpha/NT. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] pseries: Make CPU hotplug path endian safe
On Fri, Sep 5, 2014 at 7:38 PM, Nathan Fontenot nf...@linux.vnet.ibm.com wrote: On 09/05/2014 04:16 AM, bharata@gmail.com wrote: From: Bharata B Rao bhar...@linux.vnet.ibm.com - ibm,rtas-configure-connector should treat the RTAS data as big endian. - Treat ibm,ppc-interrupt-server#s as big-endian when setting smp_processor_id during hotplug. Signed-off-by: Bharata B Rao bhar...@linux.vnet.ibm.com --- arch/powerpc/platforms/pseries/dlpar.c | 10 +- arch/powerpc/platforms/pseries/hotplug-cpu.c | 4 ++-- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c index 2d0b4d6..dc55f9c 100644 --- a/arch/powerpc/platforms/pseries/dlpar.c +++ b/arch/powerpc/platforms/pseries/dlpar.c @@ -48,11 +48,11 @@ static struct property *dlpar_parse_cc_property(struct cc_workarea *ccwa) if (!prop) return NULL; - name = (char *)ccwa + ccwa-name_offset; + name = (char *)ccwa + be32_to_cpu(ccwa-name_offset); prop-name = kstrdup(name, GFP_KERNEL); - prop-length = ccwa-prop_length; - value = (char *)ccwa + ccwa-prop_offset; + prop-length = be32_to_cpu(ccwa-prop_length); + value = (char *)ccwa + be32_to_cpu(ccwa-prop_offset); prop-value = kmemdup(value, prop-length, GFP_KERNEL); if (!prop-value) { dlpar_free_cc_property(prop); @@ -78,7 +78,7 @@ static struct device_node *dlpar_parse_cc_node(struct cc_workarea *ccwa, if (!dn) return NULL; - name = (char *)ccwa + ccwa-name_offset; + name = (char *)ccwa + be32_to_cpu(ccwa-name_offset); dn-full_name = kasprintf(GFP_KERNEL, %s/%s, path, name); if (!dn-full_name) { kfree(dn); @@ -148,7 +148,7 @@ struct device_node *dlpar_configure_connector(u32 drc_index, return NULL; ccwa = (struct cc_workarea *)data_buf[0]; - ccwa-drc_index = drc_index; + ccwa-drc_index = cpu_to_be32(drc_index); I need to look at this some more but I think this may cause an issue for partition migration. If I am following the code correctly, starting in pseries_devicetree_update(), the drc_index value passed to dlpar_configure_connector is pulled directly out of a buffer we get from firmware. This would mean the drc_index value is already in BE format. Yes I see that now. Whereas for cpu hotplug the drc_index value is passed in from userspace via the cpu probe interface in sysfs. I assume that you are seeing the drc_index value getting passed in in LE format. Yes I am seeing drc_index in LE format for an LE guest during CPU hotplug operation. Regards, Bharata. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On 09/05/2014 08:31 AM, Peter Hurley wrote: Which is a bit ironic because I remember when Digital had a team working on emulating native x86 apps on Alpha/NT. Right, because the x86 architecture was obsolete and would never scale... -hpa ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On 09/05/2014 08:37 AM, David Laight wrote: From: Peter Hurley On 09/05/2014 04:30 AM, David Laight wrote: I've seen gcc generate 32bit accesses for 16bit structure members on arm. It does this because of the more limited range of the offsets for the 16bit access. OTOH I don't know if it ever did this for writes - so it may be moot. Can you recall the particulars, like what ARM config or what code? I tried an overly-simple test to see if gcc would bump up to the word load for the 12-bit offset mode, but it stuck with register offset rather than immediate offset. [I used the compiler options for allmodconfig and a 4.8 cross-compiler.] Maybe the test doesn't generate enough register pressure on the compiler? Dunno, I would have been using a much older version of the compiler. It is possible that it doesn't do it any more. It might only have done it for loads. The compiler used to use misaligned 32bit loads for structure members on large 4n+2 byte boundaries as well. I'm pretty sure it doesn't do that either. There have been a lot of compiler versions since I was compiling anything for arm. Yeah, it seems gcc for ARM no longer uses the larger operand size as a substitute for 12-bit immediate offset addressing mode, even for reads. While this test: struct x { short b[12]; }; short load_b(struct x *p) { return p-b[8]; } generates the 8-bit immediate offset form, short load_b(struct x *p) { 0: e1d001f0ldrsh r0, [r0, #16] 4: e12fff1ebx lr pushing the offset out past 256: struct x { long unused[64]; short b[12]; }; short load_b(struct x *p) { return p-b[8]; } generates the register offset addressing mode instead of 12-bit immediate: short load_b(struct x *p) { 0: e3a03e11mov r3, #272; 0x110 4: e19000f3ldrsh r0, [r0, r3] 8: e12fff1ebx lr Regards, Peter Hurley [Note: I compiled without the frame pointer to simplify the code generation] ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Fri, Sep 05, 2014 at 08:16:48PM +1200, Michael Cree wrote: On Thu, Sep 04, 2014 at 07:08:48PM -0700, H. Peter Anvin wrote: On 09/04/2014 05:59 PM, Peter Hurley wrote: I have no idea how prevalent the ev56 is compared to the ev5. Still we're talking about a chip that came out in 1996. Ah yes, I stand corrected. According to Wikipedia, the affected CPUs were all the 2106x CPUs (EV4, EV45, LCA4, LCA45) plus the 21164 with no suffix (EV5). However, we're still talking about museum pieces here. Yes, that is correct, EV56 is the first Alpha CPU to have the byte-word extension (BWX) CPU instructions. It would not worry me if the kernel decided to assume atomic aligned scalar accesses for all arches, thus terminating support for Alphas without BWX. The X server, ever since the libpciaccess change, does not work on Alphas without BWX. Debian Alpha (pretty much up to date at Debian-Ports) is still compiled for all Alphas, i.e., without BWX. The last attempt to start compiling Debian Alpha with BWX, about three years ago when Alpha was kicked out to Debian-Ports resulted in a couple or so complaints so got nowhere. It's frustrating supporting the lowest common demoninator as many of the bugs specific to Alpha can be resolved by recompiling with the BWX. The kernel no longer supporting Alphas without BWX might just be the incentive we need to switch Debian Alpha to compiling with BWX. Very good, then I update my patch as follows. Thoughts? Thanx, Paul documentation: Record limitations of bitfields and small variables This commit documents the fact that it is not safe to use bitfields as shared variables in synchronization algorithms. It also documents that CPUs must provide one-byte and two-byte load and store instructions in order to be supported by the Linux kernel. (Michael Cree has agreed to the resulting non-support of pre-EV56 Alpha CPUs: https://lkml.org/lkml/2014/9/5/143. Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 87be0a8a78de..455df6b298f7 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -269,6 +269,30 @@ And there are a number of things that _must_ or _must_not_ be assumed: STORE *(A + 4) = Y; STORE *A = X; STORE {*A, *(A + 4) } = {X, Y}; +And there are anti-guarantees: + + (*) These guarantees do not apply to bitfields, because compilers often + generate code to modify these using non-atomic read-modify-write + sequences. Do not attempt to use bitfields to synchronize parallel + algorithms. + + (*) Even in cases where bitfields are protected by locks, all fields + in a given bitfield must be protected by one lock. If two fields + in a given bitfield are protected by different locks, the compiler's + non-atomic read-modify-write sequences can cause an update to one + field to corrupt the value of an adjacent field. + + (*) These guarantees apply only to properly aligned and sized scalar + variables. Properly sized currently means variables that are the + same size as char, short, int and long. Properly aligned + means the natural alignment, thus no constraints for char, + two-byte alignment for short, four-byte alignment for int, + and either four-byte or eight-byte alignment for long, on 32-bit + and 64-bit systems, respectively. Note that this means that the + Linux kernel does not support pre-EV56 Alpha CPUs, because these + older CPUs do not provide one-byte and two-byte loads and stores. + Alpha EV56 and later Alpha CPUs are still supported. + = WHAT ARE MEMORY BARRIERS? ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Fri, Sep 05, 2014 at 11:09:50AM -0700, Paul E. McKenney wrote: On Fri, Sep 05, 2014 at 08:16:48PM +1200, Michael Cree wrote: On Thu, Sep 04, 2014 at 07:08:48PM -0700, H. Peter Anvin wrote: On 09/04/2014 05:59 PM, Peter Hurley wrote: I have no idea how prevalent the ev56 is compared to the ev5. Still we're talking about a chip that came out in 1996. Ah yes, I stand corrected. According to Wikipedia, the affected CPUs were all the 2106x CPUs (EV4, EV45, LCA4, LCA45) plus the 21164 with no suffix (EV5). However, we're still talking about museum pieces here. Yes, that is correct, EV56 is the first Alpha CPU to have the byte-word extension (BWX) CPU instructions. It would not worry me if the kernel decided to assume atomic aligned scalar accesses for all arches, thus terminating support for Alphas without BWX. The X server, ever since the libpciaccess change, does not work on Alphas without BWX. Debian Alpha (pretty much up to date at Debian-Ports) is still compiled for all Alphas, i.e., without BWX. The last attempt to start compiling Debian Alpha with BWX, about three years ago when Alpha was kicked out to Debian-Ports resulted in a couple or so complaints so got nowhere. It's frustrating supporting the lowest common demoninator as many of the bugs specific to Alpha can be resolved by recompiling with the BWX. The kernel no longer supporting Alphas without BWX might just be the incentive we need to switch Debian Alpha to compiling with BWX. Very good, then I update my patch as follows. Thoughts? And, while I am at it, fix smp_load_acquire() and smp_store_release() to allow single-byte and double-byte accesses. (Adding Peter Zijlstra on CC.) Thanx, Paul compiler: Allow 1- and 2-byte smp_load_acquire() and smp_store_release() CPUs without single-byte and double-byte loads and stores place some interesting requirements on concurrent code. For example (adapted from Peter Hurley's test code), suppose we have the following structure: struct foo { spinlock_t lock1; spinlock_t lock2; char a; /* Protected by lock1. */ char b; /* Protected by lock2. */ }; struct foo *foop; Of course, it is common (and good) practice to place data protected by different locks in separate cache lines. However, if the locks are rarely acquired (for example, only in rare error cases), and there are a great many instances of the data structure, then memory footprint can trump false-sharing concerns, so that it can be better to place them in the same cache cache line as above. But if the CPU does not support single-byte loads and stores, a store to foop-a will do a non-atomic read-modify-write operation on foop-b, which will come as a nasty surprise to someone holding foop-lock2. So we now require CPUs to support single-byte and double-byte loads and stores. Therefore, this commit adjusts the definition of __native_word() to allow these sizes to be used by smp_load_acquire() and smp_store_release(). Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com Cc: Peter Zijlstra pet...@infradead.org diff --git a/include/linux/compiler.h b/include/linux/compiler.h index d5ad7b1118fc..934a834ab9f9 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -311,7 +311,7 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect); /* Is this type a native word size -- useful for atomic operations */ #ifndef __native_word -# define __native_word(t) (sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long)) +# define __native_word(t) (sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long)) #endif /* Compile time object size, -1 for unknown */ ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On 09/05/2014 02:09 PM, Paul E. McKenney wrote: On Fri, Sep 05, 2014 at 08:16:48PM +1200, Michael Cree wrote: On Thu, Sep 04, 2014 at 07:08:48PM -0700, H. Peter Anvin wrote: On 09/04/2014 05:59 PM, Peter Hurley wrote: I have no idea how prevalent the ev56 is compared to the ev5. Still we're talking about a chip that came out in 1996. Ah yes, I stand corrected. According to Wikipedia, the affected CPUs were all the 2106x CPUs (EV4, EV45, LCA4, LCA45) plus the 21164 with no suffix (EV5). However, we're still talking about museum pieces here. Yes, that is correct, EV56 is the first Alpha CPU to have the byte-word extension (BWX) CPU instructions. It would not worry me if the kernel decided to assume atomic aligned scalar accesses for all arches, thus terminating support for Alphas without BWX. The X server, ever since the libpciaccess change, does not work on Alphas without BWX. Debian Alpha (pretty much up to date at Debian-Ports) is still compiled for all Alphas, i.e., without BWX. The last attempt to start compiling Debian Alpha with BWX, about three years ago when Alpha was kicked out to Debian-Ports resulted in a couple or so complaints so got nowhere. It's frustrating supporting the lowest common demoninator as many of the bugs specific to Alpha can be resolved by recompiling with the BWX. The kernel no longer supporting Alphas without BWX might just be the incentive we need to switch Debian Alpha to compiling with BWX. Very good, then I update my patch as follows. Thoughts? Thanx, Paul Minor [optional] edits. Thanks, Peter Hurley documentation: Record limitations of bitfields and small variables This commit documents the fact that it is not safe to use bitfields as shared variables in synchronization algorithms. It also documents that CPUs must provide one-byte and two-byte load and store instructions ^ atomic in order to be supported by the Linux kernel. (Michael Cree has agreed to the resulting non-support of pre-EV56 Alpha CPUs: https://lkml.org/lkml/2014/9/5/143. Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 87be0a8a78de..455df6b298f7 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -269,6 +269,30 @@ And there are a number of things that _must_ or _must_not_ be assumed: STORE *(A + 4) = Y; STORE *A = X; STORE {*A, *(A + 4) } = {X, Y}; +And there are anti-guarantees: + + (*) These guarantees do not apply to bitfields, because compilers often + generate code to modify these using non-atomic read-modify-write + sequences. Do not attempt to use bitfields to synchronize parallel + algorithms. + + (*) Even in cases where bitfields are protected by locks, all fields + in a given bitfield must be protected by one lock. If two fields + in a given bitfield are protected by different locks, the compiler's + non-atomic read-modify-write sequences can cause an update to one + field to corrupt the value of an adjacent field. + + (*) These guarantees apply only to properly aligned and sized scalar + variables. Properly sized currently means variables that are the + same size as char, short, int and long. Properly aligned + means the natural alignment, thus no constraints for char, + two-byte alignment for short, four-byte alignment for int, + and either four-byte or eight-byte alignment for long, on 32-bit + and 64-bit systems, respectively. Note that this means that the + Linux kernel does not support pre-EV56 Alpha CPUs, because these + older CPUs do not provide one-byte and two-byte loads and stores. ^ non-atomic + Alpha EV56 and later Alpha CPUs are still supported. + = WHAT ARE MEMORY BARRIERS? ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Fri, Sep 05, 2014 at 02:50:31PM -0400, Peter Hurley wrote: On 09/05/2014 02:09 PM, Paul E. McKenney wrote: On Fri, Sep 05, 2014 at 08:16:48PM +1200, Michael Cree wrote: On Thu, Sep 04, 2014 at 07:08:48PM -0700, H. Peter Anvin wrote: On 09/04/2014 05:59 PM, Peter Hurley wrote: I have no idea how prevalent the ev56 is compared to the ev5. Still we're talking about a chip that came out in 1996. Ah yes, I stand corrected. According to Wikipedia, the affected CPUs were all the 2106x CPUs (EV4, EV45, LCA4, LCA45) plus the 21164 with no suffix (EV5). However, we're still talking about museum pieces here. Yes, that is correct, EV56 is the first Alpha CPU to have the byte-word extension (BWX) CPU instructions. It would not worry me if the kernel decided to assume atomic aligned scalar accesses for all arches, thus terminating support for Alphas without BWX. The X server, ever since the libpciaccess change, does not work on Alphas without BWX. Debian Alpha (pretty much up to date at Debian-Ports) is still compiled for all Alphas, i.e., without BWX. The last attempt to start compiling Debian Alpha with BWX, about three years ago when Alpha was kicked out to Debian-Ports resulted in a couple or so complaints so got nowhere. It's frustrating supporting the lowest common demoninator as many of the bugs specific to Alpha can be resolved by recompiling with the BWX. The kernel no longer supporting Alphas without BWX might just be the incentive we need to switch Debian Alpha to compiling with BWX. Very good, then I update my patch as follows. Thoughts? Thanx, Paul Minor [optional] edits. Thanks, Peter Hurley documentation: Record limitations of bitfields and small variables This commit documents the fact that it is not safe to use bitfields as shared variables in synchronization algorithms. It also documents that CPUs must provide one-byte and two-byte load and store instructions ^ atomic Here you meant non-atomic? My guess is that you are referring to the fact that you could emulate a one-byte store on pre-EV56 Alpha CPUs using the ll and sc atomic-read-modify-write instructions, correct? in order to be supported by the Linux kernel. (Michael Cree has agreed to the resulting non-support of pre-EV56 Alpha CPUs: https://lkml.org/lkml/2014/9/5/143. Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 87be0a8a78de..455df6b298f7 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -269,6 +269,30 @@ And there are a number of things that _must_ or _must_not_ be assumed: STORE *(A + 4) = Y; STORE *A = X; STORE {*A, *(A + 4) } = {X, Y}; +And there are anti-guarantees: + + (*) These guarantees do not apply to bitfields, because compilers often + generate code to modify these using non-atomic read-modify-write + sequences. Do not attempt to use bitfields to synchronize parallel + algorithms. + + (*) Even in cases where bitfields are protected by locks, all fields + in a given bitfield must be protected by one lock. If two fields + in a given bitfield are protected by different locks, the compiler's + non-atomic read-modify-write sequences can cause an update to one + field to corrupt the value of an adjacent field. + + (*) These guarantees apply only to properly aligned and sized scalar + variables. Properly sized currently means variables that are the + same size as char, short, int and long. Properly aligned + means the natural alignment, thus no constraints for char, + two-byte alignment for short, four-byte alignment for int, + and either four-byte or eight-byte alignment for long, on 32-bit + and 64-bit systems, respectively. Note that this means that the + Linux kernel does not support pre-EV56 Alpha CPUs, because these + older CPUs do not provide one-byte and two-byte loads and stores. ^ non-atomic I took this, thank you! Thanx, Paul + Alpha EV56 and later Alpha CPUs are still supported. + = WHAT ARE MEMORY BARRIERS? ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] pseries: Fix endianness in cpu hotplug and hotremove
This patch attempts to ensure that all values are in the proper endianness format when both hotadding and hotremoving cpus. Signed-off-by: Thomas Falcon tlfal...@linux.vnet.ibm.com --- arch/powerpc/platforms/pseries/dlpar.c | 56 ++-- arch/powerpc/platforms/pseries/hotplug-cpu.c | 20 +- 2 files changed, 38 insertions(+), 38 deletions(-) diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c index a2450b8..c1d7e40 100644 --- a/arch/powerpc/platforms/pseries/dlpar.c +++ b/arch/powerpc/platforms/pseries/dlpar.c @@ -24,11 +24,11 @@ #include asm/rtas.h struct cc_workarea { - u32 drc_index; - u32 zero; - u32 name_offset; - u32 prop_length; - u32 prop_offset; + __be32 drc_index; + __be32 zero; + __be32 name_offset; + __be32 prop_length; + __be32 prop_offset; }; void dlpar_free_cc_property(struct property *prop) @@ -48,11 +48,11 @@ static struct property *dlpar_parse_cc_property(struct cc_workarea *ccwa) if (!prop) return NULL; - name = (char *)ccwa + ccwa-name_offset; + name = (char *)ccwa + be32_to_cpu(ccwa-name_offset); prop-name = kstrdup(name, GFP_KERNEL); - prop-length = ccwa-prop_length; - value = (char *)ccwa + ccwa-prop_offset; + prop-length = be32_to_cpu(ccwa-prop_length); + value = (char *)ccwa + be32_to_cpu(ccwa-prop_offset); prop-value = kmemdup(value, prop-length, GFP_KERNEL); if (!prop-value) { dlpar_free_cc_property(prop); @@ -78,7 +78,7 @@ static struct device_node *dlpar_parse_cc_node(struct cc_workarea *ccwa, if (!dn) return NULL; - name = (char *)ccwa + ccwa-name_offset; + name = (char *)ccwa + be32_to_cpu(ccwa-name_offset); dn-full_name = kasprintf(GFP_KERNEL, %s/%s, path, name); if (!dn-full_name) { kfree(dn); @@ -148,7 +148,7 @@ struct device_node *dlpar_configure_connector(u32 drc_index, return NULL; ccwa = (struct cc_workarea *)data_buf[0]; - ccwa-drc_index = drc_index; + ccwa-drc_index = cpu_to_be32(drc_index); ccwa-zero = 0; do { @@ -363,10 +363,10 @@ static int dlpar_online_cpu(struct device_node *dn) int rc = 0; unsigned int cpu; int len, nthreads, i; - const u32 *intserv; + const __be32 *intserv_be; - intserv = of_get_property(dn, ibm,ppc-interrupt-server#s, len); - if (!intserv) + intserv_be = of_get_property(dn, ibm,ppc-interrupt-server#s, len); + if (!intserv_be) return -EINVAL; nthreads = len / sizeof(u32); @@ -374,7 +374,7 @@ static int dlpar_online_cpu(struct device_node *dn) cpu_maps_update_begin(); for (i = 0; i nthreads; i++) { for_each_present_cpu(cpu) { - if (get_hard_smp_processor_id(cpu) != intserv[i]) + if (get_hard_smp_processor_id(cpu) != be32_to_cpu(intserv_be[i])) continue; BUG_ON(get_cpu_current_state(cpu) != CPU_STATE_OFFLINE); @@ -388,7 +388,7 @@ static int dlpar_online_cpu(struct device_node *dn) } if (cpu == num_possible_cpus()) printk(KERN_WARNING Could not find cpu to online - with physical id 0x%x\n, intserv[i]); + with physical id 0x%x\n, be32_to_cpu(intserv_be[i])); } cpu_maps_update_done(); @@ -442,18 +442,17 @@ static int dlpar_offline_cpu(struct device_node *dn) int rc = 0; unsigned int cpu; int len, nthreads, i; - const u32 *intserv; + const __be32 *intserv_be; - intserv = of_get_property(dn, ibm,ppc-interrupt-server#s, len); - if (!intserv) + intserv_be = of_get_property(dn, ibm,ppc-interrupt-server#s, len); + if (!intserv_be) return -EINVAL; nthreads = len / sizeof(u32); - cpu_maps_update_begin(); for (i = 0; i nthreads; i++) { for_each_present_cpu(cpu) { - if (get_hard_smp_processor_id(cpu) != intserv[i]) + if (get_hard_smp_processor_id(cpu) != be32_to_cpu(intserv_be[i])) continue; if (get_cpu_current_state(cpu) == CPU_STATE_OFFLINE) @@ -469,20 +468,19 @@ static int dlpar_offline_cpu(struct device_node *dn) break; } - /* * The cpu is in CPU_STATE_INACTIVE. * Upgrade it's state to CPU_STATE_OFFLINE. */ set_preferred_offline_state(cpu, CPU_STATE_OFFLINE); -
Re: bit fields data tearing
On 09/05/2014 03:05 PM, Paul E. McKenney wrote: On Fri, Sep 05, 2014 at 02:50:31PM -0400, Peter Hurley wrote: On 09/05/2014 02:09 PM, Paul E. McKenney wrote: [cut] documentation: Record limitations of bitfields and small variables This commit documents the fact that it is not safe to use bitfields as shared variables in synchronization algorithms. It also documents that CPUs must provide one-byte and two-byte load and store instructions ^ atomic Here you meant non-atomic? My guess is that you are referring to the fact that you could emulate a one-byte store on pre-EV56 Alpha CPUs using the ll and sc atomic-read-modify-write instructions, correct? Yes, that's what I meant. I must be tired and am misreading the commit message, or misinterpreting it's meaning. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Fri, Sep 05, 2014 at 11:31:09AM -0700, Paul E. McKenney wrote: compiler: Allow 1- and 2-byte smp_load_acquire() and smp_store_release() CPUs without single-byte and double-byte loads and stores place some interesting requirements on concurrent code. For example (adapted from Peter Hurley's test code), suppose we have the following structure: struct foo { spinlock_t lock1; spinlock_t lock2; char a; /* Protected by lock1. */ char b; /* Protected by lock2. */ }; struct foo *foop; Of course, it is common (and good) practice to place data protected by different locks in separate cache lines. However, if the locks are rarely acquired (for example, only in rare error cases), and there are a great many instances of the data structure, then memory footprint can trump false-sharing concerns, so that it can be better to place them in the same cache cache line as above. But if the CPU does not support single-byte loads and stores, a store to foop-a will do a non-atomic read-modify-write operation on foop-b, which will come as a nasty surprise to someone holding foop-lock2. So we now require CPUs to support single-byte and double-byte loads and stores. Therefore, this commit adjusts the definition of __native_word() to allow these sizes to be used by smp_load_acquire() and smp_store_release(). So does this patch depends on a patch that removes pre EV56 alpha support? I'm all for removing that, but I need to see the patch merged before we can do this. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On 09/05/2014 03:52 PM, Peter Zijlstra wrote: On Fri, Sep 05, 2014 at 11:31:09AM -0700, Paul E. McKenney wrote: compiler: Allow 1- and 2-byte smp_load_acquire() and smp_store_release() CPUs without single-byte and double-byte loads and stores place some interesting requirements on concurrent code. For example (adapted from Peter Hurley's test code), suppose we have the following structure: struct foo { spinlock_t lock1; spinlock_t lock2; char a; /* Protected by lock1. */ char b; /* Protected by lock2. */ }; struct foo *foop; Of course, it is common (and good) practice to place data protected by different locks in separate cache lines. However, if the locks are rarely acquired (for example, only in rare error cases), and there are a great many instances of the data structure, then memory footprint can trump false-sharing concerns, so that it can be better to place them in the same cache cache line as above. But if the CPU does not support single-byte loads and stores, a store to foop-a will do a non-atomic read-modify-write operation on foop-b, which will come as a nasty surprise to someone holding foop-lock2. So we now require CPUs to support single-byte and double-byte loads and stores. Therefore, this commit adjusts the definition of __native_word() to allow these sizes to be used by smp_load_acquire() and smp_store_release(). So does this patch depends on a patch that removes pre EV56 alpha support? I'm all for removing that, but I need to see the patch merged before we can do this. I'm working on that but Alpha's Kconfig is not quite straightforward. ... and I'm wondering if I should _remove_ pre-EV56 configurations or move the default choice and produce a warning about unsupported Alpha CPUs instead? Regards, Peter Hurley [ How does one do a red popup in kbuild? The 'comment' approach is too subtle. ] ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Fri, Sep 05, 2014 at 03:24:35PM -0400, Peter Hurley wrote: On 09/05/2014 03:05 PM, Paul E. McKenney wrote: On Fri, Sep 05, 2014 at 02:50:31PM -0400, Peter Hurley wrote: On 09/05/2014 02:09 PM, Paul E. McKenney wrote: [cut] documentation: Record limitations of bitfields and small variables This commit documents the fact that it is not safe to use bitfields as shared variables in synchronization algorithms. It also documents that CPUs must provide one-byte and two-byte load and store instructions ^ atomic Here you meant non-atomic? My guess is that you are referring to the fact that you could emulate a one-byte store on pre-EV56 Alpha CPUs using the ll and sc atomic-read-modify-write instructions, correct? Yes, that's what I meant. I must be tired and am misreading the commit message, or misinterpreting it's meaning. Very good, got it! Thanx, Paul ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On 09/05/2014 03:38 PM, Marc Gauthier wrote: Paul E. McKenney wrote: On Fri, Sep 05, 2014 at 02:50:31PM -0400, Peter Hurley wrote: On 09/05/2014 02:09 PM, Paul E. McKenney wrote: This commit documents the fact that it is not safe to use bitfields as shared variables in synchronization algorithms. It also documents that CPUs must provide one-byte and two-byte load and store instructions ^ atomic Here you meant non-atomic? My guess is that you are referring to the fact that you could emulate a one-byte store on pre-EV56 Alpha CPUs using the ll and sc atomic-read-modify-write instructions, correct? in order to be supported by the Linux kernel. (Michael Cree has agreed to the resulting non-support of pre-EV56 Alpha CPUs: https://lkml.org/lkml/2014/9/5/143. [...] + and 64-bit systems, respectively. Note that this means that the + Linux kernel does not support pre-EV56 Alpha CPUs, because these + older CPUs do not provide one-byte and two-byte loads and stores. ^ non-atomic I took this, thank you! Eum, am I totally lost, or aren't both of these supposed to say atomic ? Can't imagine requiring a CPU to provide non-atomic loads and stores (i.e. requiring old Alpha behavior?). Here's how I read the two statements. First, the commit message: It [this commit] documents that CPUs [supported by the Linux kernel] _must provide_ atomic one-byte and two-byte naturally aligned loads and stores. Second, in the body of the document: The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these older CPUs _do not provide_ atomic one-byte and two-byte loads and stores. Regards, Peter Hurley ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On 09/05/2014 01:12 PM, Peter Zijlstra wrote: ... and I'm wondering if I should _remove_ pre-EV56 configurations or move the default choice and produce a warning about unsupported Alpha CPUs instead? depends BROKEN or is that deprecated? Just rip it out, like I did for the i386. -hpa ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Fri, Sep 05, 2014 at 04:01:35PM -0400, Peter Hurley wrote: On 09/05/2014 03:52 PM, Peter Zijlstra wrote: On Fri, Sep 05, 2014 at 11:31:09AM -0700, Paul E. McKenney wrote: compiler: Allow 1- and 2-byte smp_load_acquire() and smp_store_release() CPUs without single-byte and double-byte loads and stores place some interesting requirements on concurrent code. For example (adapted from Peter Hurley's test code), suppose we have the following structure: struct foo { spinlock_t lock1; spinlock_t lock2; char a; /* Protected by lock1. */ char b; /* Protected by lock2. */ }; struct foo *foop; Of course, it is common (and good) practice to place data protected by different locks in separate cache lines. However, if the locks are rarely acquired (for example, only in rare error cases), and there are a great many instances of the data structure, then memory footprint can trump false-sharing concerns, so that it can be better to place them in the same cache cache line as above. But if the CPU does not support single-byte loads and stores, a store to foop-a will do a non-atomic read-modify-write operation on foop-b, which will come as a nasty surprise to someone holding foop-lock2. So we now require CPUs to support single-byte and double-byte loads and stores. Therefore, this commit adjusts the definition of __native_word() to allow these sizes to be used by smp_load_acquire() and smp_store_release(). So does this patch depends on a patch that removes pre EV56 alpha support? I'm all for removing that, but I need to see the patch merged before we can do this. I'm working on that but Alpha's Kconfig is not quite straightforward. ... and I'm wondering if I should _remove_ pre-EV56 configurations or move the default choice and produce a warning about unsupported Alpha CPUs instead? I suspect that either would work, given that the Alpha community is pretty close-knit. Just setting the appropriate flag to make the compiler generate one-byte and two-byte loads and stores would probably suffice. ;-) Thanx, Paul Regards, Peter Hurley [ How does one do a red popup in kbuild? The 'comment' approach is too subtle. ] ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On 09/05/2014 01:14 PM, Peter Hurley wrote: Here's how I read the two statements. First, the commit message: It [this commit] documents that CPUs [supported by the Linux kernel] _must provide_ atomic one-byte and two-byte naturally aligned loads and stores. Second, in the body of the document: The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these older CPUs _do not provide_ atomic one-byte and two-byte loads and stores. Does this apply in general or only to SMP configurations? I guess non-SMP configurations would still have problems if interrupted in the wrong place... -hpa ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Fri, Sep 05, 2014 at 04:14:48PM -0400, Peter Hurley wrote: Second, in the body of the document: The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these older CPUs _do not provide_ atomic one-byte and two-byte loads and stores. Let's be clear here, the pre-EV56 Alpha CPUs do provide an atomic one-byte and two-byte load and store; it's just that one must use locked load and store sequences to achieve atomicity. The point, I think, is that the pre-EV56 Alpha CPUs provide non-atomic one-byte and two-byte load and stores as the norm, and that is the problem. Cheers Michael. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Fri, Sep 05, 2014 at 04:14:48PM -0400, Peter Hurley wrote: On 09/05/2014 03:38 PM, Marc Gauthier wrote: Paul E. McKenney wrote: On Fri, Sep 05, 2014 at 02:50:31PM -0400, Peter Hurley wrote: On 09/05/2014 02:09 PM, Paul E. McKenney wrote: This commit documents the fact that it is not safe to use bitfields as shared variables in synchronization algorithms. It also documents that CPUs must provide one-byte and two-byte load and store instructions ^ atomic Here you meant non-atomic? My guess is that you are referring to the fact that you could emulate a one-byte store on pre-EV56 Alpha CPUs using the ll and sc atomic-read-modify-write instructions, correct? in order to be supported by the Linux kernel. (Michael Cree has agreed to the resulting non-support of pre-EV56 Alpha CPUs: https://lkml.org/lkml/2014/9/5/143. [...] + and 64-bit systems, respectively. Note that this means that the + Linux kernel does not support pre-EV56 Alpha CPUs, because these + older CPUs do not provide one-byte and two-byte loads and stores. ^ non-atomic I took this, thank you! Eum, am I totally lost, or aren't both of these supposed to say atomic ? Can't imagine requiring a CPU to provide non-atomic loads and stores (i.e. requiring old Alpha behavior?). Here's how I read the two statements. First, the commit message: It [this commit] documents that CPUs [supported by the Linux kernel] _must provide_ atomic one-byte and two-byte naturally aligned loads and stores. Second, in the body of the document: The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these older CPUs _do not provide_ atomic one-byte and two-byte loads and stores. Hmmm... It is a bit ambiguous. How about the following? Thanx, Paul documentation: Record limitations of bitfields and small variables This commit documents the fact that it is not safe to use bitfields as shared variables in synchronization algorithms. It also documents that CPUs must provide one-byte and two-byte normal load and store instructions in order to be supported by the Linux kernel. (Michael Cree has agreed to the resulting non-support of pre-EV56 Alpha CPUs: https://lkml.org/lkml/2014/9/5/143.) Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 87be0a8a78de..fe4d51b704c5 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -269,6 +269,37 @@ And there are a number of things that _must_ or _must_not_ be assumed: STORE *(A + 4) = Y; STORE *A = X; STORE {*A, *(A + 4) } = {X, Y}; +And there are anti-guarantees: + + (*) These guarantees do not apply to bitfields, because compilers often + generate code to modify these using non-atomic read-modify-write + sequences. Do not attempt to use bitfields to synchronize parallel + algorithms. + + (*) Even in cases where bitfields are protected by locks, all fields + in a given bitfield must be protected by one lock. If two fields + in a given bitfield are protected by different locks, the compiler's + non-atomic read-modify-write sequences can cause an update to one + field to corrupt the value of an adjacent field. + + (*) These guarantees apply only to properly aligned and sized scalar + variables. Properly sized currently means variables that are + the same size as char, short, int and long. Properly + aligned means the natural alignment, thus no constraints for + char, two-byte alignment for short, four-byte alignment for + int, and either four-byte or eight-byte alignment for long, + on 32-bit and 64-bit systems, respectively. Note that this means + that the Linux kernel does not support pre-EV56 Alpha CPUs, + because these older CPUs do not provide one-byte and two-byte + load and store instructions. (In theory, the pre-EV56 Alpha CPUs + can emulate these instructions using load-linked/store-conditional + instructions, but in practice this approach has excessive overhead. + Keep in mind that this emulation would be required on -all- single- + and double-byte loads and stores in order to handle adjacent bytes + protected by different locks.) + + Alpha EV56 and later Alpha CPUs are still supported. + = WHAT ARE MEMORY BARRIERS? ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Fri, Sep 05, 2014 at 01:34:52PM -0700, H. Peter Anvin wrote: On 09/05/2014 01:14 PM, Peter Hurley wrote: Here's how I read the two statements. First, the commit message: It [this commit] documents that CPUs [supported by the Linux kernel] _must provide_ atomic one-byte and two-byte naturally aligned loads and stores. Second, in the body of the document: The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these older CPUs _do not provide_ atomic one-byte and two-byte loads and stores. Does this apply in general or only to SMP configurations? I guess non-SMP configurations would still have problems if interrupted in the wrong place... And preemption could cause problems, too. So I believe that it needs to be universal. Thanx, Paul ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Fri, Sep 05, 2014 at 01:34:52PM -0700, H. Peter Anvin wrote: On 09/05/2014 01:14 PM, Peter Hurley wrote: Here's how I read the two statements. First, the commit message: It [this commit] documents that CPUs [supported by the Linux kernel] _must provide_ atomic one-byte and two-byte naturally aligned loads and stores. Second, in the body of the document: The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these older CPUs _do not provide_ atomic one-byte and two-byte loads and stores. Does this apply in general or only to SMP configurations? I guess non-SMP configurations would still have problems if interrupted in the wrong place... Yes. Cheers Michael. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Fri, 5 Sep 2014, Paul E. McKenney wrote: On Fri, Sep 05, 2014 at 01:34:52PM -0700, H. Peter Anvin wrote: On 09/05/2014 01:14 PM, Peter Hurley wrote: Here's how I read the two statements. First, the commit message: It [this commit] documents that CPUs [supported by the Linux kernel] _must provide_ atomic one-byte and two-byte naturally aligned loads and stores. Second, in the body of the document: The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these older CPUs _do not provide_ atomic one-byte and two-byte loads and stores. Does this apply in general or only to SMP configurations? I guess non-SMP configurations would still have problems if interrupted in the wrong place... And preemption could cause problems, too. So I believe that it needs to be universal. Well preemption is usually caused by an interrupt, except you have a combined load and preempt instruction :) Thanks, tglx ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Fri, Sep 05, 2014 at 10:48:34PM +0200, Thomas Gleixner wrote: On Fri, 5 Sep 2014, Paul E. McKenney wrote: On Fri, Sep 05, 2014 at 01:34:52PM -0700, H. Peter Anvin wrote: On 09/05/2014 01:14 PM, Peter Hurley wrote: Here's how I read the two statements. First, the commit message: It [this commit] documents that CPUs [supported by the Linux kernel] _must provide_ atomic one-byte and two-byte naturally aligned loads and stores. Second, in the body of the document: The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these older CPUs _do not provide_ atomic one-byte and two-byte loads and stores. Does this apply in general or only to SMP configurations? I guess non-SMP configurations would still have problems if interrupted in the wrong place... And preemption could cause problems, too. So I believe that it needs to be universal. Well preemption is usually caused by an interrupt, except you have a combined load and preempt instruction :) Fair point! ;-) Thanx, Paul ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On 09/05/2014 04:39 PM, Michael Cree wrote: On Fri, Sep 05, 2014 at 04:14:48PM -0400, Peter Hurley wrote: Second, in the body of the document: The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these older CPUs _do not provide_ atomic one-byte and two-byte loads and stores. Let's be clear here, the pre-EV56 Alpha CPUs do provide an atomic one-byte and two-byte load and store; it's just that one must use locked load and store sequences to achieve atomicity. The point, I think, is that the pre-EV56 Alpha CPUs provide non-atomic one-byte and two-byte load and stores as the norm, and that is the problem. I'm all for an Alpha expert to jump in here and meet the criteria; which is that byte stores cannot corrupt adjacent storage (nor can aligned short stores). To my mind, a quick look at Documentation/circular-buffers.txt will pretty much convince anyone that trying to differentiate by execution context is undoable. If someone wants to make Alphas do cmpxchg loops for every byte store, then ok. Or any other solution that doesn't require subsystem code changes. Regards, Peter Hurley ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 0/2] PCI/MSI: Remove arch_msi_check_device()
On Sat, Jul 12, 2014 at 01:21:06PM +0200, Alexander Gordeev wrote: Hello, This is a cleanup effort to get rid of useless arch_msi_check_device(). I am not sure what were the reasons for its existence in the first place, but at the moment it appears totally unnecessary. Thanks! Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-...@vger.kernel.org Alexander Gordeev (2): PCI/MSI/PPC: Remove arch_msi_check_device() PCI/MSI: Remove arch_msi_check_device() I applied these (with Michael's ack on the first, and v2 of the second) to pci/msi for v3.18, thanks! arch/powerpc/include/asm/machdep.h |2 - arch/powerpc/kernel/msi.c | 12 +--- arch/powerpc/platforms/cell/axon_msi.c |9 -- arch/powerpc/platforms/powernv/pci.c | 19 +++- arch/powerpc/platforms/pseries/msi.c | 42 ++- arch/powerpc/sysdev/fsl_msi.c | 12 ++-- arch/powerpc/sysdev/mpic_pasemi_msi.c | 11 +-- arch/powerpc/sysdev/mpic_u3msi.c | 28 +++--- arch/powerpc/sysdev/ppc4xx_hsta_msi.c | 18 arch/powerpc/sysdev/ppc4xx_msi.c | 19 drivers/pci/msi.c | 49 --- include/linux/msi.h| 3 -- 12 files changed, 63 insertions(+), 161 deletions(-) -- 1.7.7.6 -- To unsubscribe from this list: send the line unsubscribe linux-pci in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 0/2] PCI/MSI: Remove arch_msi_check_device()
On Fri, Sep 5, 2014 at 3:25 PM, Bjorn Helgaas bhelg...@google.com wrote: On Sat, Jul 12, 2014 at 01:21:06PM +0200, Alexander Gordeev wrote: Hello, This is a cleanup effort to get rid of useless arch_msi_check_device(). I am not sure what were the reasons for its existence in the first place, but at the moment it appears totally unnecessary. Thanks! Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-...@vger.kernel.org Alexander Gordeev (2): PCI/MSI/PPC: Remove arch_msi_check_device() PCI/MSI: Remove arch_msi_check_device() I applied these (with Michael's ack on the first, and v2 of the second) to pci/msi for v3.18, thanks! arch/powerpc/include/asm/machdep.h |2 - arch/powerpc/kernel/msi.c | 12 +--- arch/powerpc/platforms/cell/axon_msi.c |9 -- arch/powerpc/platforms/powernv/pci.c | 19 +++- arch/powerpc/platforms/pseries/msi.c | 42 ++- arch/powerpc/sysdev/fsl_msi.c | 12 ++-- arch/powerpc/sysdev/mpic_pasemi_msi.c | 11 +-- arch/powerpc/sysdev/mpic_u3msi.c | 28 +++--- arch/powerpc/sysdev/ppc4xx_hsta_msi.c | 18 arch/powerpc/sysdev/ppc4xx_msi.c | 19 drivers/pci/msi.c | 49 --- include/linux/msi.h| 3 -- 12 files changed, 63 insertions(+), 161 deletions(-) Oh, I forgot -- if you'd rather take the first one through the PPC tree, you can do that and I can merge the second one later. Let me know if you want to do that. Bjorn ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Fri, Sep 05, 2014 at 05:12:28PM -0400, Peter Hurley wrote: On 09/05/2014 04:39 PM, Michael Cree wrote: On Fri, Sep 05, 2014 at 04:14:48PM -0400, Peter Hurley wrote: Second, in the body of the document: The Linux kernel no longer supports pre-EV56 Alpha CPUs, because these older CPUs _do not provide_ atomic one-byte and two-byte loads and stores. Let's be clear here, the pre-EV56 Alpha CPUs do provide an atomic one-byte and two-byte load and store; it's just that one must use locked load and store sequences to achieve atomicity. The point, I think, is that the pre-EV56 Alpha CPUs provide non-atomic one-byte and two-byte load and stores as the norm, and that is the problem. I'm all for an Alpha expert to jump in here and meet the criteria; which is that byte stores cannot corrupt adjacent storage (nor can aligned short stores). To my mind, a quick look at Documentation/circular-buffers.txt will pretty much convince anyone that trying to differentiate by execution context is undoable. If someone wants to make Alphas do cmpxchg loops for every byte store, then ok. Or any other solution that doesn't require subsystem code changes. I am not suggesting that anyone do that work. I'm certainly not going to do it. All I was pointing out is that the claim that _do not provide_ made above with emphasis is, strictly interpreted, not true, thus should not be committed to the documentation without further clarification. Cheers Michael. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] PCI: Increase BAR size quirk for IBM ipr SAS Crocodile adapters
On Thu, Aug 21, 2014 at 09:26:52AM +1000, Anton Blanchard wrote: From: Douglas Lehr dll...@us.ibm.com The Crocodile chip occasionally comes up with 4k and 8k BAR sizes. Due to an errata, setting the SR-IOV page size causes the physical function BARs to expand to the system page size. Since ppc64 uses 64k pages, when Linux tries to assign the smaller resource sizes to the now 64k BARs the address will be truncated and the BARs will overlap. This quirk will force Linux to allocate the resource as a full page, which will avoid the overlap. Cc: sta...@vger.kernel.org Signed-off-by: Douglas Lehr dll...@us.ibm.com Signed-off-by: Anton Blanchard an...@samba.org Acked-by: Milton Miller milt...@us.ibm.com Applied to pci/misc for v3.18, thanks! I tweaked it to print the expanded resource, see below. --- drivers/pci/quirks.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 80c2d01..45b946d 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -24,6 +24,7 @@ #include linux/ioport.h #include linux/sched.h #include linux/ktime.h +#include linux/mm.h #include asm/dma.h /* isa_dma_bridge_buggy */ #include pci.h @@ -287,6 +288,24 @@ static void quirk_citrine(struct pci_dev *dev) } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_IBM, PCI_DEVICE_ID_IBM_CITRINE, quirk_citrine); +/* On IBM Crocodile ipr SAS adapters, expand bar size to system page size. */ +static void quirk_extend_bar_to_page(struct pci_dev *dev) +{ + int i; + + for (i = 0; i PCI_STD_RESOURCE_END; i++) { + struct resource *r = dev-resource[i]; + + if (r-flags IORESOURCE_MEM resource_size(r) PAGE_SIZE) { + dev_info(dev-dev, Setting Bar size to Page size); + r-end = PAGE_SIZE-1; + r-start = 0; + r-flags |= IORESOURCE_UNSET; + } + } +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_IBM, 0x034a, quirk_extend_bar_to_page); + /* * S3 868 and 968 chips report region size equal to 32M, but they decode 64M. * If it's needed, re-allocate the region. -- commit 86b6431a306ab5a5204c436a45a3337fb17efa21 Author: Douglas Lehr dll...@us.ibm.com Date: Thu Aug 21 09:26:52 2014 +1000 PCI: Increase IBM ipr SAS Crocodile BARs to at least system page size The Crocodile chip occasionally comes up with 4k and 8k BAR sizes. Due to an erratum, setting the SR-IOV page size causes the physical function BARs to expand to the system page size. Since ppc64 uses 64k pages, when Linux tries to assign the smaller resource sizes to the now 64k BARs the address will be truncated and the BARs will overlap. Force Linux to allocate the resource as a full page, which avoids the overlap. [bhelgaas: print expanded resource, too] Signed-off-by: Douglas Lehr dll...@us.ibm.com Signed-off-by: Anton Blanchard an...@samba.org Signed-off-by: Bjorn Helgaas bhelg...@google.com Acked-by: Milton Miller milt...@us.ibm.com CC: sta...@vger.kernel.org diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 80c2d014283d..e73960311fb4 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -24,6 +24,7 @@ #include linux/ioport.h #include linux/sched.h #include linux/ktime.h +#include linux/mm.h #include asm/dma.h /* isa_dma_bridge_buggy */ #include pci.h @@ -287,6 +288,25 @@ static void quirk_citrine(struct pci_dev *dev) } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_IBM,PCI_DEVICE_ID_IBM_CITRINE, quirk_citrine); +/* On IBM Crocodile ipr SAS adapters, expand BAR to system page size */ +static void quirk_extend_bar_to_page(struct pci_dev *dev) +{ + int i; + + for (i = 0; i PCI_STD_RESOURCE_END; i++) { + struct resource *r = dev-resource[i]; + + if (r-flags IORESOURCE_MEM resource_size(r) PAGE_SIZE) { + r-end = PAGE_SIZE - 1; + r-start = 0; + r-flags |= IORESOURCE_UNSET; + dev_info(dev-dev, expanded BAR %d to page size: %pR\n, +r); + } + } +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_IBM, 0x034a, quirk_extend_bar_to_page); + /* * S3 868 and 968 chips report region size equal to 32M, but they decode 64M. * If it's needed, re-allocate the region. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev