Re: [RFC PATCH 0/6] Change vendor prefix for Intersil Corporation
On Wed, Aug 20, 2014 at 08:16:43AM -0400, Jason Cooper wrote: Philipp, On Wed, Aug 20, 2014 at 10:42:58AM +0200, Philipp Zabel wrote: Hi, currently there is a wild mixture of isl, isil, and intersil compatibles in the kernel. I figure at this point it is still possible to change this to use isil everywhere without too much pain, but it might be preferred to keep the already documented isl prefix, even though it doesn't follow the convention to use the NASDAQ symbol where available. The current users of the isil prefix are the following drivers and device trees, so we could just as well change those instead: arch/arm/boot/dts/tegra20-seaboard.dts arch/arm/boot/dts/tegra20-ventana.dts arch/arm/boot/dts/tegra30-cardhu.dtsi arch/powerpc/boot/dts/p1022rdk.dts drivers/staging/iio/light/isl29018.c drivers/staging/iio/light/isl29028.c regards Philipp Philipp Zabel (6): of: Change vendor prefix for Intersil Corporation to isil Documentation: Add isl1208 and isl12022 to trivial-devices list ARM: mvebu: Change vendor prefix for Intersil Corporation to isil powerpc/85xx: Change vendor prefix for Intersil Corporation to isil rtc: rtc-isl12022: Change vendor prefix for Intersil Corporation to isil rtc: rtc-isl12057: Change vendor prefix for Intersil Corporation to isil Documentation/devicetree/bindings/i2c/trivial-devices.txt | 4 +++- Documentation/devicetree/bindings/vendor-prefixes.txt | 2 +- arch/arm/boot/dts/armada-370-netgear-rn102.dts| 2 +- arch/arm/boot/dts/armada-370-netgear-rn104.dts| 2 +- arch/arm/boot/dts/armada-xp-netgear-rn2120.dts| 2 +- arch/powerpc/boot/dts/ppa8548.dts | 2 +- drivers/rtc/rtc-isl12022.c| 3 ++- drivers/rtc/rtc-isl12057.c| 3 ++- 8 files changed, 12 insertions(+), 8 deletions(-) This looks good overall. My only nit is that I'd like to see the legacy name(s) captured in the binding docs. I'll take the Armada dts changes through mvebu/arm-soc after a few days. Just to dot our i's and cross our t's, would you mind sending out a non-rfc version of this series? thx, Jason. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Sat, Sep 06, 2014 at 10:07:22PM -0700, James Bottomley wrote: On Thu, 2014-09-04 at 21:06 -0700, Paul E. McKenney wrote: On Thu, Sep 04, 2014 at 10:47:24PM -0400, Peter Hurley wrote: Hi James, On 09/04/2014 10:11 PM, James Bottomley wrote: On Thu, 2014-09-04 at 17:17 -0700, Paul E. McKenney wrote: +And there are anti-guarantees: + + (*) These guarantees do not apply to bitfields, because compilers often + generate code to modify these using non-atomic read-modify-write + sequences. Do not attempt to use bitfields to synchronize parallel + algorithms. + + (*) Even in cases where bitfields are protected by locks, all fields + in a given bitfield must be protected by one lock. If two fields + in a given bitfield are protected by different locks, the compiler's + non-atomic read-modify-write sequences can cause an update to one + field to corrupt the value of an adjacent field. + + (*) These guarantees apply only to properly aligned and sized scalar + variables. Properly sized currently means int and long, + because some CPU families do not support loads and stores of + other sizes. (Some CPU families is currently believed to + be only Alpha 21064. If this is actually the case, a different + non-guarantee is likely to be formulated.) This is a bit unclear. Presumably you're talking about definiteness of the outcome (as in what's seen after multiple stores to the same variable). No, the last conditions refers to adjacent byte stores from different cpu contexts (either interrupt or SMP). The guarantees are only for natural width on Parisc as well, so you would get a mess if you did byte stores to adjacent memory locations. For a simple test like: struct x { long a; char b; char c; char d; char e; }; void store_bc(struct x *p) { p-b = 1; p-c = 2; } on parisc, gcc generates separate byte stores void store_bc(struct x *p) { 0: 34 1c 00 02 ldi 1,ret0 4: 0f 5c 12 08 stb ret0,4(r26) 8: 34 1c 00 04 ldi 2,ret0 c: e8 40 c0 00 bv r0(rp) 10: 0f 5c 12 0a stb ret0,5(r26) which appears to confirm that on parisc adjacent byte data is safe from corruption by concurrent cpu updates; that is, CPU 0| CPU 1 | p-b = 1 | p-c = 2 | will result in p-b == 1 p-c == 2 (assume both values were 0 before the call to store_bc()). What Peter said. I would ask for suggestions for better wording, but I would much rather be able to say that single-byte reads and writes are atomic and that aligned-short reads and writes are also atomic. Thus far, it looks like we lose only very old Alpha systems, so unless I hear otherwise, I update my patch to outlaw these very old systems. This isn't universally true according to the architecture manual. The PARISC CPU can make byte to long word stores atomic against the memory bus but not against the I/O bus for instance. Atomicity is a property of the underlying substrate, not of the CPU. Implying that atomicity is a CPU property is incorrect. OK, fair point. But are there in-use-for-Linux PARISC memory fabrics (for normal memory, not I/O) that do not support single-byte and double-byte stores? Thanx, Paul ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 0/2 v5] powerpc/kvm: support to handle sw breakpoint
This patchset adds ppc64 server side support for software breakpoint and extends the use of illegal instruction as software breakpoint across ppc platform. Patch 1, adds kernel side support for software breakpoint. Design is that, by using an illegal instruction, we trap to hypervisor via Emulation Assistance interrupt, where we check for the illegal instruction and accordingly we return to Host or Guest. Patch also adds support for software breakpoint in PR KVM. Patch 2,extends the use of illegal instruction as software breakpoint instruction across the ppc platform. Patch extends booke program interrupt code to support software breakpoint. Patch 2 is only compile tested. Will really help if someone can try it out and let me know comments. Changes v4-v5: Made changes to code comments and commit messages Added debugging active checks for illegal instr comparison Added debug instruction check in emulate code Extended SW breakpoint to booke Changes v3-v4: Made changes to code comments and removed #define of zero opcode Added a new function to handle the debug instruction emulation in book3s_hv Rebased the code to latest upstream source. Changes v2-v3: Changed the debug instructions. Using the all zero opcode in the instruction word as illegal instruction as mentioned in Power ISA instead of ABS Removed reg updated in emulation assist and added a call to kvmppc_emulate_instruction for reg update. Changes v1-v2: Moved the debug instruction #def to kvm_book3s.h. This way PR_KVM can also share it. Added code to use KVM get one reg infrastructure to get debug opcode. Updated emulate.c to include emulation of debug instruction incase of PR_KVM. Made changes to commit message. Madhavan Srinivasan (2): powerpc/kvm: support to handle sw breakpoint powerpc/kvm: common sw breakpoint instr across ppc arch/powerpc/include/asm/kvm_ppc.h | 6 ++ arch/powerpc/kvm/book3s.c | 3 ++- arch/powerpc/kvm/book3s_hv.c | 41 ++ arch/powerpc/kvm/book3s_pr.c | 3 +++ arch/powerpc/kvm/booke.c | 18 +++-- arch/powerpc/kvm/emulate.c | 18 + 6 files changed, 82 insertions(+), 7 deletions(-) -- 1.7.11.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/2 v5] powerpc/kvm: support to handle sw breakpoint
This patch adds kernel side support for software breakpoint. Design is that, by using an illegal instruction, we trap to hypervisor via Emulation Assistance interrupt, where we check for the illegal instruction and accordingly we return to Host or Guest. Patch also adds support for software breakpoint in PR KVM. Signed-off-by: Madhavan Srinivasan ma...@linux.vnet.ibm.com --- arch/powerpc/include/asm/kvm_ppc.h | 6 ++ arch/powerpc/kvm/book3s.c | 3 ++- arch/powerpc/kvm/book3s_hv.c | 41 ++ arch/powerpc/kvm/book3s_pr.c | 3 +++ arch/powerpc/kvm/emulate.c | 18 + 5 files changed, 66 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index fb86a22..dd83c9a 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -38,6 +38,12 @@ #include asm/paca.h #endif +/* + * KVMPPC_INST_SW_BREAKPOINT is debug Instruction + * for supporting software breakpoint. + */ +#define KVMPPC_INST_SW_BREAKPOINT 0x0000 + enum emulation_result { EMULATE_DONE, /* no further processing */ EMULATE_DO_MMIO, /* kvm_run filled with MMIO request */ diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index dd03f6b..00e9c9f 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -778,7 +778,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg) { - return -EINVAL; + vcpu-guest_debug = dbg-control; + return 0; } void kvmppc_decrementer_func(unsigned long data) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 27cced9..3a2414c 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -725,6 +725,30 @@ static int kvmppc_hcall_impl_hv(unsigned long cmd) return kvmppc_hcall_impl_hv_realmode(cmd); } +static int kvmppc_emulate_debug_inst(struct kvm_run *run, + struct kvm_vcpu *vcpu) +{ + u32 last_inst; + + if(kvmppc_get_last_inst(vcpu, INST_GENERIC, last_inst) != + EMULATE_DONE) { + /* +* Fetch failed, so return to guest and +* try executing it again. +*/ + return RESUME_GUEST; + } else { + if (last_inst == KVMPPC_INST_SW_BREAKPOINT) { + run-exit_reason = KVM_EXIT_DEBUG; + run-debug.arch.address = kvmppc_get_pc(vcpu); + return RESUME_HOST; + } else { + kvmppc_core_queue_program(vcpu, SRR1_PROGILL); + return RESUME_GUEST; + } + } +} + static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu, struct task_struct *tsk) { @@ -807,12 +831,18 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu, break; /* * This occurs if the guest executes an illegal instruction. -* We just generate a program interrupt to the guest, since -* we don't emulate any guest instructions at this stage. +* If the guest debug is disabled, generate a program interrupt +* to the guest. If guest debug is enabled, we need to check +* whether the instruction is a software breakpoint instruction. +* Accordingly return to Guest or Host. */ case BOOK3S_INTERRUPT_H_EMUL_ASSIST: - kvmppc_core_queue_program(vcpu, SRR1_PROGILL); - r = RESUME_GUEST; + if(vcpu-guest_debug KVM_GUESTDBG_USE_SW_BP) { + r = kvmppc_emulate_debug_inst(run, vcpu); + } else { + kvmppc_core_queue_program(vcpu, SRR1_PROGILL); + r = RESUME_GUEST; + } break; /* * This occurs if the guest (kernel or userspace), does something that @@ -922,6 +952,9 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, u64 id, long int i; switch (id) { + case KVM_REG_PPC_DEBUG_INST: + *val = get_reg_val(id, KVMPPC_INST_SW_BREAKPOINT); + break; case KVM_REG_PPC_HIOR: *val = get_reg_val(id, 0); break; diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index faffb27..6d73708 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -1319,6 +1319,9 @@ static int kvmppc_get_one_reg_pr(struct kvm_vcpu *vcpu, u64 id, int r = 0; switch (id) { + case KVM_REG_PPC_DEBUG_INST: + *val = get_reg_val(id, KVMPPC_INST_SW_BREAKPOINT); +
[PATCH 2/2 v5] powerpc/kvm: common sw breakpoint instr across ppc
This patch extends the use of illegal instruction as software breakpoint instruction across the ppc platform. Patch extends booke program interrupt code to support software breakpoint. Signed-off-by: Madhavan Srinivasan ma...@linux.vnet.ibm.com --- Patch is only compile tested. Will really help if someone can try it out and let me know comments. arch/powerpc/kvm/booke.c | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index b4c89fa..1b84853 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -870,6 +870,10 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, case BOOKE_INTERRUPT_HV_PRIV: emulated = kvmppc_get_last_inst(vcpu, false, last_inst); break; + case BOOKE_INTERRUPT_PROGRAM: + /*SW breakpoints arrive as illegal instructions on HV */ + emulated = kvmppc_get_last_inst(vcpu, false, last_inst); + break; default: break; } @@ -947,7 +951,17 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, break; case BOOKE_INTERRUPT_PROGRAM: - if (vcpu-arch.shared-msr (MSR_PR | MSR_GS)) { + if ((vcpu-arch.shared-msr (MSR_PR | MSR_GS)) + (last_inst == KVMPPC_INST_SW_BREAKPOINT)) { + /* +* We are here because of an SW breakpoint instr, +* so lets return to host to handle. +*/ + r = kvmppc_handle_debug(run, vcpu); + run-exit_reason = KVM_EXIT_DEBUG; + kvmppc_account_exit(vcpu, DEBUG_EXITS); + break; + } else { /* * Program traps generated by user-level software must * be handled by the guest kernel. @@ -1505,7 +1519,7 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg) val = get_reg_val(reg-id, vcpu-arch.tsr); break; case KVM_REG_PPC_DEBUG_INST: - val = get_reg_val(reg-id, KVMPPC_INST_EHPRIV_DEBUG); + val = get_reg_val(reg-id, KVMPPC_INST_SW_BREAKPOINT); break; case KVM_REG_PPC_VRSAVE: val = get_reg_val(reg-id, vcpu-arch.vrsave); -- 1.7.11.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Sun, 2014-09-07 at 09:21 -0700, Paul E. McKenney wrote: On Sat, Sep 06, 2014 at 10:07:22PM -0700, James Bottomley wrote: On Thu, 2014-09-04 at 21:06 -0700, Paul E. McKenney wrote: On Thu, Sep 04, 2014 at 10:47:24PM -0400, Peter Hurley wrote: Hi James, On 09/04/2014 10:11 PM, James Bottomley wrote: On Thu, 2014-09-04 at 17:17 -0700, Paul E. McKenney wrote: +And there are anti-guarantees: + + (*) These guarantees do not apply to bitfields, because compilers often + generate code to modify these using non-atomic read-modify-write + sequences. Do not attempt to use bitfields to synchronize parallel + algorithms. + + (*) Even in cases where bitfields are protected by locks, all fields + in a given bitfield must be protected by one lock. If two fields + in a given bitfield are protected by different locks, the compiler's + non-atomic read-modify-write sequences can cause an update to one + field to corrupt the value of an adjacent field. + + (*) These guarantees apply only to properly aligned and sized scalar + variables. Properly sized currently means int and long, + because some CPU families do not support loads and stores of + other sizes. (Some CPU families is currently believed to + be only Alpha 21064. If this is actually the case, a different + non-guarantee is likely to be formulated.) This is a bit unclear. Presumably you're talking about definiteness of the outcome (as in what's seen after multiple stores to the same variable). No, the last conditions refers to adjacent byte stores from different cpu contexts (either interrupt or SMP). The guarantees are only for natural width on Parisc as well, so you would get a mess if you did byte stores to adjacent memory locations. For a simple test like: struct x { long a; char b; char c; char d; char e; }; void store_bc(struct x *p) { p-b = 1; p-c = 2; } on parisc, gcc generates separate byte stores void store_bc(struct x *p) { 0: 34 1c 00 02 ldi 1,ret0 4: 0f 5c 12 08 stb ret0,4(r26) 8: 34 1c 00 04 ldi 2,ret0 c: e8 40 c0 00 bv r0(rp) 10: 0f 5c 12 0a stb ret0,5(r26) which appears to confirm that on parisc adjacent byte data is safe from corruption by concurrent cpu updates; that is, CPU 0| CPU 1 | p-b = 1 | p-c = 2 | will result in p-b == 1 p-c == 2 (assume both values were 0 before the call to store_bc()). What Peter said. I would ask for suggestions for better wording, but I would much rather be able to say that single-byte reads and writes are atomic and that aligned-short reads and writes are also atomic. Thus far, it looks like we lose only very old Alpha systems, so unless I hear otherwise, I update my patch to outlaw these very old systems. This isn't universally true according to the architecture manual. The PARISC CPU can make byte to long word stores atomic against the memory bus but not against the I/O bus for instance. Atomicity is a property of the underlying substrate, not of the CPU. Implying that atomicity is a CPU property is incorrect. OK, fair point. But are there in-use-for-Linux PARISC memory fabrics (for normal memory, not I/O) that do not support single-byte and double-byte stores? For aligned access, I believe that's always the case for the memory bus (on both 32 and 64 bit systems). However, it only applies to machine instruction loads and stores of the same width.. If you mix the widths on the loads and stores, all bets are off. That means you have to beware of the gcc penchant for coalescing loads and stores: if it sees two adjacent byte stores it can coalesce them into a short store instead ... that screws up the atomicity guarantees. James ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 0/2] PCI/MSI: Remove arch_msi_check_device()
On Fri, Sep 05, 2014 at 03:27:49PM -0600, Bjorn Helgaas wrote: I applied these (with Michael's ack on the first, and v2 of the second) to pci/msi for v3.18, thanks! Hi Bjorn, I resent a series with updates that fix kbuild robot errors. Hopefully, the rebase for pci/msi would not cause trouble for anyone. Oh, I forgot -- if you'd rather take the first one through the PPC tree, you can do that and I can merge the second one later. Let me know if you want to do that. Nah, your treee is just fine. Thanks! Bjorn -- Regards, Alexander Gordeev agord...@redhat.com ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On 09/07/2014 03:04 PM, James Bottomley wrote: On Sun, 2014-09-07 at 09:21 -0700, Paul E. McKenney wrote: On Sat, Sep 06, 2014 at 10:07:22PM -0700, James Bottomley wrote: On Thu, 2014-09-04 at 21:06 -0700, Paul E. McKenney wrote: On Thu, Sep 04, 2014 at 10:47:24PM -0400, Peter Hurley wrote: Hi James, On 09/04/2014 10:11 PM, James Bottomley wrote: On Thu, 2014-09-04 at 17:17 -0700, Paul E. McKenney wrote: +And there are anti-guarantees: + + (*) These guarantees do not apply to bitfields, because compilers often + generate code to modify these using non-atomic read-modify-write + sequences. Do not attempt to use bitfields to synchronize parallel + algorithms. + + (*) Even in cases where bitfields are protected by locks, all fields + in a given bitfield must be protected by one lock. If two fields + in a given bitfield are protected by different locks, the compiler's + non-atomic read-modify-write sequences can cause an update to one + field to corrupt the value of an adjacent field. + + (*) These guarantees apply only to properly aligned and sized scalar + variables. Properly sized currently means int and long, + because some CPU families do not support loads and stores of + other sizes. (Some CPU families is currently believed to + be only Alpha 21064. If this is actually the case, a different + non-guarantee is likely to be formulated.) This is a bit unclear. Presumably you're talking about definiteness of the outcome (as in what's seen after multiple stores to the same variable). No, the last conditions refers to adjacent byte stores from different cpu contexts (either interrupt or SMP). The guarantees are only for natural width on Parisc as well, so you would get a mess if you did byte stores to adjacent memory locations. For a simple test like: struct x { long a; char b; char c; char d; char e; }; void store_bc(struct x *p) { p-b = 1; p-c = 2; } on parisc, gcc generates separate byte stores void store_bc(struct x *p) { 0: 34 1c 00 02 ldi 1,ret0 4: 0f 5c 12 08 stb ret0,4(r26) 8: 34 1c 00 04 ldi 2,ret0 c: e8 40 c0 00 bv r0(rp) 10: 0f 5c 12 0a stb ret0,5(r26) which appears to confirm that on parisc adjacent byte data is safe from corruption by concurrent cpu updates; that is, CPU 0| CPU 1 | p-b = 1 | p-c = 2 | will result in p-b == 1 p-c == 2 (assume both values were 0 before the call to store_bc()). What Peter said. I would ask for suggestions for better wording, but I would much rather be able to say that single-byte reads and writes are atomic and that aligned-short reads and writes are also atomic. Thus far, it looks like we lose only very old Alpha systems, so unless I hear otherwise, I update my patch to outlaw these very old systems. This isn't universally true according to the architecture manual. The PARISC CPU can make byte to long word stores atomic against the memory bus but not against the I/O bus for instance. Atomicity is a property of the underlying substrate, not of the CPU. Implying that atomicity is a CPU property is incorrect. To go back to this briefly, while it's true that atomicity is not a CPU property, atomicity is not possible if the CPU is not cooperating. OK, fair point. But are there in-use-for-Linux PARISC memory fabrics (for normal memory, not I/O) that do not support single-byte and double-byte stores? For aligned access, I believe that's always the case for the memory bus (on both 32 and 64 bit systems). However, it only applies to machine instruction loads and stores of the same width.. If you mix the widths on the loads and stores, all bets are off. That means you have to beware of the gcc penchant for coalescing loads and stores: if it sees two adjacent byte stores it can coalesce them into a short store instead ... that screws up the atomicity guarantees. Two things: I think that gcc has given up on combining adjacent writes, perhaps because unaligned writes on some arches are prohibitive, so whatever minor optimization was believed to be gained was quickly lost, multi-fold. (Although this might not be the reason since one would expect that gcc would know the alignment of the promoted store). But additionally, even if gcc combines adjacent writes _that are part of the program flow_ then I believe the situation is no worse than would otherwise exist. For instance, given the following: struct x { spinlock_t lock; long a; byte b; byte c; }; void locked_store_b(struct x *p) { spin_lock(p-lock); p-b = 1; spin_unlock(p-lock); p-c = 2; } Granted, the author probably expects ordered writes of STORE B STORE C but that's not guaranteed because there is no memory barrier ordering B before C. I
Re: bit fields data tearing
On Sun, Sep 07, 2014 at 12:04:47PM -0700, James Bottomley wrote: On Sun, 2014-09-07 at 09:21 -0700, Paul E. McKenney wrote: On Sat, Sep 06, 2014 at 10:07:22PM -0700, James Bottomley wrote: On Thu, 2014-09-04 at 21:06 -0700, Paul E. McKenney wrote: On Thu, Sep 04, 2014 at 10:47:24PM -0400, Peter Hurley wrote: Hi James, On 09/04/2014 10:11 PM, James Bottomley wrote: On Thu, 2014-09-04 at 17:17 -0700, Paul E. McKenney wrote: +And there are anti-guarantees: + + (*) These guarantees do not apply to bitfields, because compilers often + generate code to modify these using non-atomic read-modify-write + sequences. Do not attempt to use bitfields to synchronize parallel + algorithms. + + (*) Even in cases where bitfields are protected by locks, all fields + in a given bitfield must be protected by one lock. If two fields + in a given bitfield are protected by different locks, the compiler's + non-atomic read-modify-write sequences can cause an update to one + field to corrupt the value of an adjacent field. + + (*) These guarantees apply only to properly aligned and sized scalar + variables. Properly sized currently means int and long, + because some CPU families do not support loads and stores of + other sizes. (Some CPU families is currently believed to + be only Alpha 21064. If this is actually the case, a different + non-guarantee is likely to be formulated.) This is a bit unclear. Presumably you're talking about definiteness of the outcome (as in what's seen after multiple stores to the same variable). No, the last conditions refers to adjacent byte stores from different cpu contexts (either interrupt or SMP). The guarantees are only for natural width on Parisc as well, so you would get a mess if you did byte stores to adjacent memory locations. For a simple test like: struct x { long a; char b; char c; char d; char e; }; void store_bc(struct x *p) { p-b = 1; p-c = 2; } on parisc, gcc generates separate byte stores void store_bc(struct x *p) { 0: 34 1c 00 02 ldi 1,ret0 4: 0f 5c 12 08 stb ret0,4(r26) 8: 34 1c 00 04 ldi 2,ret0 c: e8 40 c0 00 bv r0(rp) 10: 0f 5c 12 0a stb ret0,5(r26) which appears to confirm that on parisc adjacent byte data is safe from corruption by concurrent cpu updates; that is, CPU 0| CPU 1 | p-b = 1 | p-c = 2 | will result in p-b == 1 p-c == 2 (assume both values were 0 before the call to store_bc()). What Peter said. I would ask for suggestions for better wording, but I would much rather be able to say that single-byte reads and writes are atomic and that aligned-short reads and writes are also atomic. Thus far, it looks like we lose only very old Alpha systems, so unless I hear otherwise, I update my patch to outlaw these very old systems. This isn't universally true according to the architecture manual. The PARISC CPU can make byte to long word stores atomic against the memory bus but not against the I/O bus for instance. Atomicity is a property of the underlying substrate, not of the CPU. Implying that atomicity is a CPU property is incorrect. OK, fair point. But are there in-use-for-Linux PARISC memory fabrics (for normal memory, not I/O) that do not support single-byte and double-byte stores? For aligned access, I believe that's always the case for the memory bus (on both 32 and 64 bit systems). However, it only applies to machine instruction loads and stores of the same width.. If you mix the widths on the loads and stores, all bets are off. That means you have to beware of the gcc penchant for coalescing loads and stores: if it sees two adjacent byte stores it can coalesce them into a short store instead ... that screws up the atomicity guarantees. OK, that means that to make PARISC work reliably, we need to use ACCESS_ONCE() for loads and stores that could have racing accesses. If I understand correctly, this will -not- be needed for code guarded by locks, even with Peter's examples. So if we have something like this: struct foo { char a; char b; }; struct foo *fp; then this code would be bad: fp-a = 1; fp-b = 2; The reason is (as you say) that GCC would be happy to store 0x0102 (or vice versa, depending on endianness) to the pair. We instead need:
Re: bit fields data tearing
I'm confused why storing 0x0102 would be a problem. I think gcc does that even on other cpus. More atomicity can't hurt, can it? On September 7, 2014 4:00:19 PM PDT, Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Sun, Sep 07, 2014 at 12:04:47PM -0700, James Bottomley wrote: On Sun, 2014-09-07 at 09:21 -0700, Paul E. McKenney wrote: On Sat, Sep 06, 2014 at 10:07:22PM -0700, James Bottomley wrote: On Thu, 2014-09-04 at 21:06 -0700, Paul E. McKenney wrote: On Thu, Sep 04, 2014 at 10:47:24PM -0400, Peter Hurley wrote: Hi James, On 09/04/2014 10:11 PM, James Bottomley wrote: On Thu, 2014-09-04 at 17:17 -0700, Paul E. McKenney wrote: +And there are anti-guarantees: + + (*) These guarantees do not apply to bitfields, because compilers often + generate code to modify these using non-atomic read-modify-write + sequences. Do not attempt to use bitfields to synchronize parallel + algorithms. + + (*) Even in cases where bitfields are protected by locks, all fields + in a given bitfield must be protected by one lock. If two fields + in a given bitfield are protected by different locks, the compiler's + non-atomic read-modify-write sequences can cause an update to one + field to corrupt the value of an adjacent field. + + (*) These guarantees apply only to properly aligned and sized scalar + variables. Properly sized currently means int and long, + because some CPU families do not support loads and stores of + other sizes. (Some CPU families is currently believed to + be only Alpha 21064. If this is actually the case, a different + non-guarantee is likely to be formulated.) This is a bit unclear. Presumably you're talking about definiteness of the outcome (as in what's seen after multiple stores to the same variable). No, the last conditions refers to adjacent byte stores from different cpu contexts (either interrupt or SMP). The guarantees are only for natural width on Parisc as well, so you would get a mess if you did byte stores to adjacent memory locations. For a simple test like: struct x { long a; char b; char c; char d; char e; }; void store_bc(struct x *p) { p-b = 1; p-c = 2; } on parisc, gcc generates separate byte stores void store_bc(struct x *p) { 0:34 1c 00 02 ldi 1,ret0 4:0f 5c 12 08 stb ret0,4(r26) 8:34 1c 00 04 ldi 2,ret0 c:e8 40 c0 00 bv r0(rp) 10:0f 5c 12 0a stb ret0,5(r26) which appears to confirm that on parisc adjacent byte data is safe from corruption by concurrent cpu updates; that is, CPU 0| CPU 1 | p-b = 1 | p-c = 2 | will result in p-b == 1 p-c == 2 (assume both values were 0 before the call to store_bc()). What Peter said. I would ask for suggestions for better wording, but I would much rather be able to say that single-byte reads and writes are atomic and that aligned-short reads and writes are also atomic. Thus far, it looks like we lose only very old Alpha systems, so unless I hear otherwise, I update my patch to outlaw these very old systems. This isn't universally true according to the architecture manual. The PARISC CPU can make byte to long word stores atomic against the memory bus but not against the I/O bus for instance. Atomicity is a property of the underlying substrate, not of the CPU. Implying that atomicity is a CPU property is incorrect. OK, fair point. But are there in-use-for-Linux PARISC memory fabrics (for normal memory, not I/O) that do not support single-byte and double-byte stores? For aligned access, I believe that's always the case for the memory bus (on both 32 and 64 bit systems). However, it only applies to machine instruction loads and stores of the same width.. If you mix the widths on the loads and stores, all bets are off. That means you have to beware of the gcc penchant for coalescing loads and stores: if it sees two adjacent byte stores it can coalesce them into a short store instead ... that screws up the atomicity guarantees. OK, that means that to make PARISC work reliably, we need to use ACCESS_ONCE() for loads and stores that could have racing accesses. If I understand correctly, this will -not- be needed for code guarded by locks, even with Peter's examples. So if we have something like this: struct foo { char a; char b; }; struct foo *fp; then this code would be bad: fp-a = 1;
Re: bit fields data tearing
On Sun, Sep 07, 2014 at 04:17:30PM -0700, H. Peter Anvin wrote: I'm confused why storing 0x0102 would be a problem. I think gcc does that even on other cpus. More atomicity can't hurt, can it? I must defer to James for any additional details on why PARISC systems don't provide atomicity for partially overlapping stores. ;-) Thanx, Paul On September 7, 2014 4:00:19 PM PDT, Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Sun, Sep 07, 2014 at 12:04:47PM -0700, James Bottomley wrote: On Sun, 2014-09-07 at 09:21 -0700, Paul E. McKenney wrote: On Sat, Sep 06, 2014 at 10:07:22PM -0700, James Bottomley wrote: On Thu, 2014-09-04 at 21:06 -0700, Paul E. McKenney wrote: On Thu, Sep 04, 2014 at 10:47:24PM -0400, Peter Hurley wrote: Hi James, On 09/04/2014 10:11 PM, James Bottomley wrote: On Thu, 2014-09-04 at 17:17 -0700, Paul E. McKenney wrote: +And there are anti-guarantees: + + (*) These guarantees do not apply to bitfields, because compilers often + generate code to modify these using non-atomic read-modify-write + sequences. Do not attempt to use bitfields to synchronize parallel + algorithms. + + (*) Even in cases where bitfields are protected by locks, all fields + in a given bitfield must be protected by one lock. If two fields + in a given bitfield are protected by different locks, the compiler's + non-atomic read-modify-write sequences can cause an update to one + field to corrupt the value of an adjacent field. + + (*) These guarantees apply only to properly aligned and sized scalar + variables. Properly sized currently means int and long, + because some CPU families do not support loads and stores of + other sizes. (Some CPU families is currently believed to + be only Alpha 21064. If this is actually the case, a different + non-guarantee is likely to be formulated.) This is a bit unclear. Presumably you're talking about definiteness of the outcome (as in what's seen after multiple stores to the same variable). No, the last conditions refers to adjacent byte stores from different cpu contexts (either interrupt or SMP). The guarantees are only for natural width on Parisc as well, so you would get a mess if you did byte stores to adjacent memory locations. For a simple test like: struct x { long a; char b; char c; char d; char e; }; void store_bc(struct x *p) { p-b = 1; p-c = 2; } on parisc, gcc generates separate byte stores void store_bc(struct x *p) { 0: 34 1c 00 02 ldi 1,ret0 4: 0f 5c 12 08 stb ret0,4(r26) 8: 34 1c 00 04 ldi 2,ret0 c: e8 40 c0 00 bv r0(rp) 10: 0f 5c 12 0a stb ret0,5(r26) which appears to confirm that on parisc adjacent byte data is safe from corruption by concurrent cpu updates; that is, CPU 0| CPU 1 | p-b = 1 | p-c = 2 | will result in p-b == 1 p-c == 2 (assume both values were 0 before the call to store_bc()). What Peter said. I would ask for suggestions for better wording, but I would much rather be able to say that single-byte reads and writes are atomic and that aligned-short reads and writes are also atomic. Thus far, it looks like we lose only very old Alpha systems, so unless I hear otherwise, I update my patch to outlaw these very old systems. This isn't universally true according to the architecture manual. The PARISC CPU can make byte to long word stores atomic against the memory bus but not against the I/O bus for instance. Atomicity is a property of the underlying substrate, not of the CPU. Implying that atomicity is a CPU property is incorrect. OK, fair point. But are there in-use-for-Linux PARISC memory fabrics (for normal memory, not I/O) that do not support single-byte and double-byte stores? For aligned access, I believe that's always the case for the memory bus (on both 32 and 64 bit systems). However, it only applies to machine instruction loads and stores of the same width.. If you mix the widths on the loads and stores, all bets are off. That means you have to beware of the gcc penchant for coalescing loads and stores: if it sees two adjacent byte stores it can coalesce them into a short store instead ... that screws up the atomicity guarantees. OK, that means that to make PARISC work reliably, we
Re: bit fields data tearing
How many PARISC systems do we have that actually do real work on Linux? On September 7, 2014 4:36:55 PM PDT, Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Sun, Sep 07, 2014 at 04:17:30PM -0700, H. Peter Anvin wrote: I'm confused why storing 0x0102 would be a problem. I think gcc does that even on other cpus. More atomicity can't hurt, can it? I must defer to James for any additional details on why PARISC systems don't provide atomicity for partially overlapping stores. ;-) Thanx, Paul On September 7, 2014 4:00:19 PM PDT, Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Sun, Sep 07, 2014 at 12:04:47PM -0700, James Bottomley wrote: On Sun, 2014-09-07 at 09:21 -0700, Paul E. McKenney wrote: On Sat, Sep 06, 2014 at 10:07:22PM -0700, James Bottomley wrote: On Thu, 2014-09-04 at 21:06 -0700, Paul E. McKenney wrote: On Thu, Sep 04, 2014 at 10:47:24PM -0400, Peter Hurley wrote: Hi James, On 09/04/2014 10:11 PM, James Bottomley wrote: On Thu, 2014-09-04 at 17:17 -0700, Paul E. McKenney wrote: +And there are anti-guarantees: + + (*) These guarantees do not apply to bitfields, because compilers often + generate code to modify these using non-atomic read-modify-write + sequences. Do not attempt to use bitfields to synchronize parallel + algorithms. + + (*) Even in cases where bitfields are protected by locks, all fields + in a given bitfield must be protected by one lock. If two fields + in a given bitfield are protected by different locks, the compiler's + non-atomic read-modify-write sequences can cause an update to one + field to corrupt the value of an adjacent field. + + (*) These guarantees apply only to properly aligned and sized scalar + variables. Properly sized currently means int and long, + because some CPU families do not support loads and stores of + other sizes. (Some CPU families is currently believed to + be only Alpha 21064. If this is actually the case, a different + non-guarantee is likely to be formulated.) This is a bit unclear. Presumably you're talking about definiteness of the outcome (as in what's seen after multiple stores to the same variable). No, the last conditions refers to adjacent byte stores from different cpu contexts (either interrupt or SMP). The guarantees are only for natural width on Parisc as well, so you would get a mess if you did byte stores to adjacent memory locations. For a simple test like: struct x { long a; char b; char c; char d; char e; }; void store_bc(struct x *p) { p-b = 1; p-c = 2; } on parisc, gcc generates separate byte stores void store_bc(struct x *p) { 0: 34 1c 00 02 ldi 1,ret0 4: 0f 5c 12 08 stb ret0,4(r26) 8: 34 1c 00 04 ldi 2,ret0 c: e8 40 c0 00 bv r0(rp) 10: 0f 5c 12 0a stb ret0,5(r26) which appears to confirm that on parisc adjacent byte data is safe from corruption by concurrent cpu updates; that is, CPU 0| CPU 1 | p-b = 1 | p-c = 2 | will result in p-b == 1 p-c == 2 (assume both values were 0 before the call to store_bc()). What Peter said. I would ask for suggestions for better wording, but I would much rather be able to say that single-byte reads and writes are atomic and that aligned-short reads and writes are also atomic. Thus far, it looks like we lose only very old Alpha systems, so unless I hear otherwise, I update my patch to outlaw these very old systems. This isn't universally true according to the architecture manual. The PARISC CPU can make byte to long word stores atomic against the memory bus but not against the I/O bus for instance. Atomicity is a property of the underlying substrate, not of the CPU. Implying that atomicity is a CPU property is incorrect. OK, fair point. But are there in-use-for-Linux PARISC memory fabrics (for normal memory, not I/O) that do not support single-byte and double-byte stores? For aligned access, I believe that's always the case for the memory bus (on both 32 and 64 bit systems). However, it only applies to machine instruction loads and stores of the same width.. If you mix the widths on the loads and stores, all bets are off. That means you have to beware of the gcc penchant for coalescing loads and stores: if it sees two adjacent byte stores it can
[PATCH v2 3/3] PCI/MSI: Remove arch_msi_check_device()
There are no archs that override arch_msi_check_device() hook. Remove it as it is completely redundant. If an arch would need to check MSI/MSI-X possibility for a device it should make it within arch_setup_msi_irqs() hook. Cc: Michael Ellerman m...@ellerman.id.au Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-...@vger.kernel.org Signed-off-by: Alexander Gordeev agord...@redhat.com --- drivers/pci/msi.c | 49 + include/linux/msi.h | 3 --- 2 files changed, 13 insertions(+), 39 deletions(-) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 5a40516..6c2cc41 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -56,16 +56,6 @@ void __weak arch_teardown_msi_irq(unsigned int irq) chip-teardown_irq(chip, irq); } -int __weak arch_msi_check_device(struct pci_dev *dev, int nvec, int type) -{ - struct msi_chip *chip = dev-bus-msi; - - if (!chip || !chip-check_device) - return 0; - - return chip-check_device(chip, dev, nvec, type); -} - int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) { struct msi_desc *entry; @@ -806,22 +796,23 @@ out_free: } /** - * pci_msi_check_device - check whether MSI may be enabled on a device + * pci_msi_supported - check whether MSI may be enabled on a device * @dev: pointer to the pci_dev data structure of MSI device function * @nvec: how many MSIs have been requested ? - * @type: are we checking for MSI or MSI-X ? * * Look at global flags, the device itself, and its parent buses * to determine if MSI/-X are supported for the device. If MSI/-X is * supported return 0, else return an error code. **/ -static int pci_msi_check_device(struct pci_dev *dev, int nvec, int type) +static int pci_msi_supported(struct pci_dev *dev, int nvec) { struct pci_bus *bus; - int ret; /* MSI must be globally enabled and supported by the device */ - if (!pci_msi_enable || !dev || dev-no_msi) + if (!pci_msi_enable) + return -EINVAL; + + if (!dev || dev-no_msi || dev-current_state != PCI_D0) return -EINVAL; /* @@ -843,10 +834,6 @@ static int pci_msi_check_device(struct pci_dev *dev, int nvec, int type) if (bus-bus_flags PCI_BUS_FLAGS_NO_MSI) return -EINVAL; - ret = arch_msi_check_device(dev, nvec, type); - if (ret) - return ret; - return 0; } @@ -949,13 +936,13 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec) int status, nr_entries; int i, j; - if (!entries || !dev-msix_cap || dev-current_state != PCI_D0) - return -EINVAL; - - status = pci_msi_check_device(dev, nvec, PCI_CAP_ID_MSIX); + status = pci_msi_supported(dev, nvec); if (status) return status; + if (!entries) + return -EINVAL; + nr_entries = pci_msix_vec_count(dev); if (nr_entries 0) return nr_entries; @@ -1062,8 +1049,9 @@ int pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec) int nvec; int rc; - if (dev-current_state != PCI_D0) - return -EINVAL; + rc = pci_msi_supported(dev, minvec); + if (rc) + return rc; WARN_ON(!!dev-msi_enabled); @@ -1086,17 +1074,6 @@ int pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec) nvec = maxvec; do { - rc = pci_msi_check_device(dev, nvec, PCI_CAP_ID_MSI); - if (rc 0) { - return rc; - } else if (rc 0) { - if (rc minvec) - return -ENOSPC; - nvec = rc; - } - } while (rc); - - do { rc = msi_capability_init(dev, nvec); if (rc 0) { return rc; diff --git a/include/linux/msi.h b/include/linux/msi.h index 8103f32..dbf7cc9 100644 --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -60,7 +60,6 @@ int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc); void arch_teardown_msi_irq(unsigned int irq); int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type); void arch_teardown_msi_irqs(struct pci_dev *dev); -int arch_msi_check_device(struct pci_dev* dev, int nvec, int type); void arch_restore_msi_irqs(struct pci_dev *dev); void default_teardown_msi_irqs(struct pci_dev *dev); @@ -77,8 +76,6 @@ struct msi_chip { int (*setup_irq)(struct msi_chip *chip, struct pci_dev *dev, struct msi_desc *desc); void (*teardown_irq)(struct msi_chip *chip, unsigned int irq); - int (*check_device)(struct msi_chip *chip, struct pci_dev *dev, - int nvec, int type); }; #endif /* LINUX_MSI_H */ -- 1.9.3
[PATCH v2 1/3] PCI/MSI/PPC: Remove arch_msi_check_device()
Moving MSI checks from arch_msi_check_device() function to arch_setup_msi_irqs() function makes code more compact and allows removing unnecessary hook arch_msi_check_device() from generic MSI code. Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-...@vger.kernel.org Cc: Michael Ellerman m...@ellerman.id.au Signed-off-by: Alexander Gordeev agord...@redhat.com --- arch/powerpc/include/asm/machdep.h | 2 -- arch/powerpc/kernel/msi.c | 12 +- arch/powerpc/platforms/cell/axon_msi.c | 9 arch/powerpc/platforms/powernv/pci.c | 19 --- arch/powerpc/platforms/pseries/msi.c | 42 +- arch/powerpc/sysdev/fsl_msi.c | 12 +++--- arch/powerpc/sysdev/mpic_pasemi_msi.c | 11 ++--- arch/powerpc/sysdev/mpic_u3msi.c | 28 +-- arch/powerpc/sysdev/ppc4xx_hsta_msi.c | 18 +-- arch/powerpc/sysdev/ppc4xx_msi.c | 19 +-- 10 files changed, 50 insertions(+), 122 deletions(-) diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index b125cea..3af7216 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/include/asm/machdep.h @@ -136,8 +136,6 @@ struct machdep_calls { int (*pci_setup_phb)(struct pci_controller *host); #ifdef CONFIG_PCI_MSI - int (*msi_check_device)(struct pci_dev* dev, - int nvec, int type); int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type); void(*teardown_msi_irqs)(struct pci_dev *dev); diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c index 8bbc12d..71bd161 100644 --- a/arch/powerpc/kernel/msi.c +++ b/arch/powerpc/kernel/msi.c @@ -13,7 +13,7 @@ #include asm/machdep.h -int arch_msi_check_device(struct pci_dev* dev, int nvec, int type) +int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) { if (!ppc_md.setup_msi_irqs || !ppc_md.teardown_msi_irqs) { pr_debug(msi: Platform doesn't provide MSI callbacks.\n); @@ -24,16 +24,6 @@ int arch_msi_check_device(struct pci_dev* dev, int nvec, int type) if (type == PCI_CAP_ID_MSI nvec 1) return 1; - if (ppc_md.msi_check_device) { - pr_debug(msi: Using platform check routine.\n); - return ppc_md.msi_check_device(dev, nvec, type); - } - -return 0; -} - -int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) -{ return ppc_md.setup_msi_irqs(dev, nvec, type); } diff --git a/arch/powerpc/platforms/cell/axon_msi.c b/arch/powerpc/platforms/cell/axon_msi.c index 85825b5..862b327 100644 --- a/arch/powerpc/platforms/cell/axon_msi.c +++ b/arch/powerpc/platforms/cell/axon_msi.c @@ -199,14 +199,6 @@ out_error: return msic; } -static int axon_msi_check_device(struct pci_dev *dev, int nvec, int type) -{ - if (!find_msi_translator(dev)) - return -ENODEV; - - return 0; -} - static int setup_msi_msg_address(struct pci_dev *dev, struct msi_msg *msg) { struct device_node *dn; @@ -416,7 +408,6 @@ static int axon_msi_probe(struct platform_device *device) ppc_md.setup_msi_irqs = axon_msi_setup_msi_irqs; ppc_md.teardown_msi_irqs = axon_msi_teardown_msi_irqs; - ppc_md.msi_check_device = axon_msi_check_device; axon_msi_debug_setup(dn, msic); diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c index b854b57..b45c492 100644 --- a/arch/powerpc/platforms/powernv/pci.c +++ b/arch/powerpc/platforms/powernv/pci.c @@ -46,29 +46,21 @@ //#define cfg_dbg(fmt...) printk(fmt) #ifdef CONFIG_PCI_MSI -static int pnv_msi_check_device(struct pci_dev* pdev, int nvec, int type) -{ - struct pci_controller *hose = pci_bus_to_host(pdev-bus); - struct pnv_phb *phb = hose-private_data; - struct pci_dn *pdn = pci_get_pdn(pdev); - - if (pdn pdn-force_32bit_msi !phb-msi32_support) - return -ENODEV; - - return (phb phb-msi_bmp.bitmap) ? 0 : -ENODEV; -} - static int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) { struct pci_controller *hose = pci_bus_to_host(pdev-bus); struct pnv_phb *phb = hose-private_data; + struct pci_dn *pdn = pci_get_pdn(pdev); struct msi_desc *entry; struct msi_msg msg; int hwirq; unsigned int virq; int rc; - if (WARN_ON(!phb)) + if (WARN_ON(!phb) || !phb-msi_bmp.bitmap) + return -ENODEV; + + if (pdn pdn-force_32bit_msi !phb-msi32_support) return -ENODEV; list_for_each_entry(entry, pdev-msi_list, list) { @@ -860,7 +852,6 @@ void __init pnv_pci_init(void) /* Configure MSIs */ #ifdef CONFIG_PCI_MSI - ppc_md.msi_check_device = pnv_msi_check_device;
[PATCH v2 0/3] PCI/MSI: Remove arch_msi_check_device()
Hello, This is a cleanup effort to get rid of arch_msi_check_device() function. I am sending v2 series, since kbuild for v1 reports compile errors on ppc4xx and Armada 370. Still, I have not checked the fixes on these platforms. Changes since v1: - patch 1: 'pdev' undeclared compile error fixed on ppc4xx. I updated changelog and removed ACK from this patch in case there are any objections; - patch 2: 'struct msi_chip' has no 'check_device' error fixed on Armada 370 - armada_370_xp_check_msi_device() hook removed; - patch 3: msi_check_device() renamed to pci_msi_supported(). This is the very same patch I sent earlier; Thanks! Cc: Thomas Gleixner t...@linutronix.de Cc: Jason Cooper ja...@lakedaemon.net Cc: Michael Ellerman m...@ellerman.id.au Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-...@vger.kernel.org Alexander Gordeev (3): patch 1 - PCI/MSI/PPC: Remove arch_msi_check_device() patch 2 - PCI/MSI/Armada-370-xp: Remove arch_msi_check_device() patch 3 - PCI/MSI: Remove arch_msi_check_device() arch/powerpc/include/asm/machdep.h | 2 -- arch/powerpc/kernel/msi.c | 12 + arch/powerpc/platforms/cell/axon_msi.c | 9 --- arch/powerpc/platforms/powernv/pci.c | 19 - arch/powerpc/platforms/pseries/msi.c | 42 +++-- arch/powerpc/sysdev/fsl_msi.c | 12 +++-- arch/powerpc/sysdev/mpic_pasemi_msi.c | 11 ++-- arch/powerpc/sysdev/mpic_u3msi.c | 28 --- arch/powerpc/sysdev/ppc4xx_hsta_msi.c | 18 + arch/powerpc/sysdev/ppc4xx_msi.c | 19 + drivers/irqchip/irq-armada-370-xp.c| 14 +++--- drivers/pci/msi.c | 49 +- include/linux/msi.h| 3 --- 13 files changed, 67 insertions(+), 171 deletions(-) -- 1.9.3 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: bit fields data tearing
On Sun, 2014-09-07 at 16:41 -0400, Peter Hurley wrote: On 09/07/2014 03:04 PM, James Bottomley wrote: On Sun, 2014-09-07 at 09:21 -0700, Paul E. McKenney wrote: On Sat, Sep 06, 2014 at 10:07:22PM -0700, James Bottomley wrote: On Thu, 2014-09-04 at 21:06 -0700, Paul E. McKenney wrote: On Thu, Sep 04, 2014 at 10:47:24PM -0400, Peter Hurley wrote: Hi James, On 09/04/2014 10:11 PM, James Bottomley wrote: On Thu, 2014-09-04 at 17:17 -0700, Paul E. McKenney wrote: +And there are anti-guarantees: + + (*) These guarantees do not apply to bitfields, because compilers often + generate code to modify these using non-atomic read-modify-write + sequences. Do not attempt to use bitfields to synchronize parallel + algorithms. + + (*) Even in cases where bitfields are protected by locks, all fields + in a given bitfield must be protected by one lock. If two fields + in a given bitfield are protected by different locks, the compiler's + non-atomic read-modify-write sequences can cause an update to one + field to corrupt the value of an adjacent field. + + (*) These guarantees apply only to properly aligned and sized scalar + variables. Properly sized currently means int and long, + because some CPU families do not support loads and stores of + other sizes. (Some CPU families is currently believed to + be only Alpha 21064. If this is actually the case, a different + non-guarantee is likely to be formulated.) This is a bit unclear. Presumably you're talking about definiteness of the outcome (as in what's seen after multiple stores to the same variable). No, the last conditions refers to adjacent byte stores from different cpu contexts (either interrupt or SMP). The guarantees are only for natural width on Parisc as well, so you would get a mess if you did byte stores to adjacent memory locations. For a simple test like: struct x { long a; char b; char c; char d; char e; }; void store_bc(struct x *p) { p-b = 1; p-c = 2; } on parisc, gcc generates separate byte stores void store_bc(struct x *p) { 0: 34 1c 00 02 ldi 1,ret0 4: 0f 5c 12 08 stb ret0,4(r26) 8: 34 1c 00 04 ldi 2,ret0 c: e8 40 c0 00 bv r0(rp) 10: 0f 5c 12 0a stb ret0,5(r26) which appears to confirm that on parisc adjacent byte data is safe from corruption by concurrent cpu updates; that is, CPU 0| CPU 1 | p-b = 1 | p-c = 2 | will result in p-b == 1 p-c == 2 (assume both values were 0 before the call to store_bc()). What Peter said. I would ask for suggestions for better wording, but I would much rather be able to say that single-byte reads and writes are atomic and that aligned-short reads and writes are also atomic. Thus far, it looks like we lose only very old Alpha systems, so unless I hear otherwise, I update my patch to outlaw these very old systems. This isn't universally true according to the architecture manual. The PARISC CPU can make byte to long word stores atomic against the memory bus but not against the I/O bus for instance. Atomicity is a property of the underlying substrate, not of the CPU. Implying that atomicity is a CPU property is incorrect. To go back to this briefly, while it's true that atomicity is not a CPU property, atomicity is not possible if the CPU is not cooperating. You mean if it doesn't have the bus logic to emit? Like ia32 mostly can't do 64 bit writes ... sure. OK, fair point. But are there in-use-for-Linux PARISC memory fabrics (for normal memory, not I/O) that do not support single-byte and double-byte stores? For aligned access, I believe that's always the case for the memory bus (on both 32 and 64 bit systems). However, it only applies to machine instruction loads and stores of the same width.. If you mix the widths on the loads and stores, all bets are off. That means you have to beware of the gcc penchant for coalescing loads and stores: if it sees two adjacent byte stores it can coalesce them into a short store instead ... that screws up the atomicity guarantees. Two things: I think that gcc has given up on combining adjacent writes, perhaps because unaligned writes on some arches are prohibitive, so whatever minor optimization was believed to be gained was quickly lost, multi-fold. (Although this might not be the reason since one would expect that gcc would know the alignment of the promoted store). Um, gcc assumes architecturally correct alignment; that's why it pads structures. Only when accessing packed structures will it use the lowest unit load/store. if you have a struct { char a, char b }; and load first a then b with a constant gcc will obligingly optimise to a
Re: bit fields data tearing
On Sun, 2014-09-07 at 16:39 -0700, H. Peter Anvin wrote: How many PARISC systems do we have that actually do real work on Linux? I'd be very surprised if this problem didn't exist on all alignment requiring architectures, like PPC and Sparc as well. I know it would be very convenient if all the world were an x86 ... but it would also be very boring as well. The rules for coping with it are well known and a relaxation of what we currently do in the kernel, so I don't see what the actual problem is. In the context of this thread, PA can't do atomic bit sets (no atomic RMW except the ldcw operation) it can do atomic writes to fundamental sizes (byte, short, int, long) provided gcc emits the correct primitive (there are lots of gotchas in this, but that's not an architectural problem). These atomicity guarantees depend on the underlying storage and are respected for main memory but not for any other type of bus. James On September 7, 2014 4:36:55 PM PDT, Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Sun, Sep 07, 2014 at 04:17:30PM -0700, H. Peter Anvin wrote: I'm confused why storing 0x0102 would be a problem. I think gcc does that even on other cpus. More atomicity can't hurt, can it? I must defer to James for any additional details on why PARISC systems don't provide atomicity for partially overlapping stores. ;-) Thanx, Paul On September 7, 2014 4:00:19 PM PDT, Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Sun, Sep 07, 2014 at 12:04:47PM -0700, James Bottomley wrote: On Sun, 2014-09-07 at 09:21 -0700, Paul E. McKenney wrote: On Sat, Sep 06, 2014 at 10:07:22PM -0700, James Bottomley wrote: On Thu, 2014-09-04 at 21:06 -0700, Paul E. McKenney wrote: On Thu, Sep 04, 2014 at 10:47:24PM -0400, Peter Hurley wrote: Hi James, On 09/04/2014 10:11 PM, James Bottomley wrote: On Thu, 2014-09-04 at 17:17 -0700, Paul E. McKenney wrote: +And there are anti-guarantees: + + (*) These guarantees do not apply to bitfields, because compilers often + generate code to modify these using non-atomic read-modify-write + sequences. Do not attempt to use bitfields to synchronize parallel + algorithms. + + (*) Even in cases where bitfields are protected by locks, all fields + in a given bitfield must be protected by one lock. If two fields + in a given bitfield are protected by different locks, the compiler's + non-atomic read-modify-write sequences can cause an update to one + field to corrupt the value of an adjacent field. + + (*) These guarantees apply only to properly aligned and sized scalar + variables. Properly sized currently means int and long, + because some CPU families do not support loads and stores of + other sizes. (Some CPU families is currently believed to + be only Alpha 21064. If this is actually the case, a different + non-guarantee is likely to be formulated.) This is a bit unclear. Presumably you're talking about definiteness of the outcome (as in what's seen after multiple stores to the same variable). No, the last conditions refers to adjacent byte stores from different cpu contexts (either interrupt or SMP). The guarantees are only for natural width on Parisc as well, so you would get a mess if you did byte stores to adjacent memory locations. For a simple test like: struct x { long a; char b; char c; char d; char e; }; void store_bc(struct x *p) { p-b = 1; p-c = 2; } on parisc, gcc generates separate byte stores void store_bc(struct x *p) { 0: 34 1c 00 02 ldi 1,ret0 4: 0f 5c 12 08 stb ret0,4(r26) 8: 34 1c 00 04 ldi 2,ret0 c: e8 40 c0 00 bv r0(rp) 10: 0f 5c 12 0a stb ret0,5(r26) which appears to confirm that on parisc adjacent byte data is safe from corruption by concurrent cpu updates; that is, CPU 0| CPU 1 | p-b = 1 | p-c = 2 | will result in p-b == 1 p-c == 2 (assume both values were 0 before the call to store_bc()). What Peter said. I would ask for suggestions for better wording, but I would much rather be able to say that single-byte reads and writes are atomic and that aligned-short reads and writes are also atomic. Thus far, it looks like we lose only very