Re: [PATCH v3] powerpc/64: Fix memcmp reading past the end of src/dest
On Tue, Mar 26, 2019 at 08:18:07PM +1100, Michael Ellerman wrote: > Segher Boessenkool writes: > > On Mon, Mar 25, 2019 at 11:33:56PM +1100, Michael Ellerman wrote: > >> Segher Boessenkool writes: > >> > On Fri, Mar 22, 2019 at 11:37:24PM +1100, Michael Ellerman wrote: > >> >> + clrldi r6,r4,(64-12) // r6 = r4 & 0xfff > >> > > >> > You can just write > >> > rlwinm r6,r4,0,0x0fff > >> > >> > if that is clearer? Or do you still want a comment with that :-) > >> > >> I don't think it's clearer doing a rotate of zero bits :) > >> > >> And yeah I'd probably still leave the comment, so I'm inclined to stick > >> with the clrldi? > > > > I always have to think what the clrldi etc. do exactly, while with rlwinm > > it is obvious. But yeah this may be different for other people who are > > used to different idiom. > > Interesting, I'm the opposite. You know ppc assembler better than me so > I guess I just need to spend more time on it and embrace the zen of the > rotate instructions. There is only one rlwinm instruction, but there are 9 extended mnemonics for it. (And similarly for the 64-bit ops, where there are 3 basic mnemonics but 9 extended mnemonics). Of course I use {s,rot}{l,r}{w,d}i, those are actually *simpler* than rlwinm / rldic{,l,r}, but not the other stuff. It may also be because the disassembler doesn't show these things. Dunno. > I'm not sure it's vastly more hostile though than `andi. 6,4,0x` > being valid but `andi. 6,4,0x1` being not valid. But that is the same for *all* UIMM insns. > Yeah, I guess a new `andi` instruction is the only real answer :) Too bad there is no space in the opcode map for it, already not in the original POWER architecture (where this is "andil.") :-/ Segher
Re: [PATCH v3] powerpc/64: Fix memcmp reading past the end of src/dest
Segher Boessenkool writes: > On Mon, Mar 25, 2019 at 11:33:56PM +1100, Michael Ellerman wrote: >> Segher Boessenkool writes: >> > On Fri, Mar 22, 2019 at 11:37:24PM +1100, Michael Ellerman wrote: >> >> + clrldi r6,r4,(64-12) // r6 = r4 & 0xfff >> > >> > You can just write >> > rlwinm r6,r4,0,0x0fff >> >> > if that is clearer? Or do you still want a comment with that :-) >> >> I don't think it's clearer doing a rotate of zero bits :) >> >> And yeah I'd probably still leave the comment, so I'm inclined to stick >> with the clrldi? > > I always have to think what the clrldi etc. do exactly, while with rlwinm > it is obvious. But yeah this may be different for other people who are > used to different idiom. Interesting, I'm the opposite. You know ppc assembler better than me so I guess I just need to spend more time on it and embrace the zen of the rotate instructions. >> Would be nice if the assembler could support: >> >> andir6, r4, 0x0fff >> >> And turn it into the rlwinm, or rldicl :) > > The extended mnemonics are *simple*, *one-to-one* mappings. It would still be simple and 1:1, but would only be valid for certain constants :) > Having "andi. 6,4,0x0f0f" a valid insn, but an extended mnemonic "andi > 6,4,0x0f0f" > that is not (and the other way around for say 0xffff) would violate that. I agree that's a bit of a foot gun. I'm not sure it's vastly more hostile though than `andi. 6,4,0x` being valid but `andi. 6,4,0x1` being not valid. The assembler could print a nice error saying you need to use a contiguous mask. And I mean how often do you andi. with a mask that isn't contiguous? > You could do some assembler macro, that can also expand to multiple insns > where that is useful. Also one for loading constants, etc. The downside > to that is you often do care how many insns are generated. > > Instead you could do a macro for only those cases that can be done with *one* > insn. But that then is pretty restricted in use, and people have to learn > what values are valid. > > I don't see a perfect solution. Yeah, I guess a new `andi` instruction is the only real answer :) cheers
Re: [PATCH v3] powerpc/64: Fix memcmp reading past the end of src/dest
On Mon, Mar 25, 2019 at 11:33:56PM +1100, Michael Ellerman wrote: > Segher Boessenkool writes: > > On Fri, Mar 22, 2019 at 11:37:24PM +1100, Michael Ellerman wrote: > >> + clrldi r6,r4,(64-12) // r6 = r4 & 0xfff > > > > You can just write > > rlwinm r6,r4,0,0x0fff > > > if that is clearer? Or do you still want a comment with that :-) > > I don't think it's clearer doing a rotate of zero bits :) > > And yeah I'd probably still leave the comment, so I'm inclined to stick > with the clrldi? I always have to think what the clrldi etc. do exactly, while with rlwinm it is obvious. But yeah this may be different for other people who are used to different idiom. > Would be nice if the assembler could support: > > andir6, r4, 0x0fff > > And turn it into the rlwinm, or rldicl :) The extended mnemonics are *simple*, *one-to-one* mappings. Having "andi. 6,4,0x0f0f" a valid insn, but an extended mnemonic "andi 6,4,0x0f0f" that is not (and the other way around for say 0xffff) would violate that. You could do some assembler macro, that can also expand to multiple insns where that is useful. Also one for loading constants, etc. The downside to that is you often do care how many insns are generated. Instead you could do a macro for only those cases that can be done with *one* insn. But that then is pretty restricted in use, and people have to learn what values are valid. I don't see a perfect solution. Segher
Re: [PATCH v3] powerpc/64: Fix memcmp reading past the end of src/dest
Segher Boessenkool writes: > On Fri, Mar 22, 2019 at 11:37:24PM +1100, Michael Ellerman wrote: >> .Lcmp_rest_lt8bytes: >> -/* Here we have only less than 8 bytes to compare with. at least s1 >> - * Address is aligned with 8 bytes. >> - * The next double words are load and shift right with appropriate >> - * bits. >> +/* >> + * Here we have less than 8 bytes to compare. At least s1 is aligned to >> + * 8 bytes, but s2 may not be. We must make sure s2 + 8 doesn't cross a > > "s2 + 7"? The code is fine though (bgt, not bge). Duh, thanks for catching it. >> + * page boundary, otherwise we might read past the end of the buffer and >> + * trigger a page fault. We use 4K as the conservative minimum page >> + * size. If we detect that case we go to the byte-by-byte loop. >> + * >> + * Otherwise the next double word is loaded from s1 and s2, and shifted >> + * right to compare the appropriate bits. >> */ >> +clrldi r6,r4,(64-12) // r6 = r4 & 0xfff > > You can just write > rlwinm r6,r4,0,0x0fff > if that is clearer? Or do you still want a comment with that :-) I don't think it's clearer doing a rotate of zero bits :) And yeah I'd probably still leave the comment, so I'm inclined to stick with the clrldi? Would be nice if the assembler could support: andir6, r4, 0x0fff And turn it into the rlwinm, or rldicl :) >> +cmpdi r6,0xff8 >> +bgt .Lshort > > Reviewed-by: Segher Boessenkool I'll fixup the comment. Thanks. cheers
Re: [PATCH v3] powerpc/64: Fix memcmp reading past the end of src/dest
On Friday, March 22, 2019 6:07:24 PM IST Michael Ellerman wrote: > Chandan reported that fstests' generic/026 test hit a crash: > > BUG: Unable to handle kernel data access at 0xc0062ac4 > Faulting instruction address: 0xc0092240 > Oops: Kernel access of bad area, sig: 11 [#1] > LE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries > CPU: 0 PID: 27828 Comm: chacl Not tainted > 5.0.0-rc2-next-20190115-1-g6de6dba64dda #1 > NIP: c0092240 LR: c066a55c CTR: > REGS: c0062c0c3430 TRAP: 0300 Not tainted > (5.0.0-rc2-next-20190115-1-g6de6dba64dda) > MSR: 82009033 CR: 44000842 XER: > 2000 > CFAR: 7fff7f3108ac DAR: c0062ac4 DSISR: 4000 IRQMASK: 0 > GPR00: c0062c0c36c0 c17f4c00 c121a660 > GPR04: c0062ac3fff9 0004 0020 275b19c4 > GPR08: 000c 46494c45 5347495f41434c5f c26073a0 > GPR12: c27a > GPR16: > GPR20: c0062ea70020 c0062c0c38d0 0002 0002 > GPR24: c0062ac3ffe8 275b19c4 0001 c0062ac3 > GPR28: c0062c0c38d0 c0062ac30050 c0062ac30058 > NIP memcmp+0x120/0x690 > LR xfs_attr3_leaf_lookup_int+0x53c/0x5b0 > Call Trace: > xfs_attr3_leaf_lookup_int+0x78/0x5b0 (unreliable) > xfs_da3_node_lookup_int+0x32c/0x5a0 > xfs_attr_node_addname+0x170/0x6b0 > xfs_attr_set+0x2ac/0x340 > __xfs_set_acl+0xf0/0x230 > xfs_set_acl+0xd0/0x160 > set_posix_acl+0xc0/0x130 > posix_acl_xattr_set+0x68/0x110 > __vfs_setxattr+0xa4/0x110 > __vfs_setxattr_noperm+0xac/0x240 > vfs_setxattr+0x128/0x130 > setxattr+0x248/0x600 > path_setxattr+0x108/0x120 > sys_setxattr+0x28/0x40 > system_call+0x5c/0x70 > Instruction dump: > 7d201c28 7d402428 7c295040 38630008 38840008 408201f0 4200ffe8 2c05 > 4182ff6c 20c50008 54c61838 7d201c28 <7d402428> 7d293436 7d4a3436 7c295040 > > The instruction dump decodes as: > subfic r6,r5,8 > rlwinm r6,r6,3,0,28 > ldbrx r9,0,r3 > ldbrx r10,0,r4 <- > > Which shows us doing an 8 byte load from c0062ac3fff9, which > crosses the page boundary at c0062ac4 and faults. > > It's not OK for memcmp to read past the end of the source or > destination buffers if that would cross a page boundary, because we > don't know that the next page is mapped. > > As pointed out by Segher, we can read past the end of the source or > destination as long as we don't cross a 4K boundary, because that's > our minimum page size on all platforms. > > The bug is in the code at the .Lcmp_rest_lt8bytes label. When we get > there we know that s1 is 8-byte aligned and we have at least 1 byte to > read, so a single 8-byte load won't read past the end of s1 and cross > a page boundary. > > But we have to be more careful with s2. So check if it's within 8 > bytes of a 4K boundary and if so go to the byte-by-byte loop. > > Fixes: 2d9ee327adce ("powerpc/64: Align bytes before fall back to .Lshort in > powerpc64 memcmp()") > Cc: sta...@vger.kernel.org # v4.19+ > Reported-by: Chandan Rajendra > Signed-off-by: Michael Ellerman For unknown reasons, I am unable to recreate this bug on the unmodified next-20190115 which was the original kernel I had found this bug on. FWIW, I have executed generic/026 on a next-20190115 kernel with this patch applied and I wasn't able to recreate the bug. Hence, Tested-by: Chandan Rajendra -- chandan
Re: [PATCH v3] powerpc/64: Fix memcmp reading past the end of src/dest
On Friday, March 22, 2019 6:07:24 PM IST Michael Ellerman wrote: > Chandan reported that fstests' generic/026 test hit a crash: > > BUG: Unable to handle kernel data access at 0xc0062ac4 > Faulting instruction address: 0xc0092240 > Oops: Kernel access of bad area, sig: 11 [#1] > LE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries > CPU: 0 PID: 27828 Comm: chacl Not tainted > 5.0.0-rc2-next-20190115-1-g6de6dba64dda #1 > NIP: c0092240 LR: c066a55c CTR: > REGS: c0062c0c3430 TRAP: 0300 Not tainted > (5.0.0-rc2-next-20190115-1-g6de6dba64dda) > MSR: 82009033 CR: 44000842 XER: > 2000 > CFAR: 7fff7f3108ac DAR: c0062ac4 DSISR: 4000 IRQMASK: 0 > GPR00: c0062c0c36c0 c17f4c00 c121a660 > GPR04: c0062ac3fff9 0004 0020 275b19c4 > GPR08: 000c 46494c45 5347495f41434c5f c26073a0 > GPR12: c27a > GPR16: > GPR20: c0062ea70020 c0062c0c38d0 0002 0002 > GPR24: c0062ac3ffe8 275b19c4 0001 c0062ac3 > GPR28: c0062c0c38d0 c0062ac30050 c0062ac30058 > NIP memcmp+0x120/0x690 > LR xfs_attr3_leaf_lookup_int+0x53c/0x5b0 > Call Trace: > xfs_attr3_leaf_lookup_int+0x78/0x5b0 (unreliable) > xfs_da3_node_lookup_int+0x32c/0x5a0 > xfs_attr_node_addname+0x170/0x6b0 > xfs_attr_set+0x2ac/0x340 > __xfs_set_acl+0xf0/0x230 > xfs_set_acl+0xd0/0x160 > set_posix_acl+0xc0/0x130 > posix_acl_xattr_set+0x68/0x110 > __vfs_setxattr+0xa4/0x110 > __vfs_setxattr_noperm+0xac/0x240 > vfs_setxattr+0x128/0x130 > setxattr+0x248/0x600 > path_setxattr+0x108/0x120 > sys_setxattr+0x28/0x40 > system_call+0x5c/0x70 > Instruction dump: > 7d201c28 7d402428 7c295040 38630008 38840008 408201f0 4200ffe8 2c05 > 4182ff6c 20c50008 54c61838 7d201c28 <7d402428> 7d293436 7d4a3436 7c295040 > > The instruction dump decodes as: > subfic r6,r5,8 > rlwinm r6,r6,3,0,28 > ldbrx r9,0,r3 > ldbrx r10,0,r4 <- > > Which shows us doing an 8 byte load from c0062ac3fff9, which > crosses the page boundary at c0062ac4 and faults. > > It's not OK for memcmp to read past the end of the source or > destination buffers if that would cross a page boundary, because we > don't know that the next page is mapped. > > As pointed out by Segher, we can read past the end of the source or > destination as long as we don't cross a 4K boundary, because that's > our minimum page size on all platforms. > > The bug is in the code at the .Lcmp_rest_lt8bytes label. When we get > there we know that s1 is 8-byte aligned and we have at least 1 byte to > read, so a single 8-byte load won't read past the end of s1 and cross > a page boundary. > > But we have to be more careful with s2. So check if it's within 8 > bytes of a 4K boundary and if so go to the byte-by-byte loop. > > Fixes: 2d9ee327adce ("powerpc/64: Align bytes before fall back to .Lshort in > powerpc64 memcmp()") > Cc: sta...@vger.kernel.org # v4.19+ > Reported-by: Chandan Rajendra > Signed-off-by: Michael Ellerman For unknown reasons, I am unable to recreate this bug on the unmodified next-20190115 which was the original kernel I had found this bug on. FWIW, I have executed generic/026 on a next-20190115 kernel with this patch applied and I wasn't able to recreate the bug. Hence, Tested-by: Chandan Rajendra -- chandan
Re: [PATCH v3] powerpc/64: Fix memcmp reading past the end of src/dest
On Fri, Mar 22, 2019 at 11:37:24PM +1100, Michael Ellerman wrote: > .Lcmp_rest_lt8bytes: > - /* Here we have only less than 8 bytes to compare with. at least s1 > - * Address is aligned with 8 bytes. > - * The next double words are load and shift right with appropriate > - * bits. > + /* > + * Here we have less than 8 bytes to compare. At least s1 is aligned to > + * 8 bytes, but s2 may not be. We must make sure s2 + 8 doesn't cross a "s2 + 7"? The code is fine though (bgt, not bge). > + * page boundary, otherwise we might read past the end of the buffer and > + * trigger a page fault. We use 4K as the conservative minimum page > + * size. If we detect that case we go to the byte-by-byte loop. > + * > + * Otherwise the next double word is loaded from s1 and s2, and shifted > + * right to compare the appropriate bits. >*/ > + clrldi r6,r4,(64-12) // r6 = r4 & 0xfff You can just write rlwinm r6,r4,0,0x0fff if that is clearer? Or do you still want a comment with that :-) > + cmpdi r6,0xff8 > + bgt .Lshort Reviewed-by: Segher Boessenkool Segher
[PATCH v3] powerpc/64: Fix memcmp reading past the end of src/dest
Chandan reported that fstests' generic/026 test hit a crash: BUG: Unable to handle kernel data access at 0xc0062ac4 Faulting instruction address: 0xc0092240 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries CPU: 0 PID: 27828 Comm: chacl Not tainted 5.0.0-rc2-next-20190115-1-g6de6dba64dda #1 NIP: c0092240 LR: c066a55c CTR: REGS: c0062c0c3430 TRAP: 0300 Not tainted (5.0.0-rc2-next-20190115-1-g6de6dba64dda) MSR: 82009033 CR: 44000842 XER: 2000 CFAR: 7fff7f3108ac DAR: c0062ac4 DSISR: 4000 IRQMASK: 0 GPR00: c0062c0c36c0 c17f4c00 c121a660 GPR04: c0062ac3fff9 0004 0020 275b19c4 GPR08: 000c 46494c45 5347495f41434c5f c26073a0 GPR12: c27a GPR16: GPR20: c0062ea70020 c0062c0c38d0 0002 0002 GPR24: c0062ac3ffe8 275b19c4 0001 c0062ac3 GPR28: c0062c0c38d0 c0062ac30050 c0062ac30058 NIP memcmp+0x120/0x690 LR xfs_attr3_leaf_lookup_int+0x53c/0x5b0 Call Trace: xfs_attr3_leaf_lookup_int+0x78/0x5b0 (unreliable) xfs_da3_node_lookup_int+0x32c/0x5a0 xfs_attr_node_addname+0x170/0x6b0 xfs_attr_set+0x2ac/0x340 __xfs_set_acl+0xf0/0x230 xfs_set_acl+0xd0/0x160 set_posix_acl+0xc0/0x130 posix_acl_xattr_set+0x68/0x110 __vfs_setxattr+0xa4/0x110 __vfs_setxattr_noperm+0xac/0x240 vfs_setxattr+0x128/0x130 setxattr+0x248/0x600 path_setxattr+0x108/0x120 sys_setxattr+0x28/0x40 system_call+0x5c/0x70 Instruction dump: 7d201c28 7d402428 7c295040 38630008 38840008 408201f0 4200ffe8 2c05 4182ff6c 20c50008 54c61838 7d201c28 <7d402428> 7d293436 7d4a3436 7c295040 The instruction dump decodes as: subfic r6,r5,8 rlwinm r6,r6,3,0,28 ldbrx r9,0,r3 ldbrx r10,0,r4 <- Which shows us doing an 8 byte load from c0062ac3fff9, which crosses the page boundary at c0062ac4 and faults. It's not OK for memcmp to read past the end of the source or destination buffers if that would cross a page boundary, because we don't know that the next page is mapped. As pointed out by Segher, we can read past the end of the source or destination as long as we don't cross a 4K boundary, because that's our minimum page size on all platforms. The bug is in the code at the .Lcmp_rest_lt8bytes label. When we get there we know that s1 is 8-byte aligned and we have at least 1 byte to read, so a single 8-byte load won't read past the end of s1 and cross a page boundary. But we have to be more careful with s2. So check if it's within 8 bytes of a 4K boundary and if so go to the byte-by-byte loop. Fixes: 2d9ee327adce ("powerpc/64: Align bytes before fall back to .Lshort in powerpc64 memcmp()") Cc: sta...@vger.kernel.org # v4.19+ Reported-by: Chandan Rajendra Signed-off-by: Michael Ellerman --- arch/powerpc/lib/memcmp_64.S | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) v3: Just check if we're crossing a 4K boundary. Oops, I wrote this a while back but forgot to send v3. diff --git a/arch/powerpc/lib/memcmp_64.S b/arch/powerpc/lib/memcmp_64.S index 844d8e774492..f6554ebebeb5 100644 --- a/arch/powerpc/lib/memcmp_64.S +++ b/arch/powerpc/lib/memcmp_64.S @@ -215,11 +215,20 @@ _GLOBAL_TOC(memcmp) beq .Lzero .Lcmp_rest_lt8bytes: - /* Here we have only less than 8 bytes to compare with. at least s1 -* Address is aligned with 8 bytes. -* The next double words are load and shift right with appropriate -* bits. + /* +* Here we have less than 8 bytes to compare. At least s1 is aligned to +* 8 bytes, but s2 may not be. We must make sure s2 + 8 doesn't cross a +* page boundary, otherwise we might read past the end of the buffer and +* trigger a page fault. We use 4K as the conservative minimum page +* size. If we detect that case we go to the byte-by-byte loop. +* +* Otherwise the next double word is loaded from s1 and s2, and shifted +* right to compare the appropriate bits. */ + clrldi r6,r4,(64-12) // r6 = r4 & 0xfff + cmpdi r6,0xff8 + bgt .Lshort + subfic r6,r5,8 slwir6,r6,3 LD rA,0,r3 -- 2.20.1