Re: [PATCH v3] powerpc/64: Fix memcmp reading past the end of src/dest

2019-03-25 Thread Chandan Rajendra
On Friday, March 22, 2019 6:07:24 PM IST Michael Ellerman wrote:
> Chandan reported that fstests' generic/026 test hit a crash:
> 
>   BUG: Unable to handle kernel data access at 0xc0062ac4
>   Faulting instruction address: 0xc0092240
>   Oops: Kernel access of bad area, sig: 11 [#1]
>   LE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries
>   CPU: 0 PID: 27828 Comm: chacl Not tainted 
> 5.0.0-rc2-next-20190115-1-g6de6dba64dda #1
>   NIP:  c0092240 LR: c066a55c CTR: 
>   REGS: c0062c0c3430 TRAP: 0300   Not tainted  
> (5.0.0-rc2-next-20190115-1-g6de6dba64dda)
>   MSR:  82009033   CR: 44000842  XER: 
> 2000
>   CFAR: 7fff7f3108ac DAR: c0062ac4 DSISR: 4000 IRQMASK: 0
>   GPR00:  c0062c0c36c0 c17f4c00 c121a660
>   GPR04: c0062ac3fff9 0004 0020 275b19c4
>   GPR08: 000c 46494c45 5347495f41434c5f c26073a0
>   GPR12:  c27a  
>   GPR16:    
>   GPR20: c0062ea70020 c0062c0c38d0 0002 0002
>   GPR24: c0062ac3ffe8 275b19c4 0001 c0062ac3
>   GPR28: c0062c0c38d0 c0062ac30050 c0062ac30058 
>   NIP memcmp+0x120/0x690
>   LR  xfs_attr3_leaf_lookup_int+0x53c/0x5b0
>   Call Trace:
> xfs_attr3_leaf_lookup_int+0x78/0x5b0 (unreliable)
> xfs_da3_node_lookup_int+0x32c/0x5a0
> xfs_attr_node_addname+0x170/0x6b0
> xfs_attr_set+0x2ac/0x340
> __xfs_set_acl+0xf0/0x230
> xfs_set_acl+0xd0/0x160
> set_posix_acl+0xc0/0x130
> posix_acl_xattr_set+0x68/0x110
> __vfs_setxattr+0xa4/0x110
> __vfs_setxattr_noperm+0xac/0x240
> vfs_setxattr+0x128/0x130
> setxattr+0x248/0x600
> path_setxattr+0x108/0x120
> sys_setxattr+0x28/0x40
> system_call+0x5c/0x70
>   Instruction dump:
>   7d201c28 7d402428 7c295040 38630008 38840008 408201f0 4200ffe8 2c05
>   4182ff6c 20c50008 54c61838 7d201c28 <7d402428> 7d293436 7d4a3436 7c295040
> 
> The instruction dump decodes as:
>   subfic  r6,r5,8
>   rlwinm  r6,r6,3,0,28
>   ldbrx   r9,0,r3
>   ldbrx   r10,0,r4  <-
> 
> Which shows us doing an 8 byte load from c0062ac3fff9, which
> crosses the page boundary at c0062ac4 and faults.
> 
> It's not OK for memcmp to read past the end of the source or
> destination buffers if that would cross a page boundary, because we
> don't know that the next page is mapped.
> 
> As pointed out by Segher, we can read past the end of the source or
> destination as long as we don't cross a 4K boundary, because that's
> our minimum page size on all platforms.
> 
> The bug is in the code at the .Lcmp_rest_lt8bytes label. When we get
> there we know that s1 is 8-byte aligned and we have at least 1 byte to
> read, so a single 8-byte load won't read past the end of s1 and cross
> a page boundary.
> 
> But we have to be more careful with s2. So check if it's within 8
> bytes of a 4K boundary and if so go to the byte-by-byte loop.
> 
> Fixes: 2d9ee327adce ("powerpc/64: Align bytes before fall back to .Lshort in 
> powerpc64 memcmp()")
> Cc: sta...@vger.kernel.org # v4.19+
> Reported-by: Chandan Rajendra 
> Signed-off-by: Michael Ellerman 

For unknown reasons, I am unable to recreate this bug on the unmodified
next-20190115 which was the original kernel I had found this bug on.

FWIW, I have executed generic/026 on a next-20190115 kernel with this patch
applied and I wasn't able to recreate the bug. Hence,

Tested-by: Chandan Rajendra 

-- 
chandan





Re: [PATCH v3] powerpc/64: Fix memcmp reading past the end of src/dest

2019-03-25 Thread Chandan Rajendra
On Friday, March 22, 2019 6:07:24 PM IST Michael Ellerman wrote:
> Chandan reported that fstests' generic/026 test hit a crash:
> 
>   BUG: Unable to handle kernel data access at 0xc0062ac4
>   Faulting instruction address: 0xc0092240
>   Oops: Kernel access of bad area, sig: 11 [#1]
>   LE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries
>   CPU: 0 PID: 27828 Comm: chacl Not tainted 
> 5.0.0-rc2-next-20190115-1-g6de6dba64dda #1
>   NIP:  c0092240 LR: c066a55c CTR: 
>   REGS: c0062c0c3430 TRAP: 0300   Not tainted  
> (5.0.0-rc2-next-20190115-1-g6de6dba64dda)
>   MSR:  82009033   CR: 44000842  XER: 
> 2000
>   CFAR: 7fff7f3108ac DAR: c0062ac4 DSISR: 4000 IRQMASK: 0
>   GPR00:  c0062c0c36c0 c17f4c00 c121a660
>   GPR04: c0062ac3fff9 0004 0020 275b19c4
>   GPR08: 000c 46494c45 5347495f41434c5f c26073a0
>   GPR12:  c27a  
>   GPR16:    
>   GPR20: c0062ea70020 c0062c0c38d0 0002 0002
>   GPR24: c0062ac3ffe8 275b19c4 0001 c0062ac3
>   GPR28: c0062c0c38d0 c0062ac30050 c0062ac30058 
>   NIP memcmp+0x120/0x690
>   LR  xfs_attr3_leaf_lookup_int+0x53c/0x5b0
>   Call Trace:
> xfs_attr3_leaf_lookup_int+0x78/0x5b0 (unreliable)
> xfs_da3_node_lookup_int+0x32c/0x5a0
> xfs_attr_node_addname+0x170/0x6b0
> xfs_attr_set+0x2ac/0x340
> __xfs_set_acl+0xf0/0x230
> xfs_set_acl+0xd0/0x160
> set_posix_acl+0xc0/0x130
> posix_acl_xattr_set+0x68/0x110
> __vfs_setxattr+0xa4/0x110
> __vfs_setxattr_noperm+0xac/0x240
> vfs_setxattr+0x128/0x130
> setxattr+0x248/0x600
> path_setxattr+0x108/0x120
> sys_setxattr+0x28/0x40
> system_call+0x5c/0x70
>   Instruction dump:
>   7d201c28 7d402428 7c295040 38630008 38840008 408201f0 4200ffe8 2c05
>   4182ff6c 20c50008 54c61838 7d201c28 <7d402428> 7d293436 7d4a3436 7c295040
> 
> The instruction dump decodes as:
>   subfic  r6,r5,8
>   rlwinm  r6,r6,3,0,28
>   ldbrx   r9,0,r3
>   ldbrx   r10,0,r4  <-
> 
> Which shows us doing an 8 byte load from c0062ac3fff9, which
> crosses the page boundary at c0062ac4 and faults.
> 
> It's not OK for memcmp to read past the end of the source or
> destination buffers if that would cross a page boundary, because we
> don't know that the next page is mapped.
> 
> As pointed out by Segher, we can read past the end of the source or
> destination as long as we don't cross a 4K boundary, because that's
> our minimum page size on all platforms.
> 
> The bug is in the code at the .Lcmp_rest_lt8bytes label. When we get
> there we know that s1 is 8-byte aligned and we have at least 1 byte to
> read, so a single 8-byte load won't read past the end of s1 and cross
> a page boundary.
> 
> But we have to be more careful with s2. So check if it's within 8
> bytes of a 4K boundary and if so go to the byte-by-byte loop.
> 
> Fixes: 2d9ee327adce ("powerpc/64: Align bytes before fall back to .Lshort in 
> powerpc64 memcmp()")
> Cc: sta...@vger.kernel.org # v4.19+
> Reported-by: Chandan Rajendra 
> Signed-off-by: Michael Ellerman 

For unknown reasons, I am unable to recreate this bug on the unmodified
next-20190115 which was the original kernel I had found this bug on.

FWIW, I have executed generic/026 on a next-20190115 kernel with this patch
applied and I wasn't able to recreate the bug. Hence,

Tested-by: Chandan Rajendra 

-- 
chandan





Re: BUG: memcmp(): Accessing invalid memory location

2019-02-06 Thread Chandan Rajendra
On Thursday, February 7, 2019 5:27:09 AM IST Michael Ellerman wrote:
> Chandan Rajendra  writes:
> > On Wednesday, February 6, 2019 5:20:04 PM IST Michael Ellerman wrote:
> >> Chandan Rajendra  writes:
> >> > On Friday, February 1, 2019 4:43:52 PM IST Michael Ellerman wrote:
> >> >> Michael Ellerman  writes:
> >> >> 
> >> >> > Adding Simon who wrote the code.
> >> >> >
> >> >> > Chandan Rajendra  writes:
> >> >> >> When executing fstests' generic/026 test, I hit the following call 
> >> >> >> trace,
> >> >> >>
> >> >> >> [  417.061038] BUG: Unable to handle kernel data access at 
> >> >> >> 0xc0062ac4
> >> >> >> [  417.062172] Faulting instruction address: 0xc0092240
> >> >> >> [  417.062242] Oops: Kernel access of bad area, sig: 11 [#1]
> >> >> >> [  417.062299] LE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries
> >> >> >> [  417.062366] Modules linked in:
> >> >> >> [  417.062401] CPU: 0 PID: 27828 Comm: chacl Not tainted 
> >> >> >> 5.0.0-rc2-next-20190115-1-g6de6dba64dda #1
> >> >> >> [  417.062495] NIP:  c0092240 LR: c066a55c CTR: 
> >> >> >> 
> >> >> >> [  417.062567] REGS: c0062c0c3430 TRAP: 0300   Not tainted  
> >> >> >> (5.0.0-rc2-next-20190115-1-g6de6dba64dda)
> >> >> >> [  417.062660] MSR:  82009033   
> >> >> >> CR: 44000842  XER: 2000
> >> >> >> [  417.062750] CFAR: 7fff7f3108ac DAR: c0062ac4 DSISR: 
> >> >> >> 4000 IRQMASK: 0
> >> >> >>GPR00:  c0062c0c36c0 
> >> >> >> c17f4c00 c121a660
> >> >> >>GPR04: c0062ac3fff9 0004 
> >> >> >> 0020 275b19c4
> >> >> >>GPR08: 000c 46494c45 
> >> >> >> 5347495f41434c5f c26073a0
> >> >> >>GPR12:  c27a 
> >> >> >>  
> >> >> >>GPR16:   
> >> >> >>  
> >> >> >>GPR20: c0062ea70020 c0062c0c38d0 
> >> >> >> 0002 0002
> >> >> >>GPR24: c0062ac3ffe8 275b19c4 
> >> >> >> 0001 c0062ac3
> >> >> >>GPR28: c0062c0c38d0 c0062ac30050 
> >> >> >> c0062ac30058 
> >> >> >> [  417.063563] NIP [c0092240] memcmp+0x120/0x690
> >> >> >> [  417.063635] LR [c066a55c] 
> >> >> >> xfs_attr3_leaf_lookup_int+0x53c/0x5b0
> >> >> >> [  417.063709] Call Trace:
> >> >> >> [  417.063744] [c0062c0c36c0] [c066a098] 
> >> >> >> xfs_attr3_leaf_lookup_int+0x78/0x5b0 (unreliable)
> >> >> >> [  417.063851] [c0062c0c3760] [c0693f8c] 
> >> >> >> xfs_da3_node_lookup_int+0x32c/0x5a0
> >> >> >> [  417.063944] [c0062c0c3820] [c06634a0] 
> >> >> >> xfs_attr_node_addname+0x170/0x6b0
> >> >> >> [  417.064034] [c0062c0c38b0] [c0664ffc] 
> >> >> >> xfs_attr_set+0x2ac/0x340
> >> >> >> [  417.064118] [c0062c0c39a0] [c0758d40] 
> >> >> >> __xfs_set_acl+0xf0/0x230
> >> >> >> [  417.064190] [c0062c0c3a00] [c0758f50] 
> >> >> >> xfs_set_acl+0xd0/0x160
> >> >> >> [  417.064268] [c0062c0c3aa0] [c04b69b0] 
> >> >> >> set_posix_acl+0xc0/0x130
> >> >> >> [  417.064339] [c0062c0c3ae0] [c04b6a88] 
> >> >> >> posix_acl_xattr_set+0x68/0x110
> >> >> >> [  417.064412] [c0062c0c3b20] [c04532d4] 
> >> >> >> __vfs_setxattr+0xa4/0x110
> >> >> >> [  417.064485] [c0062c0c3b80] [c0454c2c] 
> >> >> >> __vfs_setxattr_noperm+0xac/0x240
> >&g

Re: BUG: memcmp(): Accessing invalid memory location

2019-02-06 Thread Chandan Rajendra
On Wednesday, February 6, 2019 5:20:04 PM IST Michael Ellerman wrote:
> Chandan Rajendra  writes:
> > On Friday, February 1, 2019 4:43:52 PM IST Michael Ellerman wrote:
> >> Michael Ellerman  writes:
> >> 
> >> > Adding Simon who wrote the code.
> >> >
> >> > Chandan Rajendra  writes:
> >> >> When executing fstests' generic/026 test, I hit the following call 
> >> >> trace,
> >> >>
> >> >> [  417.061038] BUG: Unable to handle kernel data access at 
> >> >> 0xc0062ac4
> >> >> [  417.062172] Faulting instruction address: 0xc0092240
> >> >> [  417.062242] Oops: Kernel access of bad area, sig: 11 [#1]
> >> >> [  417.062299] LE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries
> >> >> [  417.062366] Modules linked in:
> >> >> [  417.062401] CPU: 0 PID: 27828 Comm: chacl Not tainted 
> >> >> 5.0.0-rc2-next-20190115-1-g6de6dba64dda #1
> >> >> [  417.062495] NIP:  c0092240 LR: c066a55c CTR: 
> >> >> 
> >> >> [  417.062567] REGS: c0062c0c3430 TRAP: 0300   Not tainted  
> >> >> (5.0.0-rc2-next-20190115-1-g6de6dba64dda)
> >> >> [  417.062660] MSR:  82009033   CR: 
> >> >> 44000842  XER: 2000
> >> >> [  417.062750] CFAR: 7fff7f3108ac DAR: c0062ac4 DSISR: 
> >> >> 4000 IRQMASK: 0
> >> >>GPR00:  c0062c0c36c0 
> >> >> c17f4c00 c121a660
> >> >>GPR04: c0062ac3fff9 0004 
> >> >> 0020 275b19c4
> >> >>GPR08: 000c 46494c45 
> >> >> 5347495f41434c5f c26073a0
> >> >>GPR12:  c27a 
> >> >>  
> >> >>GPR16:   
> >> >>  
> >> >>GPR20: c0062ea70020 c0062c0c38d0 
> >> >> 0002 0002
> >> >>GPR24: c0062ac3ffe8 275b19c4 
> >> >> 0001 c0062ac3
> >> >>GPR28: c0062c0c38d0 c0062ac30050 
> >> >> c0062ac30058 
> >> >> [  417.063563] NIP [c0092240] memcmp+0x120/0x690
> >> >> [  417.063635] LR [c066a55c] 
> >> >> xfs_attr3_leaf_lookup_int+0x53c/0x5b0
> >> >> [  417.063709] Call Trace:
> >> >> [  417.063744] [c0062c0c36c0] [c066a098] 
> >> >> xfs_attr3_leaf_lookup_int+0x78/0x5b0 (unreliable)
> >> >> [  417.063851] [c0062c0c3760] [c0693f8c] 
> >> >> xfs_da3_node_lookup_int+0x32c/0x5a0
> >> >> [  417.063944] [c0062c0c3820] [c06634a0] 
> >> >> xfs_attr_node_addname+0x170/0x6b0
> >> >> [  417.064034] [c0062c0c38b0] [c0664ffc] 
> >> >> xfs_attr_set+0x2ac/0x340
> >> >> [  417.064118] [c0062c0c39a0] [c0758d40] 
> >> >> __xfs_set_acl+0xf0/0x230
> >> >> [  417.064190] [c0062c0c3a00] [c0758f50] 
> >> >> xfs_set_acl+0xd0/0x160
> >> >> [  417.064268] [c0062c0c3aa0] [c04b69b0] 
> >> >> set_posix_acl+0xc0/0x130
> >> >> [  417.064339] [c0062c0c3ae0] [c04b6a88] 
> >> >> posix_acl_xattr_set+0x68/0x110
> >> >> [  417.064412] [c0062c0c3b20] [c04532d4] 
> >> >> __vfs_setxattr+0xa4/0x110
> >> >> [  417.064485] [c0062c0c3b80] [c0454c2c] 
> >> >> __vfs_setxattr_noperm+0xac/0x240
> >> >> [  417.064566] [c0062c0c3bd0] [c0454ee8] 
> >> >> vfs_setxattr+0x128/0x130
> >> >> [  417.064638] [c0062c0c3c30] [c0455138] 
> >> >> setxattr+0x248/0x600
> >> >> [  417.064710] [c0062c0c3d90] [c0455738] 
> >> >> path_setxattr+0x108/0x120
> >> >> [  417.064785] [c0062c0c3e00] [c0455778] 
> >> >> sys_setxattr+0x28/0x40
> >> >> [  417.064858] [c0062c0c3e20] [c000bae4] 
> >> >> system_call+0x5c/0x70
> >> >> [  417.064930] Instruction dump:
> >> >> [  417.064964] 7d201c2

Re: BUG: memcmp(): Accessing invalid memory location

2019-02-03 Thread Chandan Rajendra
On Friday, February 1, 2019 4:43:52 PM IST Michael Ellerman wrote:
> Michael Ellerman  writes:
> 
> > Adding Simon who wrote the code.
> >
> > Chandan Rajendra  writes:
> >> When executing fstests' generic/026 test, I hit the following call trace,
> >>
> >> [  417.061038] BUG: Unable to handle kernel data access at 
> >> 0xc0062ac4
> >> [  417.062172] Faulting instruction address: 0xc0092240
> >> [  417.062242] Oops: Kernel access of bad area, sig: 11 [#1]
> >> [  417.062299] LE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries
> >> [  417.062366] Modules linked in:
> >> [  417.062401] CPU: 0 PID: 27828 Comm: chacl Not tainted 
> >> 5.0.0-rc2-next-20190115-1-g6de6dba64dda #1
> >> [  417.062495] NIP:  c0092240 LR: c066a55c CTR: 
> >> 
> >> [  417.062567] REGS: c0062c0c3430 TRAP: 0300   Not tainted  
> >> (5.0.0-rc2-next-20190115-1-g6de6dba64dda)
> >> [  417.062660] MSR:  82009033   CR: 
> >> 44000842  XER: 2000
> >> [  417.062750] CFAR: 7fff7f3108ac DAR: c0062ac4 DSISR: 
> >> 4000 IRQMASK: 0
> >>GPR00:  c0062c0c36c0 c17f4c00 
> >> c121a660
> >>GPR04: c0062ac3fff9 0004 0020 
> >> 275b19c4
> >>GPR08: 000c 46494c45 5347495f41434c5f 
> >> c26073a0
> >>GPR12:  c27a  
> >> 
> >>GPR16:    
> >> 
> >>GPR20: c0062ea70020 c0062c0c38d0 0002 
> >> 0002
> >>GPR24: c0062ac3ffe8 275b19c4 0001 
> >> c0062ac3
> >>GPR28: c0062c0c38d0 c0062ac30050 c0062ac30058 
> >> 
> >> [  417.063563] NIP [c0092240] memcmp+0x120/0x690
> >> [  417.063635] LR [c066a55c] xfs_attr3_leaf_lookup_int+0x53c/0x5b0
> >> [  417.063709] Call Trace:
> >> [  417.063744] [c0062c0c36c0] [c066a098] 
> >> xfs_attr3_leaf_lookup_int+0x78/0x5b0 (unreliable)
> >> [  417.063851] [c0062c0c3760] [c0693f8c] 
> >> xfs_da3_node_lookup_int+0x32c/0x5a0
> >> [  417.063944] [c0062c0c3820] [c06634a0] 
> >> xfs_attr_node_addname+0x170/0x6b0
> >> [  417.064034] [c0062c0c38b0] [c0664ffc] 
> >> xfs_attr_set+0x2ac/0x340
> >> [  417.064118] [c0062c0c39a0] [c0758d40] 
> >> __xfs_set_acl+0xf0/0x230
> >> [  417.064190] [c0062c0c3a00] [c0758f50] xfs_set_acl+0xd0/0x160
> >> [  417.064268] [c0062c0c3aa0] [c04b69b0] 
> >> set_posix_acl+0xc0/0x130
> >> [  417.064339] [c0062c0c3ae0] [c04b6a88] 
> >> posix_acl_xattr_set+0x68/0x110
> >> [  417.064412] [c0062c0c3b20] [c04532d4] 
> >> __vfs_setxattr+0xa4/0x110
> >> [  417.064485] [c0062c0c3b80] [c0454c2c] 
> >> __vfs_setxattr_noperm+0xac/0x240
> >> [  417.064566] [c0062c0c3bd0] [c0454ee8] 
> >> vfs_setxattr+0x128/0x130
> >> [  417.064638] [c0062c0c3c30] [c0455138] setxattr+0x248/0x600
> >> [  417.064710] [c0062c0c3d90] [c0455738] 
> >> path_setxattr+0x108/0x120
> >> [  417.064785] [c0062c0c3e00] [c0455778] sys_setxattr+0x28/0x40
> >> [  417.064858] [c0062c0c3e20] [c000bae4] system_call+0x5c/0x70
> >> [  417.064930] Instruction dump:
> >> [  417.064964] 7d201c28 7d402428 7c295040 38630008 38840008 408201f0 
> >> 4200ffe8 2c05
> >> [  417.065051] 4182ff6c 20c50008 54c61838 7d201c28 <7d402428> 7d293436 
> >> 7d4a3436 7c295040
> >> [  417.065150] ---[ end trace 0d060411b5e3741b ]---
> >>
> >>
> >> Both the memory locations passed to memcmp() had "SGI_ACL_FILE" and len
> >> argument of memcmp() was set to 12. s1 argument of memcmp() had the value
> >> 0xf4af0485, while s2 argument had the value 0xce9e316f.
> >>
> >> The following is the code path within memcmp() that gets executed for the
> >> above mentioned values,
> >>
> >> - Since len (i.e. 12) is greater than 7, we branch to .Lno_short.
> >> - We then prefetch the contents of r3 & r4 and branch to
> >

BUG: memcmp(): Accessing invalid memory location

2019-01-24 Thread Chandan Rajendra
When executing fstests' generic/026 test, I hit the following call trace,

[  417.061038] BUG: Unable to handle kernel data access at 0xc0062ac4
[  417.062172] Faulting instruction address: 0xc0092240
[  417.062242] Oops: Kernel access of bad area, sig: 11 [#1]
[  417.062299] LE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries
[  417.062366] Modules linked in:
[  417.062401] CPU: 0 PID: 27828 Comm: chacl Not tainted 
5.0.0-rc2-next-20190115-1-g6de6dba64dda #1
[  417.062495] NIP:  c0092240 LR: c066a55c CTR: 
[  417.062567] REGS: c0062c0c3430 TRAP: 0300   Not tainted  
(5.0.0-rc2-next-20190115-1-g6de6dba64dda)
[  417.062660] MSR:  82009033   CR: 44000842  
XER: 2000
[  417.062750] CFAR: 7fff7f3108ac DAR: c0062ac4 DSISR: 4000 
IRQMASK: 0
   GPR00:  c0062c0c36c0 c17f4c00 
c121a660
   GPR04: c0062ac3fff9 0004 0020 
275b19c4
   GPR08: 000c 46494c45 5347495f41434c5f 
c26073a0
   GPR12:  c27a  

   GPR16:    

   GPR20: c0062ea70020 c0062c0c38d0 0002 
0002
   GPR24: c0062ac3ffe8 275b19c4 0001 
c0062ac3
   GPR28: c0062c0c38d0 c0062ac30050 c0062ac30058 

[  417.063563] NIP [c0092240] memcmp+0x120/0x690
[  417.063635] LR [c066a55c] xfs_attr3_leaf_lookup_int+0x53c/0x5b0
[  417.063709] Call Trace:
[  417.063744] [c0062c0c36c0] [c066a098] 
xfs_attr3_leaf_lookup_int+0x78/0x5b0 (unreliable)
[  417.063851] [c0062c0c3760] [c0693f8c] 
xfs_da3_node_lookup_int+0x32c/0x5a0
[  417.063944] [c0062c0c3820] [c06634a0] 
xfs_attr_node_addname+0x170/0x6b0
[  417.064034] [c0062c0c38b0] [c0664ffc] xfs_attr_set+0x2ac/0x340
[  417.064118] [c0062c0c39a0] [c0758d40] __xfs_set_acl+0xf0/0x230
[  417.064190] [c0062c0c3a00] [c0758f50] xfs_set_acl+0xd0/0x160
[  417.064268] [c0062c0c3aa0] [c04b69b0] set_posix_acl+0xc0/0x130
[  417.064339] [c0062c0c3ae0] [c04b6a88] 
posix_acl_xattr_set+0x68/0x110
[  417.064412] [c0062c0c3b20] [c04532d4] __vfs_setxattr+0xa4/0x110
[  417.064485] [c0062c0c3b80] [c0454c2c] 
__vfs_setxattr_noperm+0xac/0x240
[  417.064566] [c0062c0c3bd0] [c0454ee8] vfs_setxattr+0x128/0x130
[  417.064638] [c0062c0c3c30] [c0455138] setxattr+0x248/0x600
[  417.064710] [c0062c0c3d90] [c0455738] path_setxattr+0x108/0x120
[  417.064785] [c0062c0c3e00] [c0455778] sys_setxattr+0x28/0x40
[  417.064858] [c0062c0c3e20] [c000bae4] system_call+0x5c/0x70
[  417.064930] Instruction dump:
[  417.064964] 7d201c28 7d402428 7c295040 38630008 38840008 408201f0 4200ffe8 
2c05
[  417.065051] 4182ff6c 20c50008 54c61838 7d201c28 <7d402428> 7d293436 7d4a3436 
7c295040
[  417.065150] ---[ end trace 0d060411b5e3741b ]---


Both the memory locations passed to memcmp() had "SGI_ACL_FILE" and len
argument of memcmp() was set to 12. s1 argument of memcmp() had the value
0xf4af0485, while s2 argument had the value 0xce9e316f.

The following is the code path within memcmp() that gets executed for the
above mentioned values,

- Since len (i.e. 12) is greater than 7, we branch to .Lno_short.
- We then prefetch the contents of r3 & r4 and branch to
  .Ldiffoffset_8bytes_make_align_start.
- Under .Ldiffoffset_novmx_cmp, Since r3 is unaligned we end up comparing
  "SGI" part of the string. r3's value is then aligned. r4's value is
  incremented by 3. For comparing the remaining 9 bytes, we jump to
  .Lcmp_lt32bytes.
- Here, 8 bytes of the remaining 9 bytes are compared and execution moves to
  .Lcmp_rest_lt8bytes.
- Here we execute "LD rB,0,r4". In the case of this bug, r4 has an unaligned
  value and hence ends up accessing the "next" double word. The "next" double
  word happens to occur after the last page mapped into the kernel's address
  space and hence this leads to the previously listed oops.
  
-- 
chandan





BUG_ON() in irq_work_run_list

2017-10-14 Thread Chandan Rajendra
Executing fstests' generic/036 test in a loop on next-20171013 kernel causes
BUG_ON()'s condition to evaluate to true,

run fstests generic/036 at 2017-10-14 09:23:29
[ cut here ]
kernel BUG at /root/repos/linux/kernel/irq_work.c:138!
Oops: Exception in kernel mode, sig: 5 [#1]
BE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries
Modules linked in:
CPU: 3 PID: 0 Comm: swapper/3 Tainted: GW
4.14.0-rc4-next-20171013 #7
task: c0063862c780 task.stack: c006387e4000
NIP:  c02476ac LR: c02477c8 CTR: 
REGS: c0063ffd3810 TRAP: 0700   Tainted: GW 
(4.14.0-rc4-next-20171013)
MSR:  80029032   CR: 24002442  XER: 2000
CFAR: c02477c4 SOFTE: 1
GPR00: c01b70c4 c0063ffd3a90 c143bb00 c0063fee1a60
GPR04: 002b c00635ad1b0c c0063383c9e8 
GPR08: 00063ecd 0001 0001 
GPR12: 28002482 cfd41080 c006387e7f90 0200
GPR16: f0b048c0  c13a0920 c13a0920
GPR20: 0003  0001 0002
GPR24: 0010 c0063e22c498 c0063ffd3df0 
GPR28: 00063ecd   c1211a60
NIP [c02476ac] .irq_work_run_list+0xc/0x100
LR [c02477c8] .irq_work_run+0x28/0x50
Call Trace:
[c0063ffd3a90] [c0787638] 
.__blk_mq_complete_request_remote+0x38/0x50 (unreliable)
[c0063ffd3b10] [c01b70c4] .flush_smp_call_function_queue+0xd4/0x1e0
[c0063ffd3ba0] [c0044a4c] .smp_ipi_demux_relaxed+0x9c/0x110
[c0063ffd3c30] [c008dbdc] .icp_hv_ipi_action+0x5c/0xb0
[c0063ffd3cb0] [c0174384] .__handle_irq_event_percpu+0x94/0x2d0
[c0063ffd3d80] [c01745f4] .handle_irq_event_percpu+0x34/0x90
[c0063ffd3e10] [c017ae20] .handle_percpu_irq+0x80/0xd0
[c0063ffd3e90] [c0172ad0] .generic_handle_irq+0x50/0x80
[c0063ffd3f10] [c0016cd0] .__do_irq+0x90/0x210
[c0063ffd3f90] [c002a900] .call_do_irq+0x14/0x24
[c006387e77a0] [c0016ee0] .do_IRQ+0x90/0x140
[c006387e7840] [c0008c20] hardware_interrupt_common+0x150/0x160
--- interrupt: 501 at .plpar_hcall_norets+0x14/0x20
LR = .check_and_cede_processor+0x2c/0x40
[c006387e7b30] [c0b3f028] .check_and_cede_processor+0x18/0x40 
(unreliable)
[c006387e7ba0] [c0b3f3c8] .shared_cede_loop+0x48/0x140
[c006387e7c20] [c0b3c644] .cpuidle_enter_state+0xa4/0x410
[c006387e7cd0] [c0159158] .call_cpuidle+0x68/0xd0
[c006387e7d60] [c0159640] .do_idle+0x2b0/0x310
[c006387e7e20] [c01598b0] .cpu_startup_entry+0x30/0x40
[c006387e7ea0] [c0045e38] .start_secondary+0x4e8/0x530
[c006387e7f90] [c000b06c] start_secondary_prolog+0x10/0x14
Instruction dump:
3861 4e800020 6000 6000 6000 3860 4e800020 6000
6000 894d027a 312a 7d295110 <0b09> e923 2fa9 4d9e0020
---[ end trace 921006f210ad28ba ]---

The corresponding code is,

static void irq_work_run_list(struct llist_head *list)
{
unsigned long flags;
struct irq_work *work;
struct llist_node *llnode;

BUG_ON(!irqs_disabled());


-- 
chandan



[PATCH] powerpc: Wire up statx() syscall

2017-03-16 Thread Chandan Rajendra
Test runs on a ppc64 BE guest succeeded. linux/samples/statx/test-statx
program was executed on the following file types,

1. Regular file
2. Directory
3. device file
4. symlink
5. Named pipe

The test run also included invoking test-statx with the runtime options
provided in the main() function of test-statx.c

Signed-off-by: Chandan Rajendra <chan...@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/systbl.h  | 1 +
 arch/powerpc/include/asm/unistd.h  | 2 +-
 arch/powerpc/include/uapi/asm/unistd.h | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/systbl.h 
b/arch/powerpc/include/asm/systbl.h
index 4b369d8..1c94708 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -387,3 +387,4 @@ SYSCALL(copy_file_range)
 COMPAT_SYS_SPU(preadv2)
 COMPAT_SYS_SPU(pwritev2)
 SYSCALL(kexec_file_load)
+SYSCALL(statx)
diff --git a/arch/powerpc/include/asm/unistd.h 
b/arch/powerpc/include/asm/unistd.h
index eb1acee..9ba11db 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,7 +12,7 @@
 #include 
 
 
-#define NR_syscalls383
+#define NR_syscalls384
 
 #define __NR__exit __NR_exit
 
diff --git a/arch/powerpc/include/uapi/asm/unistd.h 
b/arch/powerpc/include/uapi/asm/unistd.h
index 2f26335..b85f142 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -393,5 +393,6 @@
 #define __NR_preadv2   380
 #define __NR_pwritev2  381
 #define __NR_kexec_file_load   382
+#define __NR_statx 383
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
-- 
2.5.5



Re: syscall statx not implemented on powerpc

2017-03-13 Thread Chandan Rajendra
On Monday, March 13, 2017 03:33:07 AM Chris Packham wrote:
> Hi,
> 
> I've just attempted to build a powerpc kernel from 4.11-rc2 using a 
> custom defconfig (available on request) and I'm hitting the following 
> error in the early stages of compilation.
> 
> :1325:2: error: #warning syscall statx not implemented [-Werror=cpp]
> 
> Same thing seems to happen with mpc85xx_basic_defconfig.
> 
> I don't actually need this syscall so I'd be happy to turn something off 
> to get things building. I did a quick search and couldn't see anything 
> on linuxppc-dev but google keeps correcting "statx" to "stats" so I 
> could have missed it.
> 
> 

The upstream commit
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a528d35e8bfcc521d7cb70aaf03e1bd296c8493f)
that introduces the statx syscall provides a test program. I will wire-up the
syscall on ppc64, run that test program and post the patch if the test program
works well.

-- 
chandan



Re: [PATCH] direct-io: don't introduce another read of inode->i_blkbits

2017-01-09 Thread Chandan Rajendra
On Monday, January 09, 2017 04:42:58 PM Jeff Moyer wrote:
> Commit 20ce44d545844 ("do_direct_IO: Use inode->i_blkbits to compute
> block count to be cleaned") introduced a regression: if the block size
> of the block device is changed while a direct I/O request is being
> setup, it can result in a panic.  See commit ab73857e354ab ("direct-io:
> don't read inode->i_blkbits multiple times") for the reasoning, and
> commit b87570f5d3496 ("Fix a crash when block device is read and block
> size is changed at the same time") for a more detailed problem
> description and reproducer.
> 
> Fixes: 20ce44d545844
> Signed-off-by: Jeff Moyer <jmo...@redhat.com>
> 
> ---
> Chandan, can you please test this to ensure this still fixes your problem?

This patch fixes the failure,

Tested-by: Chandan Rajendra <chan...@linux.vnet.ibm.com>

-- 
chandan



Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le

2017-01-08 Thread Chandan Rajendra
On Wednesday, January 04, 2017 10:28:37 AM Theodore Ts'o wrote:
> On Wed, Jan 04, 2017 at 11:32:42AM +0530, Chandan Rajendra wrote:
> > On Wednesday, January 04, 2017 04:18:08 PM Anton Blanchard wrote:
> > > I'm consistently seeing ext4 filesystem corruption using a mainline
> > > kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu
> > > cloud image, boot it in KVM and run:
> > > 
> > > sudo apt-get update
> > > sudo apt-get dist-upgrade
> > > sudo reboot
> > > 
> > > And it never makes it back up, dying with rather severe filesystem
> > > corruption.
> > 
> > The patch at https://patchwork.kernel.org/patch/9488235/ should fix the
> > bug.
> 
> It looks like this patch is already queued up on the "for-linus"
> branch on the linux-block.git tree.
> 
> Chandra, thanks for pointing this out!  I had missed your e-mail from
> Christmas day, and it was on my todo list to figure out why I was
> seeing lots of 1k block regressions on gce-xfstests post-merge window
> that wasn't showing up on the ext4.git tree before I sent my pull
> request to Linus.
> 
> Jens, could you expedite a pull request to Linus?  This is affecting
> ext4 on 1k block file systems on x86/x86_64, so this is not a ppc-only
> regression.  
> 
> Anton or Chandan, could you do me a favor and verify whether or not
> 64k block sizes are working for you on ppcle on ext4 by running
> xfstests?  Light duty testing works for me but when I stress ext4 with
> pagesize==blocksize on ppcle64 via xfstests, it blows up.  I suspect
> (but am not sure) it's due to (non-upstream) device driver issues, and
> a verification that you can run xfstests on your ppcle64 systems using
> standard upstream device drivers would be very helpful, since I don't
> have easy console access on the machines I have access to at $WORK.  :-(

Hi Ted,

I found one regression w.r.t 64k blocksize. I posted a patch
(http://marc.info/?l=linux-block=148388687722745=2) to fix the issue. 

-- 
chandan



[PATCH] do_direct_IO: Use inode->i_blkbits to compute block count to be cleaned

2017-01-08 Thread Chandan Rajendra
The code currently uses sdio->blkbits to compute the number of blocks to
be cleaned. However sdio->blkbits is derived from the logical block size
of the underlying block device (Refer to the definition of
do_blockdev_direct_IO()). Due to this, generic/299 test would rarely
fail when executed on an ext4 filesystem with 64k as the block size and
when using a virtio based disk (having 512 byte as the logical block
size) inside a kvm guest.

This commit fixes the bug by using inode->i_blkbits to compute the
number of blocks to be cleaned.

Signed-off-by: Chandan Rajendra <chan...@linux.vnet.ibm.com>
---
 fs/direct-io.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index aeae8c0..b20adf9 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -905,6 +905,7 @@ static inline void dio_zero_block(struct dio *dio, struct 
dio_submit *sdio,
 static int do_direct_IO(struct dio *dio, struct dio_submit *sdio,
struct buffer_head *map_bh)
 {
+   const unsigned i_blkbits = dio->inode->i_blkbits;
const unsigned blkbits = sdio->blkbits;
int ret = 0;
 
@@ -949,7 +950,7 @@ static int do_direct_IO(struct dio *dio, struct dio_submit 
*sdio,
clean_bdev_aliases(
map_bh->b_bdev,
map_bh->b_blocknr,
-   map_bh->b_size >> blkbits);
+   map_bh->b_size >> i_blkbits);
}
 
if (!sdio->blkfactor)
-- 
2.5.5



Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le

2017-01-03 Thread Chandan Rajendra
On Wednesday, January 04, 2017 04:18:08 PM Anton Blanchard wrote:
> Hi,
> 
> I'm consistently seeing ext4 filesystem corruption using a mainline
> kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu
> cloud image, boot it in KVM and run:
> 
> sudo apt-get update
> sudo apt-get dist-upgrade
> sudo reboot
> 
> And it never makes it back up, dying with rather severe filesystem
> corruption.

Hi,

The patch at https://patchwork.kernel.org/patch/9488235/ should fix the
bug.

> 
> I've narrowed it down to:
> 
> 64e1c57fa474 ("ext4: Use clean_bdev_aliases() instead of iteration")
> e64855c6cfaa ("fs: Add helper to clean bdev aliases under a bh and use it")
> ce98321bf7d2 ("fs: Remove unmap_underlying_metadata")
> 
> Backing these patches out fixes the issue.
> 
> Anton
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
chandan



[PATCH] ext4: ext4_mb_mark_free_simple: Fix integer value truncation

2016-11-03 Thread Chandan Rajendra
'border' variable is set to a value of 2 times the block size of the
underlying filesystem. With 64k block size, the resulting value won't
fit into a 16-bit variable. Hence this commit changes the data type of
'border' to 'unsigned int'.

Signed-off-by: Chandan Rajendra <chan...@linux.vnet.ibm.com>
---
 fs/ext4/mballoc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index f418f55..a937ac7 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -669,7 +669,7 @@ static void ext4_mb_mark_free_simple(struct super_block *sb,
ext4_grpblk_t min;
ext4_grpblk_t max;
ext4_grpblk_t chunk;
-   unsigned short border;
+   unsigned int border;
 
BUG_ON(len > EXT4_CLUSTERS_PER_GROUP(sb));
 
-- 
2.5.5



Re: [PATCH] powerpc: Wire up copy_file_range() syscall

2016-01-13 Thread Chandan Rajendra
On Thursday 14 Jan 2016 09:53:31 Michael Ellerman wrote:
> On Wed, 2016-01-13 at 22:20 +0530, Chandan Rajendra wrote:
> > Test runs on a ppc64 BE guest succeeded.
> 
> Were the tests built 64-bit or 32-bit?
> 

The test tool (xfs_io to be precise) was built as a 64-bit binary.

-- 
chandan

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc: Wire up copy_file_range() syscall

2016-01-13 Thread Chandan Rajendra
Test runs on a ppc64 BE guest succeeded.

Signed-off-by: Chandan Rajendra <chan...@linux.vnet.ibm.com>
---
The "yet to be upstreamed" fstests test
(https://github.com/chandanr/xfstests/commit/c2ce6196711e02792b434448e29f45b5f9a955f6)
was used to test the syscall. The test in turn depends on the usage of xfs_io's
copy_file_range command.
(https://github.com/chandanr/xfsprogs-dev/commit/9222b48a3d03fb9d690323b460d882e559bd1080)
I will post these patches to the respective mailing lists once this patch is
mainlined.

 arch/powerpc/include/asm/systbl.h  | 1 +
 arch/powerpc/include/asm/unistd.h  | 2 +-
 arch/powerpc/include/uapi/asm/unistd.h | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/systbl.h 
b/arch/powerpc/include/asm/systbl.h
index 5654ece..3fa9df7 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -383,3 +383,4 @@ SYSCALL(ni_syscall)
 SYSCALL(ni_syscall)
 SYSCALL(ni_syscall)
 SYSCALL(mlock2)
+SYSCALL(copy_file_range)
diff --git a/arch/powerpc/include/asm/unistd.h 
b/arch/powerpc/include/asm/unistd.h
index 6a5ace5..1f2594d 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,7 +12,7 @@
 #include 
 
 
-#define NR_syscalls379
+#define NR_syscalls380
 
 #define __NR__exit __NR_exit
 
diff --git a/arch/powerpc/include/uapi/asm/unistd.h 
b/arch/powerpc/include/uapi/asm/unistd.h
index 12a0565..940290d 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -389,5 +389,6 @@
 #define __NR_userfaultfd   364
 #define __NR_membarrier365
 #define __NR_mlock2378
+#define __NR_copy_file_range   379
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev