Re: [PATCH] powerpc: add crtsavres.o to always-y instead of extra-y

2024-02-05 Thread Jan Stancek
On Mon, Feb 5, 2024 at 12:50 PM Michael Ellerman  wrote:
>
> Jan Stancek  writes:
> > On Tue, Nov 21, 2023 at 10:51:34AM +1000, Nicholas Piggin wrote:
> >>On Tue Nov 21, 2023 at 9:23 AM AEST, Masahiro Yamada wrote:
> >>> crtsavres.o is linked to modules. However, as explained in commit
> >>> d0e628cd817f ("kbuild: doc: clarify the difference between extra-y
> >>> and always-y"), 'make modules' does not build extra-y.
> >>>
> >>> For example, the following command fails:
> >>>
> >>>   $ make ARCH=powerpc LLVM=1 KBUILD_MODPOST_WARN=1 mrproper ps3_defconfig 
> >>> modules
> >>> [snip]
> >>> LD [M]  arch/powerpc/platforms/cell/spufs/spufs.ko
> >>>   ld.lld: error: cannot open arch/powerpc/lib/crtsavres.o: No such file 
> >>> or directory
> >>>   make[3]: *** [scripts/Makefile.modfinal:56: 
> >>> arch/powerpc/platforms/cell/spufs/spufs.ko] Error 1
> >>>   make[2]: *** [Makefile:1844: modules] Error 2
> >>>   make[1]: *** [/home/masahiro/workspace/linux-kbuild/Makefile:350: 
> >>> __build_one_by_one] Error 2
> >>>   make: *** [Makefile:234: __sub-make] Error 2
> >>>
> >>
> >>Thanks. Is this the correct Fixes tag?
> >>
> >>Fixes: d0e628cd817f ("powerpc/64: Do not link crtsavres.o in vmlinux")
> >>
> >>Hmm, looks like LLD might just do this now automatically for us
> >>too without --save-restore-funcs (https://reviews.llvm.org/D79977).
> >>But we probably still need it for older versions, so we still need
> >>your patch.
> >
> > Hi,
> >
> > I'm still seeing the error of crtsavres.o missing when building external 
> > modules
> > after "make LLVM=1 modules_prepare". Should it be built also in archprepare?
>
> Or modules_prepare?
>
> Example patch below.

I tested your patch with my setup and that works for me as well.

>
> cheers
>
>
> diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
> index 051247027da0..82cdef40a9cd 100644
> --- a/arch/powerpc/Makefile
> +++ b/arch/powerpc/Makefile
> @@ -59,6 +59,11 @@ ifeq ($(CONFIG_PPC64)$(CONFIG_LD_IS_BFD),yy)
>  KBUILD_LDFLAGS_MODULE += --save-restore-funcs
>  else
>  KBUILD_LDFLAGS_MODULE += arch/powerpc/lib/crtsavres.o
> +
> +crtsavres_prepare: scripts
> +   $(MAKE) $(build)=arch/powerpc/lib arch/powerpc/lib/crtsavres.o
> +
> +modules_prepare: crtsavres_prepare
>  endif
>
>  ifdef CONFIG_CPU_LITTLE_ENDIAN
>



Re: [PATCH] powerpc: add crtsavres.o to always-y instead of extra-y

2024-01-30 Thread Jan Stancek

On Tue, Nov 21, 2023 at 10:51:34AM +1000, Nicholas Piggin wrote:

On Tue Nov 21, 2023 at 9:23 AM AEST, Masahiro Yamada wrote:

crtsavres.o is linked to modules. However, as explained in commit
d0e628cd817f ("kbuild: doc: clarify the difference between extra-y
and always-y"), 'make modules' does not build extra-y.

For example, the following command fails:

  $ make ARCH=powerpc LLVM=1 KBUILD_MODPOST_WARN=1 mrproper ps3_defconfig 
modules
[snip]
LD [M]  arch/powerpc/platforms/cell/spufs/spufs.ko
  ld.lld: error: cannot open arch/powerpc/lib/crtsavres.o: No such file or 
directory
  make[3]: *** [scripts/Makefile.modfinal:56: 
arch/powerpc/platforms/cell/spufs/spufs.ko] Error 1
  make[2]: *** [Makefile:1844: modules] Error 2
  make[1]: *** [/home/masahiro/workspace/linux-kbuild/Makefile:350: 
__build_one_by_one] Error 2
  make: *** [Makefile:234: __sub-make] Error 2



Thanks. Is this the correct Fixes tag?

Fixes: d0e628cd817f ("powerpc/64: Do not link crtsavres.o in vmlinux")

Hmm, looks like LLD might just do this now automatically for us
too without --save-restore-funcs (https://reviews.llvm.org/D79977).
But we probably still need it for older versions, so we still need
your patch.


Hi,

I'm still seeing the error of crtsavres.o missing when building external modules
after "make LLVM=1 modules_prepare". Should it be built also in archprepare?

Thanks,
Jan


diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 051247027..a62334194 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -57,8 +57,11 @@ ifeq ($(CONFIG_PPC64)$(CONFIG_LD_IS_BFD),yy)
 # Have the linker provide sfpr if possible.
 # There is a corresponding test in arch/powerpc/lib/Makefile
 KBUILD_LDFLAGS_MODULE += --save-restore-funcs
+crtsavres_prepare:
 else
 KBUILD_LDFLAGS_MODULE += arch/powerpc/lib/crtsavres.o
+crtsavres_prepare:
+   $(MAKE) $(build)=arch/powerpc/lib arch/powerpc/lib/crtsavres.o
 endif

 ifdef CONFIG_CPU_LITTLE_ENDIAN
@@ -389,7 +392,7 @@ vdso_prepare: prepare0
$(build)=arch/powerpc/kernel/vdso 
include/generated/vdso64-offsets.h)
 endif

-archprepare: checkbin
+archprepare: checkbin crtsavres_prepare

 archheaders:
$(Q)$(MAKE) $(build)=arch/powerpc/kernel/syscalls all



Re: [PATCH/RFC] powerpc/module_64: allow .init_array constructors to run

2022-07-07 Thread Jan Stancek
On Thu, Jul 7, 2022 at 1:20 PM Christophe Leroy
 wrote:
>
>
>
> Le 17/08/2021 à 15:02, Jan Stancek a écrit :
> > gcov and kasan rely on compiler generated constructor code.
> > For modules, gcc-8 with gcov enabled generates .init_array section,
> > but on ppc64le it doesn't get executed. find_module_sections() never
> > finds .init_array section, because module_frob_arch_sections() renames
> > it to _init_array.
> >
> > Avoid renaming .init_array section, so do_mod_ctors() can use it.
> >
> > Cc: Michael Ellerman 
> > Cc: Benjamin Herrenschmidt 
> > Cc: Paul Mackerras 
> > Cc: Christophe Leroy 
> > Signed-off-by: Jan Stancek 
>
> Does commit d4be60fe66b7 ("powerpc/module_64: use module_init_section
> instead of patching names") fixes your issue ?

Yes, it does gcov for me. Thanks

>
> If not, please rebase and resubmit.
>
> Thanks
> Christophe
>
>
> > ---
> > I wasn't able to trace the comment:
> >"We don't handle .init for the moment: rename to _init"
> > to original patch (it pre-dates .git). I'm not sure if it
> > still applies today, so I limited patch to .init_array. This
> > fixes gcov for modules for me on ppc64le 5.14.0-rc6.
> >
> > Renaming issue is also mentioned in kasan patches here:
> >
> > https://patchwork.ozlabs.org/project/linuxppc-dev/cover/20210319144058.772525-1-dja@axtens
> >
> >   arch/powerpc/kernel/module_64.c | 10 +-
> >   1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/powerpc/kernel/module_64.c 
> > b/arch/powerpc/kernel/module_64.c
> > index 6baa676e7cb6..c604b13ea6bf 100644
> > --- a/arch/powerpc/kernel/module_64.c
> > +++ b/arch/powerpc/kernel/module_64.c
> > @@ -299,8 +299,16 @@ int module_frob_arch_sections(Elf64_Ehdr *hdr,
> > sechdrs[i].sh_size);
> >
> >   /* We don't handle .init for the moment: rename to _init */
> > - while ((p = strstr(secstrings + sechdrs[i].sh_name, ".init")))
> > + while ((p = strstr(secstrings + sechdrs[i].sh_name, 
> > ".init"))) {
> > +#ifdef CONFIG_CONSTRUCTORS
> > + /* find_module_sections() needs .init_array intact */
> > + if (strstr(secstrings + sechdrs[i].sh_name,
> > + ".init_array")) {
> > + break;
> > + }
> > +#endif
> >   p[0] = '_';
> > + }
> >
> >   if (sechdrs[i].sh_type == SHT_SYMTAB)
> >   dedotify((void *)hdr + sechdrs[i].sh_offset,
>



[PATCH/RFC] powerpc/module_64: allow .init_array constructors to run

2021-08-17 Thread Jan Stancek
gcov and kasan rely on compiler generated constructor code.
For modules, gcc-8 with gcov enabled generates .init_array section,
but on ppc64le it doesn't get executed. find_module_sections() never
finds .init_array section, because module_frob_arch_sections() renames
it to _init_array.

Avoid renaming .init_array section, so do_mod_ctors() can use it.

Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Christophe Leroy 
Signed-off-by: Jan Stancek 
---
I wasn't able to trace the comment:
  "We don't handle .init for the moment: rename to _init"
to original patch (it pre-dates .git). I'm not sure if it
still applies today, so I limited patch to .init_array. This
fixes gcov for modules for me on ppc64le 5.14.0-rc6.

Renaming issue is also mentioned in kasan patches here:
  
https://patchwork.ozlabs.org/project/linuxppc-dev/cover/20210319144058.772525-1-dja@axtens

 arch/powerpc/kernel/module_64.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
index 6baa676e7cb6..c604b13ea6bf 100644
--- a/arch/powerpc/kernel/module_64.c
+++ b/arch/powerpc/kernel/module_64.c
@@ -299,8 +299,16 @@ int module_frob_arch_sections(Elf64_Ehdr *hdr,
  sechdrs[i].sh_size);
 
/* We don't handle .init for the moment: rename to _init */
-   while ((p = strstr(secstrings + sechdrs[i].sh_name, ".init")))
+   while ((p = strstr(secstrings + sechdrs[i].sh_name, ".init"))) {
+#ifdef CONFIG_CONSTRUCTORS
+   /* find_module_sections() needs .init_array intact */
+   if (strstr(secstrings + sechdrs[i].sh_name,
+   ".init_array")) {
+   break;
+   }
+#endif
p[0] = '_';
+   }
 
if (sechdrs[i].sh_type == SHT_SYMTAB)
dedotify((void *)hdr + sechdrs[i].sh_offset,
-- 
2.27.0



[bug] LTP mmap03 stuck in page fault loop after c46241a370a6 ("powerpc/pkeys: Check vma before returning key fault error to the user")

2020-06-26 Thread Jan Stancek
Hi,

LTP mmap03 is getting stuck in page fault loop after commit
  c46241a370a6 ("powerpc/pkeys: Check vma before returning key fault error to 
the user")

System is ppc64le P9 lpar [1] running v5.8-rc2-34-g3e08a95294a4.

Here's a minimized reproducer:
- 8< -
#include 
#include 
#include 
#include 
#include 

int main(int ac, char **av)
{
int page_sz = getpagesize();
int fildes;
char *addr;

fildes = open("tempfile", O_WRONLY | O_CREAT, 0666);
write(fildes, , sizeof(fildes));
close(fildes);

fildes = open("tempfile", O_RDONLY);
unlink("tempfile");

addr = mmap(0, page_sz, PROT_EXEC, MAP_FILE | MAP_PRIVATE, fildes, 0);

printf("%d\n", *addr);
return 0;
}
- >8 -

This would previously end quickly with segmentation fault, after
commit c46241a370a6 test is stuck:

# perf stat timeout 5 ./a.out

 Performance counter stats for 'timeout 5 ./a.out':

  5,001.74 msec task-clock#1.000 CPUs utilized
 9  context-switches  #0.002 K/sec
 0  cpu-migrations#0.000 K/sec
 3,094,893  page-faults   #0.619 M/sec
18,940,869,512  cycles#3.787 GHz
  (33.39%)
 1,377,005,087  stalled-cycles-frontend   #7.27% frontend cycles 
idle (50.19%)
10,949,936,056  stalled-cycles-backend#   57.81% backend cycles 
idle  (16.62%)
21,133,828,748  instructions  #1.12  insn per cycle
  #0.52  stalled cycles per 
insn  (33.22%)
 4,395,016,137  branches  #  878.698 M/sec  
  (49.81%)
   164,499,002  branch-misses #3.74% of all branches
  (16.60%)

   5.001237248 seconds time elapsed

   0.321276000 seconds user
   4.680772000 seconds sys


access_pkey_error() in page fault handler now always seem to return false:

  __do_page_fault
access_pkey_error(is_pkey: 1, is_exec: 0, is_write: 0)
  arch_vma_access_permitted
pkey_access_permitted
  if (!is_pkey_enabled(pkey))
return true
  return false

Regards,
Jan

[1]
Architecture:ppc64le
Byte Order:  Little Endian
CPU(s):  8
On-line CPU(s) list: 0-7
Thread(s) per core:  8
Core(s) per socket:  1
Socket(s):   1
NUMA node(s):2
Model:   2.2 (pvr 004e 0202)
Model name:  POWER9 (architected), altivec supported
Hypervisor vendor:   pHyp
Virtualization type: para
L1d cache:   32 KiB
L1i cache:   32 KiB
NUMA node0 CPU(s):
NUMA node1 CPU(s):   0-7
Physical sockets:2
Physical chips:  1
Physical cores/chip: 8
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf:  Mitigation; RFI Flush, L1D private per thread
Vulnerability Mds:   Not affected
Vulnerability Meltdown:  Mitigation; RFI Flush, L1D private per thread
Vulnerability Spec store bypass: Mitigation; Kernel entry/exit barrier (eieio)
Vulnerability Spectre v1:Mitigation; __user pointer sanitization, ori31 
speculation barrier enabled
Vulnerability Spectre v2:Mitigation; Indirect branch cache disabled, 
Software link stack flush
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort:   Not affected



Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later

2019-12-09 Thread Jan Stancek


- Original Message -
> 
> 
> On 12/6/19 6:09 PM, dftxbs3e wrote:
> > Hello!
> > 
> > I am very happy that someone has found this issue.
> > 
> > I have been suffering from rather random SIGBUS errors in similar
> > conditions described by the author.
> > 
> > I don't have much troubleshooting information to provide, however, I hit
> > the issue regularly so I could investigate during that.
> > 
> > How do you debug such an issue? I tried a debugger etc. but besides
> > crashing with SIGBUS, I couldnt get any other meaningful information.

If it's same issue, you could check if dropping caches helps.
Figure out what page is it with crash or systemtap and look at page->flags
and ((struct iomap_page *)page->private)->uptodate bitmap.

> 
> You may want to test the patch Christoph sent on the original thread for
> this issue.

Or v5.5-rc1, Christoph's patch has been merged:
  1cea335d1db1 ("iomap: fix sub-page uptodate handling")



Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later

2019-12-04 Thread Jan Stancek


- Original Message -
> Please try the patch below:

I ran reproducer for 18 hours on 2 systems were it previously reproduced,
there were no crashes / SIGBUS.



Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later

2019-12-03 Thread Jan Stancek


- Original Message -
> On Tue, Dec 03, 2019 at 07:50:39AM -0500, Jan Stancek wrote:
> > My theory is that there's a race in iomap. There appear to be
> > interleaved calls to iomap_set_range_uptodate() for same page
> > with varying offset and length. Each call sees bitmap as _not_
> > entirely "uptodate" and hence doesn't call SetPageUptodate().
> > Even though each bit in bitmap ends up uptodate by the time
> > all calls finish.
> 
> Weird.  That should be prevented by the page lock that all callers
> of iomap_set_range_uptodate.  But in case I miss something, does
> the patch below trigger?  If not it is not jut a race, but might
> be some weird ordering problem with the bitops, especially if it
> only triggers on ppc, which is very weakly ordered.
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index d33c7bc5ee92..25e942c71590 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -148,6 +148,8 @@ iomap_set_range_uptodate(struct page *page, unsigned off,
> unsigned len)
>   unsigned int i;
>   bool uptodate = true;
>  
> + WARN_ON_ONCE(!PageLocked(page));
> +
>   if (iop) {
>   for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
>   if (i >= first && i <= last)
> 

Hit it pretty quick this time:

# uptime
 09:27:42 up 22 min,  2 users,  load average: 0.09, 13.38, 26.18

# /mnt/testarea/ltp/testcases/bin/genbessel 


Bus error (core dumped)

# dmesg | grep -i -e warn -e call   


[0.00] dt-cpu-ftrs: not enabling: system-call-vectored (disabled or 
unsupported by kernel)
[0.00] random: get_random_u64 called from 
cache_random_seq_create+0x98/0x1e0 with crng_init=0
[0.00] rcu: Offload RCU callbacks from CPUs: (none).
[5.312075] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is called 
outbound_intr_mask:0x4009
[5.357307] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is called 
outbound_intr_mask:0x4009
[5.485126] megaraid_sas 0031:01:00.0: megasas_enable_intr_fusion is called 
outbound_intr_mask:0x4000

So, extra WARN_ON_ONCE applied on top of v5.4-8836-g81b6b96475ac
did not trigger.

Is it possible for iomap code to submit multiple bio-s for same
locked page and then receive callbacks in parallel?



[bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later

2019-12-03 Thread Jan Stancek
Hi,

(This bug report is summary from thread [1] with some additions)

User-space binaries on Power9 ppc64le (with 64k pages) on xfs
filesystem are sporadically hitting SIGBUS:

-- 8< --
(gdb) r
Starting program: /mnt/testarea/ltp/testcases/bin/genasin

Program received signal SIGBUS, Bus error.
dl_main (phdr=0x1040, phnum=, user_entry=0x7fffe760, 
auxv=) at rtld.c:1362
1362switch (ph->p_type)

(gdb) p ph
$1 = (const Elf64_Phdr *) 0x1040

(gdb) p *ph
Cannot access memory at address 0x1040

(gdb) info proc map
process 1110670
Mapped address spaces:

  Start Addr   End Addr   Size Offset objfile
  0x1000 0x10010x10x0 
/mnt/testarea/ltp/testcases/bin/genasin
  0x1001 0x10030x20x0 
/mnt/testarea/ltp/testcases/bin/genasin
  0x77f9 0x77fb0x20x0 [vdso]
  0x77fb 0x77fe0x30x0 
/usr/lib64/ld-2.30.so
  0x77fe 0x78000x20x2 
/usr/lib64/ld-2.30.so
  0x7ffd 0x80000x30x0 [stack]

(gdb) x/1x 0x1040
0x1040: Cannot access memory at address 0x1040
-- >8 --

When this happens the binary continues to hit SIGBUS until page
is released, for example by: echo 3 > /proc/sys/vm/drop_caches

The issue goes back to at least v4.19.

I can semi-reliably reproduce it with LTP is installed to /mnt/testarea/ltp by:
while [ True ]; do
echo 3 > /proc/sys/vm/drop_caches
rm -f /mnt/testarea/ltp/results/RUNTEST.log 
/mnt/testarea/ltp/output/RUNTEST.run.log
./runltp -p -d results -l RUNTEST.log -o RUNTEST.run.log -f math
grep FAIL /mnt/testarea/ltp/results/RUNTEST.log && exit 1
done

and some stress activity in other terminal (e.g. kernel build).
Sometimes in minutes, sometimes in hours. It is not reliable
enough to get meaningful bisect results.

My theory is that there's a race in iomap. There appear to be
interleaved calls to iomap_set_range_uptodate() for same page
with varying offset and length. Each call sees bitmap as _not_
entirely "uptodate" and hence doesn't call SetPageUptodate().
Even though each bit in bitmap ends up uptodate by the time
all calls finish.

For example, with following traces:

iomap_set_range_uptodate()
...
if (uptodate && !PageError(page))
SetPageUptodate(page);
+   
+   if (mycheck(page)) {
+   trace_printk("page: %px, iop: %px, uptodate: %d, 
!PageError(page): %d, flags: %lx\n", page, iop, uptodate, !PageError(page), 
page->flags);
+   trace_printk("first: %u, last: %u, off: %u, len: %u, i: %u\n", 
first, last, off, len, i);
+   }

I get:
 genacos-18471 [057]    162.465730: iomap_readpages: mapping: 
c03f185a1ab0
 genacos-18471 [057]    162.465732: iomap_page_create: 
iomap_page_create page: c00c0fe26180, page->private: , iop: 
c03fc70a19c0, flags: 380001
 genacos-18471 [057]    162.465736: iomap_set_range_uptodate: page: 
c00c0fe26180, iop: c03fc70a19c0, uptodate: 0, !PageError(page): 1, 
flags: 382001
 genacos-18471 [057]    162.465736: iomap_set_range_uptodate: 
first: 1, last: 14, off: 4096, len: 57344, i: 16
  -0 [060] ..s.   162.534862: iomap_set_range_uptodate: page: 
c00c0fe26180, iop: c03fc70a19c0, uptodate: 0, !PageError(page): 1, 
flags: 382081
  -0 [061] ..s.   162.534862: iomap_set_range_uptodate: page: 
c00c0fe26180, iop: c03fc70a19c0, uptodate: 0, !PageError(page): 1, 
flags: 382081
  -0 [060] ..s.   162.534864: iomap_set_range_uptodate: 
first: 0, last: 0, off: 0, len: 4096, i: 16
  -0 [061] ..s.   162.534864: iomap_set_range_uptodate: 
first: 15, last: 15, off: 61440, len: 4096, i: 16

This page doesn't have Uptodate flag set, which leads to filemap_fault()
returning VM_FAULT_SIGBUS:

crash> p/x ((struct page *) 0xc00c0fe26180)->flags  

   
$1 = 0x382032

crash> kmem -g 0x382032
FLAGS: 382032
  PAGE-FLAG   BIT  VALUE
  PG_error  1  002
  PG_dirty  4  010
  PG_lru5  020
  PG_private_2 13  0002000
  PG_fscache   13  0002000
  PG_savepinned 4  010
  PG_double_map13  0002000

But iomap_page->uptodate in page->private suggests all bits are uptodate:

crash> p/x ((struct page *) 0xc00c0fe26180)->private
$2 = 0xc03fc70a19c0

crash> p/x ((struct iomap_page *) 0xc03fc70a19c0)->uptodate 

   
$3 = {0x, 0x0}


It appears (after ~4 hours) that I 

Re: ❌ FAIL: Test report for kernel 5.3.13-3b5f971.cki (stable-queue)

2019-12-02 Thread Jan Stancek



- Original Message -
> Hi Jan,
> 
> Jan Stancek  writes:
> > - Original Message -
> >> 
> >> Hello,
> >> 
> >> We ran automated tests on a recent commit from this kernel tree:
> >> 
> >>Kernel repo:
> >>
> >> git://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git
> >> Commit: 3b5f97139acc - KVM: PPC: Book3S HV: Flush link stack
> >> on
> >> guest exit to host kernel
> 
> I can't find this commit, I assume it's roughly the same as:
> 
>   
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/commit/?h=linux-5.3.y=0815f75f90178bc7e1933cf0d0c818b5f3f5a20c

Hi,

yes, that looks like same one:
  
https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/commit/?h=3b5f97139acc

Looking at CKI reports for past 2 weeks, there were 3 (unexplained) SIGBUS 
related failures:

5.3.13-3b5f971.cki@upstream-stable
LTP genpower Bus error

5.4.0-rc8-4b17a56.cki@upstream-stable
LTP genatan Bus error

5.3.11-200.fc30
xfstests
+/var/lib/xfstests/tests/generic/248: line 38: 161943 Bus error   
(core dumped) $TEST_PROG $TESTFILE

All 3 are from ppc64le, all power9 systems.

> 
> >> The results of these automated tests are provided below.
> >> 
> >> Overall result: FAILED (see details below)
> >>  Merge: OK
> >>Compile: OK
> >>  Tests: FAILED
> >> 
> >> All kernel binaries, config files, and logs are available for download
> >> here:
> >> 
> >>   https://artifacts.cki-project.org/pipelines/314344
> >> 
> >> One or more kernel tests failed:
> >> 
> >> ppc64le:
> >>  ❌ LTP
> >
> > I suspect kernel bug.
> 
> Looks that way, but I can't reproduce it on a machine here.
> 
> I have the same CPU revision and am booting the exact kernel binary &
> modules linked above.

I can semi-reliably reproduce it with:
(where LTP is installed to /mnt/testarea/ltp)

while [ True ]; do
echo 3 > /proc/sys/vm/drop_caches
rm -f /mnt/testarea/ltp/results/RUNTEST.log 
/mnt/testarea/ltp/output/RUNTEST.run.log
./runltp -p -d results -l RUNTEST.log -o RUNTEST.run.log -f math
grep FAIL /mnt/testarea/ltp/results/RUNTEST.log && exit 1
done

and some stress activity in other terminal (e.g. kernel build).
Sometimes in minutes, sometimes in hours. I did try couple
older kernels and could reproduce it with v4.19 and v5.0 as well.

v4.18 ran OK for 2 hours, assuming that one is good, it could be
related to xfs switching to iomap in 4.19-rc1.

Tracing so far led me to filemap_fault(), where it reached this -EIO,
before returning SIGBUS.

page_not_uptodate:
/*
 * Umm, take care of errors if the page isn't up-to-date.
 * Try to re-read it _once_. We do this synchronously,
 * because there really aren't any performance issues here
 * and we need to check for errors.
 */
ClearPageError(page);
fpin = maybe_unlock_mmap_for_io(vmf, fpin);
error = mapping->a_ops->readpage(file, page);
if (!error) {
wait_on_page_locked(page);
if (!PageUptodate(page))
error = -EIO;
}

...
return VM_FAULT_SIGBUS;

> 
> > There were couple of 'math' runtest related failures in recent couple days.
> > In all cases, some data file used by test was missing. Presumably because
> > binary that generates it crashed.
> >
> > I managed to reproduce one failure with this CKI build, which I believe
> > is the same problem.
> >
> > We crash early during load, before any LTP code runs:
> >
> > (gdb) r
> > Starting program: /mnt/testarea/ltp/testcases/bin/genasin
> 
> What is this /mnt/testarea? Looks like it's setup by some of the beaker
> scripts or something?

Correct, it's where beaker script installs LTP. It's not a real mount,
just a directory on /. In my case it's xfs. It should match default
Fedora-31 Server ppc64le installation.

> 
> I'm running LTP out of /home, which is ext4 directly on disk.
> 
> I tried getting the tests-beaker stuff working on my machine, but I
> couldn't find all the libraries and so on it requires.
> 
> 
> > Program received signal SIGBUS, Bus error.
> > dl_main (phdr=0x1040, phnum=, user_entry=0x7fffe760,
> > auxv=) at rtld.c:1362
> > 1362switch (ph->p_type)
> > (gdb) bt
> > #0  dl_main (phdr=0x1040, phnum=,
> > user_entry=0x7fffe760, auxv=) at rtld.c:1362
> > #1  0x77fcf3c8 in _dl_sysdep_start (start_argptr=,
&

Re: [bug] KVM: Unrecoverable TM Unavailable Exception f60

2017-07-14 Thread Jan Stancek


- Original Message -
> On Thu, 2017-07-13 at 08:07 -0400, Jan Stancek wrote:
> 
> (You may want to CC the patch author... Added Paul).

I did CC him using email address from patch. Maybe some list
de-duplication dropped it?

> 
> > - Original Message -
> > > Hi,
> > > 
> > > I'm running into Oops below on IBM PowerNV system (model 8247-22L)
> > > with 4.12 trees and qemu-kvm-2.9. It triggers quickly after I start
> > > KVM guest installation:
> > > 
> > > virt-install  --name ppc64le_kvm_1cpu --mac 52:56:00:00:00:06 --location
> > > nfs://XXX --ram=1024 --vcpus=1 --file-size=20 --hvm --nonsparse --debug
> > > --nographics --noautoconsole --wait -1 --prompt --accelerate
> > > --os-variant=virtio26 --network bridge:br3,model=virtio --serial pty
> > > --console pty --file
> > > /home/virtimages/VirtualMachines/ppc64le_kvm_1cpu.img
> > > --extra-args "serial console=tty0 console=hvc0" --noreboot
> > > 
> > > # git describe
> > > v4.12-10985-g4ca6df1
> > > 
> > > # cat /proc/cpuinfo | head
> > > processor : 0
> > > cpu   : POWER8E (raw), altivec supported
> > > clock : 3325.00MHz
> > > revision  : 2.1 (pvr 004b 0201)
> > > 
> > > 4.11 works OK
> > > 4.11 with these 4 patches applied panics in same way as latest HEAD
> > > (v4.12-10985-g4ca6df1)
> > >   KVM: PPC: Book3S HV: Save/restore host values of debug registers
> > >   KVM: PPC: Book3S HV: Preserve userspace HTM state properly
> > >   KVM: PPC: Book3S HV: Restore critical SPRs to host values on guest exit
> > >   KVM: PPC: Book3S HV: Context-switch EBB registers properly
> > 
> > Bisect on
> > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> > tree identified this as first BAD patch:
> > 
> >   commit 46a704f8409f79fd66567ad3f8a7304830a84293
> >   Author: Paul Mackerras <pau...@ozlabs.org>
> >   Date:   Thu Jun 15 16:10:27 2017 +1000
> > KVM: PPC: Book3S HV: Preserve userspace HTM state properly
> > 
> > Regards,
> > Jan
> > 
> > > 
> > > ---
> > > 
> > > [  181.328511] Unrecoverable TM Unavailable Exception f60 at
> > > d0001e7d9980
> > > [  181.328605] Oops: Unrecoverable TM Unavailable Exception, sig: 6 [#1]
> > > [  181.328613] SMP NR_CPUS=2048
> > > [  181.328613] NUMA
> > > [  181.328618] PowerNV
> > > [  181.328646] Modules linked in: vhost_net vhost tap
> > > nfs_layout_nfsv41_files
> > > rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xt_CHECKSUM iptable_mangle
> > > ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
> > > nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT
> > > nf_reject_ipv4 tun ebtable_filter ebtables ip6table_filter ip6_tables
> > > iptable_filter bridge stp llc kvm_hv kvm nfsd ses enclosure
> > > scsi_transport_sas ghash_generic auth_rpcgss gf128mul xts sg ctr nfs_acl
> > > lockd vmx_crypto shpchp ipmi_powernv i2c_opal grace ipmi_devintf i2c_core
> > > powernv_rng sunrpc ipmi_msghandler ibmpowernv uio_pdrv_genirq uio
> > > leds_powernv powernv_op_panel ip_tables xfs sd_mod lpfc ipr bnx2x libata
> > > mdio ptp pps_core scsi_transport_fc libcrc32c dm_mirror dm_region_hash
> > > dm_log dm_mod
> > > [  181.329278] CPU: 40 PID: 9926 Comm: CPU 0/KVM Not tainted 4.12.0+ #1
> > > [  181.329337] task: c03fc698 task.stack: c03fe4d8
> > > [  181.329396] NIP: d0001e7d9980 LR: d0001e77381c CTR:
> > > d0001e7d98f0
> > > [  181.329465] REGS: c03fe4d837e0 TRAP: 0f60   Not tainted  (4.12.0+)
> > > [  181.329523] MSR: 90009033 <SF,HV,EE,ME,IR,DR,RI,LE>
> > > [  181.329527]   CR: 24022448  XER: 
> > > [  181.329608] CFAR: d0001e773818 SOFTE: 1
> > > [  181.329608] GPR00: d0001e77381c c03fe4d83a60 d0001e7ef410
> > > c03fdcfe
> > > [  181.329608] GPR04: c03fe4f0  
> > > c03fd7954800
> > > [  181.329608] GPR08: 0001 c03fc698 
> > > d0001e7e2880
> > > [  181.329608] GPR12: d0001e7d98f0 c7b19000 0001295220e0
> > > 7fffc0ce2090
> > > [  181.329608] GPR16: 010011886608 7fff8c89f260 0001
> > > 7fff8c080028
> > > [  181.329608] GPR20:  0100118500a6 01001185
>

Re: [bug] KVM: Unrecoverable TM Unavailable Exception f60

2017-07-13 Thread Jan Stancek


- Original Message -
> Hi,
> 
> I'm running into Oops below on IBM PowerNV system (model 8247-22L)
> with 4.12 trees and qemu-kvm-2.9. It triggers quickly after I start
> KVM guest installation:
> 
> virt-install  --name ppc64le_kvm_1cpu --mac 52:56:00:00:00:06 --location
> nfs://XXX --ram=1024 --vcpus=1 --file-size=20 --hvm --nonsparse --debug
> --nographics --noautoconsole --wait -1 --prompt --accelerate
> --os-variant=virtio26 --network bridge:br3,model=virtio --serial pty
> --console pty --file /home/virtimages/VirtualMachines/ppc64le_kvm_1cpu.img
> --extra-args "serial console=tty0 console=hvc0" --noreboot
> 
> # git describe
> v4.12-10985-g4ca6df1
> 
> # cat /proc/cpuinfo | head
> processor : 0
> cpu   : POWER8E (raw), altivec supported
> clock : 3325.00MHz
> revision  : 2.1 (pvr 004b 0201)
> 
> 4.11 works OK
> 4.11 with these 4 patches applied panics in same way as latest HEAD
> (v4.12-10985-g4ca6df1)
>   KVM: PPC: Book3S HV: Save/restore host values of debug registers
>   KVM: PPC: Book3S HV: Preserve userspace HTM state properly
>   KVM: PPC: Book3S HV: Restore critical SPRs to host values on guest exit
>   KVM: PPC: Book3S HV: Context-switch EBB registers properly

Bisect on git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
tree identified this as first BAD patch:

  commit 46a704f8409f79fd66567ad3f8a7304830a84293
  Author: Paul Mackerras 
  Date:   Thu Jun 15 16:10:27 2017 +1000
KVM: PPC: Book3S HV: Preserve userspace HTM state properly

Regards,
Jan

> 
> ---
> 
> [  181.328511] Unrecoverable TM Unavailable Exception f60 at d0001e7d9980
> [  181.328605] Oops: Unrecoverable TM Unavailable Exception, sig: 6 [#1]
> [  181.328613] SMP NR_CPUS=2048
> [  181.328613] NUMA
> [  181.328618] PowerNV
> [  181.328646] Modules linked in: vhost_net vhost tap nfs_layout_nfsv41_files
> rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xt_CHECKSUM iptable_mangle
> ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT
> nf_reject_ipv4 tun ebtable_filter ebtables ip6table_filter ip6_tables
> iptable_filter bridge stp llc kvm_hv kvm nfsd ses enclosure
> scsi_transport_sas ghash_generic auth_rpcgss gf128mul xts sg ctr nfs_acl
> lockd vmx_crypto shpchp ipmi_powernv i2c_opal grace ipmi_devintf i2c_core
> powernv_rng sunrpc ipmi_msghandler ibmpowernv uio_pdrv_genirq uio
> leds_powernv powernv_op_panel ip_tables xfs sd_mod lpfc ipr bnx2x libata
> mdio ptp pps_core scsi_transport_fc libcrc32c dm_mirror dm_region_hash
> dm_log dm_mod
> [  181.329278] CPU: 40 PID: 9926 Comm: CPU 0/KVM Not tainted 4.12.0+ #1
> [  181.329337] task: c03fc698 task.stack: c03fe4d8
> [  181.329396] NIP: d0001e7d9980 LR: d0001e77381c CTR:
> d0001e7d98f0
> [  181.329465] REGS: c03fe4d837e0 TRAP: 0f60   Not tainted  (4.12.0+)
> [  181.329523] MSR: 90009033 
> [  181.329527]   CR: 24022448  XER: 
> [  181.329608] CFAR: d0001e773818 SOFTE: 1
> [  181.329608] GPR00: d0001e77381c c03fe4d83a60 d0001e7ef410
> c03fdcfe
> [  181.329608] GPR04: c03fe4f0  
> c03fd7954800
> [  181.329608] GPR08: 0001 c03fc698 
> d0001e7e2880
> [  181.329608] GPR12: d0001e7d98f0 c7b19000 0001295220e0
> 7fffc0ce2090
> [  181.329608] GPR16: 010011886608 7fff8c89f260 0001
> 7fff8c080028
> [  181.329608] GPR20:  0100118500a6 01001185
> 01001185
> [  181.329608] GPR24: 7fffc0ce1b48 01001185 d673b901
> 
> [  181.329608] GPR28:  c03fdcfe c03fdcfe
> c03fe4f0
> [  181.330199] NIP [d0001e7d9980] kvmppc_vcpu_run_hv+0x90/0x6b0 [kvm_hv]
> [  181.330264] LR [d0001e77381c] kvmppc_vcpu_run+0x2c/0x40 [kvm]
> [  181.330322] Call Trace:
> [  181.330351] [c03fe4d83a60] [d0001e773478]
> kvmppc_set_one_reg+0x48/0x340 [kvm] (unreliable)
> [  181.330437] [c03fe4d83b30] [d0001e77381c]
> kvmppc_vcpu_run+0x2c/0x40 [kvm]
> [  181.330513] [c03fe4d83b50] [d0001e7700b4]
> kvm_arch_vcpu_ioctl_run+0x114/0x2a0 [kvm]
> [  181.330586] [c03fe4d83bd0] [d0001e7642f8]
> kvm_vcpu_ioctl+0x598/0x7a0 [kvm]
> [  181.330658] [c03fe4d83d40] [c03451b8] do_vfs_ioctl+0xc8/0x8b0
> [  181.330717] [c03fe4d83de0] [c0345a64] SyS_ioctl+0xc4/0x120
> [  181.330776] [c03fe4d83e30] [c000b004] system_call+0x58/0x6c
> [  181.330833] Instruction dump:
> [  181.330869] e92d0260 e9290b50 e9290108 792807e3 41820058 e92d0260 e9290b50
> e9290108
> [  181.330941] 792ae8a4 794a1f87 408204f4 e92d0260 <7d4022a6> f9490ff0
> e92d0260 7d4122a6
> [  181.331013] ---[ end trace 6f6ddeb4bfe92a92 ]---
> [  181.334574]
> [  183.334758] Kernel panic - not syncing: Fatal exception
> [  

[bug] KVM: Unrecoverable TM Unavailable Exception f60

2017-07-13 Thread Jan Stancek
Hi,

I'm running into Oops below on IBM PowerNV system (model 8247-22L)
with 4.12 trees and qemu-kvm-2.9. It triggers quickly after I start
KVM guest installation:

virt-install  --name ppc64le_kvm_1cpu --mac 52:56:00:00:00:06 --location 
nfs://XXX --ram=1024 --vcpus=1 --file-size=20 --hvm --nonsparse --debug 
--nographics --noautoconsole --wait -1 --prompt --accelerate 
--os-variant=virtio26 --network bridge:br3,model=virtio --serial pty --console 
pty --file /home/virtimages/VirtualMachines/ppc64le_kvm_1cpu.img --extra-args 
"serial console=tty0 console=hvc0" --noreboot

# git describe
v4.12-10985-g4ca6df1

# cat /proc/cpuinfo | head
processor   : 0
cpu : POWER8E (raw), altivec supported
clock   : 3325.00MHz
revision: 2.1 (pvr 004b 0201)

4.11 works OK
4.11 with these 4 patches applied panics in same way as latest HEAD 
(v4.12-10985-g4ca6df1)
  KVM: PPC: Book3S HV: Save/restore host values of debug registers
  KVM: PPC: Book3S HV: Preserve userspace HTM state properly
  KVM: PPC: Book3S HV: Restore critical SPRs to host values on guest exit
  KVM: PPC: Book3S HV: Context-switch EBB registers properly

---

[  181.328511] Unrecoverable TM Unavailable Exception f60 at d0001e7d9980
[  181.328605] Oops: Unrecoverable TM Unavailable Exception, sig: 6 [#1]
[  181.328613] SMP NR_CPUS=2048 
[  181.328613] NUMA 
[  181.328618] PowerNV
[  181.328646] Modules linked in: vhost_net vhost tap nfs_layout_nfsv41_files 
rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xt_CHECKSUM iptable_mangle 
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT 
nf_reject_ipv4 tun ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bridge stp llc kvm_hv kvm nfsd ses enclosure scsi_transport_sas 
ghash_generic auth_rpcgss gf128mul xts sg ctr nfs_acl lockd vmx_crypto shpchp 
ipmi_powernv i2c_opal grace ipmi_devintf i2c_core powernv_rng sunrpc 
ipmi_msghandler ibmpowernv uio_pdrv_genirq uio leds_powernv powernv_op_panel 
ip_tables xfs sd_mod lpfc ipr bnx2x libata mdio ptp pps_core scsi_transport_fc 
libcrc32c dm_mirror dm_region_hash dm_log dm_mod
[  181.329278] CPU: 40 PID: 9926 Comm: CPU 0/KVM Not tainted 4.12.0+ #1
[  181.329337] task: c03fc698 task.stack: c03fe4d8
[  181.329396] NIP: d0001e7d9980 LR: d0001e77381c CTR: d0001e7d98f0
[  181.329465] REGS: c03fe4d837e0 TRAP: 0f60   Not tainted  (4.12.0+)
[  181.329523] MSR: 90009033 
[  181.329527]   CR: 24022448  XER: 
[  181.329608] CFAR: d0001e773818 SOFTE: 1 
[  181.329608] GPR00: d0001e77381c c03fe4d83a60 d0001e7ef410 
c03fdcfe 
[  181.329608] GPR04: c03fe4f0   
c03fd7954800 
[  181.329608] GPR08: 0001 c03fc698  
d0001e7e2880 
[  181.329608] GPR12: d0001e7d98f0 c7b19000 0001295220e0 
7fffc0ce2090 
[  181.329608] GPR16: 010011886608 7fff8c89f260 0001 
7fff8c080028 
[  181.329608] GPR20:  0100118500a6 01001185 
01001185 
[  181.329608] GPR24: 7fffc0ce1b48 01001185 d673b901 
 
[  181.329608] GPR28:  c03fdcfe c03fdcfe 
c03fe4f0 
[  181.330199] NIP [d0001e7d9980] kvmppc_vcpu_run_hv+0x90/0x6b0 [kvm_hv]
[  181.330264] LR [d0001e77381c] kvmppc_vcpu_run+0x2c/0x40 [kvm]
[  181.330322] Call Trace:
[  181.330351] [c03fe4d83a60] [d0001e773478] 
kvmppc_set_one_reg+0x48/0x340 [kvm] (unreliable)
[  181.330437] [c03fe4d83b30] [d0001e77381c] kvmppc_vcpu_run+0x2c/0x40 
[kvm]
[  181.330513] [c03fe4d83b50] [d0001e7700b4] 
kvm_arch_vcpu_ioctl_run+0x114/0x2a0 [kvm]
[  181.330586] [c03fe4d83bd0] [d0001e7642f8] kvm_vcpu_ioctl+0x598/0x7a0 
[kvm]
[  181.330658] [c03fe4d83d40] [c03451b8] do_vfs_ioctl+0xc8/0x8b0
[  181.330717] [c03fe4d83de0] [c0345a64] SyS_ioctl+0xc4/0x120
[  181.330776] [c03fe4d83e30] [c000b004] system_call+0x58/0x6c
[  181.330833] Instruction dump:
[  181.330869] e92d0260 e9290b50 e9290108 792807e3 41820058 e92d0260 e9290b50 
e9290108 
[  181.330941] 792ae8a4 794a1f87 408204f4 e92d0260 <7d4022a6> f9490ff0 e92d0260 
7d4122a6 
[  181.331013] ---[ end trace 6f6ddeb4bfe92a92 ]---
[  181.334574] 
[  183.334758] Kernel panic - not syncing: Fatal exception
[  183.338352] Rebooting in 10 seconds..

---

Regards,
Jan


[bug] stack protector panics on v4.10-rc1+

2017-01-23 Thread Jan Stancek
Hi,

I'm running into panics with stack protector enabled on ppc64le
lpar (IBM,8408-E8E), starting with:

commit 6533b7c16ee5712041b4e324100550e02a9a5dda
Author: Christophe Leroy 
Date:   Tue Nov 22 11:49:30 2016 +0100
powerpc: Initial stack protector (-fstack-protector) support

CONFIG_HAVE_CC_STACKPROTECTOR=y
CONFIG_CC_STACKPROTECTOR=y
# CONFIG_CC_STACKPROTECTOR_NONE is not set
# CONFIG_CC_STACKPROTECTOR_REGULAR is not set
CONFIG_CC_STACKPROTECTOR_STRONG=y

For example (it crashes at various places):
[1.028466] systemd[1]: Set hostname to . 
[1.036105] Kernel panic - not syncing: stack-protector: Kernel stack is 
corrupted in: c0ad2250 
[1.036105]  
[1.036124] CPU: 5 PID: 168 Comm: dracut-rootfs-g Tainted: GW   
4.0.0+ #11 
[1.036131] Call Trace: 
[1.036141] [c000fe113a80] [c0af13e8] dump_stack+0xa0/0xdc 
(unreliable) 
[1.036153] [c000fe113ab0] [c0ae5138] panic+0x110/0x2bc 
[1.036163] [c000fe113b40] [c00dd664] __stack_chk_fail+0x24/0x30 
[1.036172] [c000fe113ba0] [c0ad2250] 
wait_for_completion+0x190/0x1a0 
[1.036182] [c000fe113c20] [c0221920] stop_one_cpu+0x110/0x1b0 
[1.036191] [c000fe113d00] [c0134a58] sched_exec+0xf8/0x180 
[1.036200] [c000fe113d60] [c03b0f74] SyS_execve+0x414/0xb10 
[1.036210] [c000fe113e30] [c0009308] system_call+0x38/0xb4 
[1.052902] Rebooting in 10 seconds.. 

I tried applying this commit on older kernels, and every kernel I tried, going
back as far as 3.10 was panic-ing early during boot on stack corruption.
I tried gcc-4.8.5-11.el7, and Fedora 25's gcc-6.3.1-1.fc25 with same result.

(gdb) disassemble wait_for_completion
Dump of assembler code for function wait_for_completion:
...
   0xc0c6642c <+140>:   ld  r9,-28688(r13)
   0xc0c66430 <+144>:   xor.r8,r8,r9
   0xc0c66434 <+148>:   li  r9,0
   0xc0c66438 <+152>:   bne-0xc0c665d8 

...
   0xc0c665d8 <+568>:   bl  0xc00f5c68 <__stack_chk_fail+8>

I came across following gcc commit:
  
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=0d55f4d0aeaeb16629a2c07c96a190695b83a7e6
which mentions offset above:
  "If TARGET_THREAD_SSP_OFFSET is defined, use -0x7010(13) resp.
   -0x7008(2) instead of reading __stack_chk_guard variable."

It looks like it's not reading canary value from __stack_chk_guard variable.
atm. I'm not sure where -28688(r13) falls in ppc kernel (somewhere near paca 
struct?).

Is anyone else seeing these panics?

Regards,
Jan


Re: ehea crash on boot

2016-10-11 Thread Jan Stancek




- Original Message -
> From: "Michael Ellerman" <m...@ellerman.id.au>
> To: "Jan Stancek" <jstan...@redhat.com>, "Denis Kirjanov" 
> <k...@linux-powerpc.org>
> Cc: linuxppc-dev@lists.ozlabs.org
> Sent: Tuesday, 11 October, 2016 7:46:31 AM
> Subject: Re: ehea crash on boot
> 
> Jan Stancek <jstan...@redhat.com> writes:
> 
> > Hi Denis / all,
> >
> > Do you know if there is a patch or lead for this problem? I seem
> > to be hitting same Oops with P730 lpar when running 4.8 (see below),
> > but 4.7.7 looks OK.
> 
> Does this fix it?

Yes, it does. dmesg looks clean and network is up.

Regards,
Jan

> 
> cheers
> 
> 
> diff --git a/arch/powerpc/mm/hash_utils_64.c
> b/arch/powerpc/mm/hash_utils_64.c
> index 4cebc31e53de..4e83d872872d 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -526,7 +526,7 @@ static bool might_have_hea(void)
>*/
>  #ifdef CONFIG_IBMEBUS
>   return !cpu_has_feature(CPU_FTR_ARCH_207S) &&
> - !firmware_has_feature(FW_FEATURE_SPLPAR);
> + firmware_has_feature(FW_FEATURE_SPLPAR);
>  #else
>   return false;
>  #endif
> 


Re: ehea crash on boot

2016-10-10 Thread Jan Stancek
Hi Denis / all,

Do you know if there is a patch or lead for this problem? I seem
to be hitting same Oops with P730 lpar when running 4.8 (see below),
but 4.7.7 looks OK.

Regards,
Jan

[8.698424] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready 
[8.713373] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready 
[8.713940] mm: Hashing failure ! EA=0xd80080004040 
access=0x800e current=NetworkManager 
[8.713949] trap=0x300 vsid=0x13d349c ssize=1 base psize=2 psize 2 
pte=0xc0003cc033e701ae 
[8.713958] mm: Hashing failure ! EA=0xd80080004040 
access=0x800e current=NetworkManager 
[8.713966] trap=0x300 vsid=0x13d349c ssize=1 base psize=2 psize 2 
pte=0xc0003cc033e701ae 
[8.713979] Unable to handle kernel paging request for data at address 
0xd80080004040 
[8.713985] Faulting instruction address: 0xd11cc250 
[8.713992] Oops: Kernel access of bad area, sig: 7 [#1] 
[8.713996] SMP NR_CPUS=2048 NUMA pSeries 
[8.714008] Modules linked in: sg uio_pdrv_genirq uio nfsd auth_rpcgss 
nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod ibmvscsi 
scsi_transport_srp ibmveth ehea dm_mirror dm_region_hash dm_log dm_mod 
[8.714063] CPU: 2 PID: 1148 Comm: NetworkManager Not tainted 
4.8.0-1.el7.test.ppc64.debug #1 
[8.714072] task: c65e2080 task.stack: c6668000 
[8.714078] NIP: d11cc250 LR: d11cc118 CTR: 0042c120 
[8.714086] REGS: c666ab00 TRAP: 0300   Not tainted  
(4.8.0-1.el7.test.ppc64.debug) 
[8.714092] MSR: 80009032   CR: 24288442  XER: 
0020 
[8.714120] CFAR: c00087d0 DAR: d80080004040 DSISR: 4200 
SOFTE: 1  
GPR00: d11cc118 c666ad80 d11dbdd8 c6327f80  
GPR04:  c000b080 00029000 00028000  
GPR08: c000b080  d80080004000 d953  
GPR12: 8001 cea61200    
GPR16: 07fe  0001   
GPR20: c000b53ecbd0 c000b53ecb00 c000b53ec1e8 c000b53ec1d0  
GPR24: c000b53ec1b8 c000b53ec200  0015  
GPR28: 09fd c000bbb59418 0028 c6327f80  
[8.714254] NIP [d11cc250] .ehea_create_cq+0x280/0x340 [ehea] 
[8.714263] LR [d11cc118] .ehea_create_cq+0x148/0x340 [ehea] 
[8.714270] Call Trace: 
[8.714278] [c666ad80] [d11cc118] 
.ehea_create_cq+0x148/0x340 [ehea] (unreliable) 
[8.714292] [c666ae30] [d11c5e28] .ehea_up+0x258/0x1200 
[ehea] 
[8.714304] [c666afa0] [d11c6e14] .ehea_open+0x44/0x1a0 
[ehea] 
[8.714316] [c666b030] [c09bc4c4] .__dev_open+0x164/0x310 
[8.714328] [c666b0d0] [c09c6998] 
.__dev_change_flags+0x158/0x4f0 
[8.714339] [c666b180] [c09c6d5c] 
.dev_change_flags+0x2c/0x220 
[8.714349] [c666b220] [c09e2d3c] .do_setlink+0x38c/0xef0 
[8.714359] [c666b3a0] [c09e65cc] .rtnl_newlink+0x97c/0xb10 
[8.714369] [c666b6b0] [c09e4ae4] 
.rtnetlink_rcv_msg+0xc4/0x380 
[8.714379] [c666b7a0] [c0a1c05c] 
.netlink_rcv_skb+0x12c/0x150 
[8.714388] [c666b830] [c09e1b68] .rtnetlink_rcv+0x38/0x60 
[8.714396] [c666b8b0] [c0a1bb74] 
.netlink_unicast+0x554/0x6b0 
[8.714405] [c666b990] [c0a1cbcc] 
.netlink_sendmsg+0x41c/0x490 
[8.714415] [c666ba70] [c0986e18] 
.___sys_sendmsg+0x278/0x370 
[8.714425] [c666bc50] [c09892d4] .SyS_sendmsg+0xc4/0x130 
[8.714436] [c666bd50] [c098a180] 
.SyS_socketcall+0x3d0/0x4e0 
[8.714448] [c666be30] [c0009590] system_call+0x38/0xec 
[8.714455] Instruction dump: 
[8.714462] 38a1 4bffe7fd 6000 7fe3fb78 48003081 e8410028 3860 
4830  
[8.714484] e95f0038 3920 7fe3fb78 f93f0010  3920 79290004 
e95f0038  
[8.714506] ---[ end trace fe4fbc224578dd0c ]--- 


Re: [bug] crypto/vmx/p8_ghash memory corruption in 4.8-rc7

2016-09-28 Thread Jan Stancek
> Jan,
> 
> Can you check if the problem occurs with this patch?

No issues in over-night test with this patch.

> --- a/drivers/crypto/vmx/vmx.c
> +++ b/drivers/crypto/vmx/vmx.c
> @@ -28,6 +28,8 @@
>  #include 
>  #include 
>  
> +int p8_ghash_fallback_descsize(void);
> +
>  extern struct shash_alg p8_ghash_alg;
>  extern struct crypto_alg p8_aes_alg;
>  extern struct crypto_alg p8_aes_cbc_alg;
> @@ -45,6 +47,7 @@ int __init p8_init(void)
>  {
>   int ret = 0;
>   struct crypto_alg **alg_it;
> + int ghash_descsize;
>  
>   for (alg_it = algs; *alg_it; alg_it++) {
>   ret = crypto_register_alg(*alg_it);
> @@ -59,6 +62,12 @@ int __init p8_init(void)
>   if (ret)
>   return ret;
>  
> + ghash_descsize = p8_ghash_fallback_descsize();
> + if (ghash_descsize < 0) {
> + printk(KERN_ERR "Cannot get descsize for p8_ghash fallback\n");
> + return ghash_descsize;
> + }
> + p8_ghash_alg.descsize += ghash_descsize;
>   ret = crypto_register_shash(_ghash_alg);
>   if (ret) {
>   for (alg_it = algs; *alg_it; alg_it++)

I'd suggest to move this inside vmx/ghash.c to a new function, so all p8_ghash
related code is at single place. Then p8_init() would just call the new
function:
-ret = crypto_register_shash(_ghash_alg);
+ret = register_p8_ghash();

Regards,
Jan


Re: [bug] crypto/vmx/p8_ghash memory corruption in 4.8-rc7

2016-09-28 Thread Jan Stancek




- Original Message -
> From: "Herbert Xu" <herb...@gondor.apana.org.au>
> To: "Marcelo Cerri" <marcelo.ce...@canonical.com>
> Cc: "Jan Stancek" <jstan...@redhat.com>, "rui y wang" <rui.y.w...@intel.com>, 
> mhce...@linux.vnet.ibm.com,
> leosi...@linux.vnet.ibm.com, pfsmor...@linux.vnet.ibm.com, 
> linux-cry...@vger.kernel.org,
> linuxppc-dev@lists.ozlabs.org, linux-ker...@vger.kernel.org
> Sent: Wednesday, 28 September, 2016 4:45:49 AM
> Subject: Re: [bug] crypto/vmx/p8_ghash memory corruption in 4.8-rc7
> 
> On Tue, Sep 27, 2016 at 04:46:44PM -0300, Marcelo Cerri wrote:
> > 
> > Can you check if the problem occurs with this patch?
> 
> In light of the fact that padlock-sha is the correct example
> to follow, you only need to add one line to the init_tfm fucntion
> to update the descsize based on that of the fallback.

Thanks for clearing up how this works in padlock-sha, but
we are not exactly in same situation with p8_ghash.

p8_ghash_init_tfm() already updates descsize. Problem in original report
is that without custom export/import/statesize p8_ghash_alg.statesize
gets initialized by shash_prepare_alg() to alg->descsize:

crash> p p8_ghash_alg.statesize
$1 = 56

testmgr allocates space for export based on crypto_shash_statesize(),
but shash_default_export() writes based on crypto_shash_descsize():
[8.297902] state: c004b873aa80, statesize: 56
[8.297932] shash_default_export memcpy c004b873aa80 c004b8607da0, 
len: 76

so I think we need either:
1) make sure p8_ghash_alg.descsize is correct before we register shash,
   this is what Marcelo's last patch is doing
2) provide custom export/import/statesize for p8_ghash_alg

Regards,
Jan

> 
> Thanks,
> --
> Email: Herbert Xu <herb...@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> 


Re: [bug] crypto/vmx/p8_ghash memory corruption in 4.8-rc7

2016-09-27 Thread Jan Stancek




- Original Message -
> From: "Herbert Xu" <herb...@gondor.apana.org.au>
> To: "Marcelo Cerri" <marcelo.ce...@canonical.com>
> Cc: "Jan Stancek" <jstan...@redhat.com>, "rui y wang" <rui.y.w...@intel.com>, 
> mhce...@linux.vnet.ibm.com,
> leosi...@linux.vnet.ibm.com, pfsmor...@linux.vnet.ibm.com, 
> linux-cry...@vger.kernel.org,
> linuxppc-dev@lists.ozlabs.org, linux-ker...@vger.kernel.org
> Sent: Tuesday, 27 September, 2016 5:08:26 AM
> Subject: Re: [bug] crypto/vmx/p8_ghash memory corruption in 4.8-rc7
> 
> On Mon, Sep 26, 2016 at 02:43:17PM -0300, Marcelo Cerri wrote:
> > 
> > Wouldn't be enough to provide a pair of import/export functions as the
> > padlock-sha driver does?
> 
> I don't think that will help as ultimately you need to call the
> export function on the fallback and that's what requires the extra
> memory.  In fact very operation involving the fallback will need
> that extra memory too.

So, if we extended p8_ghash_desc_ctx to accommodate fallback_desc's ctx
and then provided statesize/import/export, would that be acceptable?

struct p8_ghash_desc_ctx {
...
struct shash_desc fallback_desc;
+   char fallback_ctx[sizeof(struct ghash_desc_ctx)];


Also, does that mean that padlock_sha has similar problem?
It does not seem to reserve any space for fallback __ctx and it calls
init()/update()/export() with padlock_sha_desc's fallback:

struct padlock_sha_desc {
struct shash_desc fallback;
};

static struct shash_alg sha1_alg = {
.descsize   =   sizeof(struct padlock_sha_desc),

Regards,
Jan

> 
> Cheers,
> --
> Email: Herbert Xu <herb...@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> 


Re: [bug] crypto/vmx/p8_ghash memory corruption in 4.8-rc7

2016-09-26 Thread Jan Stancek



- Original Message -
> From: "Marcelo Cerri" <marcelo.ce...@canonical.com>
> To: "Jan Stancek" <jstan...@redhat.com>
> Cc: "rui y wang" <rui.y.w...@intel.com>, herb...@gondor.apana.org.au, 
> mhce...@linux.vnet.ibm.com,
> leosi...@linux.vnet.ibm.com, pfsmor...@linux.vnet.ibm.com, 
> linux-cry...@vger.kernel.org,
> linuxppc-dev@lists.ozlabs.org, linux-ker...@vger.kernel.org
> Sent: Monday, 26 September, 2016 4:15:10 PM
> Subject: Re: [bug] crypto/vmx/p8_ghash memory corruption in 4.8-rc7
> 
> Hi Jan,
> 
> Just out of curiosity, have you tried to use "76" on both values to
> check if the problem still happens?

I did, I haven't seen any panics with such patch:

@@ -211,7 +212,7 @@ struct shash_alg p8_ghash_alg = {
.update = p8_ghash_update,
.final = p8_ghash_final,
.setkey = p8_ghash_setkey,
-   .descsize = sizeof(struct p8_ghash_desc_ctx),
+   .descsize = sizeof(struct p8_ghash_desc_ctx) + 20,
.base = {
 .cra_name = "ghash",
 .cra_driver_name = "p8_ghash",


[bug] crypto/vmx/p8_ghash memory corruption in 4.8-rc7

2016-09-23 Thread Jan Stancek
Hi,

I'm chasing a memory corruption with 4.8-rc7 as I'm observing random Oopses
on ppc BE/LE systems (lpars, KVM guests). About 30% of issues is that
module list gets corrupted, and "cat /proc/modules" or "lsmod" triggers
an Oops, for example:

[   88.486041] Unable to handle kernel paging request for data at address 
0x0020
...
[   88.487658] NIP [c020f820] m_show+0xa0/0x240
[   88.487689] LR [c020f834] m_show+0xb4/0x240
[   88.487719] Call Trace:
[   88.487736] [c004b605bbb0] [c020f834] m_show+0xb4/0x240 
(unreliable)
[   88.487796] [c004b605bc50] [c045e73c] seq_read+0x36c/0x520
[   88.487843] [c004b605bcf0] [c04e1014] proc_reg_read+0x84/0x120
[   88.487889] [c004b605bd30] [c040df88] vfs_read+0xf8/0x380
[   88.487934] [c004b605bde0] [c040fd40] SyS_read+0x60/0x110
[   88.487981] [c004b605be30] [c0009590] system_call+0x38/0xec

0x20 offset is module_use->source, module_use is NULL because module.source_list
gets corrupted.

The source of corruption appears to originate from a 'ahash' test for p8_ghash:

cryptomgr_test
 alg_test
  alg_test_hash
   test_hash
__test_hash
 ahash_partial_update
  shash_async_export
   memcpy

With some extra traces [1], I'm seeing that ahash_partial_update() allocates 56 
bytes
for 'state', and then crypto_ahash_export() writes 76 bytes into it:

[5.970887] __test_hash alg name p8_ghash, result: c4333ac0, key: 
c004b860a500, req: c004b860a380
[5.970963] state: c4333f00, statesize: 56
[5.970995] shash_default_export memcpy c4333f00 c004b860a3e0, 
len: 76

This seems to directly correspond with:
  p8_ghash_alg.descsize = sizeof(struct p8_ghash_desc_ctx) == 56
  shash_tfm->descsize = sizeof(struct p8_ghash_desc_ctx) + 
crypto_shash_descsize(fallback) == 56 + 20
where 20 is presumably coming from "ghash_alg.descsize".

My gut feeling was that these 2 should match, but I'd love to hear
what crypto people think.

Thank you,
Jan

[1]
diff --git a/crypto/shash.c b/crypto/shash.c
index a051541..49fe182 100644
--- a/crypto/shash.c
+++ b/crypto/shash.c
@@ -188,6 +188,8 @@ EXPORT_SYMBOL_GPL(crypto_shash_digest);

 static int shash_default_export(struct shash_desc *desc, void *out)
 {
+   int len = crypto_shash_descsize(desc->tfm);
+   printk("shash_default_export memcpy %p %p, len: %d\n", out, 
shash_desc_ctx(desc), len);
memcpy(out, shash_desc_ctx(desc), crypto_shash_descsize(desc->tfm));
return 0;
 }
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index 5c9d5a5..2e54579 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -218,6 +218,8 @@ static int ahash_partial_update(struct ahash_request **preq,
pr_err("alt: hash: Failed to alloc state for %s\n", algo);
goto out_nostate;
}
+   printk("state: %p, statesize: %d\n", state, statesize);
+
ret = crypto_ahash_export(req, state);
if (ret) {
pr_err("alt: hash: Failed to export() for %s\n", algo);
@@ -288,6 +290,7 @@ static int __test_hash(struct crypto_ahash *tfm, struct 
hash_testvec *template,
   "%s\n", algo);
goto out_noreq;
}
+   printk("__test_hash alg name %s, result: %p, key: %p, req: %p\n", algo, 
result, key, req);
ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
   tcrypt_complete, );


Re: [PATCH] Fix: PowerNV crash with 4.4.0-rc8 at sched_init_numa

2016-01-18 Thread Jan Stancek




- Original Message -
> From: "Raghavendra K T" <raghavendra...@linux.vnet.ibm.com>
> To: mi...@redhat.com, pet...@infradead.org, b...@kernel.crashing.org, 
> pau...@samba.org, m...@ellerman.id.au,
> an...@samba.org, a...@linux-foundation.org
> Cc: jstan...@redhat.com, gk...@linux.vnet.ibm.com, "grant likely" 
> <grant.lik...@linaro.org>,
> nik...@linux.vnet.ibm.com, vdavy...@parallels.com, "raghavendra kt" 
> <raghavendra...@linux.vnet.ibm.com>,
> linuxppc-dev@lists.ozlabs.org, linux-ker...@vger.kernel.org, 
> linux...@kvack.org
> Sent: Friday, 15 January, 2016 8:01:23 PM
> Subject: [PATCH] Fix: PowerNV crash with 4.4.0-rc8 at sched_init_numa
> 
> Commit c118baf80256 ("arch/powerpc/mm/numa.c: do not allocate bootmem
> memory for non existing nodes") avoided bootmem memory allocation for
> non existent nodes.
> 
> When DEBUG_PER_CPU_MAPS enabled, powerNV system failed to boot because
> in sched_init_numa, cpumask_or operation was done on unallocated nodes.
> Fix that by making cpumask_or operation only on existing nodes.
> 
> [ Tested with and w/o DEBUG_PER_CPU_MAPS on x86 and powerpc ]
> 
> Reported-by: Jan Stancek <jstan...@redhat.com>

Tested-by: Jan Stancek <jstan...@redhat.com>

I also verified with my setup, that this made the crash go away.
Report mail thread for reference:
  https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-January/137691.html

Regards,
Jan

> Signed-off-by: Raghavendra K T <raghavendra...@linux.vnet.ibm.com>
> ---
>  kernel/sched/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 44253ad..474658b 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6840,7 +6840,7 @@ static void sched_init_numa(void)
>  
>   sched_domains_numa_masks[i][j] = mask;
>  
> - for (k = 0; k < nr_node_ids; k++) {
> + for_each_node(k) {
>   if (node_distance(j, k) > 
> sched_domains_numa_distance[i])
>   continue;
>  
> --
> 1.7.11.7
> 
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related to commit c118baf80256)

2016-01-15 Thread Jan Stancek


- Original Message -
> From: "Raghavendra K T" <raghavendra...@linux.vnet.ibm.com>
> To: "Jan Stancek" <jstan...@redhat.com>
> Cc: linuxppc-dev@lists.ozlabs.org, "raghavendra kt" 
> <raghavendra...@linux.vnet.ibm.com>, vdavy...@parallels.com,
> b...@kernel.crashing.org, pau...@samba.org, m...@ellerman.id.au, 
> an...@samba.org, n...@linux.vnet.ibm.com,
> gk...@linux.vnet.ibm.com, "grant likely" <grant.lik...@linaro.org>, 
> nik...@linux.vnet.ibm.com, "Steve Best"
> <sb...@redhat.com>, "Gustavo Duarte" <gdua...@redhat.com>, "Thomas Huth" 
> <th...@redhat.com>
> Sent: Friday, 15 January, 2016 2:43:07 PM
> Subject: Re: [BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related 
> to commit c118baf80256)
> 
> * Jan Stancek <jstan...@redhat.com> [2016-01-09 18:03:55]:
> 
> > Hi,
> > 
> > I'm seeing bare metal ppc64le system crashing early during boot
> > with latest upstream kernel (4.4.0-rc8):
> > 
> > # git describe
> > v4.4-rc8-96-g751e5f5
> > 
> > [0.625451] Unable to handle kernel paging request for data at address
> > 0x
> > [0.625586] Faulting instruction address: 0xc04ae000
> > [0.625698] Oops: Kernel access of bad area, sig: 11 [#1]
> > [0.625789] SMP NR_CPUS=2048 NUMA PowerNV
> > [0.625879] Modules linked in:
> > [0.625973] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc8+ #6
> > [0.626087] task: c02ff430 ti: c02ff6084000 task.ti:
> > c02ff6084000
> > [0.626224] NIP: c04ae000 LR: c090b9e4 CTR:
> > 0003
> > [0.626361] REGS: c02ff6087930 TRAP: 0300   Not tainted
> > (4.4.0-rc8+)
> > [0.626475] MSR: 90019033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR:
> > 48002044  XER: 2000
> > [0.626808] CFAR: c0008468 DAR:  DSISR: 4000
> > SOFTE: 1
> > GPR00: c090b9ac c02ff6087bb0 c1700900 c03ff229e080
> > GPR04: c03ff229e080  0003 0001
> > GPR08:   0010 90011003
> > GPR12: 2200 cfb4 c000bd68 0002
> > GPR16: 0028 c0b25940 c173ffa4 
> > GPR20: c0b259d8 c0b259e0 c0b259e8 
> > GPR24: c03ff229e080  c189b180 
> > GPR28:  c1740a94 0002 0002
> > [0.627925] NIP [c04ae000] __bitmap_or+0x30/0x50
> > [0.627973] LR [c090b9e4] sched_init_numa+0x440/0x7c8
> > [0.628030] Call Trace:
> > [0.628054] [c02ff6087bb0] [c090b9ac]
> > sched_init_numa+0x408/0x7c8 (unreliable)
> > [0.628136] [c02ff6087ca0] [c0c60718]
> > sched_init_smp+0x60/0x238
> > [0.628206] [c02ff6087d00] [c0c44294]
> > kernel_init_freeable+0x1fc/0x3b4
> > [0.628286] [c02ff6087dc0] [c000bd84] kernel_init+0x24/0x140
> > [0.628356] [c02ff6087e30] [c0009544]
> > ret_from_kernel_thread+0x5c/0x98
> > [0.628435] Instruction dump:
> > [0.628470] 38c6003f 78c9d183 4d820020 38c9 3920 78c60020
> > 38c60001 7cc903a6
> > [0.628587] 6000 6000 6000 6042 <7d05482a> 7d44482a
> > 7d0a5378 7d43492a
> > [0.628711] ---[ end trace b423f3e02b333fbf ]---
> > [0.628757]
> > [2.628822] Kernel panic - not syncing: Fatal exception
> > [2.628969] Rebooting in 10 seconds..[0.00] OPAL V3 detected !
> > 
> 
> > The crash goes away if I revert following commit:
> >   commit c118baf802562688d46e6002f2b5fe66b947da21
> >   Author: Raghavendra K T <raghavendra...@linux.vnet.ibm.com>
> >   Date:   Thu Nov 5 18:46:29 2015 -0800
> > arch/powerpc/mm/numa.c: do not allocate bootmem memory for non existing
> > nodes
> >
> 
> Something like below should fix. I 'll send it in a separate email
>  marking Peter and Ingo. Basically for_each_node conversion
> has targeted only slowpaths / used_once sort of functions.
> But it seems there was a cpumask_or in sched_init_numa that used
> unallocated node.
> 
> Sorry for getting back late.. Was overcautious checking x86/power
> w/ and w/o DEBUG_PER_CPU_MAPS

Hi,

I ran it on my setup (same config as before) on top of v4.4-5966-g7d1fc01.
System now booted OK, dmesg looks clean.

Regards,
Jan

> 
> ---8<-
> From 66809

Re: [BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related to commit c118baf80256)

2016-01-10 Thread Jan Stancek




- Original Message -
> From: "Raghavendra K T" <raghavendra...@linux.vnet.ibm.com>
> To: "Jan Stancek" <jstan...@redhat.com>
> Cc: linuxppc-dev@lists.ozlabs.org, vdavy...@parallels.com, 
> b...@kernel.crashing.org, pau...@samba.org,
> m...@ellerman.id.au, an...@samba.org, n...@linux.vnet.ibm.com, 
> gk...@linux.vnet.ibm.com, "grant likely"
> <grant.lik...@linaro.org>, nik...@linux.vnet.ibm.com, "Steve Best" 
> <sb...@redhat.com>, "Gustavo Duarte"
> <gdua...@redhat.com>, "Thomas Huth" <th...@redhat.com>
> Sent: Sunday, 10 January, 2016 7:47:31 AM
> Subject: Re: [BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related 
> to commit c118baf80256)
> 
> On 01/10/2016 04:33 AM, Jan Stancek wrote:
> > Hi,
> >
> > I'm seeing bare metal ppc64le system crashing early during boot
> > with latest upstream kernel (4.4.0-rc8):
> >
> 
> Hi Jan,
> Thanks for reporting. Let me try to reproduce the issue.
> 
> (Between if you think there is anything special in the .config
> that I need for testing .. please share).

Config has many debug options turned on, so my guess was SCHED_DEBUG.
I've uploaded my config here:
  
http://jan.stancek.eu/tmp/powernv_crash_sched_init_numa/config-powernv-crash-4.4.0-rc8

Regards,
Jan

> 
> - Raghu
> 
> > # git describe
> > v4.4-rc8-96-g751e5f5
> >
> > [0.625451] Unable to handle kernel paging request for data at address
> > 0x
> > [0.625586] Faulting instruction address: 0xc04ae000
> > [0.625698] Oops: Kernel access of bad area, sig: 11 [#1]
> > [0.625789] SMP NR_CPUS=2048 NUMA PowerNV
> > [0.625879] Modules linked in:
> > [0.625973] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc8+ #6
> > [0.626087] task: c02ff430 ti: c02ff6084000 task.ti:
> > c02ff6084000
> > [0.626224] NIP: c04ae000 LR: c090b9e4 CTR:
> > 0003
> > [0.626361] REGS: c02ff6087930 TRAP: 0300   Not tainted
> > (4.4.0-rc8+)
> > [0.626475] MSR: 90019033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR:
> > 48002044  XER: 2000
> > [0.626808] CFAR: c0008468 DAR:  DSISR: 4000
> > SOFTE: 1
> > GPR00: c090b9ac c02ff6087bb0 c1700900 c03ff229e080
> > GPR04: c03ff229e080  0003 0001
> > GPR08:   0010 90011003
> > GPR12: 2200 cfb4 c000bd68 0002
> > GPR16: 0028 c0b25940 c173ffa4 
> > GPR20: c0b259d8 c0b259e0 c0b259e8 
> > GPR24: c03ff229e080  c189b180 
> > GPR28:  c1740a94 0002 0002
> > [0.627925] NIP [c04ae000] __bitmap_or+0x30/0x50
> > [0.627973] LR [c090b9e4] sched_init_numa+0x440/0x7c8
> > [0.628030] Call Trace:
> > [0.628054] [c02ff6087bb0] [c090b9ac]
> > sched_init_numa+0x408/0x7c8 (unreliable)
> > [0.628136] [c02ff6087ca0] [c0c60718]
> > sched_init_smp+0x60/0x238
> > [0.628206] [c02ff6087d00] [c0c44294]
> > kernel_init_freeable+0x1fc/0x3b4
> > [0.628286] [c02ff6087dc0] [c000bd84] kernel_init+0x24/0x140
> > [0.628356] [c02ff6087e30] [c0009544]
> > ret_from_kernel_thread+0x5c/0x98
> > [0.628435] Instruction dump:
> > [0.628470] 38c6003f 78c9d183 4d820020 38c9 3920 78c60020
> > 38c60001 7cc903a6
> > [0.628587] 6000 6000 6000 6042 <7d05482a> 7d44482a
> > 7d0a5378 7d43492a
> > [0.628711] ---[ end trace b423f3e02b333fbf ]---
> > [0.628757]
> > [2.628822] Kernel panic - not syncing: Fatal exception
> > [2.628969] Rebooting in 10 seconds..[0.00] OPAL V3 detected !
> >
> > # numactl -H
> > available: 4 nodes (0-1,16-17)
> > node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
> > 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
> > node 0 size: 64941 MB
> > node 0 free: 64210 MB
> > node 1 cpus: 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
> > 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
> > node 1 size: 65456 MB
> > node 1 free: 62424 MB
> > node 16 cpus: 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
> > 100 101 102 103 104 105 106 107 108 109 110 111 112 113

[BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related to commit c118baf80256)

2016-01-09 Thread Jan Stancek
Hi,

I'm seeing bare metal ppc64le system crashing early during boot
with latest upstream kernel (4.4.0-rc8):

# git describe
v4.4-rc8-96-g751e5f5

[0.625451] Unable to handle kernel paging request for data at address 
0x
[0.625586] Faulting instruction address: 0xc04ae000
[0.625698] Oops: Kernel access of bad area, sig: 11 [#1]
[0.625789] SMP NR_CPUS=2048 NUMA PowerNV
[0.625879] Modules linked in:
[0.625973] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc8+ #6
[0.626087] task: c02ff430 ti: c02ff6084000 task.ti: 
c02ff6084000
[0.626224] NIP: c04ae000 LR: c090b9e4 CTR: 0003
[0.626361] REGS: c02ff6087930 TRAP: 0300   Not tainted  (4.4.0-rc8+)
[0.626475] MSR: 90019033   CR: 48002044  
XER: 2000
[0.626808] CFAR: c0008468 DAR:  DSISR: 4000 
SOFTE: 1
GPR00: c090b9ac c02ff6087bb0 c1700900 c03ff229e080
GPR04: c03ff229e080  0003 0001
GPR08:   0010 90011003
GPR12: 2200 cfb4 c000bd68 0002
GPR16: 0028 c0b25940 c173ffa4 
GPR20: c0b259d8 c0b259e0 c0b259e8 
GPR24: c03ff229e080  c189b180 
GPR28:  c1740a94 0002 0002
[0.627925] NIP [c04ae000] __bitmap_or+0x30/0x50
[0.627973] LR [c090b9e4] sched_init_numa+0x440/0x7c8
[0.628030] Call Trace:
[0.628054] [c02ff6087bb0] [c090b9ac] 
sched_init_numa+0x408/0x7c8 (unreliable)
[0.628136] [c02ff6087ca0] [c0c60718] sched_init_smp+0x60/0x238
[0.628206] [c02ff6087d00] [c0c44294] 
kernel_init_freeable+0x1fc/0x3b4
[0.628286] [c02ff6087dc0] [c000bd84] kernel_init+0x24/0x140
[0.628356] [c02ff6087e30] [c0009544] 
ret_from_kernel_thread+0x5c/0x98
[0.628435] Instruction dump:
[0.628470] 38c6003f 78c9d183 4d820020 38c9 3920 78c60020 38c60001 
7cc903a6
[0.628587] 6000 6000 6000 6042 <7d05482a> 7d44482a 7d0a5378 
7d43492a
[0.628711] ---[ end trace b423f3e02b333fbf ]---
[0.628757]
[2.628822] Kernel panic - not syncing: Fatal exception
[2.628969] Rebooting in 10 seconds..[0.00] OPAL V3 detected !

# numactl -H
available: 4 nodes (0-1,16-17)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
node 0 size: 64941 MB
node 0 free: 64210 MB
node 1 cpus: 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
node 1 size: 65456 MB
node 1 free: 62424 MB
node 16 cpus: 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119
node 16 size: 65457 MB
node 16 free: 65258 MB
node 17 cpus: 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 
136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151
node 17 size: 65186 MB
node 17 free: 65001 MB
node distances:
node   0   1  16  17
  0:  10  20  40  40
  1:  20  10  40  40
 16:  40  40  10  20
 17:  40  40  20  10

The crash goes away if I revert following commit:
  commit c118baf802562688d46e6002f2b5fe66b947da21
  Author: Raghavendra K T 
  Date:   Thu Nov 5 18:46:29 2015 -0800
arch/powerpc/mm/numa.c: do not allocate bootmem memory for non existing 
nodes

Regards,
Jan
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] PPC: fix LOGMPP instruction opcode and inline asm

2015-10-17 Thread Jan Stancek




- Original Message -
> From: "Stewart Smith" <stew...@linux.vnet.ibm.com>
> To: linuxppc-dev@lists.ozlabs.org
> Cc: pau...@samba.org, th...@redhat.com, jstan...@redhat.com, 
> dgib...@redhat.com, b...@kernel.crashing.org, "Stewart
> Smith" <stew...@linux.vnet.ibm.com>, sta...@vger.kernel.org
> Sent: Friday, 16 October, 2015 3:20:35 AM
> Subject: [PATCH] PPC: fix LOGMPP instruction opcode and inline asm
> 
> Back in 9678cda when we started using the Micro Partition Prefetch Engine
> in POWER8 for KVM, there were two mistakes introduced from the original
> patch used for investigation and microbenchmarks.
> 
> One mistake was that the opcode was constructed incorrectly, putting
> the register in the wrong field in the opcode, meaning that we were
> asking the chip to read the memory address from some other register than
> what we intended - probably r0. For those unfortunate enough to have r0
> point somewhere in memory they cared about, the prefetch engine would
> gleefully trash all over it leading to some data you cared about being
> replaced with a list of physical addresses.
> 
> In addition, the logmpp inline function incorrectly used R1 rather than
> %0, leading to even if we got the construction of the instruction right,
> we'd still generate the wrong thing, looking at the address in r1 rather
> than whatever we were asked to look at.
> 
> So, this patch fixes the following:
> - the inline logmpp function's inline asm to be correct
> - puts the register in the right field of the instruction
> 
> This bug would overwrite a single 64k page.
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1269653
> https://bugzilla.redhat.com/show_bug.cgi?id=1271997
> 
> Cc: sta...@vger.kernel.org
> Fixes: 9678cda ("Use the POWER8 Micro Partition Prefetch Engine in KVM HV")
> Reported-by: David Gibson <da...@gibson.dropbear.id.au>
> Reported-by: Benjamin Herrenschmidt <b...@kernel.crashing.org>
> Suggested-by: Benjamin Herrenschmidt <b...@kernel.crashing.org>
> Suggested-by: Paul Mackerras <pau...@samba.org>
> Signed-off-by: Stewart Smith <stew...@linux.vnet.ibm.com>

Tested-by: Jan Stancek <jstan...@redhat.com>

After running a kernel with this patch for 36+ hours, I can no longer
reproduce the user-space corruption I have been seeing.

Previously it reproduced fairly quickly (within hour, on IBM Power S814
[8286-41A], 16GB RAM) by starting/stopping couple KVM guests in loop.

> ---
>  arch/powerpc/include/asm/cache.h  | 2 +-
>  arch/powerpc/include/asm/ppc-opcode.h | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/cache.h
> b/arch/powerpc/include/asm/cache.h
> index 34a05a1a990b..3af1c1e35435 100644
> --- a/arch/powerpc/include/asm/cache.h
> +++ b/arch/powerpc/include/asm/cache.h
> @@ -43,7 +43,7 @@ extern struct ppc64_caches ppc64_caches;
>  
>  static inline void logmpp(u64 x)
>  {
> - asm volatile(PPC_LOGMPP(R1) : : "r" (x));
> + asm volatile(PPC_LOGMPP(%0) : : "r" (x));
>  }
>  
>  #endif /* __powerpc64__ && ! __ASSEMBLY__ */
> diff --git a/arch/powerpc/include/asm/ppc-opcode.h
> b/arch/powerpc/include/asm/ppc-opcode.h
> index 65136928a572..0dc2f6f9b445 100644
> --- a/arch/powerpc/include/asm/ppc-opcode.h
> +++ b/arch/powerpc/include/asm/ppc-opcode.h
> @@ -304,8 +304,8 @@
>  #define PPC_LDARX(t, a, b, eh)   stringify_in_c(.long PPC_INST_LDARX | \
>   ___PPC_RT(t) | ___PPC_RA(a) | \
>   ___PPC_RB(b) | __PPC_EH(eh))
> -#define PPC_LOGMPP(b)stringify_in_c(.long PPC_INST_LOGMPP | \
> - __PPC_RB(b))
> +#define PPC_LOGMPP(a)stringify_in_c(.long PPC_INST_LOGMPP | \
> + ___PPC_RA(a))
>  #define PPC_LWARX(t, a, b, eh)   stringify_in_c(.long PPC_INST_LWARX | \
>   ___PPC_RT(t) | ___PPC_RA(a) | \
>   ___PPC_RB(b) | __PPC_EH(eh))
> --
> 2.1.4
> 
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel=

2015-09-04 Thread Jan Stancek
On Fri, Sep 04, 2015 at 09:59:38AM -0700, Nishanth Aravamudan wrote:
> The 32-bit TCE table initialization relies on the DMA window having a
> size equal to a power of 2 (and checks for it explicitly). But
> crashkernel= has no constraint that requires a power-of-2 be specified.
> This causes the kdump kernel to fail to boot as none of the PCI devices
> (including the disk controller) are successfully initialized.
> 
> After this change, the PCI devices successfully set up the 32-bit TCE
> table and kdump succeeds.
> 
> Fixes: aca6913f5551 ("powerpc/powernv/ioda2: Introduce helpers to allocate 
> TCE pages")
> Signed-off-by: Nishanth Aravamudan 
> Cc: sta...@vger.kernel.org # 4.2
> ---
> 
> Michael, I did this as a follow-on patch to my previous one. If you'd
> rather I made a v3 of that patch with the two fixes combined, I can
> resend.
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
> b/arch/powerpc/platforms/powernv/pci-ioda.c
> index f1c74c28e564..73914f4bd1ab 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -2084,6 +2084,12 @@ static long pnv_pci_ioda2_setup_default_config(struct 
> pnv_ioda_pe *pe)
>*/
>   const u64 window_size =
>   min((u64)pe->table_group.tce32_size, memory_hotplug_max());
> + /*
> +  * crashkernel= specifies the kdump kernel's maximum memory at
> +  * some offset and there is no guaranteed the result is a power
> +  * of 2, which will cause errors later.
> +  */
> + window_size = __rounddown_pow_of_two(window_size);

Hi,

Just wondering if this won't hide potential alignment issues of
"table_group.tce32_size", that now trigger EINVAL down the road
and if it wouldn't be safer to round only "memory_hotplug_max()?

  const __u64 hotplug_max_p2 = __rounddown_pow_of_two(memory_hotplug_max());
  const __u64 window_size =
min((u64)pe->table_group.tce32_size, hotplug_max_p2);

Regards,
Jan

>  
>   rc = pnv_pci_ioda2_create_table(>table_group, 0,
>   IOMMU_PAGE_SHIFT_4K,
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel=

2015-09-04 Thread Jan Stancek




- Original Message -
> From: "Nishanth Aravamudan" <n...@linux.vnet.ibm.com>
> To: "Michael Ellerman" <m...@ellerman.id.au>
> Cc: "Hari Bathini" <hbath...@in.ibm.com>, "Gavin Shan" 
> <gws...@linux.vnet.ibm.com>, "Alexey Kardashevskiy"
> <a...@ozlabs.ru>, "Ben Herrenschmidt" <b...@kernel.crashing.org>, "Paul 
> Mackerras" <pau...@samba.org>, "David Gibson"
> <da...@gibson.dropbear.id.au>, "Wei Yang" <weiy...@linux.vnet.ibm.com>, 
> linuxppc-dev@lists.ozlabs.org, "Jan Stancek"
> <jstan...@redhat.com>
> Sent: Friday, 4 September, 2015 8:22:52 PM
> Subject: [PATCH v2] powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 
> crashkernel=
> 
> The 32-bit TCE table initialization relies on the DMA window having a
> size equal to a power of 2 (and checks for it explicitly). But
> crashkernel= has no constraint that requires a power-of-2 be specified.
> This causes the kdump kernel to fail to boot as none of the PCI devices
> (including the disk controller) are successfully initialized.
> 
> After this change, the PCI devices successfully set up the 32-bit TCE
> table and kdump succeeds.
> 
> Fixes: aca6913f5551 ("powerpc/powernv/ioda2: Introduce helpers to allocate
> TCE pages")
> Signed-off-by: Nishanth Aravamudan <n...@linux.vnet.ibm.com>
> Cc: sta...@vger.kernel.org # 4.2

Tested-by: Jan Stancek <jstan...@redhat.com>

I can confirm, that this patch along with 
  http://patchwork.ozlabs.org/patch/513229/
applied on top of 4.2 is fixing kdump for me on Power S812L.
(ppc64le bare metal, crashkernel=1024M)

Regards,
Jan

> 
> ---
> 
> Michael, I kept this as a follow-on patch to my previous one. If you'd
> rather I made a v3 of that patch with the two fixes combined, I can
> resend. Also, I fixed up the context on my end to be u64, but not sure
> if that will match your tree (next doesn't have my prior patch applied
> yet, that I can see).
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c
> b/arch/powerpc/platforms/powernv/pci-ioda.c
> index f1c74c28e564..d5e635f2c3aa 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -2078,12 +2078,18 @@ static long pnv_pci_ioda2_setup_default_config(struct
> pnv_ioda_pe *pe)
>   struct iommu_table *tbl = NULL;
>   long rc;
>   /*
> +  * crashkernel= specifies the kdump kernel's maximum memory at
> +  * some offset and there is no guaranteed the result is a power
> +  * of 2, which will cause errors later.
> +  */
> + const u64 max_memory = __rounddown_pow_of_two(memory_hotplug_max());
> + /*
>* In memory constrained environments, e.g. kdump kernel, the
>* DMA window can be larger than available memory, which will
>* cause errors later.
>*/
>   const u64 window_size =
> - min((u64)pe->table_group.tce32_size, memory_hotplug_max());
> + min((u64)pe->table_group.tce32_size, max_memory);
>  
>   rc = pnv_pci_ioda2_create_table(>table_group, 0,
>   iommu_page_shift_4k,
> 
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc: fix memory corruption by pnv_alloc_idle_core_states

2015-03-31 Thread Jan Stancek
Space allocated for paca is based off nr_cpu_ids,
but pnv_alloc_idle_core_states() iterates paca with
cpu_nr_cores()*threads_per_core, which is using NR_CPUS.

This causes pnv_alloc_idle_core_states() to write over memory,
which is outside of paca array and may later lead to various panics.

Fixes: 7cba160ad789 (powernv/cpuidle: Redesign idle states management)
Signed-off-by: Jan Stancek jstan...@redhat.com
---
 arch/powerpc/include/asm/cputhreads.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/cputhreads.h 
b/arch/powerpc/include/asm/cputhreads.h
index 2bf8e93..4c8ad59 100644
--- a/arch/powerpc/include/asm/cputhreads.h
+++ b/arch/powerpc/include/asm/cputhreads.h
@@ -55,7 +55,7 @@ static inline cpumask_t cpu_thread_mask_to_cores(const struct 
cpumask *threads)
 
 static inline int cpu_nr_cores(void)
 {
-   return NR_CPUS  threads_shift;
+   return nr_cpu_ids  threads_shift;
 }
 
 static inline cpumask_t cpu_online_cores_map(void)
-- 
1.8.3.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/perf: add missing put_cpu_var in power_pmu_event_init

2015-03-25 Thread Jan Stancek


- Original Message -
 From: Michael Ellerman m...@ellerman.id.au
 To: Jan Stancek jstan...@redhat.com, linuxppc-dev@lists.ozlabs.org
 Cc: linux-ker...@vger.kernel.org, pau...@samba.org, an...@samba.org, 
 t...@kernel.org, c...@linux.com, jo...@redhat.com,
 jstan...@redhat.com, j...@jms.id.au
 Sent: Wednesday, 25 March, 2015 6:25:09 AM
 Subject: Re: powerpc/perf: add missing put_cpu_var in power_pmu_event_init
 
 On Tue, 2015-24-03 at 12:33:22 UTC, Jan Stancek wrote:
  One path in power_pmu_event_init() calls get_cpu_var(), but is
  missing matching call to put_cpu_var(), which causes preemption
  imbalance and crash in user-space:
  
Page fault in user mode with in_atomic() = 1 mm = c01fefa5a280
NIP = 3fff9bf2cae0  MSR = 90014280f032
Oops: Weird page fault, sig: 11 [#23]
 
 snip
 
 Thanks. But I don't see this. I guess you have CONFIG_PREEMPT enabled?

Hi,

CONFIG_PREEMPT_NOTIFIERS=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_COUNT=y

but I think the difference comes from:
  CONFIG_DEBUG_ATOMIC_SLEEP=y

I did following:
- took the default config from RHEL7.1 kernel
- ran 'make oldnoconfig'.
- reproducer didn't trigger anything
- then I added CONFIG_DEBUG_ATOMIC_SLEEP=y
- this time reproducer triggered a panic (3 out of 3 attempts)

Here's config from panic-ing kernel: http://fpaste.org/202543/

[  133.957305] Page fault in user mode with in_atomic() = 1 mm = 
c5fc7e80
[  133.957399] NIP = 3fff9be0cae0  MSR = 90014280f032
[  133.957405] Oops: Weird page fault, sig: 11 [#1]
[  133.957409] SMP NR_CPUS=2048 NUMA PowerNV
[  133.957414] Modules linked in: ses enclosure shpchp uio_pdrv_genirq 
powernv_rng uio xfs libcrc32c sr_mod sd_mod cdrom ipr libata tg3 ptp pps_core 
dm_mirror dm_region_hash dm_log dm_mod
[  133.957638] CPU: 16 PID: 6035 Comm: a.out Not tainted 4.0.0-rc5+ #4
[  133.957693] task: c00fea44b640 ti: c00fea5e4000 task.ti: 
c00fea5e4000
[  133.957759] NIP: 3fff9be0cae0 LR: 3fff9bdc4898 CTR: 3fff9be0cae0
[  133.957825] REGS: c00fea5e7ea0 TRAP: 0401   Not tainted  (4.0.0-rc5+)
[  133.957880] MSR: 90014280f032 SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI  CR: 
2228  XER: 
[  133.958079] CFAR: 3fff9bdc4894 SOFTE: 1 
GPR00: 3fff9bdc494c 31fef3e0 3fff9bf64410 10020068 
GPR04:  0002 0008 0001 
GPR08: 0001 3fff9bf54a30 3fff9be0cae0 3fff9be0cd70 
GPR12: 5222 3fff9bfeb700 
[  133.958485] NIP [3fff9be0cae0] 0x3fff9be0cae0
[  133.958530] LR [3fff9bdc4898] 0x3fff9bdc4898
[  133.958574] Call Trace:
[  133.958597] ---[ end trace 56ec543903422cd9 ]---
[  133.958642] 
[  135.958709] Kernel panic - not syncing: Fatal exception
[  135.958863] Rebooting in 10 seconds..
[  145.970348] BUG: sleeping function called from invalid context at 
kernel/irq/manage.c:104
[  145.970453] in_atomic(): 1, irqs_disabled(): 1, pid: 6035, name: a.out
[  145.970515] CPU: 16 PID: 6035 Comm: a.out Tainted: G  D 
4.0.0-rc5+ #4
[  145.970588] Call Trace:
[  145.970618] [c00fea5e76d0] [c07c2090] .dump_stack+0x98/0xd4 
(unreliable)
[  145.970707] [c00fea5e7750] [c00d5fe4] .___might_sleep+0x124/0x170
[  145.970782] [c00fea5e77c0] [c0112860] .synchronize_irq+0x40/0xe0
[  145.970857] [c00fea5e7880] [c0112fa8] .__free_irq+0xf8/0x2b0
[  145.970931] [c00fea5e7920] [c0113258] .free_irq+0x78/0x100
[  145.971007] [c00fea5e79b0] [c0067ae8] .opal_shutdown+0x88/0x120
[  145.971081] [c00fea5e7a40] [c0063e88] .pnv_shutdown+0x18/0x30
[  145.971157] [c00fea5e7ab0] [c0020c98] .machine_shutdown+0x38/0x50
[  145.971231] [c00fea5e7b20] [c0020d24] .machine_restart+0x14/0x70
[  145.971307] [c00fea5e7ba0] [c00cdc10] 
.emergency_restart+0x20/0x40
[  145.971393] [c00fea5e7c10] [c07bb0a4] .panic+0x224/0x2a4
[  145.971468] [c00fea5e7cb0] [c001e1fc] .die+0x43c/0x450
[  145.971543] [c00fea5e7d60] [c07b62c4] .do_page_fault+0x2d4/0x8f0
[  145.971618] [c00fea5e7e30] [c0008664] handle_page_fault+0x10/0x30

Regards,
Jan
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/perf: add missing put_cpu_var in power_pmu_event_init

2015-03-24 Thread Jan Stancek
One path in power_pmu_event_init() calls get_cpu_var(), but is
missing matching call to put_cpu_var(), which causes preemption
imbalance and crash in user-space:

  Page fault in user mode with in_atomic() = 1 mm = c01fefa5a280
  NIP = 3fff9bf2cae0  MSR = 90014280f032
  Oops: Weird page fault, sig: 11 [#23]
  SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in: snip
  CPU: 43 PID: 10285 Comm: a.out Tainted: G  D 4.0.0-rc5+ #1
  task: c01fe82c9200 ti: c01fe835c000 task.ti: c01fe835c000
  NIP: 3fff9bf2cae0 LR: 3fff9bee4898 CTR: 3fff9bf2cae0
  REGS: c01fe835fea0 TRAP: 0401   Tainted: G  D  (4.0.0-rc5+)
  MSR: 90014280f032 SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI  CR: 2228  
XER: 
  CFAR: 3fff9bee4894 SOFTE: 1
   GPR00: 3fff9bee494c 3fffe01c2ee0 3fff9c084410 10020068
   GPR04:  0002 0008 0001
   GPR08: 0001 3fff9c074a30 3fff9bf2cae0 3fff9bf2cd70
   GPR12: 5222 3fff9c10b700
  NIP [3fff9bf2cae0] 0x3fff9bf2cae0
  LR [3fff9bee4898] 0x3fff9bee4898
  Call Trace:
  ---[ end trace 5d3d952b5d4185d4 ]---

  BUG: sleeping function called from invalid context at 
kernel/locking/rwsem.c:41
  in_atomic(): 1, irqs_disabled(): 0, pid: 10285, name: a.out
  INFO: lockdep is turned off.
  CPU: 43 PID: 10285 Comm: a.out Tainted: G  D 4.0.0-rc5+ #1
  Call Trace:
  [c01fe835f990] [c089c014] .dump_stack+0x98/0xd4 (unreliable)
  [c01fe835fa10] [c00e4138] .___might_sleep+0x1d8/0x2e0
  [c01fe835faa0] [c0888da8] .down_read+0x38/0x110
  [c01fe835fb30] [c00bf2f4] .exit_signals+0x24/0x160
  [c01fe835fbc0] [c00abde0] .do_exit+0xd0/0xe70
  [c01fe835fcb0] [c001f4c4] .die+0x304/0x450
  [c01fe835fd60] [c088e1f4] .do_page_fault+0x2d4/0x900
  [c01fe835fe30] [c0008664] handle_page_fault+0x10/0x30
  note: a.out[10285] exited with preempt_count 1

Reproducer:
  #include stdio.h
  #include unistd.h
  #include syscall.h
  #include sys/types.h
  #include sys/stat.h
  #include linux/perf_event.h
  #include linux/hw_breakpoint.h

  static struct perf_event_attr event = {
  .type = PERF_TYPE_RAW,
  .size = sizeof(struct perf_event_attr),
  .sample_type = PERF_SAMPLE_BRANCH_STACK,
  .branch_sample_type = PERF_SAMPLE_BRANCH_ANY_RETURN,
  };

  int main()
  {
  syscall(__NR_perf_event_open, event, 0, -1, -1, 0);
  }

Signed-off-by: Jan Stancek jstan...@redhat.com
---
 arch/powerpc/perf/core-book3s.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 7c4f669..b101c0b 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1832,8 +1832,10 @@ static int power_pmu_event_init(struct perf_event *event)
cpuhw-bhrb_filter = ppmu-bhrb_filter_map(
event-attr.branch_sample_type);
 
-   if(cpuhw-bhrb_filter == -1)
+   if (cpuhw-bhrb_filter == -1) {
+   put_cpu_var(cpu_hw_events);
return -EOPNOTSUPP;
+   }
}
 
put_cpu_var(cpu_hw_events);
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev