[Kernel-packages] [Bug 1768115] Comment bridged from LTC Bugzilla
--- Comment From mdr...@us.ibm.com 2018-05-09 08:47 EDT--- (In reply to comment #26) > Is it essential to have two NUMA nodes for the guest memory to see this bug? > Can we reproduce it without the NUMA node stuff in the xml? I haven't attempted it on my end. Can give it a try. But we suspect https://bugzilla.linux.ibm.com/show_bug.cgi?id=167036https://bugzilla.linux.ibm.com/show_bug.cgi?id=167036 may be the same issue (but with Pegas) since see they're doing IO tests and various IO related failures after migration. In that particular config there were no additional NUMA nodes in the guest. I am hoping to get the dump-bitmap-on-demand test you suggested going today and hopefully that can reproduce at a high enough frequency that I can try the kernel patches, disabling THP, and NUMA configurations within a reasonable timeframe. The test I kicked off yesterday to capture first 128MB of dirty bitmap ran all night without triggering... -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1768115 Title: ISST-LTE:KVM:Ubuntu1804:BostonLC:boslcp3g1: Migration guest running with IO stress crashed@security_file_permission+0xf4/0x160. Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: New Bug description: Problem Description: Migration Guest running with IO stress crashed@security_file_permission+0xf4/0x160 after couple of migrations. Steps to re-create: Source host - boslcp3 Destination host - boslcp4 1.boslcp3 & boslcp4 installed with latest kernel root@boslcp3:~# uname -a Linux boslcp3 4.15.0-20-generic #21+bug166588 SMP Thu Apr 26 15:05:59 CDT 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# root@boslcp4:~# uname -a Linux boslcp4 4.15.0-20-generic #21+bug166588 SMP Thu Apr 26 15:05:59 CDT 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# 2. Installed guest boslcp3g1 with kernel and started LTP run from boslcp3 host root@boslcp3g1:~# uname -a Linux boslcp3g1 4.15.0-15-generic #16+bug166877 SMP Wed Apr 18 14:47:30 CDT 2018 ppc64le ppc64le ppc64le GNU/Linux 3. Started migrating boslcp3g1 guest from source to destination & viceversa. 4. After couple of migrations it crashed at boslcp4 & enters into xmon 8:mon> t [c004f8a23d20] c05a7674 security_file_permission+0xf4/0x160 [c004f8a23d60] c03d1d30 rw_verify_area+0x70/0x120 [c004f8a23d90] c03d375c vfs_read+0x8c/0x1b0 [c004f8a23de0] c03d3d88 SyS_read+0x68/0x110 [c004f8a23e30] c000b184 system_call+0x58/0x6c --- Exception: c01 (System Call) at 71f1779fe280 SP (7fffe99ece50) is in userspace 8:mon> S msr= 80001033 sprg0 = pvr= 004e1202 sprg1 = c7a85800 dec= 591e3e03 sprg2 = c7a85800 sp = c004f8a234a0 sprg3 = 00010008 toc= c16eae00 dar = 023c srr0 = c00c355c srr1 = 80001033 dsisr = 4000 dscr = ppr = 0010 pir= 0011 amr= uamor = dpdes = tir = cir= fscr = 05000180 tar = pspb = mmcr0 = 8000 mmcr1 = mmcr2 = pmc1 = pmc2 = pmc3 = pmc4 = mmcra = siar = pmc5 = 026c sdar = sier = pmc6 = 0861 ebbhr = ebbrr = bescr = iamr = 4000 pidr = 0034 tidr = cpu 0x8: Vector: 700 (Program Check) at [c004f8a23220] pc: c00e4854: xmon_core+0x1f24/0x3520 lr: c00e4850: xmon_core+0x1f20/0x3520 sp: c004f8a234a0 msr: 80041033 current = 0xc004f89faf00 paca= 0xc7a85800 softe: 0irq_happened: 0x01 pid = 24028, comm = top Linux version 4.15.0-20-generic (buildd@bos02-ppc64el-002) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 (Ubuntu 4.15.0-20.21-generic 4.15.17) cpu 0x8: Exception 700 (Program Check) in xmon, returning to main loop [c004f8a23d20] c05a7674 security_file_permission+0xf4/0x160 [c004f8a23d60] c03d1d30 rw_verify_area+0x70/0x120 [c004f8a23d90] c03d375c vfs_read+0x8c/0x1b0 [c004f8a23de0] c03d3d88 SyS_read+0x68/0x110 [c004f8a23e30] c000b184 system_call+0x58/0x6c --- Exception: c01 (System Call) at 71f1779fe280 SP (7fffe99ece50) is in userspace 8:mon> r R00 = c043b7fc R16 = R01 = c004f8a23c90 R17 = ff70 R02 =
[Kernel-packages] [Bug 1768115] Comment bridged from LTC Bugzilla
--- Comment From p...@au1.ibm.com 2018-05-09 00:25 EDT--- Is it essential to have two NUMA nodes for the guest memory to see this bug? Can we reproduce it without the NUMA node stuff in the xml? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1768115 Title: ISST-LTE:KVM:Ubuntu1804:BostonLC:boslcp3g1: Migration guest running with IO stress crashed@security_file_permission+0xf4/0x160. Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: New Bug description: Problem Description: Migration Guest running with IO stress crashed@security_file_permission+0xf4/0x160 after couple of migrations. Steps to re-create: Source host - boslcp3 Destination host - boslcp4 1.boslcp3 & boslcp4 installed with latest kernel root@boslcp3:~# uname -a Linux boslcp3 4.15.0-20-generic #21+bug166588 SMP Thu Apr 26 15:05:59 CDT 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# root@boslcp4:~# uname -a Linux boslcp4 4.15.0-20-generic #21+bug166588 SMP Thu Apr 26 15:05:59 CDT 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# 2. Installed guest boslcp3g1 with kernel and started LTP run from boslcp3 host root@boslcp3g1:~# uname -a Linux boslcp3g1 4.15.0-15-generic #16+bug166877 SMP Wed Apr 18 14:47:30 CDT 2018 ppc64le ppc64le ppc64le GNU/Linux 3. Started migrating boslcp3g1 guest from source to destination & viceversa. 4. After couple of migrations it crashed at boslcp4 & enters into xmon 8:mon> t [c004f8a23d20] c05a7674 security_file_permission+0xf4/0x160 [c004f8a23d60] c03d1d30 rw_verify_area+0x70/0x120 [c004f8a23d90] c03d375c vfs_read+0x8c/0x1b0 [c004f8a23de0] c03d3d88 SyS_read+0x68/0x110 [c004f8a23e30] c000b184 system_call+0x58/0x6c --- Exception: c01 (System Call) at 71f1779fe280 SP (7fffe99ece50) is in userspace 8:mon> S msr= 80001033 sprg0 = pvr= 004e1202 sprg1 = c7a85800 dec= 591e3e03 sprg2 = c7a85800 sp = c004f8a234a0 sprg3 = 00010008 toc= c16eae00 dar = 023c srr0 = c00c355c srr1 = 80001033 dsisr = 4000 dscr = ppr = 0010 pir= 0011 amr= uamor = dpdes = tir = cir= fscr = 05000180 tar = pspb = mmcr0 = 8000 mmcr1 = mmcr2 = pmc1 = pmc2 = pmc3 = pmc4 = mmcra = siar = pmc5 = 026c sdar = sier = pmc6 = 0861 ebbhr = ebbrr = bescr = iamr = 4000 pidr = 0034 tidr = cpu 0x8: Vector: 700 (Program Check) at [c004f8a23220] pc: c00e4854: xmon_core+0x1f24/0x3520 lr: c00e4850: xmon_core+0x1f20/0x3520 sp: c004f8a234a0 msr: 80041033 current = 0xc004f89faf00 paca= 0xc7a85800 softe: 0irq_happened: 0x01 pid = 24028, comm = top Linux version 4.15.0-20-generic (buildd@bos02-ppc64el-002) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 (Ubuntu 4.15.0-20.21-generic 4.15.17) cpu 0x8: Exception 700 (Program Check) in xmon, returning to main loop [c004f8a23d20] c05a7674 security_file_permission+0xf4/0x160 [c004f8a23d60] c03d1d30 rw_verify_area+0x70/0x120 [c004f8a23d90] c03d375c vfs_read+0x8c/0x1b0 [c004f8a23de0] c03d3d88 SyS_read+0x68/0x110 [c004f8a23e30] c000b184 system_call+0x58/0x6c --- Exception: c01 (System Call) at 71f1779fe280 SP (7fffe99ece50) is in userspace 8:mon> r R00 = c043b7fc R16 = R01 = c004f8a23c90 R17 = ff70 R02 = c16eae00 R18 = 0a51b4bebfc8 R03 = c00279557200 R19 = 7fffe99edbb0 R04 = c003242499c0 R20 = 0a51b4c04db0 R05 = 0002 R21 = 0a51b4c20e90 R06 = 0004 R22 = 00040f00 R07 = ff81 R23 = 0a51b4c06560 R08 = ff80 R24 = ff80 R09 = R25 = 0a51b4bec2b8 R10 = R26 = 71f177bb0b20 R11 = R27 = R12 = 2000 R28 = c00279557200 R13 = c7a85800 R29 = c004c7734210 R14 = R30 = R15 = R31 = c003242499c0 pc = c043b808 __fsnotify_parent+0x88/0x1a0 cfar=
[Kernel-packages] [Bug 1768115] Comment bridged from LTC Bugzilla
--- Comment From mdr...@us.ibm.com 2018-05-08 16:37 EDT--- Hit another instance of the RAM inconsistencies prior to resuming guest on target side (this one is migrating from boslcp6 to boslcp5 and crashing after it resumes execution on boslcp5). The signature is eerily similar to the ones above... the workload is blast from LTP but it's strange that 3 out of 3 so far have been the same data structure. Maybe there's a relationship between something the process is doing and dirty syncing? root@boslcp5:~/vm_logs/1525768538/dumps# xxd -s 20250624 -l 128 0-2.vm0.iteration2a 0135: 01350010: 01350020: 01350030: 01350040: 01350050: 01350060: 01350070: root@boslcp5:~/vm_logs/1525768538/dumps# xxd -s 20250624 -l 128 0-2.vm0.iteration2a.boslcp6 0135: d603 0100 2f62 6c61 7374 2f76 /blast/v 01350010: 6463 3400 dc4. 01350020: 01350030: 01350040: 01350050: 01350060: 01350070: For this run I included traces of the various stages of memory migration on the QEMU side relative to dirty bitmap sync (attached above). The 3 phases are: "ram_save_setup": enables dirty logging and sets up data structures used for tracking dirty pages. does the initial bitmap sync. QEMU keeps it's own copy which gets OR'd with the one provided by KVM on each bitmap sync. There's 2 blocks (ram-node0/ram-node1) each with their own bitmap / KVM memslot since guest was defined with 2 NUMA nodes. Only ram-node0 would be relevant here since it has offset 0 in guest physical memory address space. "ram_save_pending": called before each iteration to see if there are pages still pending. When number of dirty pages in the QEMU bitmap drop below a certain value it does another sync with KVM's bitmap. "ram_save_iterate": walks the QEMU dirty bitmap and sends corresponding pages until there's none left or some other limit (e.g. bandwidth throttling or max-pages-per-iteration) is hit. "ram_save_pending"/"ram_save_iterate" keeps repeating until no more pages are left. "ram_save_complete" does a final sync with KVM bitmap, sends final set of pages, then disables dirty logging and completes the migration. "vm_stop" denotes with the guest VCPUs have all exited and stopped execution. There's 2 migrations reflected in the posted traces, the first one can be ignored (everything between first ram_save_setup and first ram_save_complete), it's just a backup of the VM. After that the VM is backup up it resumes execution and that's the state we're migrating here and seeing a crash with on other end. The sequences of events in this run are comparable to previous successful runs, no strange orderings or missed calls to sync with KVM dirty bitmap/etc. The condensed version of the trace are below, but it looks like there's a sync prior to vm_stop, and a sync afterward, and given that these syncs are OR'd into a persistent bitmap maintained by QEMU, there's shouldn't be any loss of dirty page information with this particular ordering of events. 117401@1525770831.423435: >ram_save_setup 117401@1525770831.424386: migration_bitmap_sync, count: 4 117401@1525770831.424400: qemu_global_log_sync 117401@1525770831.424410: qemu_global_log_sync, name: ram-node0, addr: 0 117401@1525770831.424419: kvm_log_sync, addr: 0, size: 28000 117401@1525770831.445270: qemu_global_log_sync, name: ram-node1, addr: 28000 117401@1525770831.445279: kvm_log_sync, addr: 28000, size: 28000 117401@1525770831.545805: qemu_global_log_sync, name: vga.vram, addr: 8000 117401@1525770831.545814: kvm_log_sync, addr: 20008000, size: 100 117401@1525770831.545831: qemu_global_log_sync, name: ram-node0, addr: 0 117401@1525770831.545905: qemu_global_log_sync, name: ram-node1, addr: 28000 117401@1525770831.545959: qemu_global_log_sync, name: vga.vram, addr: 8000 117401@1525770831.545965: migration_bitmap_sync, id: ram-node0, block->mr->name: ram-node0, block->used_length: 28000h 117401@1525770831.547606: migration_bitmap_sync, id: ram-node1, block->mr->name: ram-node1, block->used_length: 28000h 117401@1525770831.548986: ram_save_pending, dirty pages remaining: 5247120, page size: 4096
[Kernel-packages] [Bug 1768115] Comment bridged from LTC Bugzilla
--- Comment From mdr...@us.ibm.com 2018-05-07 14:48 EDT--- The RCU connection is possibly a red herring. I tested the above theory about RCU timeouts/warning being a trigger by modifying QEMU to allow guest timebase to be advanced artificially to trigger RCU timeouts/warnings in rapid succession and ran this for 8 hours using the same workload without seeing a crash. It seems migration is a necessary component to reproduce this. I did further tests to capture the guest memory state before/after migration to see if there's possibly an issue with dirty-page tracking or something similar that could explain the crashes, and have data from 2 crashes that show a roughly 24 bytes difference between source/target after migration within the first 100MB of guest physical address range. I have more details in the summaries/logs I'm attaching, but one example is below (from "migtest" log): root@boslcp5:~/dumps-cmp# xxd -s 0x013e -l 128 0-2.boslcp5 013e: 013e0010: 013e0020: 013e0030: 013e0040: 013e0050: 013e0060: 013e0070: root@boslcp5:~/dumps-cmp# xxd -s 0x013e -l 128 0-2.boslcp6 013e: 3403 0100 0002 2f62 6c61 7374 2f76 4.../blast/v 013e0010: 6463 3400 dc4. 013e0020: 013e0030: 013e0040: 013e0050: 013e0060: 013e0070: "blast" is part of the LTP IO test suite running in the guest. It seems some data structure related to it is present in the source guest memory, but not on the target side. Part of the structure seems to be a trigger buffer, but the preceding value might be a point or something else and may explain the crashes if that ends up being zero'd on the target side. The other summary/log I'm attaching has almost an identical inconsistent between source/target from another guest using same workload and hitting a crash: root@boslcp5:~/dumps-cmp-migtest2# xxd -s 38273024 -l 128 0-2.boslcp5 0248: c000 0100 2f62 6c61 7374 2f76 /blast/v 02480010: 6462 3400 db4. 02480020: 02480030: 02480040: 02480050: 02480060: 02480070: root@boslcp5:~/dumps-cmp-migtest2# xxd -s 38273024 -l 128 0-2.boslcp6 0248: 02480010: 02480020: 02480030: 02480040: 02480050: 02480060: 02480070: It seems highly likely there's an issue related to dirty bitmap tracking at play here. Could use some help from kernel folks on figuring out where that might lie. Crashed guests are still live ATM so let me know if there's anything I should try to gather. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1768115 Title: ISST-LTE:KVM:Ubuntu1804:BostonLC:boslcp3g1: Migration guest running with IO stress crashed@security_file_permission+0xf4/0x160. Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: New Bug description: Problem Description: Migration Guest running with IO stress crashed@security_file_permission+0xf4/0x160 after couple of migrations. Steps to re-create: Source host - boslcp3 Destination host - boslcp4 1.boslcp3 & boslcp4 installed with latest kernel root@boslcp3:~# uname -a Linux boslcp3 4.15.0-20-generic #21+bug166588 SMP Thu Apr 26 15:05:59 CDT 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# root@boslcp4:~# uname -a
[Kernel-packages] [Bug 1768115] Comment bridged from LTC Bugzilla
--- Comment From mdr...@us.ibm.com 2018-05-04 09:09 EDT--- (In reply to comment #15) > This is not the same as the original bug, but I suspect they are part of a > class of issues we're hitting while running under very particular > circumstances which might not generally be seen during normal operation and > triggering various corner cases. As such I think it makes sense to group > them under this bug for the time being. > > The general workload is running IO-heavy disk workloads on large guests > (20GB memory, 16 vcpus) with SAN-based storage, and then performing > migration during the workload. During migration we begin to see a high > occurrence of rcu_sched stall warnings, and after 1-3 hours of operations > we hit filesystem-related crashes like the ones posted. We've seen this with > 2 separate FC cards, emulex and qlogic, where we invoke QEMU through libvirt > as: We been gathering additional traces while running under this scenario, and while so far most of the traces have been filesystem-related, we now have a couple that suggest the common thread between all of these is failures are related to RCU-managed data structures. I'll attach the summaries for these from xmon, these have the full dmesg log since guest start, as well as timestamps in dmesg noting where migrating has started/stopped, and "WATCHDOG" messages to note any large jumps in wall-clock time. For example (from boslcp3g1-migtest-fail-on-lcp5): [ 5757.347542] migration iteration 7: started at Thu May 3 05:59:14 CDT 2018 [ 5935.727884] INFO: rcu_sched detected stalls on CPUs/tasks: [ 5935.728567] 1-...!: (670 GPs behind) idle=486/140/0 softirq=218179/218180 fqs=0 [ 5935.730091] 2-...!: (3750 GPs behind) idle=006/140/0 softirq=203335/203335 fqs=0 [ 5935.731076] 4-...!: (96 GPs behind) idle=c2e/140/0 softirq=168607/168608 fqs=0 [ 5935.731783] 5-...!: (2270 GPs behind) idle=e16/140/0 softirq=152608/152608 fqs=1 [ 5935.732959] 6-...!: (322 GPs behind) idle=3ca/141/0 softirq=169452/169453 fqs=1 [ 5935.735061] 8-...!: (6 GPs behind) idle=c36/141/0 softirq=280514/280516 fqs=1 [ 5935.736638] 9-...!: (5 GPs behind) idle=c1e/141/0 softirq=248247/248249 fqs=1 [ 5935.738112] 10-...!: (4 GPs behind) idle=62a/1/0 softirq=228207/228208 fqs=1 [ 5935.738868] 11-...!: (32 GPs behind) idle=afe/140/0 softirq=228817/228818 fqs=1 [ 5935.739122] 12-...!: (3 GPs behind) idle=426/1/0 softirq=192716/192717 fqs=1 [ 5935.739295] 14-...!: (5 GPs behind) idle=e56/140/0 softirq=133888/133892 fqs=1 [ 5935.739486] 15-...!: (7 GPs behind) idle=36e/140/0 softirq=161010/161013 fqs=1 ... [ 5935.740031] Unable to handle kernel paging request for data at address 0x0008 [ 5935.740128] Faulting instruction address: 0xc0403d04 For the prior iterations where we don't crash we'd have messages like: [ 2997.413561] WATCHDOG (Thu May 3 05:13:18 CDT 2018): time jump of 114 seconds [ 3023.759629] migration iteration 1: completed at Thu May 3 05:13:25 CDT 2018 [ 3239.678964] migration iteration 2: started at Thu May 3 05:16:45 CDT 2018 The WATCHDOG is noting the amount of time the guest has seen jump after it resumes execution. These are generally on the order of 1-2 minutes here where we're doing migration via virsh migrate ... --timeout 60, which manually stops the guest if it hasn't finished migration within 60s. We now know that the source of the skip in time actually originates from behavior on the source side of migration due to handling within QEMU, and the guest is reacting after it wakes up from migration. A patch has been sent which changes the behavior so that the guest doesn't see a jump in time after resuming: http://lists.nongnu.org/archive/html/qemu-devel/2018-05/msg00928.html The patch is still under discussion and it's not clear yet whether this is actually a QEMU bug or intended behavior. I'm still testing the bug is conjunciton with original workload and would like to see it run over the weekend before I can say with any certain, but so far it has run overnight whereas prior to the change it would crashes after an hour or 2 (though we have seen runs that survived as long as 8 hours so that's not definitive). If that survives it would suggest that the source for the RCU-related crashes seems to occur as a result of a jump in the guest VCPU's timebase register. One interesting thing I've noticed is that with a QEMU that *doesn't have the patch above*, disabling RCU stall warning messages via: echo 1 >/sys/module/rcupdate/parameters/rcu_cpu_stall_suppress allowed the workload to run for 16 hours without crashing. This may suggest the warning messages, in conjunction with rcu_cpu_stall_timeout being exceeded due to jump in timebase register, are triggering issues with RCU. What I plan to try next is raising rcu_cpu_stall_timeout to a much higher value (currently 21 on Ubuntu 18.04 it seems) and
[Kernel-packages] [Bug 1768115] Comment bridged from LTC Bugzilla
--- Comment From dougm...@us.ibm.com 2018-04-30 15:31 EDT--- Both logs show that the dmesg buffer has been overrun, so by the time you get to xmon and run "dl" you've lost the messages that show what happened before things went wrong. You will need to be collecting console output from the beginning in order to show what happened. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1768115 Title: ISST-LTE:KVM:Ubuntu1804:BostonLC:boslcp3g1: Migration guest running with IO stress crashed@security_file_permission+0xf4/0x160. Status in The Ubuntu-power-systems project: New Status in linux package in Ubuntu: New Bug description: Problem Description: Migration Guest running with IO stress crashed@security_file_permission+0xf4/0x160 after couple of migrations. Steps to re-create: Source host - boslcp3 Destination host - boslcp4 1.boslcp3 & boslcp4 installed with latest kernel root@boslcp3:~# uname -a Linux boslcp3 4.15.0-20-generic #21+bug166588 SMP Thu Apr 26 15:05:59 CDT 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# root@boslcp4:~# uname -a Linux boslcp4 4.15.0-20-generic #21+bug166588 SMP Thu Apr 26 15:05:59 CDT 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# 2. Installed guest boslcp3g1 with kernel and started LTP run from boslcp3 host root@boslcp3g1:~# uname -a Linux boslcp3g1 4.15.0-15-generic #16+bug166877 SMP Wed Apr 18 14:47:30 CDT 2018 ppc64le ppc64le ppc64le GNU/Linux 3. Started migrating boslcp3g1 guest from source to destination & viceversa. 4. After couple of migrations it crashed at boslcp4 & enters into xmon 8:mon> t [c004f8a23d20] c05a7674 security_file_permission+0xf4/0x160 [c004f8a23d60] c03d1d30 rw_verify_area+0x70/0x120 [c004f8a23d90] c03d375c vfs_read+0x8c/0x1b0 [c004f8a23de0] c03d3d88 SyS_read+0x68/0x110 [c004f8a23e30] c000b184 system_call+0x58/0x6c --- Exception: c01 (System Call) at 71f1779fe280 SP (7fffe99ece50) is in userspace 8:mon> S msr= 80001033 sprg0 = pvr= 004e1202 sprg1 = c7a85800 dec= 591e3e03 sprg2 = c7a85800 sp = c004f8a234a0 sprg3 = 00010008 toc= c16eae00 dar = 023c srr0 = c00c355c srr1 = 80001033 dsisr = 4000 dscr = ppr = 0010 pir= 0011 amr= uamor = dpdes = tir = cir= fscr = 05000180 tar = pspb = mmcr0 = 8000 mmcr1 = mmcr2 = pmc1 = pmc2 = pmc3 = pmc4 = mmcra = siar = pmc5 = 026c sdar = sier = pmc6 = 0861 ebbhr = ebbrr = bescr = iamr = 4000 pidr = 0034 tidr = cpu 0x8: Vector: 700 (Program Check) at [c004f8a23220] pc: c00e4854: xmon_core+0x1f24/0x3520 lr: c00e4850: xmon_core+0x1f20/0x3520 sp: c004f8a234a0 msr: 80041033 current = 0xc004f89faf00 paca= 0xc7a85800 softe: 0irq_happened: 0x01 pid = 24028, comm = top Linux version 4.15.0-20-generic (buildd@bos02-ppc64el-002) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 (Ubuntu 4.15.0-20.21-generic 4.15.17) cpu 0x8: Exception 700 (Program Check) in xmon, returning to main loop [c004f8a23d20] c05a7674 security_file_permission+0xf4/0x160 [c004f8a23d60] c03d1d30 rw_verify_area+0x70/0x120 [c004f8a23d90] c03d375c vfs_read+0x8c/0x1b0 [c004f8a23de0] c03d3d88 SyS_read+0x68/0x110 [c004f8a23e30] c000b184 system_call+0x58/0x6c --- Exception: c01 (System Call) at 71f1779fe280 SP (7fffe99ece50) is in userspace 8:mon> r R00 = c043b7fc R16 = R01 = c004f8a23c90 R17 = ff70 R02 = c16eae00 R18 = 0a51b4bebfc8 R03 = c00279557200 R19 = 7fffe99edbb0 R04 = c003242499c0 R20 = 0a51b4c04db0 R05 = 0002 R21 = 0a51b4c20e90 R06 = 0004 R22 = 00040f00 R07 = ff81 R23 = 0a51b4c06560 R08 = ff80 R24 = ff80 R09 = R25 = 0a51b4bec2b8 R10 = R26 = 71f177bb0b20 R11 = R27 = R12 = 2000 R28 = c00279557200 R13 = c7a85800 R29 = c004c7734210 R14 = R30 =