------- Comment (attachment only) From pavra...@in.ibm.com 2018-04-11 08:11 
EDT-------


** Attachment added: "failure console log"
   
https://bugs.launchpad.net/bugs/1761729/+attachment/5110708/+files/new_kernels.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1761729

Title:
  Ubuntu 18.04  Machine crashed while running ltp.

Status in The Ubuntu-power-systems project:
  Confirmed
Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Bionic:
  Confirmed

Bug description:
  ---Problem Description---
  Ubuntu 18.04 [ Briggs P8 ]: Machine crashed while running ltp.

  ---Environment--
  Kernel Build:  Ubuntu 18.04
  System Name :  ltc-briggs2
  Model/Type  :  P8
  Platform    :  BML

  ---Uname output---

  root@ltc-briggs2:~# uname -a
  Linux ltc-briggs2 4.15.0-13-generic #14-Ubuntu SMP Sat Mar 17 13:43:15 UTC 
2018 ppc64le ppc64le ppc64le GNU/Linux

  ---Steps to reproduce--

  $ git clone https://github.com/linux-test-project/ltp.git
  $ cd ltp
  $ make autotools
  $ ./configure
  $ make
  $ make install

  
  ltp
  =====

  root@ltc-briggs2:~# 
  root@ltc-briggs2:~# [10781.098337] LTP: starting fs_inod01 (fs_inod $TMPDIR 
10 10 10)
  [10782.837910] LTP: starting linker01 (linktest.sh 1000 1000)
  [10784.504474] LTP: starting openfile01 (openfile -f10 -t10)
  [10784.534953] LTP: starting inode01
  [10784.550767] LTP: starting inode02
  [10784.739104] LTP: starting stream01
  [10784.740840] LTP: starting stream02
  [10784.742487] LTP: starting stream03
  [10784.744532] LTP: starting stream04
  [10784.746087] LTP: starting stream05
  [10784.747722] LTP: starting ftest01
  [10785.142054] LTP: starting ftest02
  [10785.158852] LTP: starting ftest03
  [10785.404760] LTP: starting ftest04
  [10785.527197] LTP: starting ftest05
  [10785.937164] LTP: starting ftest06
  [10785.958360] LTP: starting ftest07
  [10786.463382] LTP: starting ftest08
  [10786.592998] LTP: starting lftest01 (lftest 100)
  [10786.672707] LTP: starting writetest01 (writetest)
  [10786.774292] LTP: starting fs_di (fs_di -d $TMPDIR)
  [10792.973510] LTP: starting proc01 (proc01 -m 128)
  [10793.865686] ICMPv6: process `proc01' is using deprecated sysctl (syscall) 
net.ipv6.neigh.default.base_reachable_time - use 
net.ipv6.neigh.default.base_reachable_time_ms instead
  [10795.785593] LTP: starting read_all_dev (read_all -d /dev -e 
'/dev/watchdog?(0)' -q -r 10)
  [10795.895774] NET: Registered protocol family 40
  [10795.918763] Bluetooth: Core ver 2.22
  [10795.918866] NET: Registered protocol family 31
  [10795.918909] Bluetooth: HCI device and connection manager initialized
  [10795.918955] Bluetooth: HCI socket layer initialized
  [10795.918991] Bluetooth: L2CAP socket layer initialized
  [10795.919032] Bluetooth: SCO socket layer initialized
  [10798.374850] usercopy: kernel memory exposure attempt detected from 
0000000029431ea4 (<kernel text>) (1023 bytes)
  [10798.374952] ------------[ cut here ]------------
  [10798.374988] kernel BUG at 
/build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c:72!
  [10798.375041] Oops: Exception in kernel mode, sig: 5 [#1]
  [10798.375080] LE SMP NR_CPUS=2048 [10871.343999650,5] OPAL: Switch to 
big-endian OS
  NUMA PowerNV
  [10798.375117] [10876.190849323,5] OPAL: Switch to little-endian OS
  Modules linked in: hci_vhci bluetooth ecdh_generic vhost_vsock cuse 
vmw_vsock_virtio_transport_common userio vsock uhid vhost_net vhost tap snd_seq 
snd_seq_device snd_timer snd soundcore binfmt_misc sctp quota_v2 quota_tree 
nls_iso8859_1 ntfs xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE 
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 
nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter kvm_hv kvm idt_89hpesx vmx_crypto ofpart cmdlinepart 
ipmi_powernv powernv_flash ipmi_devintf mtd ipmi_msghandler ibmpowernv opal_prd 
at24 powernv_rng joydev input_leds mac_hid uio_pdrv_genirq uio sch_fq_codel 
nfsd ib_iser rdma_cm auth_rpcgss iw_cm nfs_acl lockd ib_cm grace iscsi_tcp
  [10798.375636]  libiscsi_tcp libiscsi sunrpc scsi_transport_iscsi ip_tables 
x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov 
async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 
multipath linear mlx5_ib ses enclosure scsi_transport_sas hid_generic usbhid 
hid ib_core qla2xxx ast i2c_algo_bit ttm mlx5_core drm_kms_helper syscopyarea 
sysfillrect sysimgblt fb_sys_fops nvme_fc crct10dif_vpmsum nvme_fabrics ahci 
mlxfw crc32c_vpmsum i40e drm devlink scsi_transport_fc megaraid_sas libahci
  [10798.375961] CPU: 87 PID: 4085 Comm: read_all Not tainted 4.15.0-13-generic 
#14-Ubuntu
  [10798.376013] NIP:  c0000000003c76f0 LR: c0000000003c76ec CTR: 
00000000300378e8
  [10798.376068] REGS: c0000076c63aba00 TRAP: 0700   Not tainted  
(4.15.0-13-generic)
  [10798.376120] MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 28002222 
 XER: 20000000
  [10798.376176] CFAR: c00000000018cce4 SOFTE: 1 
  [10798.376176] GPR00: c0000000003c76ec c0000076c63abc80 c0000000016eaf00 
0000000000000064 
  [10798.376176] GPR04: c000007ffc1cce18 c000007ffc1e4368 9000000000009033 
000000000000040f 
  [10798.376176] GPR08: 0000000000000007 c0000000011c3a74 0000007ffb010000 
9000000000001003 
  [10798.376176] GPR12: 0000000000002200 c000000007a8bd00 0000000000000000 
0000000000000000 
  [10798.376176] GPR16: 0000000000000000 0000000000000000 0000000000000006 
00007ffff7a0a018 
  [10798.376176] GPR20: 000008bb551c8908 000008bb551c88f8 000008bb551c88c8 
c0000076c63abe00 
  [10798.376176] GPR24: 0000000000010000 0000000000000000 00007ffff7a0a018 
c0000076c63abe00 
  [10798.376176] GPR28: c0000000000003ff 0000000000000001 00000000000003ff 
c000000000000000 
  [10798.376619] NIP [c0000000003c76f0] __check_object_size+0x140/0x270
  [10798.376662] LR [c0000000003c76ec] __check_object_size+0x13c/0x270
  [10798.376706] Call Trace:
  [10798.376724] [c0000076c63abc80] [c0000000003c76ec] 
__check_object_size+0x13c/0x270 (unreliable)
  [10798.376787] [c0000076c63abd00] [c0000000008268a4] read_mem+0x84/0x220
  [10798.376835] [c0000076c63abd70] [c0000000003d109c] __vfs_read+0x3c/0x70
  [10798.376880] [c0000076c63abd90] [c0000000003d118c] vfs_read+0xbc/0x1b0
  [10798.376925] [c0000076c63abde0] [c0000000003d1788] SyS_read+0x68/0x110
  [10798.377012] [c0000076c63abe30] [c00000000000b184] system_call+0x58/0x6c
  [10798.377057] Instruction dump:
  [10798.377086] 2fbd0000 419e010c 3c82ff8b 3ca2ff94 3884c360 38a5ad68 3c62ff8b 
7fc8f378 
  [10798.377140] 7fe6fb78 3863c370 4bdc55b5 60000000 <0fe00000> 60000000 
60000000 60420000 
  [10798.377195] ---[ end trace 21abd4753a69334c ]---
  [10798.445038] 
  [10798.445135] Sending IPI to other CPUs
  [10798.446688] IPI complete
  [10798.449081] kexec: waiting for cpu 0 (physical 16) to enter OPAL
  [10798.450224] kexec: waiting for cpu 23 (physical 47) to enter OPAL
  [10798.451396] kexec: waiting for cpu 54 (physical 94) to enter OPAL
  [10800.049202] kexec: Starting switchover sequence.
  [    1.078053] integrity: Unable to open file: /etc/keys/x509_ima.der (-2)
  [    1.078057] integrity: Unable to open file: /etc/keys/x509_evm.der (-2)
  [    1.165219] vio vio: uevent: failed to send synthetic uevent
  /dev/nvme0n1p2: recovering journal
  /dev/nvme0n1p2: clean, 14017353/122101760 files, 57953106/488376576 blocks
  -.mount
  sys-kernel-debug.mount
  setvtrgb.service
  dev-hugepages.mount
  dev-mqueue.mount
  kmod-static-nodes.service
  lvm2-lvmetad.service
  systemd-remount-fs.service
  systemd-tmpfiles-setup-dev.service
  systemd-random-seed.service
  lvm2-monitor.service
  systemd-udevd.service
  systemd-modules-load.service
  sys-fs-fuse-connections.mount
  sys-kernel-config.mount
  systemd-sysctl.service
  systemd-networkd.service
  swapfile.swap
  [    5.177490] vio vio: uevent: failed to send synthetic uevent
  systemd-udev-trigger.service
  keyboard-setup.service
  systemd-journald.service
  [    5.458352] qla2xxx [0020:01:00.0]-00c6:17: MSI-X: Failed to enable 
support with 32 vectors, using 10 vectors.
  apparmor.service
  systemd-journal-flush.service
  systemd-tmpfiles-setup.service
  systemd-update-utmp.service
  [    6.119284] qla2xxx [0020:01:00.1]-00c6:18: MSI-X: Failed to enable 
support with 32 vectors, using 10 vectors.
  systemd-timesyncd.service
  [   10.052141] megaraid_sas 0001:03:00.0: Init cmd return status SUCCESS for 
SCSI host 1
  systemd-networkd-wait-online.service
  iscsid.service
  blk-availability.service
  [   10.675964] kdump-tools[2222]: Starting kdump-tools:  * running 
makedumpfile -c -d 31 /proc/vmcore /var/crash/201804050340/dump-incomplete
  lvm2-pvscan@8:195.service
  lvm2-pvscan@8:179.service
  Copying data                                      : [100.0 %] /           
eta: 0s
  [   55.227083] kdump-tools[2222]: The kernel version is not supported.
  [   55.227300] kdump-tools[2222]: The makedumpfile operation may be 
incomplete.
  [   55.227471] kdump-tools[2222]: The dumpfile is saved to 
/var/crash/201804050340/dump-incomplete.
  [   55.227583] kdump-tools[2222]: makedumpfile Completed.
  [   55.230250] kdump-tools[2222]:  * kdump-tools: saved vmcore in 
/var/crash/201804050340
  [   55.311695] kdump-tools[2222]:  * running makedumpfile --dump-dmesg 
/proc/vmcore /var/crash/201804050340/dmesg.201804050340
  [   55.330032] kdump-tools[2222]: The kernel version is not supported.
  [   55.330206] kdump-tools[2222]: The makedumpfile operation may be 
incomplete.
  [   55.330302] kdump-tools[2222]: The dmesg log is saved to 
/var/crash/201804050340/dmesg.201804050340.
  [   55.330416] kdump-tools[2222]: makedumpfile Completed.
  [   55.330533] kdump-tools[2222]:  * kdump-tools: saved dmesg content in 
/var/crash/201804050340
  [   55.334722] kdump-tools[2222]: Thu, 05 Apr 2018 03:40:44 -0500
  [   55.338419] kdump-tools[2222]: Rebooting.
  [   55.546343] mlx5_core 0021:01:00.1: mlx5_enter_error_state:121:(pid 2715): 
start
  [   55.546414] mlx5_core 0021:01:00.1: mlx5_enter_error_state:128:(pid 2715): 
end
  [   55.942498] mlx5_core 0021:01:00.0: mlx5_enter_error_state:121:(pid 2715): 
start
  [   55.942631] mlx5_core 0021:01:00.0: mlx5_enter_error_state:128:(pid 2715): 
end
  [   59.836381] reboot: Restarting system
  [10963.485916127,5] OPAL: Reboot request...
    5.31149|Ignoring boot flags, incorrect version 0x0
    5.52090|ISTEP  6. 3
    6.16670|ISTEP  6. 4
    6.16957|ISTEP  6. 5
    8.74865|HWAS|PRESENT> DIMM[03]=00AA00AA00AA00AA
    8.74865|HWAS|PRESENT> Membuf[04]=4444000000000000
    8.74866|HWAS|PRESENT> Proc[05]=C000000000000000
   14.03690|ISTEP  6. 6
   14.11948|ISTEP  6. 7
   16.75478|ISTEP  6. 8
   16.91585|ISTEP  6. 9
   17.47534|ISTEP  6.10
   17.55249|ISTEP  6.11
   19.29629|ISTEP  6.12
   19.29926|ISTEP  6.13
   19.30139|ISTEP  7. 1
   19.51889|ISTEP  7. 2

  == Comment: #7 - Vaishnavi Bhat <vaish...@in.ibm.com> - 2018-04-06 04:52:31 ==
  kernel memory exposure attempt detected and the BUG() is called from the 
below code snippet:
  mm/usercopy.c:72

        KERNEL: /usr/lib/debug/boot/vmlinux-4.15.0-13-generic
      DUMPFILE: dump.201804050340  [PARTIAL DUMP]
          CPUS: 160
          DATE: Thu Apr  5 03:39:16 2018
        UPTIME: 00:48:44
  LOAD AVERAGE: 2.78, 11.61, 106.19
         TASKS: 1748
      NODENAME: ltc-briggs2
       RELEASE: 4.15.0-13-generic
       VERSION: #14-Ubuntu SMP Sat Mar 17 13:43:15 UTC 2018
       MACHINE: ppc64le  (2926 Mhz)
        MEMORY: 512 GB
         PANIC: "kernel BUG at 
/build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c:72!"
           PID: 4085
       COMMAND: "read_all"
          TASK: c000007659f23f00  [THREAD_INFO: c0000076c63a8000]
           CPU: 87
         STATE: TASK_RUNNING (PANIC)

  crash> bt
  PID: 4085   TASK: c000007659f23f00  CPU: 87  COMMAND: "read_all"
   #0 [c0000076c63ab740] crash_kexec at c0000000001e22b0
   #1 [c0000076c63ab780] oops_end at c000000000025888
   #2 [c0000076c63ab800] _exception at c000000000026684
   #3 [c0000076c63ab990] program_check_common at c000000000008da4
   Program Check [700] exception frame:
   R0:  c0000000003c76ec    R1:  c0000076c63abc80    R2:  c0000000016eaf00   
   R3:  0000000000000064    R4:  c000007ffc1cce18    R5:  c000007ffc1e4368   
   R6:  9000000000009033    R7:  000000000000040f    R8:  0000000000000007   
   R9:  c0000000011c3a74    R10: 0000007ffb010000    R11: 9000000000001003   
   R12: 0000000000002200    R13: c000000007a8bd00    R14: 0000000000000000   
   R15: 0000000000000000    R16: 0000000000000000    R17: 0000000000000000   
   R18: 0000000000000006    R19: 00007ffff7a0a018    R20: 000008bb551c8908   
   R21: 000008bb551c88f8    R22: 000008bb551c88c8    R23: c0000076c63abe00   
   R24: 0000000000010000    R25: 0000000000000000    R26: 00007ffff7a0a018   
   R27: c0000076c63abe00    R28: c0000000000003ff    R29: 0000000000000001   
   R30: 00000000000003ff    R31: c000000000000000   
   NIP: c0000000003c76f0    MSR: 9000000000029033    OR3: c00000000018cce4
   CTR: 00000000300378e8    LR:  c0000000003c76ec    XER: 0000000020000000
   CCR: 0000000028002222    MQ:  0000000000000001    DAR: 0000000000000000
   DSISR: 0000000000000000     Syscall Result: 0000000000000000
   #4 [c0000076c63abc80] __check_object_size at c0000000003c76f0
   [Link Register] [c0000076c63abc80] __check_object_size at c0000000003c76ec  
(unreliable)
   #5 [c0000076c63abd00] read_mem at c0000000008268a4
   #6 [c0000076c63abd70] __vfs_read at c0000000003d109c
   #7 [c0000076c63abd90] vfs_read at c0000000003d118c
   #8 [c0000076c63abde0] sys_read at c0000000003d1788
   #9 [c0000076c63abe30] system_call at c00000000000b184
   System Call [c01] exception frame:
   R0:  0000000000000003    R1:  00007ffff7a09ae0    R2:  0000753ec21b7f00   
   R3:  0000000000000006    R4:  00007ffff7a0a018    R5:  00000000000003ff   
   R6:  0000000000004000    R7:  0000753ec21898c4    R8:  900000010000d033   
   R9:  0000000000000000    R10: 0000000000000000    R11: 0000000000000000   
   R12: 0000000000000000    R13: 0000753ec224a8d0   
   NIP: 0000753ec2188580    MSR: 900000010000d033    OR3: 0000000000000006
   CTR: 0000000000000000    LR:  000008bb551b5f20    XER: 0000000000000000
   CCR: 0000000042002244    MQ:  0000000000000001    DAR: 0000753ec21affa8
   DSISR: 0000000040000000     Syscall Result: 0000000000000006
  crash> dis -s c0000000003c76f0
  FILE: /build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c
  LINE: 72

  static void report_usercopy(const void *ptr, unsigned long len,
                              bool to_user, const char *type)
  {
          pr_emerg("kernel memory %s attempt detected %s %p (%s) (%lu bytes)\n",
                  to_user ? "exposure" : "overwrite",
                  to_user ? "from" : "to", ptr, type ? : "unknown", len);
          /*
           * For greater effect, it would be nice to do do_group_exit(),
           * but BUG() actually hooks all the lock-breaking and per-arch
           * Oops code, so that is used here instead.
           */
          BUG();
  }

  
  From the logs, I see that the memory exposure happens after the bluetooth 
driver is initialized. This might be an issue with the default bluetooth driver 
provided by the distro. 

  [10795.918866] NET: Registered protocol family 31
  [10795.918909] Bluetooth: HCI device and connection manager initialized
  [10795.918955] Bluetooth: HCI socket layer initialized
  [10795.918991] Bluetooth: L2CAP socket layer initialized
  [10795.919032] Bluetooth: SCO socket layer initialized
  [10798.374850] usercopy: kernel memory exposure attempt detected from 
0000000029431ea4 (<kernel text>) (1023 bytes)
  [10798.374952] ------------[ cut here ]------------
  [10798.374988] kernel BUG at 
/build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c:72!

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1761729/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to