** Also affects: linux-xilinx-zynqmp (Ubuntu)
   Importance: Undecided
       Status: New

** Also affects: linux-xilinx-zynqmp (Ubuntu Focal)
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-xilinx-zynqmp in Ubuntu.
https://bugs.launchpad.net/bugs/1998738

Title:
  dev test from ubuntu_stress_smoke_tests cause kernel oops on F-5.4
  xilinx ZCU106

Status in ubuntu-kernel-tests:
  New
Status in linux-xilinx-zynqmp package in Ubuntu:
  New
Status in linux-xilinx-zynqmp source package in Focal:
  New

Bug description:
  This issue can only be reproduced on ZCU106, it will cause some
  leftover processes running and eventually cause the jenkins job hang.

  stress-ng with commit 91ec6bccd7 (V0.15.00)

   stress-ng: invoked with './stress-ng -v -t 5 --dev 4 --dev-ops 3000 
--ignite-cpu --syslog --verbose --verify --oomable' by user 0 'root'
   stress-ng: system: '202008-28164-ZCU106' Linux 5.4.0-1019-xilinx-zynqmp 
#22-Ubuntu SMP Thu Nov 17 05:04:22 UTC 2022 aarch64
   stress-ng: memory (MB): total 3929.76, free 2479.07, shared 4.30, buffer 
59.98, swap 0.00, free swap 0.00
   stress-ng: info:  [3037] setting to a 5 second run per stressor
   stress-ng: info:  [3037] dispatching hogs: 4 dev
   kernel: [  981.702313] xilinx-multiscaler a00e0000.v_multi: Channel 0 
instance created
   kernel: [  981.702829] xilinx-multiscaler a00e0000.v_multi: Channel 0 
instance released
   kernel: [  981.708039] xilinx-multiscaler a00e0000.v_multi: Channel 0 
instance created
   kernel: [  981.708569] xilinx-multiscaler a00e0000.v_multi: Channel 0 
instance released
   kernel: [  981.709027] xilinx-multiscaler a00e0000.v_multi: Channel 0 
instance created
   kernel: [  981.709501] xilinx-multiscaler a00e0000.v_multi: Channel 0 
instance released
   kernel: [  981.734320] xilinx-multiscaler a00e0000.v_multi: Channel 0 
instance created
   kernel: [  981.734859] xilinx-multiscaler a00e0000.v_multi: Channel 0 
instance released

  Message from syslogd@202008-28164-ZCU106 at Dec  5 05:11:01 ...
   kernel:[  981.797006] Internal error: Oops: 96000004 [#1] SMP
   kernel: [  981.768878] xilinx-multiscaler a00e0000.v_multi: Channel 0 
instance created
   kernel: [  981.768958] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan 
already opened for minor = 1
   kernel: [  981.768961] Unable to handle kernel access to user memory outside 
uaccess routines at virtual address 0000087000000f48
   kernel: [  981.768966] Mem abort info:
   kernel: [  981.779704] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan 
already opened for minor = 1
   kernel: [  981.782475]   ESR = 0x96000004
   kernel: [  981.782478]   EC = 0x25: DABT (current EL), IL = 32 bits
   kernel: [  981.782480]   SET = 0, FnV = 0
   kernel: [  981.782484]   EA = 0, S1PTW = 0
   kernel: [  981.785524] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan 
already opened for minor = 1
   kernel: [  981.790822] Data abort info:
   kernel: [  981.790824]   ISV = 0, ISS = 0x00000004
   kernel: [  981.790826]   CM = 0, WnR = 0
   kernel: [  981.790830] user pgtable: 4k pages, 48-bit VAs, 
pgdp=0000000838768000
   kernel: [  981.790833] [0000087000000f48] pgd=0000000000000000
   kernel: [  981.793875] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan 
already opened for minor = 1
   kernel: [  981.797006] Internal error: Oops: 96000004 [#1] SMP
   kernel: [  981.797010] Modules linked in: xt_conntrack ipt_REJECT 
nf_reject_ipv4 ip6table_nat xt_CHECKSUM iptable_nat xt_MASQUERADE nf_nat 
iptable_filter fuse dm_multipath dm_mod al5e al5d allegro xlnx_vcu_clk xlnx_vcu 
xilinx_hdmi_tx xilinx_hdmi_rx xlnx_vcu_core dp159 xilinx_vphy lm63 ina2xx_adc 
mali dmaproxy nfsd zocl
   kernel: [  981.805628] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan 
already opened for minor = 1
   kernel: [  981.808485] CPU: 1 PID: 3044 Comm: stress-ng-dev Not tainted 
5.4.0-1019-xilinx-zynqmp #22-Ubuntu
   kernel: [  981.808487] Hardware name: ZynqMP ZCU106 RevA (DT)
   kernel: [  981.808491] pstate: 00400005 (nzcv daif +PAN -UAO)
   kernel: [  981.812321] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan 
already opened for minor = 1
   kernel: [  981.815269] pc : __mutex_lock.isra.0+0x170/0x510
   kernel: [  981.815273] lr : __mutex_lock_slowpath+0x28/0x38
   kernel: [  981.815276] sp : ffff800017c3bb30
   kernel: [  981.821772] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan 
already opened for minor = 1
   kernel: [  981.826563] x29: ffff800017c3bb30 x28: ffff00083460ec00
   kernel: [  981.826567] x27: 0000ffffb3f2f000 x26: ffff000855fda500
   kernel: [  981.826571] x25: 0000000000000000 x24: ffff0008498fd400
   kernel: [  981.826574] x23: 0000000000000031 x22: ffff000875878750
   kernel: [  981.826578] x21: 0000000000000002 x20: ffff0008385d4e40
   kernel: [  981.835222] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan 
already opened for minor = 1
   kernel: [  981.840035] x19: ffff0008758787f0 x18: 0000000000000000
   kernel: [  981.840039] x17: 0000000000000000 x16: 0000000000000000
   kernel: [  981.840042] x15: 0000000000000000 x14: 0000000000000000
   kernel: [  981.840046] x13: 0000000000000000 x12: 0000000000000000
   kernel: [  981.868428] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan 
already opened for minor = 1
   kernel: [  981.875905] x11: 0000000000000000 x10: 0000000000100000
   kernel: [  981.875909] x9 : 00000000000000fb x8 : 0000000010044400
   kernel: [  981.875912] x7 : 0000000000000000 x6 : ffff00083460e0c0
   kernel: [  981.875915] x5 : 0000000000000015 x4 : 0000000000000014
   kernel: [  981.875919] x3 : 0000087000000f00 x2 : ffff0008385d4e40
   kernel: [  981.875922] x1 : 0000087000000f00 x0 : 0000087000000f00
   kernel: [  981.875926] Call trace:
   kernel: [  981.875933]  __mutex_lock.isra.0+0x170/0x510
   kernel: [  981.875939]  __mutex_lock_slowpath+0x28/0x38
   kernel: [  981.885784] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan 
already opened for minor = 1
   kernel: [  981.889485]  mutex_lock+0x48/0x58
   kernel: [  981.889491]  xm2msc_mmap+0x38/0x68
   kernel: [  981.889497]  v4l2_mmap+0x7c/0xb8
   kernel: [  981.889504]  mmap_region+0x364/0x5b0
   kernel: [  981.889511]  do_mmap+0x294/0x478
   kernel: [  981.894358] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan 
already opened for minor = 1
   kernel: [  981.902880]  vm_mmap_pgoff+0xf4/0x120
   kernel: [  981.902885]  ksys_mmap_pgoff+0x1ac/0x240
   kernel: [  981.902891]  __arm64_sys_mmap+0x38/0x50
   kernel: [  981.902897]  el0_svc_common.constprop.0+0x78/0x180
   kernel: [  981.902903]  el0_svc_handler+0x84/0xa0

  Message from syslogd@202008-28164-ZCU106 at Dec  5 05:11:01 ...
   kernel:[  981.912115] Code: a94153f3 a9425bf5 a8c97bfd d65f03c0 (b9404801)
   kernel: [  981.907665] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan 
already opened for minor = 1
   kernel: [  981.912107]  el0_svc+0x8/0x1c0
   kernel: [  981.912115] Code: a94153f3 a9425bf5 a8c97bfd d65f03c0 (b9404801)
   kernel: [  981.912121] ---[ end trace bab66edb32cbb4db ]---

  
  Here is the output when running this test:
  $ time sudo ./stress-ng -v -t 5 --dev 4 --dev-ops 3000 --ignite-cpu --syslog 
--verbose --verify --oomable
  stress-ng: debug: [3037] invoked with './stress-ng -v -t 5 --dev 4 --dev-ops 
3000 --ignite-cpu --syslog --verbose --verify --oomable' by user 0 'root'
  stress-ng: debug: [3037] stress-ng 0.15.00 g91ec6bccd7e9
  stress-ng: debug: [3037] system: Linux 202008-28164-ZCU106 
5.4.0-1019-xilinx-zynqmp #22-Ubuntu SMP Thu Nov 17 05:04:22 UTC 2022 aarch64
  stress-ng: debug: [3037] RAM total: 3.8G, RAM free: 2.4G, swap free: 0.0
  stress-ng: debug: [3037] temporary file path: '.', filesystem type: ext2
  stress-ng: debug: [3037] 4 processors online, 4 processors configured
  stress-ng: info:  [3037] setting to a 5 second run per stressor
  stress-ng: info:  [3037] dispatching hogs: 4 dev
  stress-ng: debug: [3037] cache allocate: using defaults, cannot determine 
cache level details
  stress-ng: debug: [3037] cache allocate: shared cache buffer size: 2048K
  stress-ng: debug: [3037] starting stressors
  stress-ng: debug: [3039] dev: started [3039] (instance 0)
  stress-ng: debug: [3040] dev: started [3040] (instance 1)
  stress-ng: debug: [3037] 4 stressors started
  stress-ng: debug: [3041] dev: started [3041] (instance 2)
  stress-ng: debug: [3042] dev: started [3042] (instance 3)

  Message from syslogd@202008-28164-ZCU106 at Dec  5 05:11:01 ...
   kernel:[  981.797006] Internal error: Oops: 96000004 [#1] SMP

  Message from syslogd@202008-28164-ZCU106 at Dec  5 05:11:01 ...
   kernel:[  981.912115] Code: a94153f3 a9425bf5 a8c97bfd d65f03c0 (b9404801) 
  stress-ng: debug: [3042] dev: exited [3042] (instance 3)
  stress-ng: debug: [3041] dev: exited [3041] (instance 2)
  stress-ng: info:  [3039] dev: 19 of 383 devices opened and exercised
  stress-ng: debug: [3039] dev: exited [3039] (instance 0)
  stress-ng: debug: [3037] process [3039] terminated
  (hung here)

  You can see process 3040 did not exit here.

  strace output:
  $ sudo strace -p 3040
  strace: Process 3040 attached
  wait4(3044, 0xffffda2c3214, 0, NULL)    = ? ERESTARTSYS (To be restarted if 
SA_RESTART is set)
  --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
  getpid()                                = 3040
  setitimer(ITIMER_REAL, {it_interval={tv_sec=0, tv_usec=0}, 
it_value={tv_sec=1, tv_usec=0}}, {it_interval={tv_sec=0, tv_usec=0}, 
it_value={tv_sec=0, tv_usec=0}}) = 0
  rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
  kill(3044, SIGALRM)                     = 0
  kill(3044, SIGKILL)                     = 0
  clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, {tv_sec=0, 
tv_nsec=989179}) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
  --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
  getpid()                                = 3040
  setitimer(ITIMER_REAL, {it_interval={tv_sec=0, tv_usec=0}, 
it_value={tv_sec=1, tv_usec=0}}, {it_interval={tv_sec=0, tv_usec=0}, 
it_value={tv_sec=0, tv_usec=0}}) = 0
  rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
  wait4(3044, 0xffffda2c3214, 0, NULL)    = ? ERESTARTSYS (To be restarted if 
SA_RESTART is set)
  --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
  getpid()                                = 3040
  setitimer(ITIMER_REAL, {it_interval={tv_sec=0, tv_usec=0}, 
it_value={tv_sec=1, tv_usec=0}}, {it_interval={tv_sec=0, tv_usec=0}, 
it_value={tv_sec=0, tv_usec=0}}) = 0
  rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
  kill(3044, SIGALRM)                     = 0
  kill(3044, SIGKILL)                     = 0
  clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, {tv_sec=0, 
tv_nsec=505466}) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
  (repeats)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1998738/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to