You have been subscribed to a public bug:

== Comment: #0 - Santhosh G  ==
Problem Statement:
NMI watchdog bug and call traces occurs when trinity is executed.

Environment:
P8 PowerVM Lpar

uname o/p:
uname -a
Linux tuleta4u-lp5 4.4.0-11-generic #26-Ubuntu SMP Sat Mar 5 14:21:51 UTC 2016 
ppc64le ppc64le ppc64le GNU/Linux

Steps to reproduce:

1) Install ubuntu 16.04 in a PowerVM LPAR.
2) Download trinity-1.5 and set up ./configure.sh;make;make install
3)Execute trinity as 
   './trinity --dangerous'

The test runs for more than one hour and trinity gets killed with call
traces:

[19744.229979] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 21s! 
[trinity-c3:26544]
[19744.229991] Modules linked in: hidp hid bnep rfcomm l2tp_ppp l2tp_netlink 
l2tp_core ip6_udp_tunnel udp_tunnel af_key mpls_router llc2 nfnetlink dn_rtmsg 
xfrm_user xfrm_algo can_raw crypto_user can_bcm cmtp kernelcapi 
scsi_transport_iscsi sctp libcrc32c nfc af_alg caif_socket caif phonet af_rxrpc 
bluetooth can pppoe pppox irda crc_ccitt atm appletalk ipx p8023 p8022 psnap 
llc pseries_rng rtc_generic autofs4 ibmvscsi ibmveth
[19744.230024] CPU: 3 PID: 26544 Comm: trinity-c3 Not tainted 4.4.0-11-generic 
#26-Ubuntu
[19744.230026] task: c00000000ae87e60 ti: c00000000ae24000 task.ti: 
c00000000ae24000
[19744.230028] NIP: c0000000003fac78 LR: c0000000003fabfc CTR: c00000000039ef10
[19744.230029] REGS: c00000000ae27980 TRAP: 0901   Not tainted  
(4.4.0-11-generic)
[19744.230030] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24004444  XER: 
20000000
[19744.230035] CFAR: c0000000003fae6c SOFTE: 1
               GPR00: c0000000003fabfc c00000000ae27c00 c0000000015a3b00 
c0000000f7f03ba8
               GPR04: 000000000e02adcb c00000000ae27cb0 0000000000000000 
0000000000000000
               GPR08: 8000000000000000 0000000000000000 c0000000ef886000 
c000000000af0870
               GPR12: 0000000024004444 c00000000e7f1c80
[19744.230045] NIP [c0000000003fac78] ext4_es_lookup_extent+0xc8/0x2c0
[19744.230047] LR [c0000000003fabfc] ext4_es_lookup_extent+0x4c/0x2c0
[19744.230048] Call Trace:
[19744.230050] [c00000000ae27c00] [c0000000003fabfc] 
ext4_es_lookup_extent+0x4c/0x2c0 (unreliable)
[19744.230053] [c00000000ae27c50] [c0000000003a6f18] ext4_map_blocks+0x78/0x610
[19744.230055] [c00000000ae27d10] [c00000000039f14c] ext4_llseek+0x23c/0x3f0
[19744.230057] [c00000000ae27de0] [c0000000002e02a8] SyS_lseek+0xe8/0x130
[19744.230060] [c00000000ae27e30] [c000000000009204] system_call+0x38/0xb4
[19744.230061] Instruction dump:
[19744.230062] 2fa90000 409effec e93e0028 3b800000 e9490458 e92a0440 39290001 
f92a0440
[19744.230065] 7c2004ac 7d20d828 3129ffff 7d20d92d <40c2fff4> 60000000 7f83e378 
38210050


== Comment: #8 - Santhosh G  ==

Tried the scenario as given in 
https://bugzilla.linux.ibm.com/show_bug.cgi?id=128126#c26
-----
# Create a 624GiB file; Mostly filled with holes though
$ dd if=/dev/zero of=file-0.bin bs=1M count=1 seek=598382 
# Invoke lseek with SEEK_DATA option starting with file offset 0
while [ 1 ]; do xfs_io -f -c "seek -d 0" file-0.bin; done
----
and I was able to hit the issue in 16.04.1 

kernel version:
4.4.0-28-generic

dmesg o/p:

[ 1197.994822]  40-...: (5249 ticks this GP) idle=975/140000000000001/0 
softirq=7812/7812 fqs=5251 
[ 1197.995071]   (t=5251 jiffies g=29144 c=29143 q=3418)
[ 1197.995115] Task dump for CPU 40:
[ 1197.995117] xfs_io          R  running task        0  3601   3489 0x00040004
[ 1197.995121] Call Trace:
[ 1197.995126] [c000003c7c8675b0] [c0000000000fbc00] sched_show_task+0xe0/0x180 
(unreliable)
[ 1197.995131] [c000003c7c867620] [c00000000013eb74] 
rcu_dump_cpu_stacks+0xe4/0x150
[ 1197.995134] [c000003c7c867670] [c0000000001442a4] 
rcu_check_callbacks+0x6b4/0x9b0
[ 1197.995136] [c000003c7c8677a0] [c00000000014c108] 
update_process_times+0x58/0xa0
[ 1197.995140] [c000003c7c8677d0] [c000000000163818] 
tick_sched_handle.isra.6+0x48/0xe0
[ 1197.995143] [c000003c7c867810] [c000000000163914] tick_sched_timer+0x64/0xd0
[ 1197.995146] [c000003c7c867850] [c00000000014cbd4] 
__hrtimer_run_queues+0x124/0x450
[ 1197.995148] [c000003c7c8678e0] [c00000000014dbfc] 
hrtimer_interrupt+0xec/0x2c0
[ 1197.995152] [c000003c7c8679a0] [c00000000001f5bc] 
__timer_interrupt+0x8c/0x290
[ 1197.995154] [c000003c7c8679f0] [c00000000001f970] timer_interrupt+0xa0/0xe0
[ 1197.995157] [c000003c7c867a20] [c000000000002714] 
decrementer_common+0x114/0x180
[ 1197.995163] --- interrupt: 901 at 
ext4_es_find_delayed_extent_range+0x20/0x2b0
                   LR = ext4_llseek+0x268/0x3f0
[ 1197.995166] [c000003c7c867d10] [c0000000003a170c] ext4_llseek+0x23c/0x3f0 
(unreliable)
[ 1197.995170] [c000003c7c867de0] [c0000000002e1f08] SyS_lseek+0xe8/0x130
[ 1197.995173] [c000003c7c867e30] [c000000000009204] system_call+0x38/0xb4

=====

The call traces does not occur when tried with the kernel with patch.

** Affects: linux (Ubuntu)
     Importance: High
     Assignee: Canonical Kernel Team (canonical-kernel-team)
         Status: Triaged


** Tags: architecture-ppc64le bugnameltc-138650 severity-high 
targetmilestone-inin16041
-- 
[LTC-Test] - NMI watchdog Bug and call traces when trinity is executed.
https://bugs.launchpad.net/bugs/1602524
You received this bug notification because you are a member of Kernel Packages, 
which is subscribed to linux in Ubuntu.

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to