Re: [uml-devel] BUG: soft lockup for a user mode linux image

Toralf Förster Sun, 06 Oct 2013 07:19:28 -0700

On 10/04/2013 07:52 AM, Richard Weinberger wrote:
> Am 03.10.2013 21:33, schrieb Toralf Förster:
>> On 10/03/2013 09:20 PM, Richard Weinberger wrote:
>>> Am 03.10.2013 21:16, schrieb Toralf Förster:
>>>> On 10/03/2013 09:04 PM, Richard Weinberger wrote:
>>>>> Am 03.10.2013 20:54, schrieb Toralf Förster:
>>>>>> On 10/02/2013 09:55 PM, Richard Weinberger wrote:
>>>>>>> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <[email protected]> 
>>>>>>> wrote:
>>>>>>>> Running trinity (1 process, no victim files, just "$>trinity -C1) for 
>>>>>>>> a longer time
>>>>>>>> within a 32 bit user mode linux image with a recent git kernel (host: 
>>>>>>>> 3.11.3 guest 3.12-rc3-g...)
>>>>>>>> yields into this konsole message :
>>>>>>>>
>>>>>>>>  * Starting local
>>>>>>>> net.core.warnings = 0                                                  
>>>>>>>>                        [ ok ]
>>>>>>>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>>>>>>>
>>>>>>>>
>>>>>>>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 
>>>>>>>> CPU core.
>>>>>>>> 2 subsequent made back traces made with
>>>>>>>>
>>>>>>>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>>>>>>>
>>>>>>>> shows nearly a similar position around __get_user_pages() - both are 
>>>>>>>> attached.
>>>>>>>>
>>>>>>>> I'm not surprised that trinity harms a systems - I'm just wondering 
>>>>>>>> whether this particular picture is
>>>>>>>> expected or if it points to an issue.
>>>>>>>>
>>>>>>>>
>>>>>>>> FWIW the last lines of trinity log were :
>>>>>>>>
>>>>>>>>
>>>>>>>> [2031] [94] setsid() = 2031
>>>>>>>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, 
>>>>>>>> sgid=0x22000040) = -1 (Operation not permitted)
>>>>>>>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 
>>>>>>>> 0x3000
>>>>>>>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, 
>>>>>>>> suid=0xff19b6fa) = -1 (Operation not permitted)
>>>>>>>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, 
>>>>>>>> niceval=0x8088960c) = -1 (Invalid argument)
>>>>>>>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family 
>>>>>>>> not supported by protocol)
>>>>>>>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>>>>>>>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 
>>>>>>>> 3
>>>>>>>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid 
>>>>>>>> argument)
>>>>>>>> [2031] [103] sigpending(set=0x40025000) = 0
>>>>>>>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, 
>>>>>>>> flags=0x80800) = -1 (Invalid argument)
>>>>>>>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = 
>>>>>>>> -1 (Invalid argument)
>>>>>>>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, 
>>>>>>>> dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid 
>>>>>>>> argument)
>>>>>>>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 
>>>>>>>> (Invalid argument)
>>>>>>>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>>>>>>>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, 
>>>>>>>> suid=0xffff2e33) = -1 (Operation not permitted)
>>>>>>>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>>>>>>>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>>>>>>>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid 
>>>>>>>> argument)
>>>>>>>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 
>>>>>>>> (Operation not permitted)
>>>>>>>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, 
>>>>>>>> niceval=0xffff0682) = -1 (Invalid argument)
>>>>>>>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad 
>>>>>>>> address)
>>>>>>>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>>>>>>>> [2031] [117] inotify_init() = 654
>>>>>>>> [2031] [118] getgid() = 100
>>>>>>>> [2031] [119] fstatat64(dfd=382, 
>>>>>>>> filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, 
>>>>>>>> flag=0xb545d727) = -1 (Invalid argument)
>>>>>>>> [2031] [120] unlinkat(dfd=382, 
>>>>>>>> pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", 
>>>>>>>> flag=0xc00ef76) = -1 (Invalid argument)
>>>>>>>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>>>>>>>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>>>>>>>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>>>>>>>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid 
>>>>>>>> argument)
>>>>>>>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, 
>>>>>>>> u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>>>>>>>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a 
>>>>>>>> directory)
>>>>>>>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>>>>>>>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad 
>>>>>>>> address)
>>>>>>>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 
>>>>>>>> (Operation not permitted)
>>>>>>>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>>>>>>>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", 
>>>>>>>> mode=525) = -1 (File exists)
>>>>>>>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, 
>>>>>>>> size=0, flags=0) = -1 (Numerical result out of range)
>>>>>>>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by 
>>>>>>>> signal 2.
>>>>>>>
>>>>>>> Reading your gdb backtraces show that schedule_timeout() got called
>>>>>>> with a negative value.
>>>>>>> Looks like an integer overflow.
>>>>>>> The soft-lockup might also origin from that (very big integer which
>>>>>>> did not overflow jet)
>>>>>>>
>>>>>>
>>>>>> If the culprit is solved by this patch I'd like to send it out. But I'm
>>>>>> unsure whether it catches the culprit or if it just covers the root 
>>>>>> cause.
>>>>>
>>>>> I fear your Patch will not fix the issue.
>>>>>
>>>>> Does the issue only trigger on 32bit UMLs?
>>>> No diea, I do only have a 32 bit system here (both host and client).
>>>>
>>>>> How long does it take till trinity hits it?
>>>> a command like
>>>>
>>>> $> ssh tfoerste@trinity "rm -rf t3; mkdir t3; cd t3; trinity -C4"
>>>>
>>>> usually needs 10 till 15 min to trigger the issue. With just 1 trinity
>>>> task (-C1) however it needs often a hour or more.
>>>
>>> That's good. :-)
>>> You can place some printk()s into balance_dirty_pages() and observe the 
>>> values
>>> of period, max_pause, min_pause, etc...
>>> Maybe this will give us a clue.
>>
>> Will do it again. I did it before already but period was the only thing
>> which becames otherwise negative -
>> and because pages_dirtied was often in the range of 20 - 30 and perioad
>> around -3000
> 
> Did you print it before or after
>                 if (current->dirty_paused_when)
>                         pause -= now - current->dirty_paused_when;
> 
> I'm also wondering why
>               if (pause < min_pause) {
> did not trigger or did it?
> 
>> I just wondered if HZ is casts as a short int in the calculation ?
>> BTW printk doesn't showed anything in the syslog (ok, I used
>> printk(KERN_ERR ...)
> 
> syslog or dmesg?
> 
> Thanks,
> //richard
>


back from going a mushroom foray - I nearly surrounded all "pause=" statements
with BUG_ON and downgrades trinity (b/c their latest enhancements made trinity 
too weak)

The UML stopped here :
...
                if (unlikely(task_ratelimit == 0)) {
                        period = max_pause;
                        pause = max_pause;
                        BUG_ON(pause < 0);
                        goto pause;
                }
                BUG_ON(pages_dirtied < 0);
                BUG_ON(task_ratelimit < 0);
                period = HZ * pages_dirtied / task_ratelimit;
                BUG_ON(period < 0);         <----------------------here
                pause = period;
                BUG_ON(pause < 0);
                if (current->dirty_paused_when)
                        pause -= now - current->dirty_paused_when;
                BUG_ON(pause < 0);
...

The back trace is :

tfoerste@n22 ~/devel/trinity $ gdb --core=/mnt/ramdisk/core 
/home/tfoerste/devel/linux/linux -batch -ex bt[New LWP 2522]
Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk 
ubda=/home/tfoerste/virtual/uml/tr'.
Program terminated with signal 6, Aborted.
#0  0xb777a424 in __kernel_vsyscall ()
#0  0xb777a424 in __kernel_vsyscall ()
#1  0x083b33b5 in kill ()
#2  0x0807190d in uml_abort () at arch/um/os-Linux/util.c:93
#3  0x08071c45 in os_dump_core () at arch/um/os-Linux/util.c:148
#4  0x08061417 in panic_exit (self=0x85b9558 <panic_exit_notifier>, unused1=0, 
unused2=0x85ef720 <buf.16084>) at arch/um/kernel/um_arch.c:240
#5  0x0809a7d8 in notifier_call_chain (nl=0x0, val=0, v=0x85ef720 <buf.16084>, 
nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
#6  0x0809a923 in __atomic_notifier_call_chain (nr_calls=<optimized out>, 
nr_to_call=<optimized out>, v=<optimized out>, val=<optimized out>, 
nh=<optimized out>) at kernel/notifier.c:182
#7  atomic_notifier_call_chain (nh=0x85ef704 <panic_notifier_list>, val=0, 
v=0x85ef720 <buf.16084>) at kernel/notifier.c:191
#8  0x08410d1c in panic (fmt=0x0) at kernel/panic.c:130
#9  0x08411c64 in balance_dirty_pages (pages_dirtied=9, mapping=<optimized 
out>) at mm/page-writeback.c:1471
#10 0x080d1ce4 in balance_dirty_pages_ratelimited (mapping=0x6) at 
mm/page-writeback.c:1663
#11 0x080e2d0c in __do_fault (mm=0x4870da00, vma=0x487484c8, 
address=1095131136, pmd=0x0, pgoff=0, flags=1212323992, orig_pte=<incomplete 
type>) at mm/memory.c:3452
#12 0x080e5286 in do_nonlinear_fault (orig_pte=..., flags=<optimized out>, 
pmd=<optimized out>, address=<optimized out>, vma=<optimized out>, 
mm=<optimized out>, page_table=<optimized out>) at mm/memory.c:3518
#13 handle_pte_fault (flags=<optimized out>, pmd=<optimized out>, 
pte=<optimized out>, address=<optimized out>, vma=<optimized out>, 
mm=<optimized out>) at mm/memory.c:3717
#14 __handle_mm_fault (flags=<optimized out>, address=<optimized out>, 
vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3845
#15 handle_mm_fault (mm=0x4870da00, vma=0x487484c8, address=1095131136, 
flags=1) at mm/memory.c:3868
#16 0x080e5a07 in __get_user_pages (tsk=0x47d5a400, mm=0x4870da00, 
start=1095131136, nr_pages=1025, gup_flags=519, pages=0x47c50000, vmas=0x0, 
nonblocking=0x0) at mm/memory.c:1822
#17 0x080e5cc3 in get_user_pages (tsk=0x0, mm=0x0, start=0, nr_pages=0, 
write=1, force=0, pages=0x47fafb98, vmas=0x6) at mm/memory.c:2019
#18 0x08140d0e in aio_setup_ring (ctx=<optimized out>) at fs/aio.c:340
#19 ioctx_alloc (nr_events=<optimized out>) at fs/aio.c:605
#20 SYSC_io_setup (ctxp=<optimized out>, nr_events=<optimized out>) at 
fs/aio.c:1122
#21 SyS_io_setup (nr_events=65535, ctxp=135045120) at fs/aio.c:1105
#22 0x080619c2 in handle_syscall (r=0x47d5a5d4) at 
arch/um/kernel/skas/syscall.c:35
#23 0x08073f2d in handle_trap (local_using_sysemu=<optimized out>, 
regs=<optimized out>, pid=<optimized out>) at 
arch/um/os-Linux/skas/process.c:198
#24 userspace (regs=0x47d5a5d4) at arch/um/os-Linux/skas/process.c:431
#25 0x0805e6ac in fork_handler () at arch/um/kernel/process.c:160
#26 0x00000000 in ?? ()

I'll try again to see if printk's will help - how do I force a sync after a 
printk ?

-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

Re: [uml-devel] BUG: soft lockup for a user mode linux image

Reply via email to