On 05/02/2014 04:14 PM, Richard Weinberger wrote: > Am 02.05.2014 16:07, schrieb Toralf Förster: >> On 05/02/2014 09:46 AM, Richard Weinberger wrote: >>> Am 01.05.2014 23:34, schrieb Toralf Förster: >>>> On 05/01/2014 10:57 PM, Richard Weinberger wrote: >>>>> Toralf, >>>>> >>>>> Yeah, this is because trinity destroys the UML stub code. >>>>> Please test the attached patch, it should fix the root cause of the >>>>> problem. >>>>> >>>>> Thanks, >>>>> //richard >>>>> >>>> >>>> If I do just apply fix2.patch onto latest git tree v3.15-rc3-113-gba6728f >>>> then I do get after a while : >>>> >>>> * Starting sshd ... >>>> [ ok ] >>>> * Starting local >>>> net.core.warnings = 0 >>>> [ ok ] >>>> Kernel panic - not syncing: do_syscall_stub : PTRACE_SETREGS failed, errno >>>> = 3 >>>> >>>> CPU: 0 PID: 1728 Comm: trinity-c0 Not tainted >>>> 3.15.0-rc3-00113-gba6728f-dirty #5 >>>> Stack: >>>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-c0:1728] >>>> >>>> EIP: c500:[<47c6cf00>] CPU: 0 Not tainted EFLAGS: 476af700 >>>> Not tainted >>>> EAX: 47cfc500 EBX: 0a024d00 ECX: 086c75fc EDX: 080fff88 >>>> ESI: 0839f4bc EDI: 47cfc500 EBP: 0839f4bc DS: c500 ES: cd62 >>>> EXT4-fs (ubda): error count: 1 >>>> EXT4-fs (ubda): initial error at 1398962134: ext4_mb_generate_buddy:756 >>>> EXT4-fs (ubda): last error at 1398962134: ext4_mb_generate_buddy:756 >>>> >>>> >>>> which is a big improvement because before it crashes immediately after few >>>> seconds. >>>> >>>> After applying both fixes the test case runs w/o a crash till now. >>> >>> Can you please also try fix3 (without fix1/2)? >>> I think I've found the other hidden issue. >>> So far trinity did not crash my kernel... >>> >>> Thanks, >>> //richard >>> >> >> fix3 made it - till now it runs fine. >> Of course the syslog of the UML guest is flooded with messages like : >> >> May 2 15:45:59 trinity kernel: BUG: Bad rss-counter state mm:47d4d8c0 idx:0 >> val:2 >> May 2 15:46:00 trinity kernel: fix_range_common: failed, killing current >> process: 2983 >> May 2 15:46:00 trinity kernel: fix_range_common: failed, killing current >> process: 2984 >> May 2 15:46:30 trinity kernel: fix_range_common: failed, killing current >> process: 2986 >> May 2 15:46:30 trinity kernel: fix_range_common: failed, killing current >> process: 2989 >> May 2 15:46:30 trinity kernel: fix_range_common: failed, killing current >> process: 2991 >> May 2 15:46:32 trinity kernel: Stub registers - >> May 2 15:46:32 trinity kernel: 0 - 100000 >> May 2 15:46:32 trinity kernel: 1 - 1000 >> May 2 15:46:32 trinity kernel: 2 - 7 >> May 2 15:46:32 trinity kernel: 3 - 11 >> May 2 15:46:32 trinity kernel: 4 - 3 >> May 2 15:46:32 trinity kernel: 5 - 3cbae >> May 2 15:46:32 trinity kernel: 6 - 100000 >> May 2 15:46:32 trinity kernel: 7 - 7b >> May 2 15:46:32 trinity kernel: 8 - 7b >> May 2 15:46:32 trinity kernel: 9 - 0 >> May 2 15:46:32 trinity kernel: 10 - 33 >> May 2 15:46:32 trinity kernel: 11 - ffffffff >> May 2 15:46:32 trinity kernel: 12 - 100fff >> May 2 15:46:32 trinity kernel: 13 - 73 >> May 2 15:46:32 trinity kernel: 14 - 10206 >> May 2 15:46:32 trinity kernel: 15 - 101028 >> May 2 15:46:32 trinity kernel: 16 - 7b >> May 2 15:46:32 trinity kernel: wait_stub_done : failed to wait for SIGTRAP, >> pid = 483, n = 483, errno = 0, status = 0xb7f >> May 2 15:46:32 trinity kernel: BUG: Bad rss-counter state mm:47d4d8c0 idx:0 >> val:1 >> May 2 15:46:32 trinity kernel: fix_range_common: failed, killing current >> process: 3000 >> May 2 15:46:33 trinity kernel: fix_range_common: failed, killing current >> process: 3002 >> May 2 15:46:33 trinity kernel: fix_range_common: failed, killing current >> process: 3004 >> May 2 15:46:33 trinity kernel: fix_range_common: failed, killing current >> process: 3006 >> May 2 15:46:34 trinity kernel: fix_range_common: failed, killing current >> process: 3009 >> May 2 15:46:34 trinity kernel: fix_range_common: failed, killing current >> process: 3010 >> May 2 15:46:34 trinity kernel: fix_range_common: failed, killing current >> process: 3012 >> May 2 15:46:35 trinity kernel: BUG: Bad rss-counter state mm:47d4d8c0 idx:0 >> val:2 >> May 2 15:46:35 trinity kernel: fix_range_common: failed, killing current >> process: 3015 >> >> >> >> which is expected (right ?) b/c I hammered the UML with the syscall "mremap" >> by 2 trinity childs for a while. > > Yeah. Maybe I find a way to prevent "BUG: Bad rss-counter state mm:47d4d8c0 > idx:0 val:1" too. > > Thanks for testing! > //richard > /me wonders, if fix3 will make it in mainline ?
-- Toralf ------------------------------------------------------------------------------ _______________________________________________ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel