Hi,

at Antti's request I have been debugging rumprun-xen crashes on Ubuntu
14.04.1 LTS, which result in rumprun-xen being non-functional on this
platform.

The symptoms are that the domU crashes during boot in Mini-OS
arch_init_mm(), specifically during the first call to minios_printk() in
set_readonly(). Commenting out that printk causes the crash to move to the
next call to printk in the following while loop.

The crash is accompanied by the following dump from Xen (via xl dmesg):

(XEN) domain_crash_sync called from entry.S: fault at ffff82d08021b218 
create_bounce_frame+0x63/0x13b
(XEN) Domain 50 (vcpu#0) crashed on cpu#2:
(XEN) ----[ Xen-4.4.1  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    2
(XEN) RIP:    e033:[<00000000000032e8>]
(XEN) RFLAGS: 0000000000010a82   EM: 0   CONTEXT: pv guest
(XEN) rax: 0000000000003067   rbx: 0000000000000424   rcx: 000000000032e845
(XEN) rdx: 00000000002453e0   rsi: 0000000000000000   rdi: 0000000000003308
(XEN) rbp: 00000000000033e8   rsp: 0000000000001008   r8:  000000000000fc70
(XEN) r9:  000000000027d530   r10: 0000000000007ff0   r11: 0000000000000213
(XEN) r12: 0000000000279400   r13: 00000000002453e0   r14: 000000000000fc70
(XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000001526f0
(XEN) cr3: 0000000583c37000   cr2: 0000000000000213
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=0000000000001008:
(XEN)    00000000010a8640 0000000000000213 00000000000032e8 000000000000e030
(...)

RIP in this case points to code in
_minios_entry_coprocessor_segment_overrun, however I have also seen it
point directly to _minios_entry_page_fault and am inclined to think it is
bogus.

Single-stepping through the code with GDB results in the target freezing
somewhere around 0x00000000000143d7 inside _vsnprintf_l. I say "somewhere"
because I'm not 100% sure that "stepi" combined with gdbsx always steps
only a single instruction.

Comparing the disassembly of the start of _vsnprintf_l on a known working
machine vs Ubuntu 14.04.1 is interesting:

Known good (Debian jessie, GCC version Debian 4.9.1-19):

   0x0000000000013d40 <+0>:     push   %r14
   0x0000000000013d42 <+2>:     push   %r13
   0x0000000000013d44 <+4>:     mov    %rcx,%r14
   0x0000000000013d47 <+7>:     push   %r12
   0x0000000000013d49 <+9>:     push   %rbp
   0x0000000000013d4a <+10>:    mov    %rdx,%r12
   0x0000000000013d4d <+13>:    push   %rbx
   0x0000000000013d4e <+14>:    mov    %rdi,%rbp
   0x0000000000013d51 <+17>:    mov    %rsi,%rbx
   0x0000000000013d54 <+20>:    mov    %r8,%r13
   0x0000000000013d57 <+23>:    sub    $0x260,%rsp
   0x0000000000013d5e <+30>:    test   %rsi,%rsi
   0x0000000000013d61 <+33>:    je     0x13d6c <_vsnprintf_l+44>

Ubuntu (14.04.01 LTS, GCC version Ubuntu 4.8.2-19ubuntu1):

   0x00000000000143c0 <+0>:     push   %r14
   0x00000000000143c2 <+2>:     mov    %r8,%r14
   0x00000000000143c5 <+5>:     push   %r13
   0x00000000000143c7 <+7>:     mov    %rdx,%r13
   0x00000000000143ca <+10>:    push   %r12
   0x00000000000143cc <+12>:    mov    %rdi,%r12
   0x00000000000143cf <+15>:    push   %rbp
   0x00000000000143d0 <+16>:    mov    %rcx,%rbp
   0x00000000000143d3 <+19>:    push   %rbx
   0x00000000000143d4 <+20>:    mov    %rsi,%rbx
=> 0x00000000000143d7 <+23>:    sub    $0x260,%rsp
=> 0x00000000000143de <+30>:    mov    %fs:0x28,%rax
=> 0x00000000000143e7 <+39>:    mov    %rax,0x258(%rsp)
   0x00000000000143ef <+47>:    xor    %eax,%eax
   0x00000000000143f1 <+49>:    test   %rsi,%rsi
   0x00000000000143f4 <+52>:    jne    0x14518 <_vsnprintf_l+344>

The crash happens somewhere around the point marked by => above. 

Oddly enough, %fs is always zero but this does not seem to cause a crash in
previous calls to printk/vsnprintf_l.

Any ideas? At this point I'm not sure how to proceed. Also, any knowledge
on what the message from Xen actually means would help.

Thanks,

Martin

------------------------------------------------------------------------------
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
_______________________________________________
rumpkernel-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rumpkernel-users

Reply via email to