Hi,

 

My company is using UML to simulate the Linux OS on embedded systems to
do off-target testing. However during a special but complex testing a
UML virtual machine hangs for about every 4 times. It took me a few
weeks to debug it but so far I didn't get any breakthrough.

 

Symptom:

1 virtual machine (there are totally 5 connected together with
uml_switch) hangs. No response to key input (no blank line appears at
all).

 

Backtrace:

After it hangs, use gdb to attach to it and I usually get this
backtrace:

#28 0x080b5ed0 in check_poison_obj (cachep=0x27c4e5a0, objp=0x27c9d000)
at /cc/4

#29 0x080b71d9 in cache_alloc_debugcheck_after (cachep=0x27c4e5a0,
flags=208, ob

    at /cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/mm/slab.c:3072

#30 0x080b7547 in kmem_cache_alloc (cachep=0x27c4e5a0, flags=0) at
/cc/4gbts/oss

#31 0x080bfddf in getname (filename=0x619e1f <Address 0x619e1f out of
bounds>) a

#32 0x080b955b in do_sys_open (dfd=-100, filename=0x619e1f <Address
0x619e1f out

    at /cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/fs/open.c:1086

#33 0x080b9623 in sys_open (filename=0x619e1f <Address 0x619e1f out of
bounds>,

    at /cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/fs/open.c:1113

#34 0x0805f120 in handle_syscall (r=0x27365374) at
/cc/4gbts/oss_1/target/uml/li

#35 0x0806d1a0 in handle_trap (pid=17303, regs=0x27365374,
local_using_sysemu=0)

    at
/cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/arch/um/os-Linux/skas/proc
e

#36 0x0806d63e in userspace (regs=0x27365374) at
/cc/4gbts/oss_1/target/uml/linu

#37 0x0805c800 in fork_handler () at include2/asm/thread_info.h:49

#38 0x00000000 in ?? ()

 

Some debugging I did:

1.       If I single step it, it shows the function userspace never has
a chance to go out. If I set a breakpoint at "schedule", it shows no
schedule happens.

2.       If I set a breakpoint at do_IRQ, I can also get the following
backtrace, which shows the process is delivering real time alarm
signals.

#0  do_IRQ (irq=0, regs=0x1) at
/cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/include/asm-generic/irq_re
gs.h:33

#1  0x0805dc3b in timer_handler (sig=26, regs=0x26f8ef0c)

    at
/cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/arch/um/kernel/time.c:29

#2  0x0806aaaf in real_alarm_handler (sc=0x1) at
/cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/arch/um/os-Linux/signal.c:
94

#3  0x0806ad3e in unblock_signals () at
/cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/arch/um/os-Linux/signal.c:
277

#4  0x0806d7b4 in userspace (regs=0x27054354)

    at
/cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/arch/um/os-Linux/skas/proc
ess.c:450

#5  0x0805c8de in fork_handler () at include2/asm/thread_info.h:49

#6  0x00000000 in ?? ()

3.       If I set breakpoint at sigio_handler, I can also get a
backtrace which shows the process is responding to key input.

#0  sigio_handler (sig=29, regs=0x8592c6c) at
/cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/arch/um/kernel/irq.c:80

#1  0x0806aa2f in sig_handler_common (sig=29, sc=0x8592d28)

    at
/cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/arch/um/os-Linux/signal.c:
49

#2  0x0806aa72 in sig_handler (sig=29, sc=0x8592d28)

    at
/cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/arch/um/os-Linux/signal.c:
81

#3  0x0806ab95 in handle_signal (sig=-1207980044, sc=0x8592d28)

    at
/cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/arch/um/os-Linux/signal.c:
158

#4  0x0806c647 in hard_handler (sig=29)

    at
/cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/arch/um/os-Linux/sys-i386/
signal.c:12

#5  <signal handler called>

#6  check_poison_obj (cachep=0x27c4e5a0, objp=0x27c9d000) at
/cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/mm/slab.c:1854

#7  0x080b7469 in cache_alloc_debugcheck_after (cachep=0x27c4e5a0,
flags=208, objp=0x27c9d000, caller=0x27c4e56b)

    at /cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/mm/slab.c:3072

#8  0x080b77d7 in kmem_cache_alloc (cachep=0x27c4e5a0, flags=0)

    at /cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/mm/slab.c:3475

#9  0x080c006f in getname (filename=0x619e1f <Address 0x619e1f out of
bounds>)

    at /cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/fs/namei.c:146

#10 0x080b97eb in do_sys_open (dfd=-100, filename=0x619e1f <Address
0x619e1f out of bounds>, flags=0, mode=438)

    at /cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/fs/open.c:1086

#11 0x080b98b3 in sys_open (filename=0x619e1f <Address 0x619e1f out of
bounds>, flags=0, mode=438)

    at /cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/fs/open.c:1113

#12 0x0805f200 in handle_syscall (r=0x27054354)

    at
/cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/arch/um/kernel/skas/syscal
l.c:35

#13 0x0806d2b4 in handle_trap (pid=11408, regs=0x27054354,
local_using_sysemu=0)

    at
/cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/arch/um/os-Linux/skas/proc
ess.c:210

#14 0x0806d785 in userspace (regs=0x27054354)

    at
/cc/4gbts/oss_1/target/uml/linux-2.6.26.7/src/arch/um/os-Linux/skas/proc
ess.c:439

#15 0x0805c8de in fork_handler () at include2/asm/thread_info.h:49

#16 0x00000000 in ?? ()

4.       I used another function to print current status to find out why
the real time alarm signal didn't trigger a schedule:

(gdb) p get_current()

$2 = (void *) 0x0

(gdb) p *get_current_thread()

$3 = {task = 0x0, exec_domain = 0x0, flags = 4, cpu = 0, preempt_count =
-257, addr_limit = {seg = 0}, restart_block = {

    fn = 0, {futex = {uaddr = 0x0, val = 0, flags = 0, bitset = 0, time
= 0}, nanosleep = {index = 0, rmtp = 0x0,

        expires = 0}}}, real_thread = 0x8592000}

(gdb) p currentstatus()

$4 = 0x85944c0 "PreemtCount: -257,HardIrqCount:268369920,
softIrqCount:65024, irqCount:268434944, In interrupt:268434944\n"

 

Till here I'm lost. I guess the problem is about the interrupt handling,
which is difficult for me to dig into now. Can someone help me?

Thanks

 

Mars Zhao

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
User-mode-linux-user mailing list
User-mode-linux-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-user

Reply via email to