Re: [gem5-users] x86 dual system - test system kernel panic in detailed mode

Fangfei Liu Thu, 11 Jul 2013 15:44:16 -0700

I solved this problem. If I use kernel v2.6.28.4 with e1000 driver and directly 
run in atomic or detailed mode, there is no error. But if I need to take a 
checkpoint, I need to manually enforce the request to the memory range 
corresponding to Ethernet device to be uncacheable when taking the checkpoint. 
When restoring from the checkpoint and runs in detailed mode, it doesn’t matter 
whether or not to enforce the requests to be uncacheable. So I guess there is a 
small bug in taking checkpoint…


Best regards
Fangfei

From: [email protected] [mailto:[email protected]] On 
Behalf Of Fangfei Liu
Sent: 2013年7月11日 11:48
To: gem5 users mailing list
Subject: Re: [gem5-users] x86 dual system - test system kernel panic in 
detailed mode

After some debug, I guess the problem is due to the use of cache. The memory 
mapped I/O should be uncacheable, but the request corresponding to the memory 
range of Ethernet device is cacheable in gem5. Even for atomic mode, it will 
get the same assertion error (packet size!=4) if I use cache. I was wondering 
how should I specify that some memory range is uncacheable? Thanks!

Best regards
Fangfei

From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of Fangfei Liu
Sent: 2013年7月10日 19:36
To: gem5 users mailing list
Subject: Re: [gem5-users] x86 dual system - test system kernel panic in 
detailed mode

I also tried to run the dual system directly in detailed mode and met the same 
problem. It seems that this is not a problem with checkpoint restore but a 
problem with detailed mode. Does anyone have any suggestions? Thanks!

Best regards
Fangfei

From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of Fangfei Liu
Sent: 2013年7月10日 15:22
To: [email protected]<mailto:[email protected]>
Subject: [gem5-users] x86 dual system - test system kernel panic in detailed 
mode

Hello,

I'm running apache server and apache benchmark in x86 dual system. It works 
well in atomic mode with default configuration. I take a checkpoint before 
client sending request. When the simulator restores from the checkpoint and 
runs in detailed mode, the test system (server) gets a kernel panic upon 
receiving packets from the clients:

build/X86/gem5.opt configs/example/fs.py --kernel=vmlinux --disk-image=big.img 
--dual -r 1 --caches --cpu-type=detailed

BUG: unable to handle kernel NULL pointer dereference at 0000000000000710
IP: [<ffffffff803fed22>] rx_irq+0x32/0x250
PGD 72d6067 PUD 72d7067 PMD 0
Oops: 0000 [#1]
last sysfs file:
CPU 0
Modules linked in:
Pid: 0, comm: swapper Tainted: G        W  2.6.28.4-dirty #5
RIP: 0010:[<ffffffff803fed22>]  [<ffffffff803fed22>] rx_irq+0x32/0x250
RSP: 0018:ffffffff8083beb0  EFLAGS: 000000a8
RAX: 0000000000000000 RBX: ffff88000706e640 RCX: 000000000000023c
RDX: ffff88000706e940 RSI: 0000000000000000 RDI: ffff88000706e000
RBP: ffffffff80892968 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000064 R11: ffffffff8021afc0 R12: ffff88000706e000
R13: ffff88000706e640 R14: ffff88000706e710 R15: 0000000000000000
FS:  000000004e7fb950(0000) GS:ffffffff80844020(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000710 CR3: 00000000072d3000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
Process swapper (pid: 0, threadinfo ffffffff807e6000, task ffffffff8076d380)
Stack:
 ffffffff802108e9 0000000000000000 00000000000002a8 ffff88000706e640
 ffffffff80892968 ffff88000706e000 0000000000000100 000000000000000a
 0000000000000000 ffffffff803fef5d ffffffff803ff438 0000000000000000
Call Trace:
 <IRQ> <0> [<ffffffff802108e9>] ? read_tsc+0x9/0x20
 [<ffffffff803fef5d>] ? rx_action+0x1d/0x70
 [<ffffffff803ff438>] ? ns83820_irq+0x1c8/0x2b0
 [<ffffffff80232c59>] ? tasklet_action+0x39/0x80
 [<ffffffff80232e8c>] ? __do_softirq+0x7c/0x130
 [<ffffffff8020c19c>] ? call_softirq+0x1c/0x30
 [<ffffffff8020d015>] ? do_softirq+0x35/0x70
 [<ffffffff8020d0f9>] ? do_IRQ+0xa9/0x1a0
 [<ffffffff8020b996>] ? ret_from_intr+0x0/0xa
 <EOI> <0> [<ffffffff80209d32>] ? __exit_idle+0x12/0x30
 [<ffffffff8020a23a>] ? cpu_idle+0x2a/0x50
Code: 49 81 c6 10 07 00 00 41 55 49 89 fd 49 81 c5 40 06 00 00 41 54 55 53 48 
83 ec 18 48 89 7c 24 08 9c 8f 44 24 10 fa 48 8b 44 24 08 <8b> 88 10 07 00 00 85 
c9 0f 84 cb 01 00 00 49 8b 96 10 02 00 00
RIP  [<ffffffff803fed22>] rx_irq+0x32/0x250
 RSP <ffffffff8083beb0>
CR2: 0000000000000710
Kernel panic - not syncing: Fatal exception in interrupt

I'm using kernel v2.6.28.4 and NSGigE Ethernet card. I found a similar kernel 
panic in previous post but my problem is a little bit different. I tried the 
checkpoint restore to atomic mode and it works well.  I tried other kernel and 
Ethernet card combination and met different errors. For example, when I use 
kernel v2.6.22.9 and E1000, there is no kernel panic but with an assertion 
error:

 build/X86/dev/i8254xGBe.cc:177: virtual Tick IGbE::read(PacketPtr): Assertion 
`pkt->getSize() == 4' failed.
Program aborted at cycle 5470903108500
./run_fs_test: line 6: 12004 Aborted                 (core dumped) 
build/X86/gem5.opt configs/example/fs.py --kernel=x86_64-vmlinux-2.6.22.9 
--disk-image=big.img --dual --checkpoint-dir=apache-e1000 -r 1 
--cpu-type=detailed --caches

Does anyone know how to solve this problem? Thanks!

Best regards
Fangfei

_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] x86 dual system - test system kernel panic in detailed mode

Reply via email to