I solved this problem. If I use kernel v2.6.28.4 with e1000 driver and directly run in atomic or detailed mode, there is no error. But if I need to take a checkpoint, I need to manually enforce the request to the memory range corresponding to Ethernet device to be uncacheable when taking the checkpoint. When restoring from the checkpoint and runs in detailed mode, it doesn’t matter whether or not to enforce the requests to be uncacheable. So I guess there is a small bug in taking checkpoint…
Best regards Fangfei From: [email protected] [mailto:[email protected]] On Behalf Of Fangfei Liu Sent: 2013年7月11日 11:48 To: gem5 users mailing list Subject: Re: [gem5-users] x86 dual system - test system kernel panic in detailed mode After some debug, I guess the problem is due to the use of cache. The memory mapped I/O should be uncacheable, but the request corresponding to the memory range of Ethernet device is cacheable in gem5. Even for atomic mode, it will get the same assertion error (packet size!=4) if I use cache. I was wondering how should I specify that some memory range is uncacheable? Thanks! Best regards Fangfei From: [email protected]<mailto:[email protected]> [mailto:[email protected]] On Behalf Of Fangfei Liu Sent: 2013年7月10日 19:36 To: gem5 users mailing list Subject: Re: [gem5-users] x86 dual system - test system kernel panic in detailed mode I also tried to run the dual system directly in detailed mode and met the same problem. It seems that this is not a problem with checkpoint restore but a problem with detailed mode. Does anyone have any suggestions? Thanks! Best regards Fangfei From: [email protected]<mailto:[email protected]> [mailto:[email protected]] On Behalf Of Fangfei Liu Sent: 2013年7月10日 15:22 To: [email protected]<mailto:[email protected]> Subject: [gem5-users] x86 dual system - test system kernel panic in detailed mode Hello, I'm running apache server and apache benchmark in x86 dual system. It works well in atomic mode with default configuration. I take a checkpoint before client sending request. When the simulator restores from the checkpoint and runs in detailed mode, the test system (server) gets a kernel panic upon receiving packets from the clients: build/X86/gem5.opt configs/example/fs.py --kernel=vmlinux --disk-image=big.img --dual -r 1 --caches --cpu-type=detailed BUG: unable to handle kernel NULL pointer dereference at 0000000000000710 IP: [<ffffffff803fed22>] rx_irq+0x32/0x250 PGD 72d6067 PUD 72d7067 PMD 0 Oops: 0000 [#1] last sysfs file: CPU 0 Modules linked in: Pid: 0, comm: swapper Tainted: G W 2.6.28.4-dirty #5 RIP: 0010:[<ffffffff803fed22>] [<ffffffff803fed22>] rx_irq+0x32/0x250 RSP: 0018:ffffffff8083beb0 EFLAGS: 000000a8 RAX: 0000000000000000 RBX: ffff88000706e640 RCX: 000000000000023c RDX: ffff88000706e940 RSI: 0000000000000000 RDI: ffff88000706e000 RBP: ffffffff80892968 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000064 R11: ffffffff8021afc0 R12: ffff88000706e000 R13: ffff88000706e640 R14: ffff88000706e710 R15: 0000000000000000 FS: 000000004e7fb950(0000) GS:ffffffff80844020(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000710 CR3: 00000000072d3000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000 Process swapper (pid: 0, threadinfo ffffffff807e6000, task ffffffff8076d380) Stack: ffffffff802108e9 0000000000000000 00000000000002a8 ffff88000706e640 ffffffff80892968 ffff88000706e000 0000000000000100 000000000000000a 0000000000000000 ffffffff803fef5d ffffffff803ff438 0000000000000000 Call Trace: <IRQ> <0> [<ffffffff802108e9>] ? read_tsc+0x9/0x20 [<ffffffff803fef5d>] ? rx_action+0x1d/0x70 [<ffffffff803ff438>] ? ns83820_irq+0x1c8/0x2b0 [<ffffffff80232c59>] ? tasklet_action+0x39/0x80 [<ffffffff80232e8c>] ? __do_softirq+0x7c/0x130 [<ffffffff8020c19c>] ? call_softirq+0x1c/0x30 [<ffffffff8020d015>] ? do_softirq+0x35/0x70 [<ffffffff8020d0f9>] ? do_IRQ+0xa9/0x1a0 [<ffffffff8020b996>] ? ret_from_intr+0x0/0xa <EOI> <0> [<ffffffff80209d32>] ? __exit_idle+0x12/0x30 [<ffffffff8020a23a>] ? cpu_idle+0x2a/0x50 Code: 49 81 c6 10 07 00 00 41 55 49 89 fd 49 81 c5 40 06 00 00 41 54 55 53 48 83 ec 18 48 89 7c 24 08 9c 8f 44 24 10 fa 48 8b 44 24 08 <8b> 88 10 07 00 00 85 c9 0f 84 cb 01 00 00 49 8b 96 10 02 00 00 RIP [<ffffffff803fed22>] rx_irq+0x32/0x250 RSP <ffffffff8083beb0> CR2: 0000000000000710 Kernel panic - not syncing: Fatal exception in interrupt I'm using kernel v2.6.28.4 and NSGigE Ethernet card. I found a similar kernel panic in previous post but my problem is a little bit different. I tried the checkpoint restore to atomic mode and it works well. I tried other kernel and Ethernet card combination and met different errors. For example, when I use kernel v2.6.22.9 and E1000, there is no kernel panic but with an assertion error: build/X86/dev/i8254xGBe.cc:177: virtual Tick IGbE::read(PacketPtr): Assertion `pkt->getSize() == 4' failed. Program aborted at cycle 5470903108500 ./run_fs_test: line 6: 12004 Aborted (core dumped) build/X86/gem5.opt configs/example/fs.py --kernel=x86_64-vmlinux-2.6.22.9 --disk-image=big.img --dual --checkpoint-dir=apache-e1000 -r 1 --cpu-type=detailed --caches Does anyone know how to solve this problem? Thanks! Best regards Fangfei
_______________________________________________ gem5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
