Re: [gem5-users] [EXT] Re: Look into my running simulation

nocua Thu, 18 Oct 2018 01:23:40 -0700

Hi Vitorio,

Based on your command line, I notice that you are configuring a systemwith L2/L3 caches, however, in order to collect the traces using theElastic Traces probe the system should have only L1 caches (as explainin the cache configuration file).


./configs/common/CacheConfig.py

# If elastic trace generation is enabled, make sure the memorysystem is# minimal so that compute delays do not include memory accesslatencies.

    # Configure the compulsory L1 caches for the O3CPU, do not configure
    # any more caches.
    if options.l2cache and options.elastic_trace_en:

fatal("When elastic trace is enabled, do not configure L2caches.")

If you want to avoid connecting to a terminal but rather directly runthe application, you could add the --script option followed by the pathwhere to find the rcS. For instance,--script=/home/folder/folder/spec_benchmark.rcS.

If the simulation is working properly, you will notice on your m5outfolder the traces that you created (inst.proto.gz/data.prot.gz) and theyshould increase over time.

You can check as well some info about how to collect/replay ElasticTraces in [1].


I hope this helps.

Kind Regards,
Alejandro NOCUA
CNRS Postdoctoral Researcher
LIRMM
161 Rue Ada
34000

[1] http://gem5.org/TraceCPU



Le 2018-10-18 02:41, Vitorio Cargnini (lcargnini) a écrit :

Hi,

I still trying to boot, my gut-feeling, it seems to me that if I
enable the elastic traces all goes south.

My parameters:
gem5.opt --smt --caches --cpu-type=DerivO3CPU --cpu-clock=3GHz
--mem-type=SimpleMemory --mem-channels=2 --mem-ranks=4 --mem-size=16GB
--l1d_size=32kB --l1i_size=64kB --l2_size=8MB --l3_size=22MB
--elastic-trace-en --inst-trace-file=inst.proto.gz
--data-trace-file=data.prot.gz
--disk-image=$M5_PATH/disks/ubuntu-14.04-amd64.img
--kernel=$M5_PATH/binaries/vmlinux-4.8.13-1_amd64

So far it was booting and working without the traces enabled.

What should I do to make it work, I want to trace a benchmark run on
my gem5 and later only feed my system with the traces.

Regards,
Vitorio.


From: Ciro Santilli [mailto:[email protected]]
Sent: Wednesday, October 17, 2018 12:05 AM
To: gem5 users mailing list <[email protected]>; Vitorio Cargnini
(lcargnini) <[email protected]>
Subject: [EXT] Re: [gem5-users] Look into my running simulation

On Wed, Oct 17, 2018 at 2:24 AM Gabe Black<mailto:[email protected]> wrote:

Hi Vitorio. It looks like the kernel panicked and never finished
booting. You can exit m5term by typing ~. (tilda and then period),

OMG, this is amazing!

or you can use whatever telnet client/terminal emulator you'recomfortable with.



Beware however that telnet has some quirks, e.g. arrows stop working,
better stick to m5term.
 
Gabe
On Tue, Oct 16, 2018, 3:41 PM Vitorio Cargnini (lcargnini)
<mailto:[email protected]> wrote:
Hello,
 
I set a running simulation, with a SPEC benchmark. However this it is
running for some days already. So I want to check if everything it is
fine with the simulated system, if it is really running.
 
The reason it is:
I have my installation in /home/folder/folder. However I’m running
from a different folder like
/somewhere/folder/folder/folder_where_i_want_the_m5out_folder
 
Looking inside this target location, stats.txt it is empty so far,
there is also a file system.pc.com_1.terminal, and when I look in it I
see in the end:
…
Mountpoint-cache hash table entries: 32768 (order: 6, 262144 bytes)
CPU: CPU feature monitor disabled, no CPUID level 0x5
CPU: CPU feature xsave disabled, no CPUID level 0xd
mce: CPU supports 4 MCE banks
mce: unknown CPU type - not enabling MCE support
Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0

Freeing SMP alternatives memory: 24K (ffffffff8198e000 -ffffffff81994000)

BUG: unable to handle kernel paging request at ffffffffcd82c740
IP: [<ffffffffcd82c771>] 0xffffffffcd82c771
PGD 1807067 PUD 1809067 PMD 0
Oops: 0010 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.13 #1
Hardware name:  , BIOS  06/08/2008
task: ffffffff8180b500 task.stack: ffffffff81800000
RIP: 0010:[<ffffffffcd82c771>]  [<ffffffffcd82c771>] 0xffffffffcd82c771
RSP: 0000:ffffffff81803e30  EFLAGS: 00000028
RAX: 0000000000020f76 RBX: 0000000000000805 RCX: 0000000004000209
RDX: 00000000e7dbfbff RSI: ffffffff81803e59 RDI: 000000000000026c
RBP: ffffffff81803e52 R08: ffffffff810145f3 R09: 000000000000026c
R10: ffffffff81803e52 R11: ffffffff81803e52 R12: ffffffff819858bc
R13: ffffffff8192c2e0 R14: ffffffff81995000 R15: 0000000000090200

FS: 0000000000000000(0000) GS:ffff88043fc00000(0000)knlGS:0000000000000000

CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffcd82c740 CR3: 0000000001806000 CR4: 00000000000006b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
Stack:
ffffffff810145fb 000000000000026c ffffffff8197b6a0 ffffffff81014a35
669066669d570006 0000000000009090 ffffffff819dfbe2 000000000000004a
ffffffff8106abcd 0000000000000000 0000000000000000 000000000000026c
Call Trace:
[<ffffffff810145fb>] ? text_poke_early+0x2c/0x30
[<ffffffff81014a35>] ? apply_paravirt.part.1+0x74/0x82
[<ffffffff8106abcd>] ? vprintk_emit+0x357/0x368
[<ffffffff810acaaf>] ? printk+0x43/0x4b
[<ffffffff810b347c>] ? free_reserved_area+0x105/0x114
[<ffffffff818b57a1>] ? alternative_instructions+0xbf/0xcf
[<ffffffff818b708e>] ? check_bugs+0xa/0x28
[<ffffffff818ace24>] ? start_kernel+0x412/0x424
[<ffffffff818ac120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff818ac36c>] ? x86_64_start_kernel+0xe6/0xf5
Code:  Bad RIP value.
RIP  [<ffffffffcd82c771>] 0xffffffffcd82c771
RSP <ffffffff81803e30>
CR2: ffffffffcd82c740
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task!
random: fast init done
 

As Gabe said, kernel panic means the kernel completely shuts down,
there is no way your simulation can be running after one.

You can also have a look at the kernel source of the backtrace
"text_poke_early", and see if it gives a clue to what happened.
Sometimes it is easy, sometimes not.

You can find the exact line with GDB post-mortem with GDB
disassemble/rs
https://stackoverflow.com/questions/22769246/how-to-disassemble-one-single-function-using-objdump/31138400#31138400

You can also try to connect through the GDB stub and step debug kernelcode.


 
So I’m not sure if it is working. Still, I tested the application
before, and everything was working. However, I’m not so sure now, and
I didn’t want to use m5term, because I don’t know how to kill it
without killing the simulation and having to restart all over again.
Since, this has being running for a few days already, and I’m
collecting traces, and the trace file  has increased over this entire
time.
 

The contents of m5term also show at m5out/system.terminal on later
parts of the boot.
 
Will wait your feedback people.
 
Best Regards,
 
Luis Vitorio Cargnini, Ph.D.
mailto:[email protected]
Sr. Systems Architect,
Micron Technology, Inc.
This email and any attachments contained within may contain
confidential and proprietary information.
 
 
 
 
 
 
_______________________________________________
gem5-users mailing list
mailto:[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
mailto:[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] [EXT] Re: Look into my running simulation

Reply via email to