According to the gem5 status matrix (http://gem5.org/Status_Matrix),
"Classic caches do not support x86 locked (atomic RMW) accesses. The
AtomicSimple CPU model enforces atomic RMW accesses itself, so this only
affects correctness for timing-mode CPU models."
You seem to be using the classic memory model (I see no --ruby in your
command line), but the wiki states that you should use ruby, unless the
wiki is not up to date of course.
Arthur.
Le 11/12/2015 15:25, Adi Fuchs a écrit :
Hi Fabian,
I posted a mail about that exact same issue about a week ago.
From what I have seen online, it might be an issue with the way gem5
implements X86 interrupts.
The same thing happens for the timing CPU as well, and I have seen it
in all of PARSEC’s benchmarks, except for raytrace.
I have not seen any fix for that, though. After struggling for a few
weeks, I ended up giving up and using Alpha instead, where it works
better, I don’t know if that works for you. I would really like to see
this fixed as well.
Adi
*From:*gem5-users [mailto:[email protected]] *On Behalf Of
*Fabian Oboril
*Sent:* Friday, December 11, 2015 7:10 AM
*To:* [email protected]
*Subject:* [gem5-users] Parsec in X86 FS mode uses not all cores in O3
Hi all,
I am currently struggling with a phenomenon of gem5 in X86 FS mode
with multi-core cpus. I use multithreaded splash-2 and Parsec 2.1
benchmarks with the kernel and disk images provided on the gem5
website. I use atomic mode to fast forward until the region of
interest and the switch to the O3 cpu.
Now the problem is that although the benchmarks should use 8 cores (8
threads), only few cores are really used, while the rest seems to
remain idle. However, this phenomenon occurs only with the O3 cpu. If
I don't switch and remain with the first mode, 8 cores are used. I
tried to use taskset, I created checkpoints to avoid
forwarding/switching, I tried fewer and more cores and threads, the
problem is still there. In order to avoid problems with the Linux
scheduler, I also use taskset to avoid using the first core, which
however did not really resolve the problem. To illustrate my problem,
here is a snapshot of the resulting stats.file of the x264 PARSEC
benchmark for the region of interest.
atomic:
system.cpu00.committedInsts 929913 # Number
of instructions committed
system.cpu01.committedInsts 139072589 #
Number of instructions committed
system.cpu02.committedInsts 104962601 #
Number of instructions committed
system.cpu03.committedInsts 176170052 #
Number of instructions committed
system.cpu04.committedInsts 131944761 #
Number of instructions committed
system.cpu05.committedInsts 126332143 #
Number of instructions committed
system.cpu06.committedInsts 132647761 #
Number of instructions committed
system.cpu07.committedInsts 132356819 #
Number of instructions committed
system.cpu08.committedInsts 136295648 #
Number of instructions committed
system.cpu09.committedInsts 165557480 #
Number of instructions committed
system.cpu10.committedInsts 118983 # Number
of instructions committed
system.cpu11.committedInsts 119255 # Number
of instructions committed
with switch to O3:
system.switch_cpus_100.committedInsts_total
861885 # Number of Instructions Simulated
system.switch_cpus_101.committedInsts_total
10000000001 # Number of Instructions Simulated
system.switch_cpus_102.committedInsts_total
9999386643 # Number of Instructions Simulated
system.switch_cpus_103.committedInsts_total
850243 # Number of Instructions Simulated
system.switch_cpus_104.committedInsts_total
861640 # Number of Instructions Simulated
system.switch_cpus_105.committedInsts_total
846145 # Number of Instructions Simulated
system.switch_cpus_106.committedInsts_total
857511 # Number of Instructions Simulated
system.switch_cpus_107.committedInsts_total
846145 # Number of Instructions Simulated
system.switch_cpus_108.committedInsts_total
860387 # Number of Instructions Simulated
system.switch_cpus_109.committedInsts_total
850247 # Number of Instructions Simulated
system.switch_cpus_110.committedInsts_total
861648 # Number of Instructions Simulated
system.switch_cpus_111.committedInsts_total
862428 # Number of Instructions Simulated
I start gem5 like this:
build/X86/gem5.fast --remote-gdb-port=0 --stats-file=x264/x264.stats
--dump-config=x264/x264.config configs/example/fs.py -n 12 --caches
--l2cache --l3cache
--script=configs/boot/parsec2/x264_8c_simmedium.rcS
--terminal_name=.x264 --l3tech=stt --l3stochastic=true --cpu-clock="3
GHz" -s 100 -F 500000000 --maxinsts $N
Did anybody observe such a behavior already? Has anybody an idea how I
can resolve this issue?
Best regards,
Fabian
--
Dr.-Ing. Fabian Oboril
Research Assistant (Wissenschaftl. Mitarbeiter)
Karlsruhe Institute of Technology (KIT)
Chair of Dependable Nano Computing (CDNC)
Institut für Technische Informatik (ITEC)
Haid-und-Neu-Str. 7
Building 07.21
76131 Karlsruhe, Germany
Phone: +49 721 608-44859
Fax: +49 721 608-43962
Email: fabian.oboril∂kit edu
Web:http://cdnc.itec.kit.edu/
KIT – The Research University in the Helmholtz Association
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
--
Arthur Perais
INRIA Bretagne Atlantique
Bâtiment 12E, Bureau E303, Campus de Beaulieu
35042 Rennes, France
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users