Hi,
Recently,I have made a serials of tests to get know about the availability
of M5.
These tests mainly focus on the CMP/SMT simulating ability of M5. Some
configurations are passed and other are failed. I don't know what is the reason
about those had failed. I wish that the authors, Nathan or Steven or Ali or...,
can give some suggestion to those failed cases.
Now,the test-results are as bellow:
[Notations]
C:cores
T:Threads
[SE mode]:
[DetailedCPU]:
[none-cache]:
1C4T: OK;
8C: OK;
16C: ERROR
build/ALPHA_SE/cpu/base_dyn_inst_impl.hh:125: void
BaseDynInst<Impl>::initVars() [with Impl = O3CPUImpl]: Assertion `instcount <=
1500' failed.
[Just L1-cache]:
SMT-enable: ERROR
m5.opt: build/ALPHA_SE/mem/request.hh:229: int
Request::getThreadNum(): Assertion `validCpuAndThreadNums' failed.
4C: OK
8C: OK
16C: ERROR
build/ALPHA_SE/cpu/base_dyn_inst_impl.hh:125: void
BaseDynInst<Impl>::initVars() [with Impl = O3CPUImpl]: Assertion `instcount <=
1500' failed.
[L2-cache(shared)]:ERROR
Exiting @ tick 4611686018427387904 because simulate() limit reached
After running it , the simulator reached the max tick 0x4000000000000000
very much quickly and exited.
[TimingSimpleCPU]:
[SMT-enable]: ERROR
KeyError: 'system.cpu2.workload0
system.cpu2.workload1'
panic: resolveSimObject: failure on call to
Python for system.cpu2.workload0 system.cpu2.workload1
[none-cache]:
[Just L1-cache]:
32C: OK;
128C: OK;
[L2-cache(shared)]:
4C: OK;
32C: OK;
AtomicSimpleCPU:
[none-cache]:
128C: OK;
[Just L1-cach]e:
32C: OK;
128C: OK;
[L2-cache(shared)]:
2C: ERROR;
Exiting @ tick 15583 because target called
exit()
[FS Mode]:
[DetailedCPU] the same error information appearenced in all cases
[none-cache]: ERROR
[L1-Cache]: ERROR
[L2-Cache]: ERROR
warn: Entering event queue @ 0. Starting simulation...
warn: cycle 490: Quiesce instruction encountered,
halting fetch!
warn: cycle 490: Quiesce instruction encountered,
halting fetch!
warn: cycle 532: Quiesce instruction encountered,
halting fetch!
warn: cycle 532: Quiesce instruction encountered,
halting fetch!
Segmentation fault
[TimingSimpleCPU]:
[none-cache]:
2C: OK
4C: after running it , it stopped after a while(2-3
minutes), and the lastest outputted line is always "NET: Registered protocol
family 2", and the part console output is as bellow:
PIIX4: IDE controller at PCI slot 0000:00:00.0
PIIX4: chipset revision 0
PIIX4: 100% native mode on irq 31
ide0: BM-DMA at 0x8400-0x8407, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0x8408-0x840f, BIOS settings: hdc:DMA, hdd:DMA
hda: M5 IDE Disk, ATA DISK drive
hdb: M5 IDE Disk, ATA DISK drive
ide0 at 0x8410-0x8417,0x8422 on irq 31
hda: max request size: 128KiB
hda: 524160 sectors (268 MB), CHS=520/16/63,
UDMA(33)
hda: cache flushes not supported
hda: hda1
hdb: max request size: 128KiB
hdb: 4177920 sectors (2139 MB), CHS=4144/16/63,
UDMA(33)
hdb: cache flushes not supported
hdb: unknown partition table
mice: PS/2 mouse device common for all mice
NET: Registered protocol family 2
[L1-Cache]:
1C: ERROR
Listening for console connection on port 3456
0: system.remote_gdb.listener: listening for
remote gdb #0 on port 7000
0: system.remote_gdb.listener: listening for
remote gdb #1 on port 7001
warn: Entering event queue @ 0. Starting
simulation...
m5.opt:
build/ALPHA_FS/mem/cache/miss/mshr_queue.cc:216: void
MSHRQueue::markInService(MSHR*): Assertion `mshr->getNumTargets() == 0' failed.
Program aborted at cycle 330
Aborted
[L2-Cache]:
1C: ERROR:
m5.opt:
build/ALPHA_FS/mem/cache/tags/cache_tags_impl.hh:178: typename CacheTags<Tags,
Compression>::BlkType* CacheTags<Tags, Compression>::handleFill(typename
Tags::BlkType*, Packet*&, unsigned int, PacketList&, Packet*) [with Tags = LRU,
Compression = NullCompression]: Assertion `tmp_blk == blk' failed.
Program aborted at cycle 806
Aborted
[AtomicSimpleCPU]:
[none-cache]:
2C: OK
4C:as same as TimingSimpleCPU
[L1-Cache]: ERROR
warn: Entering event queue @ 0. Starting simulation...
panic: unimplemented
@ cycle 19335
[doFunctionalAccess:build/ALPHA_FS/mem/physical.cc,
line 136]
Program aborted at cycle 19335
Aborted
[L2-Cache]: ERROR
warn: Entering event queue @ 0. Starting simulation...
m5.opt: build/ALPHA_FS/mem/request.hh:229: int
Request::getThreadNum(): Assertion `validCpuAndThreadNums' failed.
Program aborted at cycle 19335
Aborted
The configuration files uses in testing are as bellow:
[se.py]:
# Simple test script
#
# "m5 test.py"
import m5
from m5.objects import *
import os, optparse, sys
m5.AddToPath('../common')
from FullO3Config import *
parser = optparse.OptionParser()
parser.add_option("-c", "--cmd",
default="../../tests/test-progs/hello/bin/alpha/linux/hello",
help="The binary to run in syscall emulation mode.")
parser.add_option("-o", "--options", default="",
help="The options to pass to the binary, use \" \" around the
entire\
string.")
parser.add_option("-i", "--input", default="",
help="A file of input to give to the binary.")
parser.add_option("-d", "--detailed", action="store_true")
parser.add_option("-t", "--timing", action="store_true")
parser.add_option("-m", "--maxtick", type="int")
parser.add_option("-p", "--cpunum", type="int")
parser.add_option("-l", "--L1", action="store_true")
parser.add_option("-L", "--L2", action="store_true")
(options, args) = parser.parse_args()
if args:
print "Error: script doesn't take any positional arguments"
sys.exit(1)
n_cpus = 1
if options.cpunum:
n_cpus = options.cpunum
if options.timing:
cpus = [ TimingSimpleCPU() for i in xrange(n_cpus) ]
elif options.detailed:
cpus = [ DetailedO3CPU() for i in xrange(n_cpus) ]
else:
cpus = [ AtomicSimpleCPU() for i in xrange(n_cpus) ]
# --------------------
# Base L1 Cache
# ====================
class L1(BaseCache):
latency = 1
block_size = 64
mshrs = 4
tgts_per_mshr = 8
#protocol = CoherenceProtocol(protocol='msi')
# ----------------------
# Base L2 Cache
# ----------------------
class L2(BaseCache):
block_size = 64
latency = 100
mshrs = 92
tgts_per_mshr = 16
write_buffers = 8
# system simulated
system = System(cpu = cpus, physmem = PhysicalMemory(), membus = Bus())
# l2cache & bus
system.toL2Bus = Bus()
system.l2c = L2(size='4MB', assoc=8)
system.l2c.cpu_side = system.toL2Bus.port
# connect l2c to membus
system.l2c.mem_side = system.membus.port
for cpu in cpus:
if options.L1:
cpu.addPrivateSplitL1Caches(L1(size = '32kB', assoc = 1),
L1(size = '32kB', assoc = 4))
cpu.mem = cpu.dcache
cpu.connectMemPorts(system.membus)
elif options.L2:
cpu.addPrivateSplitL1Caches(L1(size = '32kB', assoc = 1),
L1(size = '32kB', assoc = 4))
cpu.mem = cpu.dcache
# connect cpu level-1 caches to shared level-2 cache
cpu.connectMemPorts(system.toL2Bus)
else:
cpu.mem = system.physmem
cpu.connectMemPorts(system.membus)
process = LiveProcess()
process.executable = options.cmd
process.cmd = options.cmd + " " + options.options
if options.input != "":
process.input = options.input
if options.detailed:
#check for SMT workload
workloads = options.cmd.split(';')
if len(workloads) > 1:
process = []
smt_idx = 0
inputs = []
if options.input != "":
inputs = options.input.split(';')
for wrkld in workloads:
smt_process = LiveProcess()
smt_process.executable = wrkld
smt_process.cmd = wrkld + " " + options.options
if inputs and inputs[smt_idx]:
smt_process.input = inputs[smt_idx]
process += [smt_process, ]
smt_idx += 1
cpu.workload = process
system.physmem.port = system.membus.port
root = Root(system = system)
if options.timing or options.detailed:
root.system.mem_mode = 'timing'
# instantiate configuration
m5.instantiate(root)
# simulate until program terminates
if options.maxtick:
exit_event = m5.simulate(options.maxtick)
else:
exit_event = m5.simulate()
print 'Exiting @ tick', m5.curTick(), 'because', exit_event.getCause()
[fs.py]:
import optparse, os, sys
import m5
from m5.objects import *
m5.AddToPath('../common')
from FSConfig import *
from SysPaths import *
from Benchmarks import *
parser = optparse.OptionParser()
parser.add_option("-d", "--detailed", action="store_true")
parser.add_option("-t", "--timing", action="store_true")
parser.add_option("-m", "--maxtick", type="int")
parser.add_option("--maxtime", type="float")
parser.add_option("--dual", action="store_true",
help="Simulate two systems attached with an ethernet link")
parser.add_option("-b", "--benchmark", action="store", type="string",
dest="benchmark",
help="Specify the benchmark to run. Available benchmarks: %s"\
% DefinedBenchmarks)
parser.add_option("--etherdump", action="store", type="string",
dest="etherdump",
help="Specify the filename to dump a pcap capture of the
ethernet"
"traffic")
parser.add_option("-p", "--cpunum", type="int")
parser.add_option("-l", "--L1", action="store_true")
parser.add_option("-L", "--L2", action="store_true")
(options, args) = parser.parse_args()
if args:
print "Error: script doesn't take any positional arguments"
sys.exit(1)
#n_cpus =
#n_cpus = options.cpunum
n_cpus = 1
if options.cpunum:
n_cpus = options.cpunum
if options.detailed:
cpu_server = [ DetailedO3CPU() for i in xrange(n_cpus) ]
cpu_client = [ DetailedO3CPU() for i in xrange(n_cpus) ]
mem_mode = 'timing'
elif options.timing:
cpu_server = [ TimingSimpleCPU() for i in xrange(n_cpus) ]
cpu_client = [ TimingSimpleCPU() for i in xrange(n_cpus) ]
mem_mode = 'timing'
else:
cpu_server = [ AtomicSimpleCPU() for i in xrange(n_cpus) ]
cpu_client = [ AtomicSimpleCPU() for i in xrange(n_cpus) ]
mem_mode = 'atomic'
BaseCPU.clock = '2GHz'
if options.benchmark:
if options.benchmark not in Benchmarks:
print "Error benchmark %s has not been defined." % options.benchmark
print "Valid benchmarks are: %s" % DefinedBenchmarks
sys.exit(1)
bm = Benchmarks[options.benchmark]
else:
if options.dual:
bm = [Machine(), Machine()]
else:
bm = [Machine()]
# --------------------
# Base L1 Cache
# ====================
class L1(BaseCache):
latency = 1
block_size = 64
mshrs = 4
tgts_per_mshr = 8
#protocol = CoherenceProtocol(protocol='msi')
# ----------------------
# Base L2 Cache
# ----------------------
class L2(BaseCache):
block_size = 64
latency = 100
mshrs = 92
tgts_per_mshr = 16
write_buffers = 8
if len(bm) == 2:
#not consider two machines case
s1 = makeLinuxAlphaSystem(mem_mode, bm[0])
s1.cpu = cpu
cpu.connectMemPorts(s1.membus)
cpu.mem = s1.physmem
s2 = makeLinuxAlphaSystem(mem_mode, bm[1])
s2.cpu = cpu2
cpu2.connectMemPorts(s2.membus)
cpu2.mem = s2.physmem
root = makeDualRoot(s1, s2, options.etherdump)
elif len(bm) == 1:
# modify the makeLinuxAlphaSystem func take a cpus vector as a
parameter[FSConfig.py]
root = Root(clock = '1THz',
system = makeLinuxAlphaSystem(cpu_server, mem_mode, bm[0]))
# l2cache & bus
root.system.toL2Bus = Bus()
root.system.l2c = L2(size='4MB', assoc=8)
root.system.l2c.cpu_side = root.system.toL2Bus.port
# connect l2c to membus
root.system.l2c.mem_side = root.system.membus.port
for cpu in cpu_server:
if options.L1:
cpu.addPrivateSplitL1Caches(L1(size = '32kB', assoc = 1),
L1(size = '32kB', assoc = 4))
cpu.mem = cpu.dcache
cpu.connectMemPorts(root.system.membus)
elif options.L2:
cpu.addPrivateSplitL1Caches(L1(size = '32kB', assoc = 1),
L1(size = '32kB', assoc = 4))
cpu.mem = cpu.dcache
# connect cpu level-1 caches to shared level-2 cache
cpu.connectMemPorts(root.system.toL2Bus)
else:
cpu.mem = root.system.physmem
cpu.connectMemPorts(root.system.membus)
else:
print "Error I don't know how to create more than 2 systems."
sys.exit(1)
m5.instantiate(root)
if options.maxtick:
maxtick = options.maxtick
elif options.maxtime:
simtime = int(options.maxtime * root.clock.value)
print "simulating for: ", simtime
maxtick = simtime
else:
maxtick = -1
exit_event = m5.simulate(maxtick)
while exit_event.getCause() == "checkpoint":
m5.checkpoint(root, "cpt.%d")
exit_event = m5.simulate(maxtick - m5.curTick())
print 'Exiting @ cycle', m5.curTick(), 'because', exit_event.getCause()
Thanks and Best Regards!
xiaojun.chen
[EMAIL PROTECTED]
2006-11-05
_______________________________________________
m5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users