Hi,
   Recently,I have made a serials of tests to get know about the availability 
of M5.
   These tests mainly focus on the CMP/SMT simulating ability of M5. Some 
configurations are passed and other are failed. I don't know what is the reason 
about those had failed. I wish that the authors, Nathan or Steven or Ali or..., 
can give some suggestion to those failed cases.
   Now,the test-results are as bellow:

[Notations]
    C:cores
    T:Threads

[SE mode]:
  [DetailedCPU]:
     [none-cache]: 
      1C4T:                             OK;
      8C:                               OK;
      16C:                              ERROR
         build/ALPHA_SE/cpu/base_dyn_inst_impl.hh:125: void 
BaseDynInst<Impl>::initVars() [with Impl = O3CPUImpl]: Assertion `instcount <= 
1500' failed.
     [Just L1-cache]:
        SMT-enable:     ERROR
          m5.opt: build/ALPHA_SE/mem/request.hh:229: int 
Request::getThreadNum(): Assertion `validCpuAndThreadNums' failed.
           
        4C:                     OK
        8C:                     OK
        16C:                    ERROR
          build/ALPHA_SE/cpu/base_dyn_inst_impl.hh:125: void 
BaseDynInst<Impl>::initVars() [with Impl = O3CPUImpl]: Assertion `instcount <= 
1500' failed.
     [L2-cache(shared)]:ERROR
       Exiting @ tick 4611686018427387904 because simulate() limit reached
       After running it , the simulator reached the max tick 0x4000000000000000 
very much quickly and exited.
  [TimingSimpleCPU]:
                [SMT-enable]:   ERROR
                                KeyError: 'system.cpu2.workload0 
system.cpu2.workload1'
                                panic: resolveSimObject: failure on call to 
Python for system.cpu2.workload0 system.cpu2.workload1
                [none-cache]:
                [Just L1-cache]:
                        32C:            OK;
                        128C:           OK;
                [L2-cache(shared)]:
                        4C:             OK;
                        32C:            OK;
AtomicSimpleCPU:
                [none-cache]:
                        128C:           OK;
                [Just L1-cach]e:
                        32C:            OK;
                        128C:      OK; 
                [L2-cache(shared)]:
                        2C:         ERROR;
                                Exiting @ tick 15583 because target called 
exit()       
                                

[FS Mode]:
  [DetailedCPU] the same error information appearenced in all cases
                [none-cache]:           ERROR       
                [L1-Cache]:             ERROR
                [L2-Cache]:             ERROR
                        warn: Entering event queue @ 0.  Starting simulation...
                        warn: cycle 490: Quiesce instruction encountered, 
halting fetch!
                        warn: cycle 490: Quiesce instruction encountered, 
halting fetch!
                        warn: cycle 532: Quiesce instruction encountered, 
halting fetch!
                        warn: cycle 532: Quiesce instruction encountered, 
halting fetch!
                        Segmentation fault
  [TimingSimpleCPU]:
                [none-cache]:
        2C:             OK
        4C:             after running it , it stopped after a while(2-3 
minutes), and the lastest outputted line is always "NET: Registered protocol 
family 2", and the part console output is as bellow:
                PIIX4: IDE controller at PCI slot 0000:00:00.0
                                PIIX4: chipset revision 0
                                PIIX4: 100% native mode on irq 31
                ide0: BM-DMA at 0x8400-0x8407, BIOS settings: hda:DMA, hdb:DMA
                ide1: BM-DMA at 0x8408-0x840f, BIOS settings: hdc:DMA, hdd:DMA
                                hda: M5 IDE Disk, ATA DISK drive
                                hdb: M5 IDE Disk, ATA DISK drive
                                ide0 at 0x8410-0x8417,0x8422 on irq 31
                                hda: max request size: 128KiB
                                hda: 524160 sectors (268 MB), CHS=520/16/63, 
UDMA(33)
                                hda: cache flushes not supported
                                hda: hda1
                                hdb: max request size: 128KiB
                                hdb: 4177920 sectors (2139 MB), CHS=4144/16/63, 
UDMA(33)
                                hdb: cache flushes not supported
                                hdb: unknown partition table
                                mice: PS/2 mouse device common for all mice
                                NET: Registered protocol family 2
                [L1-Cache]:
                        1C:     ERROR
                                Listening for console connection on port 3456
                                0: system.remote_gdb.listener: listening for 
remote gdb #0 on port 7000
                                0: system.remote_gdb.listener: listening for 
remote gdb #1 on port 7001
                                warn: Entering event queue @ 0.  Starting 
simulation...
                                m5.opt: 
build/ALPHA_FS/mem/cache/miss/mshr_queue.cc:216: void 
MSHRQueue::markInService(MSHR*): Assertion `mshr->getNumTargets() == 0' failed.
                                Program aborted at cycle 330
                                Aborted
                [L2-Cache]:
                        1C:             ERROR:
                                m5.opt: 
build/ALPHA_FS/mem/cache/tags/cache_tags_impl.hh:178: typename CacheTags<Tags, 
Compression>::BlkType* CacheTags<Tags, Compression>::handleFill(typename 
Tags::BlkType*, Packet*&, unsigned int, PacketList&, Packet*) [with Tags = LRU, 
Compression = NullCompression]: Assertion `tmp_blk == blk' failed.
                                Program aborted at cycle 806
                                Aborted

  [AtomicSimpleCPU]:
                [none-cache]:
                        2C:                     OK
                        4C:as same as TimingSimpleCPU
                [L1-Cache]:     ERROR
                        warn: Entering event queue @ 0.  Starting simulation...
                        panic: unimplemented
                         @ cycle 19335
                        [doFunctionalAccess:build/ALPHA_FS/mem/physical.cc, 
line 136]
                        Program aborted at cycle 19335
                        Aborted
                [L2-Cache]:    ERROR
                        warn: Entering event queue @ 0.  Starting simulation...
                        m5.opt: build/ALPHA_FS/mem/request.hh:229: int 
Request::getThreadNum(): Assertion `validCpuAndThreadNums' failed.
                        Program aborted at cycle 19335
                        Aborted


The configuration files uses in testing are as bellow:
[se.py]:


# Simple test script
#
# "m5 test.py"

import m5
from m5.objects import *
import os, optparse, sys
m5.AddToPath('../common')
from FullO3Config import *

parser = optparse.OptionParser()

parser.add_option("-c", "--cmd",
                  default="../../tests/test-progs/hello/bin/alpha/linux/hello",
                  help="The binary to run in syscall emulation mode.")
parser.add_option("-o", "--options", default="",
                  help="The options to pass to the binary, use \" \" around the 
entire\
                        string.")
parser.add_option("-i", "--input", default="",
                  help="A file of input to give to the binary.")
parser.add_option("-d", "--detailed", action="store_true")
parser.add_option("-t", "--timing", action="store_true")
parser.add_option("-m", "--maxtick", type="int")
parser.add_option("-p", "--cpunum", type="int")
parser.add_option("-l", "--L1", action="store_true")
parser.add_option("-L", "--L2", action="store_true")

(options, args) = parser.parse_args()

if args:
    print "Error: script doesn't take any positional arguments"
    sys.exit(1)

n_cpus = 1
if options.cpunum:
    n_cpus = options.cpunum

if options.timing:
    cpus = [ TimingSimpleCPU() for i in xrange(n_cpus) ]
elif options.detailed:
    cpus = [ DetailedO3CPU() for i in xrange(n_cpus) ]
else:
    cpus = [ AtomicSimpleCPU() for i in xrange(n_cpus) ]

# --------------------
# Base L1 Cache
# ====================

class L1(BaseCache):
    latency = 1
    block_size = 64 
    mshrs = 4
    tgts_per_mshr = 8
    #protocol = CoherenceProtocol(protocol='msi')

# ----------------------
# Base L2 Cache
# ----------------------
    
class L2(BaseCache):
    block_size = 64 
    latency = 100
    mshrs = 92
    tgts_per_mshr = 16
    write_buffers = 8

# system simulated
system = System(cpu = cpus, physmem = PhysicalMemory(), membus = Bus()) 

# l2cache & bus
system.toL2Bus = Bus()
system.l2c = L2(size='4MB', assoc=8)
system.l2c.cpu_side = system.toL2Bus.port

# connect l2c to membus
system.l2c.mem_side = system.membus.port

for cpu in cpus:
    if options.L1:
                    cpu.addPrivateSplitL1Caches(L1(size = '32kB', assoc = 1),
                                                L1(size = '32kB', assoc = 4))
                    cpu.mem = cpu.dcache
                    cpu.connectMemPorts(system.membus)
    elif options.L2:
        cpu.addPrivateSplitL1Caches(L1(size = '32kB', assoc = 1),
                                                L1(size = '32kB', assoc = 4))
                    cpu.mem = cpu.dcache
                    
                    # connect cpu level-1 caches to shared level-2 cache 
                    cpu.connectMemPorts(system.toL2Bus)
                else:
                    cpu.mem = system.physmem
                    cpu.connectMemPorts(system.membus)
                
                process = LiveProcess()
                process.executable = options.cmd
                process.cmd = options.cmd + " " + options.options
                if options.input != "":
                    process.input = options.input
                
                if options.detailed:
                    #check for SMT workload
                    workloads = options.cmd.split(';')
                    if len(workloads) > 1:
                        process = []
                        smt_idx = 0
                        inputs = []
                        
                        if options.input != "":
                            inputs = options.input.split(';')
                        
                        for wrkld in workloads:
                            smt_process = LiveProcess()
                            smt_process.executable = wrkld
                            smt_process.cmd = wrkld + " " + options.options
                            if inputs and inputs[smt_idx]:
                                smt_process.input = inputs[smt_idx]
                            process += [smt_process, ]
                            smt_idx += 1
    cpu.workload = process
    
system.physmem.port = system.membus.port

root = Root(system = system)

if options.timing or options.detailed:
    root.system.mem_mode = 'timing'

# instantiate configuration
m5.instantiate(root)

# simulate until program terminates
if options.maxtick:
    exit_event = m5.simulate(options.maxtick)
else:
    exit_event = m5.simulate()

print 'Exiting @ tick', m5.curTick(), 'because', exit_event.getCause()




[fs.py]:


import optparse, os, sys

import m5
from m5.objects import *
m5.AddToPath('../common')
from FSConfig import *
from SysPaths import *
from Benchmarks import *

parser = optparse.OptionParser()

parser.add_option("-d", "--detailed", action="store_true")
parser.add_option("-t", "--timing", action="store_true")
parser.add_option("-m", "--maxtick", type="int")
parser.add_option("--maxtime", type="float")
parser.add_option("--dual", action="store_true",
                  help="Simulate two systems attached with an ethernet link")
parser.add_option("-b", "--benchmark", action="store", type="string",
                  dest="benchmark",
                  help="Specify the benchmark to run. Available benchmarks: %s"\
                          % DefinedBenchmarks)
parser.add_option("--etherdump", action="store", type="string", 
dest="etherdump",
                  help="Specify the filename to dump a pcap capture of the 
ethernet"
                  "traffic")
parser.add_option("-p", "--cpunum", type="int")
parser.add_option("-l", "--L1", action="store_true")
parser.add_option("-L", "--L2", action="store_true")

(options, args) = parser.parse_args()

if args:
    print "Error: script doesn't take any positional arguments"
    sys.exit(1)
#n_cpus = 
#n_cpus = options.cpunum
n_cpus = 1
if options.cpunum:
    n_cpus = options.cpunum
 
if options.detailed:
    cpu_server = [ DetailedO3CPU() for i in xrange(n_cpus) ]
    cpu_client = [ DetailedO3CPU() for i in xrange(n_cpus) ]
    mem_mode = 'timing'
elif options.timing:
    cpu_server = [ TimingSimpleCPU() for i in xrange(n_cpus) ]
    cpu_client = [ TimingSimpleCPU() for i in xrange(n_cpus) ]
    mem_mode = 'timing'
else:
    cpu_server = [ AtomicSimpleCPU() for i in xrange(n_cpus) ]
    cpu_client = [ AtomicSimpleCPU() for i in xrange(n_cpus) ]

    mem_mode = 'atomic'

BaseCPU.clock = '2GHz'

if options.benchmark:
    if options.benchmark not in Benchmarks:
        print "Error benchmark %s has not been defined." % options.benchmark
        print "Valid benchmarks are: %s" % DefinedBenchmarks
        sys.exit(1)
    
    bm = Benchmarks[options.benchmark]
else:
    if options.dual:
        bm = [Machine(), Machine()]
    else:
        bm = [Machine()]

# --------------------
# Base L1 Cache
# ====================

class L1(BaseCache):
    latency = 1
    block_size = 64 
    mshrs = 4
    tgts_per_mshr = 8
    #protocol = CoherenceProtocol(protocol='msi')

# ----------------------
# Base L2 Cache
# ----------------------
    
class L2(BaseCache):
    block_size = 64 
    latency = 100
    mshrs = 92
    tgts_per_mshr = 16
    write_buffers = 8
    
if len(bm) == 2:
#not consider two machines case
    s1 = makeLinuxAlphaSystem(mem_mode, bm[0])
    s1.cpu = cpu
    cpu.connectMemPorts(s1.membus)
    cpu.mem = s1.physmem
    s2 = makeLinuxAlphaSystem(mem_mode, bm[1])
    s2.cpu = cpu2
    cpu2.connectMemPorts(s2.membus)
    cpu2.mem = s2.physmem
    root = makeDualRoot(s1, s2, options.etherdump)
elif len(bm) == 1:
    # modify the makeLinuxAlphaSystem func take a cpus vector as a 
parameter[FSConfig.py]
    root = Root(clock = '1THz',
                system = makeLinuxAlphaSystem(cpu_server, mem_mode, bm[0]))
    # l2cache & bus
    root.system.toL2Bus = Bus()
    root.system.l2c = L2(size='4MB', assoc=8)
    root.system.l2c.cpu_side = root.system.toL2Bus.port

    # connect l2c to membus
    root.system.l2c.mem_side = root.system.membus.port

    for cpu in cpu_server:
        if options.L1:
            cpu.addPrivateSplitL1Caches(L1(size = '32kB', assoc = 1),
                                        L1(size = '32kB', assoc = 4))
            cpu.mem = cpu.dcache
            cpu.connectMemPorts(root.system.membus)
        elif options.L2:
            cpu.addPrivateSplitL1Caches(L1(size = '32kB', assoc = 1),
                                        L1(size = '32kB', assoc = 4))
            cpu.mem = cpu.dcache
                    
            # connect cpu level-1 caches to shared level-2 cache 
            cpu.connectMemPorts(root.system.toL2Bus)
        else:
            cpu.mem = root.system.physmem
            cpu.connectMemPorts(root.system.membus)
            
else:
    print "Error I don't know how to create more than 2 systems."
    sys.exit(1)

m5.instantiate(root)

if options.maxtick:
    maxtick = options.maxtick
elif options.maxtime:
    simtime = int(options.maxtime * root.clock.value)
    print "simulating for: ", simtime
    maxtick = simtime
else:
    maxtick = -1

exit_event = m5.simulate(maxtick)

while exit_event.getCause() == "checkpoint":
    m5.checkpoint(root, "cpt.%d")
    exit_event = m5.simulate(maxtick - m5.curTick())    

print 'Exiting @ cycle', m5.curTick(), 'because', exit_event.getCause()

Thanks and Best Regards!
                                

        xiaojun.chen
[EMAIL PROTECTED]
          2006-11-05
_______________________________________________
m5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

Reply via email to