Edith,

Although I have moved to running Splash2 in FS, I still have the SE version of the v1-splash-alpha.tgz working for simple CPUs. I have included in this email a configuration script to run the benchmarks. You will have to point the rootdir parameter to the precompiled splash2 directory you downloaded. The other file attached is a diff that enables the threads to be scheduled properly. However, it currently will break any SMT cores in the system and I have not verified that it works with the O3 cpu.

So in short, if you want to run with simple CPUS, you can use the diff attached and the configuration file attached along with the precompiled splash2 benchmarks on the m5 website to run them in SE mode. If you want O3 cpu support you may have to make the proper changes to the inital thread state of each CPU as was done for the simpleCPU in the diff.

Please let me know if you have any difficulty getting them to run. The config file has several different preset flags you can use to set the size/latencies of L1 and L2 caches as well as the speed and number of cores. The diff I have given here is from the beta3 release that should be available shortly, although it should work with the beta2 as well, line numbers may not line up, I haven't tried it.

example command line:

%: build/ALPHA_SE/m5.debug run.py -n2 --rootdir="./splash2/codes" --benchmark=FMM

Of course you will need to point the rootdir at your version.

-Ron



On Sun, 22 Apr 2007, Edith Hand wrote:

Hello All,

Has anyone had success running the Splash2 benchmarks on M5 in SE mode?
I've read through a lot of the posts on this type of subject and it looks
like maybe as of Sep/Oct last year, it's best to run in FS mode because of
the lack of pthreads and PARMACS support in SE.  Is that still the case?

Here's what I have tried so-far.  I have built M5 on my Linux box and I'm
running Alpha Linux in SE.  I have cross-compiled the Splash2 benchmarks for
Alpha.  I used the configs/example/se.py as a starting point to call the
various benchmarks.  The first problem I ran into was that _sysctl wasn't
emulated.  The second problem I ran into (with benchmarks that don't appear
to call _sysctl or if I tell the emulation to ignore sysctl) is a page table
fault.

I also tried running the pre-compiled Splash2 benchmarks from the m5 dist
page (v1-splash-alpha.tgz), but I'm having the same problem with those as
with the benchmarks that I cross compiled myself.

Any hints would be greatly appreciated.  I'd love to be able to run the
benchmarks in SE, but if that's not possible, I suppose I could brave the FS
world...

Regards,
-Edith
# Splash2 Run Script
#

import m5
from m5.objects import *
import os, optparse, sys
m5.AddToPath('../common')

# --------------------
# Define Command Line Options
# ====================

parser = optparse.OptionParser()

parser.add_option("-d", "--detailed", action="store_true")
parser.add_option("-t", "--timing", action="store_true")
parser.add_option("-m", "--maxtick", type="int")
parser.add_option("-n", "--numcpus",
                  help="Number of cpus in total", type="int")
parser.add_option("-f", "--frequency",
                  default = "1GHz",
                  help="Frequency of each CPU")
parser.add_option("-p", "--protocol",
                  default="moesi",
                  help="The coherence protocol to use for the L1'a (i.e. MOESI, 
MOSI)")
parser.add_option("--l1size",
                  default = "32kB")
parser.add_option("--l1latency",
                  default = 1)
parser.add_option("--l2size",
                  default = "256kB")
parser.add_option("--l2latency",
                  default = 10)
parser.add_option("--rootdir",
                  help="Root directory of Splash2",
                  default="/dist/splash2/codes")
parser.add_option("-b", "--benchmark",
                  help="Splash 2 benchmark to run")

(options, args) = parser.parse_args()

if args:
    print "Error: script doesn't take any positional arguments"
    sys.exit(1)

if not options.numcpus:
    print "Specify the number of cpus with -n"
    sys.exit(1)
    
# --------------------
# Define Splash2 Benchmarks
# ====================
class Cholesky(LiveProcess):
    cwd = options.rootdir + '/kernels/cholesky'
    executable = options.rootdir + '/kernels/cholesky/CHOLESKY'
    cmd = 'CHOLESKY -p' + str(options.numcpus) + ' '\
          + options.rootdir + '/kernels/cholesky/inputs/tk23.O'

class FFT(LiveProcess):
    cwd = options.rootdir + '/kernels/fft'
    executable = options.rootdir + '/kernels/fft/FFT'
    cmd = 'FFT -p' + str(options.numcpus) + ' -m18'

class LU_contig(LiveProcess):
    executable = options.rootdir + '/kernels/lu/contiguous_blocks/LU'
    cmd = 'LU -p' + str(options.numcpus)
    cwd = options.rootdir + '/kernels/lu/contiguous_blocks'

class LU_noncontig(LiveProcess):
    executable = options.rootdir + '/kernels/lu/non_contiguous_blocks/LU'
    cmd = 'LU -p' + str(options.numcpus)
    cwd = options.rootdir + '/kernels/lu/non_contiguous_blocks'

class Radix(LiveProcess):
    executable = options.rootdir + '/kernels/radix/RADIX'
    cmd = 'RADIX -n524288 -p' + str(options.numcpus)
    cwd = options.rootdir + '/kernels/radix'

class Barnes(LiveProcess):
    executable = options.rootdir + '/apps/barnes/BARNES'
    cmd = 'BARNES'
    input = options.rootdir + '/apps/barnes/input.p' + str(options.numcpus)
    cwd = options.rootdir + '/apps/barnes'

class FMM(LiveProcess):
    executable = options.rootdir + '/apps/fmm/FMM'
    cmd = 'FMM'
    if str(options.numcpus) == '1':
        input = options.rootdir + '/apps/fmm/inputs/input.2048'
    else:
        input = options.rootdir + '/apps/fmm/inputs/input.2048.p' + 
str(options.numcpus)
    cwd = options.rootdir + '/apps/fmm'

class Ocean_contig(LiveProcess):
    executable = options.rootdir + '/apps/ocean/contiguous_partitions/OCEAN'
    cmd = 'OCEAN -p' + str(options.numcpus)
    cwd = options.rootdir + '/apps/ocean/contiguous_partitions'

class Ocean_noncontig(LiveProcess):
    executable = options.rootdir + '/apps/ocean/non_contiguous_partitions/OCEAN'
    cmd = 'OCEAN -p' + str(options.numcpus)
    cwd = options.rootdir + '/apps/ocean/non_contiguous_partitions'

class Raytrace(LiveProcess):
    executable = options.rootdir + '/apps/raytrace/RAYTRACE'
    cmd = 'RAYTRACE -p' + str(options.numcpus) + ' ' \
          + options.rootdir + '/apps/raytrace/inputs/teapot.env'
    cwd = options.rootdir + '/apps/raytrace'

class Water_nsquared(LiveProcess):
    executable = options.rootdir + '/apps/water-nsquared/WATER-NSQUARED'
    cmd = 'WATER-NSQUARED'
    if options.numcpus==1:
        input = options.rootdir + '/apps/water-nsquared/input'
    else:
        input = options.rootdir + '/apps/water-nsquared/input.p' + 
str(options.numcpus)
    cwd = options.rootdir + '/apps/water-nsquared'
        
class Water_spatial(LiveProcess):
    executable = options.rootdir + '/apps/water-spatial/WATER-SPATIAL'
    cmd = 'WATER-SPATIAL'
    if options.numcpus==1:
        input = options.rootdir + '/apps/water-spatial/input'
    else:
        input = options.rootdir + '/apps/water-spatial/input.p' + 
str(options.numcpus)
    cwd = options.rootdir + '/apps/water-spatial'
        
# --------------------
# Base L1 Cache Definition
# ====================

class L1(BaseCache):
    latency = options.l1latency
    block_size = 64
    mshrs = 12
    tgts_per_mshr = 8
    protocol = CoherenceProtocol(protocol=options.protocol)

# ----------------------
# Base L2 Cache Definition
# ----------------------

class L2(BaseCache):
    block_size = 64
    latency = options.l2latency
    mshrs = 92
    tgts_per_mshr = 16
    write_buffers = 8

# ----------------------
# Define the cpus
# ----------------------

busFrequency = Frequency(options.frequency)

if options.timing:
    cpus = [TimingSimpleCPU(cpu_id = i,
                            clock=options.frequency)
            for i in xrange(options.numcpus)]
elif options.detailed:
    cpus = [DerivO3CPU(cpu_id = i,
                       clock=options.frequency)
            for i in xrange(options.numcpus)]
else:
    cpus = [AtomicSimpleCPU(cpu_id = i,
                            clock=options.frequency)
            for i in xrange(options.numcpus)]

# ----------------------
# Create a system, and add system wide objects
# ----------------------        
system = System(cpu = cpus, physmem = PhysicalMemory(),
                membus = Bus(clock = busFrequency))

system.toL2bus = Bus(clock = busFrequency)
system.l2 = L2(size = options.l2size, assoc = 8)

# ----------------------
# Connect the L2 cache and memory together
# ----------------------

system.physmem.port = system.membus.port
system.l2.cpu_side = system.toL2bus.port
system.l2.mem_side = system.membus.port

# ----------------------
# Connect the L2 cache and clusters together
# ----------------------
for cpu in cpus:
    cpu.addPrivateSplitL1Caches(L1(size = options.l1size, assoc = 1),
                                L1(size = options.l1size, assoc = 4))
    cpu.mem = cpu.dcache
    # connect cpu level-1 caches to shared level-2 cache
    cpu.connectMemPorts(system.toL2bus)


# ----------------------
# Define the root
# ----------------------

root = Root(system = system)

# --------------------
# Pick the correct Splash2 Benchmarks
# ====================
if options.benchmark == 'Cholesky':
    root.workload = Cholesky()
elif options.benchmark == 'FFT':
    root.workload = FFT()
elif options.benchmark == 'LUContig':
    root.workload = LU_contig()
elif options.benchmark == 'LUNoncontig':
    root.workload = LU_noncontig()
elif options.benchmark == 'Radix':
    root.workload = Radix()
elif options.benchmark == 'Barnes':
    root.workload = Barnes()
elif options.benchmark == 'FMM':
    root.workload = FMM()
elif options.benchmark == 'OceanContig':
    root.workload = Ocean_contig()
elif options.benchmark == 'OceanNoncontig':
    root.workload = Ocean_noncontig()
elif options.benchmark == 'Raytrace':
    root.workload = Raytrace()
elif options.benchmark == 'WaterNSquared':
    root.workload = Water_nsquared()
elif options.benchmark == 'WaterSpatial':
    root.workload = Water_spatial()
else:
    panic("The --benchmark environment variable was set to something" \
          +" improper.\nUse Cholesky, FFT, LUContig, LUNoncontig, Radix" \
          +", Barnes, FMM, OceanContig,\nOceanNoncontig, Raytrace," \
          +" WaterNSquared, or WaterSpatial\n")

# --------------------
# Assign the workload to the cpus
# ====================

for cpu in cpus:
    cpu.workload = root.workload

# ----------------------
# Run the simulation
# ----------------------

if options.timing or options.detailed:
    root.system.mem_mode = 'timing'

# instantiate configuration
m5.instantiate(root)

# simulate until program terminates
if options.maxtick:
    exit_event = m5.simulate(options.maxtick)
else:
    exit_event = m5.simulate(m5.MaxTick)

print 'Exiting @ tick', m5.curTick(), 'because', exit_event.getCause()

# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
#   2007/04/16 11:31:54-04:00 [EMAIL PROTECTED] 
#   Fixes for splash, may conflict with Korey's SMT work and doesn't support 
03cpu yet.
# 
# src/kern/tru64/tru64.hh
#   2007/04/16 11:31:53-04:00 [EMAIL PROTECTED] +1 -1
#   When looking for a open cpu to assign threads, look for an unallocated one, 
not a suspended one.
# 
# src/cpu/simple_thread.cc
#   2007/04/16 11:31:53-04:00 [EMAIL PROTECTED] +4 -4
#   Wait for a thread to be assigned to activate the cpu
# 
# src/cpu/simple/base.cc
#   2007/04/16 11:31:53-04:00 [EMAIL PROTECTED] +1 -1
#   Cpu's should start as unallocated, not suspended
# 
diff -Nru a/src/cpu/simple/base.cc b/src/cpu/simple/base.cc
--- a/src/cpu/simple/base.cc    2007-04-23 15:57:04 -04:00
+++ b/src/cpu/simple/base.cc    2007-04-23 15:57:04 -04:00
@@ -79,7 +79,7 @@
             /* asid */ 0);
 #endif // !FULL_SYSTEM
 
-    thread->setStatus(ThreadContext::Suspended);
+    thread->setStatus(ThreadContext::Unallocated);
 
     tc = thread->getTC();
 
diff -Nru a/src/cpu/simple_thread.cc b/src/cpu/simple_thread.cc
--- a/src/cpu/simple_thread.cc  2007-04-23 15:57:04 -04:00
+++ b/src/cpu/simple_thread.cc  2007-04-23 15:57:04 -04:00
@@ -221,10 +221,10 @@
     
     lastActivate = curTick;
 
-    if (status() == ThreadContext::Unallocated) {
-       cpu->activateWhenReady(tid);
-       return;
-    }
+//    if (status() == ThreadContext::Unallocated) {
+//     cpu->activateWhenReady(tid);
+//     return;
+//   }
 
     _status = ThreadContext::Active;
 
diff -Nru a/src/kern/tru64/tru64.hh b/src/kern/tru64/tru64.hh
--- a/src/kern/tru64/tru64.hh   2007-04-23 15:57:04 -04:00
+++ b/src/kern/tru64/tru64.hh   2007-04-23 15:57:04 -04:00
@@ -792,7 +792,7 @@
            for (int i = 0; i < process->numCpus(); ++i) {
                ThreadContext *tc = process->threadContexts[i];
 
-                if (tc->status() == ThreadContext::Suspended) {
+                if (tc->status() == ThreadContext::Unallocated) {
                    // inactive context... grab it
                    init_thread_context(tc, attrp, uniq_val);
 
_______________________________________________
m5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

Reply via email to