I profiled M5 but surprisingly I did not find any mention of the function
findTagInSet() in the output obtained from gprof. Does it matter what
coherence protocol is in use? I carried out the following step -
1. Compiled m5.prof using
scons -j 6 USE_MYSQL=False RUBY=True build/ALPHA_FS/m5.prof
2. Ran blackscholes benchmark using the instructions specified in the
technical report 'Running PARSEC v2.1 in the M5 Simulator' by Gebhart et
al. Specifically I ran the following command --
./build/ALPHA_FS/m5.prof ./configs/example/fs.py -n 1
--script=/scratch/nilay/GEM5/system/scripts/blackscholes.rcS --detailed
--caches --l2cache -F 5000000000
These are the contents of blackscholes.rcS
#!/bin/sh
# File to run the blackscholes benchmark
cd /parsec/install/bin
/sbin/m5 switchcpu
/sbin/m5 dumpstats
/sbin/m5 resetstats
./blackscholes 64 /parsec/install/inputs/blackscholes/in_64K.txt
/parsec/install/inputs/blackscholes/prices.txt
echo "Done :D"
/sbin/m5 exit
/sbin/m5 exit
Thanks
Nilay
On Tue, 2 Nov 2010, Steve Reinhardt wrote:
I just compiled m5.prof and ran it (forgot what workload I ran on it,
probably one of the parsec benchmarks; it probably doesn't matter a lot).
If you've never used gprof before, this is a great time to learn!
Steve
On Tue, Nov 2, 2010 at 10:40 AM, Nilay Vaish <[email protected]> wrote:
I am looking at possible performance optimizations in Ruby. As you can see
grasp from the mail excerpt below, the function findTagInSet() consumes lots
of time. I am thinking of making the changes as suggested by Brad. I have
questions for m5-dev members, in particular for Derek and Steve. How did you
arrive at the conclusion that findTagInSet() is a problem? What benchmarks,
profiling tools to use?
Thanks
Nilay
---------- Forwarded message ----------
Date: Mon, 20 Sep 2010 22:57:39 -0500
From: "Beckmann, Brad" <[email protected]>
To: 'Nilay Vaish' <[email protected]>
Cc: Daniel Gibson <[email protected]>
Subject: RE: Performane Optimizations in Ruby
== CacheMemory findTagInSet == Recently Steve mentioned to me that a huge
percentage of time was being spent in CacheMemory's findTagInSet function.
Right now that function uses a hashmap across the entire cache to map tags
to way ids. I think Derek recently implemented this change in hopes to
improve performance, and it might have for small caches, but I don't think
it helps for larger caches. There a couple of possible solutions: per set
hashmaps, or reordering the ways so that the MRU blocks are at the lower ids
and use a loop. I think we should investigate both solutions and see which
is better.
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev