None of those benchmarks probably push the memory system with multiple
cores like fft. Why don't you give Nilay your fft benchmark?
Ali
On Fri, 1 Apr 2011 16:42:48 -0400, Korey Sewell <ksew...@umich.edu>
wrote:
Hi Nilay,
I think I've located the images for those benchmarks so I'll test a
couple
of these over the weekend and give an update.
On Wed, Mar 30, 2011 at 8:03 PM, Nilay Vaish <ni...@cs.wisc.edu>
wrote:
Korey, I do not have the FftBase32 benchmark. Is it possible for you
to run
the simulation with one of the following benchmarks --
IScsiInitiator, IScsiTarget, MutexTest, NetperfMaerts,
NetperfStream,
NetperfStreamNT, NetperfStreamUdp, NetperfUdpLocal, Nfs, NfsTcp,
Nhfsstone,
Ping, PovrayAutumn, PovrayBench, SurgeSpecweb, SurgeStandard,
ValAccDelay,
ValAccDelay2, ValCtxLat, ValMemLat, ValMemLat2MB, ValMemLat8MB,
ValStream,
ValStreamCopy, ValStreamScale, ValSysLat, ValTlbLat, Validation,
bnAn
Which of these do you think would closely resemble FFT?
Nilay
On Wed, 30 Mar 2011, Korey Sewell wrote:
Hi all,
I had noticed that Ruby was running a little slower than the old M5
memory system and decided to run gprof on it to see if there was
anything obvious holding things up.
For 2, 4, and 8 core ALPHA_FS_MOESI_CMP_directory, SimpleCPU runs
for
the Fft benchmark, it seems that the MemoryControl::executeCycle
conributes to nearly 30% of the runtime. Looking at the comments
for
that code, I see this:
"// executeCycle: This function is called once per memory clock
cycle"
I'm not familiar with this Memory Controller code but it would seem
that some type of optimization not requiring this to be run every
memory cycle would speed things up a good bit. So if someone has
the
time or the need to do some Ruby optimization work (i know Nilay
had
did some previously), then I think this will be a good place to
start...
I post some of the gprof output below:
=====
2 core
=====
time (%) name
29.17 MemoryControl::executeCycle()
4.19 RubyEventQueue::scheduleEventAbsolute(Consumer*, long long)
3.52 PerfectSwitch::wakeup()
3.47 Set::Set(Set const&)
3.46 RubyEventQueueNode::process()
4 core
=====
time (%) name
27.49 MemoryControl::executeCycle()
4.01 RubyEventQueue::scheduleEventAbsolute(Consumer*, long
long)
3.66 PerfectSwitch::wakeup()
3.59 Set::Set(Set const&)
3.50 RubyEventQueueNode::process()
8 core
=====
time (%) name
26.09 MemoryControl::executeCycle()
4.12 Set::Set(Set const&)
3.91 PerfectSwitch::wakeup()
3.88 RubyEventQueue::scheduleEventAbsolute(Consumer*, long long)
3.41 RubyEventQueueNode::process()
--
- Korey
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev