I ran ALPHA_FS_MOESI_hammer using the following command --
./build/ALPHA_FS_MOESI_hammer/m5.prof ./configs/example/ruby_fs.py
I don't know how the benchmark is picked in case none is specified. Below
is the gprof output --
% cumulative self self total
time seconds seconds calls s/call s/call name
19.72 51.22 51.22 925285266 0.00 0.00
CacheMemory::isTagPresent(Address const&) const
5.59 65.74 14.52 229035720 0.00 0.00 Histogram::add(long
long)
3.57 75.02 9.28 212664644 0.00 0.00
CacheMemory::lookup(Address const&)
2.53 81.59 6.57 47830136 0.00 0.00
L1Cache_Controller::wakeup()
The output shows that about a fifth of the time is spent in the
isTagPresent() function.
bool
CacheMemory::isTagPresent(const Address& address) const
{
assert(address == line_address(address));
Index cacheSet = addressToCacheSet(address);
int loc = findTagInSet(cacheSet, address);
if (loc == -1) {
// We didn't find the tag
DPRINTF(RubyCache, "No tag match for address: %s\n", address);
return false;
}
DPRINTF(RubyCache, "address: %s found\n", address);
return true;
}
Since m5.prof is compiled with -DNDEBUG and -DTRACING_ON=0, the assert()
and the DPRINTF() will not get compiled. The addressToCacheSet() function
does some bitwise operations and some arithmetic operations. So it is
expected that it would not consume much time. So, most likely the
findTagInSet() function takes a major portion of the overall time required
by the isTagPresent() function.
--
Nilay
On Thu, 4 Nov 2010, Steve Reinhardt wrote:
You also have to build a binary that supports ruby, like
ALPHA_FS_MOESI_hammer. If you can't get that to work, try
ALPHA_SE_MOESI_hammer and run one of the ALPHA_SE test workloads... the
workload you run doesn't really matter that much as long as it's long enough
to get a meaningful profile.
Steve
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev