I ran ALPHA_FS_MOESI_hammer using the following command --

./build/ALPHA_FS_MOESI_hammer/m5.prof ./configs/example/ruby_fs.py

I don't know how the benchmark is picked in case none is specified. Below is the gprof output --

  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
19.72 51.22 51.22 925285266 0.00 0.00 CacheMemory::isTagPresent(Address const&) const 5.59 65.74 14.52 229035720 0.00 0.00 Histogram::add(long long) 3.57 75.02 9.28 212664644 0.00 0.00 CacheMemory::lookup(Address const&) 2.53 81.59 6.57 47830136 0.00 0.00 L1Cache_Controller::wakeup()


The output shows that about a fifth of the time is spent in the isTagPresent() function.

bool
CacheMemory::isTagPresent(const Address& address) const
{
    assert(address == line_address(address));
    Index cacheSet = addressToCacheSet(address);
    int loc = findTagInSet(cacheSet, address);

    if (loc == -1) {
        // We didn't find the tag
        DPRINTF(RubyCache, "No tag match for address: %s\n", address);
        return false;
    }
    DPRINTF(RubyCache, "address: %s found\n", address);
    return true;
}

Since m5.prof is compiled with -DNDEBUG and -DTRACING_ON=0, the assert() and the DPRINTF() will not get compiled. The addressToCacheSet() function does some bitwise operations and some arithmetic operations. So it is expected that it would not consume much time. So, most likely the findTagInSet() function takes a major portion of the overall time required by the isTagPresent() function.

--
Nilay

On Thu, 4 Nov 2010, Steve Reinhardt wrote:

You also have to build a binary that supports ruby, like
ALPHA_FS_MOESI_hammer.  If you can't get that to work, try
ALPHA_SE_MOESI_hammer and run one of the ALPHA_SE test workloads... the
workload you run doesn't really matter that much as long as it's long enough
to get a meaningful profile.

Steve


_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to