You can look at the call graph profile further down in the gprof output to figure out how much time is spent in functions that get called from isTagPresent. If it's not specifically calling out findTagInSet, it may be because it's inlined in isTagPresent.
Steve On Fri, Nov 5, 2010 at 7:58 AM, Nilay Vaish <ni...@cs.wisc.edu> wrote: > I ran ALPHA_FS_MOESI_hammer using the following command -- > > ./build/ALPHA_FS_MOESI_hammer/m5.prof ./configs/example/ruby_fs.py > > I don't know how the benchmark is picked in case none is specified. Below > is the gprof output -- > > > % cumulative self self total > time seconds seconds calls s/call s/call name > 19.72 51.22 51.22 925285266 0.00 0.00 > CacheMemory::isTagPresent(Address const&) const > 5.59 65.74 14.52 229035720 0.00 0.00 Histogram::add(long > long) > 3.57 75.02 9.28 212664644 0.00 0.00 > CacheMemory::lookup(Address const&) > 2.53 81.59 6.57 47830136 0.00 0.00 > L1Cache_Controller::wakeup() > > > The output shows that about a fifth of the time is spent in the > isTagPresent() function. > > bool > CacheMemory::isTagPresent(const Address& address) const > { > assert(address == line_address(address)); > Index cacheSet = addressToCacheSet(address); > int loc = findTagInSet(cacheSet, address); > > if (loc == -1) { > // We didn't find the tag > DPRINTF(RubyCache, "No tag match for address: %s\n", address); > return false; > } > DPRINTF(RubyCache, "address: %s found\n", address); > return true; > } > > Since m5.prof is compiled with -DNDEBUG and -DTRACING_ON=0, the assert() > and the DPRINTF() will not get compiled. The addressToCacheSet() function > does some bitwise operations and some arithmetic operations. So it is > expected that it would not consume much time. So, most likely the > findTagInSet() function takes a major portion of the overall time required > by the isTagPresent() function. > > -- > Nilay > > > On Thu, 4 Nov 2010, Steve Reinhardt wrote: > > You also have to build a binary that supports ruby, like >> ALPHA_FS_MOESI_hammer. If you can't get that to work, try >> ALPHA_SE_MOESI_hammer and run one of the ALPHA_SE test workloads... the >> workload you run doesn't really matter that much as long as it's long >> enough >> to get a meaningful profile. >> >> Steve >> >> > _______________________________________________ > m5-dev mailing list > m5-dev@m5sim.org > http://m5sim.org/mailman/listinfo/m5-dev >
_______________________________________________ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev