Dear Sir/Madam,
With O3 CPU model, I test tgts_per_mshr = 1, 2, 4, 8, 16, 32, and found that
the performance is best when tgts_per_mshr = 1. Could someone explain it ? It
seems that tgts_per_mshr = 1 equals to tgts_per_mshr = Infinite large.
| tgts_per_mshr | 1 | 2 | 4 | 8 | 16 | 32 |
| 401.bzip2 running time | 0.0450 | 0.072 | 0.063 | 0.046 | 0.0453 | 0.0452 |
| 403.gcc running time | 0.0141 | 0.015 | 0.015 | 0.014 | 0.0143 | 0.0142 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Best regards.
Yuhang
BaseCache::BaseCache(const Params *p)
: MemObject(p),
mshrQueue("MSHRs", p->mshrs, 4, MSHRQueue_MSHRs),
writeBuffer("write buffer", p->write_buffers, p->mshrs+1000,
MSHRQueue_WriteBuffer),
blkSize(p->system->cacheLineSize()),
hitLatency(p->hit_latency),
responseLatency(p->response_latency),
numTarget(p->tgts_per_mshr),
forwardSnoops(p->forward_snoops),
isTopLevel(p->is_top_level),
blocked(0),
noTargetMSHR(NULL),
missCount(p->max_miss_count),
addrRanges(p->addr_ranges.begin(), p->addr_ranges.end()),
system(p->system)
{
}
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users