On Sun, 16 Dec 2012, hanfeng QIN wrote:
I am sorry for that we have diverse opinions. I think firstly you should
refer to this paper to understand the simulation methodology I mentioned.
I don't think you have any opinion. If you had one, you would clearly
stated why you believe the experiment you want to conduct makes sense. You
are just trying to do what some one else has done.
Zhan, D. Locality & Utility Co-optimization for Practical Capacity Management
of Shared Last Level Caches. ICS'12
Since you know whose methodology you are trying to replicate, it is
advisable that you contact the author(s) directly as to what exactly they
did. In fact, since the author(s) used M5 simulator, it should be straight
forward to replicate the changes that might have been made to the
simulator.
For your convenience, I extract the simulation methodology here.
"In the experiments, all threads under a given workload are executed starting
from a checkpoint that has already had the first 10 billion instructions
It is not clear whether the 10 billion instructions is the sum over all
the threads, or that each individual thread had executed 10 billion
instructions.
bypassed. They are cache-warmed with 1 billion instructions and then
Again, it is not clear if 1 billion instructions is across all threads, or
for an individual thread.
simulated in detail until all threads finish another 1 billion instructions.
Performance statistics are reported for a thread when it reaches 1 billion
instructions. If one thread completes the 1 billion instructions before
others, it continues to run so as to still compete for the SLLC capacity, but
its extra instructions are not taken into account in the final performance
report. This is in conformation with the standard practice in CMP cache
research"
From this paragraph, I can infer that each thread would have executed 1
billion instructions after the cache warm-up phase. After reading the
section 5.4 from the thesis of the author named above, it seems to me that
when a hardware thread completed a billion instructions for the first
time, he noted the IPC for that thread. Finally, these recorded IPCs were
summed up and used for comparison of different cache replacement policies.
For the 1st question, I do not insist on exactly N3 instructions at all.
Actually, it is not possible to count exact instruction numbers. But to keep
consistent with the above simulation methodology, I have to enforce each core
execute at least N3 instructions. I reviewed the currentl implementation of
the option '-I' in configs/common/Simulation.py and src/cpu/base.cc. It just
passed the '-I' value to cpu[i].MAX_INSTS_ANY_THREAD. In this case it only
guarantees that it exits the simulation if one core commits N3 instructions
no matter how many instructions retired from the other cores.
For the 2nd question, considering that some programs may finish N3
instruction before the others, if we only run total N3 instruction for whole
programs and report stats after N3 instructions, I don't think the stats can
mirror the real impact of shared resource contention since during some phases
no contention exists at all. On the other hand, by enforcing every program
enforced to run N3 * 2 instructions, we have the chance to report stats after
the first N3 instructions thus the stats can reflect the impact due to shared
resource contention.
Of course, I think the above simulation methodology still has pitfalls. For
some program with short lifetime, even executed N3 * 2 instructions, we still
can not guarantee it will contend shared resource with other programs.
I implement this methodology in gem5. For M multiprogrammed workload, it
dumps (M + 1) stats as expected. But now I do not obtain the dump order to
extract core information. E.g., if the dump order is c1->c2->c0->c3, then we
get stats related to c0 in the 1st dump and that to c2 in the 2nd dump and so
on. Here we must note that the stats dump order mirrors the order in which
every program finishes N3 instructions.
It is not too hard to print when a thread has executed a billion
instructions.
--
Nilay
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users