I've discovered a couple of stats oddities that should be fixed, but I'm not
sure what the best approach is.  They both have to do with looking at the
stats for an O3 CPU after an FS run in which we restore from a checkpoint to
a SimpleCPU and then switch to an O3 CPU.  Specifically I was trying to
figure out why, when comparing two runs where in one case the CPU should
have had much more idle time than the other, the stats seemed to indicate
that there was very little idle time in either case.

First, in O3ThreadContext<Impl>::takeOverFrom(), the kernelStats pointer of
the old context overwrites the kernelStats pointer of the new context, with
the comment "Transfer kernel stats from one CPU to the other."  The problem
is that the final output is confusing... all of the kernel stats are
associated with the SimpleCPU, and the kernel stats values of the O3 CPU
(which got registered when the O3 object was created, before the call to
takeOverFrom()) are all zero.  I was specifically looking at the number of
quiesce instructions executed, and the stats told me that the SimpleCPU
executed a few hundred in the small window it was executing while the O3 CPU
executed none.   The easy solution here is just to whack the kernelStats
assignment in takeOverFrom(), but I wanted to know if there was a good
reason for doing this that I'm missing.  Also, this same assignment appears
in most (all?) of the takeOverFrom() functions of the other CPU models, so
if we get rid of it at all we should get rid of it everwhere.

Second, the real culprit turned out to be the way that idle cycles are
accounted for in O3.  In wakeCPU(), both the
numCycles and idleCycles stats are updated with the number of cycles the CPU
has been idling.  (When the CPU is not idling, numCycles is incremented each
cycle and of course idleCycles is not updated at all.)  In general this is
OK, but when the simulation ends while the CPU is still idling, then there
is a potentially substantial number of cycles that don't get added in to
either counter.  One way to fix this is to call wakeCPU() just before
terminating the simulation, but that seems very hackish and prone to
unwanted side effects.  Another way would be to have some "cleanup" function
that SimObjects can register (much like init()) that gets called before
stats are dumped, and then in the O3 model we could use this hook to update
these stats before they get dumped. A final way would be to dump the
idleCycles and replace it with a Stat::Average idleFraction statistic like
we use in the other CPUmodels... this would also have the nice side effect
of making the statistics more consistent across the CPU models, but it would
not solve the problem with numCycles.

Thoughts?

Steve
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to