(switching to email since this is getting a bit detailed)

These are valid concerns.  I'd like to explore them a little more.  I'm not
necessarily picking sides, just playing devil's advocate a bit.

On Sun, Apr 29, 2012 at 5:05 AM, Andreas Hansson <[email protected]>wrote:

>    This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/1159/
>
> On April 28th, 2012, 9:08 a.m., *Steve Reinhardt* wrote:
>
> I like the idea of having a common way of collecting these stats (and 
> probably more importantly, common names and semantics for the output).  Does 
> it make sense to have this in a separate object though?  It seems like that 
> just adds complexity to the configuration, and the possibility of not putting 
> one of these where you wish you had.  Did you consider putting all this code 
> in a base class like MemObject so that everyone automatically collects all 
> these stats?
>
>  A very valid comment, and this option was under serious consideration. The 
> benefits of having it behaviourally encapsulated (as you now propose) rather 
> than structurally encapsulated (as the current patch does) is indeed that:
> 1) it does not involve changing any configuration scripts, and
> 2) any communicating object would have a uniform set of stats relating to the 
> communication
>
> The issues are:
> 1) how to "hide" it in the interface timing calls, as we must catch the case 
> where sendTiming returns false and undo the changes to the senderState
>
> If we make this common to all MemObjects, we could take the monitor state
(which is just a tick value) and put it in the base SenderState class. I
bet there are some MemObjects that already record the transmit time in the
sender state anyway so we could get rid of that double recording.
 Eliminating the memory allocation and chaining of the separate SenderState
object could possibly be a noticeable performance win when you're actually
collecting the stats.

> 2) the rather extensive set of disable flags in the object parameters
>
> This doesn't bother me much.  If we define the flags on the base MemObject
and provide reasonable defaults, they'll be invisible to most people.

> 3) the potential negative performance impact caused by collecting more stats 
> than we really want
>
> How expensive are these stats?  A timing-mode transaction is not cheap to
begin with, so the percentage slowdown may not be that bad.  Plus I find
that, when you're running long simulations, you're better off collecting
all the stats you can (within reason), since having them and not needing
them is less costly than wanting them and not having them.

Plus, it looks like all the stats can be disabled, so if performance is a
concern, they could just be turned off, right?  This would eliminate the
first-order overheads (notwithstanding #5 below).

> 4) more "noise" in the stats output
>
> The stats output really isn't designed to be read casually anyway... if
you're using grep or some other script, then the excess data won't be
noticeable.

> 5) potential negative performance impact even when not collecting stats due 
> to the large number of if-statements
>
> True.  As with #3, we don't know  how big this impact is though.  Branch
predictors can be pretty good, so although a sequence of
load/compare/branch instructions isn't free, it's not necessarily as bad as
you think.

Plus, there may be some existing stats in some MemObjects that are
redundant with respect to these (like bytes read and bytes written maybe?)
and if we go back and get rid of those then we'd save a little overhead
there.

> The current CommMonitor may not be ideal in that it requires config script 
> changes, and it is indeed up to the user to determine which points in the 
> memory system are of interest. However, I would expect no more than a handful 
> monitors in a typical system, and the stats in the monitor could become 
> multi-dimensional and indexed on the masterId in a future patch.
>
>
How many masters are there in a typical system anyway, and how many of them
(1) issue a lot of requests and (2) are ones we're not likely to care about
stats on?  These are the only cases where it would matter to not collect
stats, IMO.  My guess is that the only things that might fall into this
category are fabric components like the Bus and *maybe* the Bridge.  I
think the CPUs are the things you would definitely want to be capturing
stats on, and other things like DMA devices probably don't issue enough
requests typically to matter (from a simulation performance perspective).


> As a minor side note, the structural CommMonitor also more closely resembles 
> the bus monitors used in actual devices.
>
>
Only because in silicon it's too expensive to automatically put one in
every bus master ;-).  In software, it's not so bad...

As I said, I'm not convinced either way at this point, just trying to
explore the issues a little more deeply.  Thanks for bearing with me.

Steve
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to