(switching to email since this is getting a bit detailed) These are valid concerns. I'd like to explore them a little more. I'm not necessarily picking sides, just playing devil's advocate a bit.
On Sun, Apr 29, 2012 at 5:05 AM, Andreas Hansson <[email protected]>wrote: > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/1159/ > > On April 28th, 2012, 9:08 a.m., *Steve Reinhardt* wrote: > > I like the idea of having a common way of collecting these stats (and > probably more importantly, common names and semantics for the output). Does > it make sense to have this in a separate object though? It seems like that > just adds complexity to the configuration, and the possibility of not putting > one of these where you wish you had. Did you consider putting all this code > in a base class like MemObject so that everyone automatically collects all > these stats? > > A very valid comment, and this option was under serious consideration. The > benefits of having it behaviourally encapsulated (as you now propose) rather > than structurally encapsulated (as the current patch does) is indeed that: > 1) it does not involve changing any configuration scripts, and > 2) any communicating object would have a uniform set of stats relating to the > communication > > The issues are: > 1) how to "hide" it in the interface timing calls, as we must catch the case > where sendTiming returns false and undo the changes to the senderState > > If we make this common to all MemObjects, we could take the monitor state (which is just a tick value) and put it in the base SenderState class. I bet there are some MemObjects that already record the transmit time in the sender state anyway so we could get rid of that double recording. Eliminating the memory allocation and chaining of the separate SenderState object could possibly be a noticeable performance win when you're actually collecting the stats. > 2) the rather extensive set of disable flags in the object parameters > > This doesn't bother me much. If we define the flags on the base MemObject and provide reasonable defaults, they'll be invisible to most people. > 3) the potential negative performance impact caused by collecting more stats > than we really want > > How expensive are these stats? A timing-mode transaction is not cheap to begin with, so the percentage slowdown may not be that bad. Plus I find that, when you're running long simulations, you're better off collecting all the stats you can (within reason), since having them and not needing them is less costly than wanting them and not having them. Plus, it looks like all the stats can be disabled, so if performance is a concern, they could just be turned off, right? This would eliminate the first-order overheads (notwithstanding #5 below). > 4) more "noise" in the stats output > > The stats output really isn't designed to be read casually anyway... if you're using grep or some other script, then the excess data won't be noticeable. > 5) potential negative performance impact even when not collecting stats due > to the large number of if-statements > > True. As with #3, we don't know how big this impact is though. Branch predictors can be pretty good, so although a sequence of load/compare/branch instructions isn't free, it's not necessarily as bad as you think. Plus, there may be some existing stats in some MemObjects that are redundant with respect to these (like bytes read and bytes written maybe?) and if we go back and get rid of those then we'd save a little overhead there. > The current CommMonitor may not be ideal in that it requires config script > changes, and it is indeed up to the user to determine which points in the > memory system are of interest. However, I would expect no more than a handful > monitors in a typical system, and the stats in the monitor could become > multi-dimensional and indexed on the masterId in a future patch. > > How many masters are there in a typical system anyway, and how many of them (1) issue a lot of requests and (2) are ones we're not likely to care about stats on? These are the only cases where it would matter to not collect stats, IMO. My guess is that the only things that might fall into this category are fabric components like the Bus and *maybe* the Bridge. I think the CPUs are the things you would definitely want to be capturing stats on, and other things like DMA devices probably don't issue enough requests typically to matter (from a simulation performance perspective). > As a minor side note, the structural CommMonitor also more closely resembles > the bus monitors used in actual devices. > > Only because in silicon it's too expensive to automatically put one in every bus master ;-). In software, it's not so bad... As I said, I'm not convinced either way at this point, just trying to explore the issues a little more deeply. Thanks for bearing with me. Steve _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
