On Tue, May 29, 2012 at 8:45 AM, nathan binkert <[email protected]> wrote:
> > - Those DynInst structures do seem huge. Slimming them down seems like > it > > would be a big win. > Sometimes structs are just big. Think about the task struct in linux. > Slimming down isn't as important as reordering the values in the > structure so that the important ones are in the front and packed > densely so that the unimportant stuff gets evicted from the cache. > By "slimming down" I just meant making it smaller, including reordering fields for better packing as well as getting rid of redundant info and using denser encodings (e.g., a bitvector instead of bools). Reordering things for better spatial locality with the struct makes sense too, but seems like more of a second-order optimization. Seems like there ought to be tools for this... combine some compiler info with a valgrind-like cache simulation and you ought to be able to calculate an optimal structure layout. I'd be surprised if that doesn't already exist somewhere (in some PLDI paper if nowhere else). > > - I wouldn't take it on faith that FastAlloc is faster. It might be, > but I > > wrote that a *long* time ago, and malloc has probably improved in the > > interim. It's easy to compile without it (I think there's a > NO_FAST_ALLOC > > flag); it would be interesting to see if that matters. Of course, if > most > > of the allocations aren't using it anyway, maybe that won't tell the > whole > > story. > My guess is that we should just stop using it since we're heading > towards multithreaded code anyway. I'd be quite surprised if it were > better than malloc. Good point, I wasn't thinking about thread safety. It'd be nice to see some actual data, but I also would not be surprised if it's not buying us much, and would be OK with seeing it put to bed if it's not. Sort of sad, as it's some of the oldest code we have: I originally wrote it for WWT when I was a grad student. It's also some of the first C++ I ever wrote; certainly the most complex thing I'd done in C++ up to that point in time. > One real win that we could get is if we could > take advantage of partially constructed objects (I think that was in > the original slab allocator paper). > Yea, strangely I can't find any C++ allocators that appear to support the partial initialization feature. We could also consider using another non-standard memory allocator if the glibc malloc isn't good enough. I checked out Hoard but unfortunately it's GPL. Of course, sticking with the default malloc is preferable unless the gains from switching are pretty significant. > > - I wonder if all the ref-counting pointer stuff is because we're copying > > the pointers a lot instead of sharing references to them (e.g., as > > parameters to short function calls). Basically you end up incrementing > > then deleting the ref count if you pass a pointer by value instead of by > > reference, IIRC. > This is an optimization that could lead to confusion if all are not on > board: > There really is no reason that we should be using RefCountingPtr > instead of a regular pointer when calling a function or as a local > variable in a function. RefCountingPtr really should only be used > when saving a pointer in some sort of datastructure (i.e. as a member > variable of a class, or as a value in some STL struct.) Make sense? > We could of course start being more conscious of these rules and for > the parts of the system that make big use of these pointers, use the > regular ones. Like to avoid confusion though. Perhaps by doing > something like: > typedef RefCountingPtr<Foo> FooSavedPtr; > typedef Foo *FooTempPtr; > Yea, that's pretty subtle. I might prefer punting on RefCountingPtr and using explicit ref incr/decr operations instead... it sort of ruins the magic of RefCountingPtr if you have to think about when to use it and when not to. Steve _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
