Quoting Steve Reinhardt <[email protected]>:

On Tue, May 29, 2012 at 8:45 AM, nathan binkert <[email protected]> wrote:

> - Those DynInst structures do seem huge.  Slimming them down seems like
it
> would be a big win.
Sometimes structs are just big.  Think about the task struct in linux.
 Slimming down isn't as important as reordering the values in the
structure so that the important ones are in the front and packed
densely so that the unimportant stuff gets evicted from the cache.


By "slimming down" I just meant making it smaller, including reordering
fields for better packing as well as getting rid of redundant info and
using denser encodings (e.g., a bitvector instead of bools).

Reordering things for better spatial locality with the struct makes sense
too, but seems like more of a second-order optimization.  Seems like there
ought to be tools for this... combine some compiler info with a
valgrind-like cache simulation and you ought to be able to calculate an
optimal structure layout.  I'd be surprised if that doesn't already exist
somewhere (in some PLDI paper if nowhere else).


> - I wouldn't take it on faith that FastAlloc is faster.  It might be,
but I
> wrote that a *long* time ago, and malloc has probably improved in the
> interim.  It's easy to compile without it (I think there's a
NO_FAST_ALLOC
> flag); it would be interesting to see if that matters.  Of course, if
most
> of the allocations aren't using it anyway, maybe that won't tell the
whole
> story.
My guess is that we should just stop using it since we're heading
towards multithreaded code anyway.  I'd be quite surprised if it were
better than malloc.


Good point, I wasn't thinking about thread safety.  It'd be nice to see
some actual data, but I also would not be surprised if it's not buying us
much, and would be OK with seeing it put to bed if it's not.  Sort of sad,
as it's some of the oldest code we have: I originally wrote it for WWT when
I was a grad student.  It's also some of the first C++ I ever wrote;
certainly the most complex thing I'd done in C++ up to that point in time.


One real win that we could get is if we could
take advantage of partially constructed objects (I think that was in
the original slab allocator paper).


Yea, strangely I can't find any C++ allocators that appear to support the
partial initialization feature.

We could also consider using another non-standard memory allocator if the
glibc malloc isn't good enough.  I checked out Hoard but unfortunately it's
GPL.  Of course, sticking with the default malloc is preferable unless the
gains from switching are pretty significant.


> - I wonder if all the ref-counting pointer stuff is because we're copying
> the pointers a lot instead of sharing references to them (e.g., as
> parameters to short function calls).  Basically you end up incrementing
> then deleting the ref count if you pass a pointer by value instead of by
> reference, IIRC.
This is an optimization that could lead to confusion if all are not on
board:
There really is no reason that we should be using RefCountingPtr
instead of a regular pointer when calling a function or as a local
variable in a function.  RefCountingPtr really should only be used
when saving a pointer in some sort of datastructure (i.e. as a member
variable of a class, or as a value in some STL struct.)  Make sense?
We could of course start being more conscious of these rules and for
the parts of the system that make big use of these pointers, use the
regular ones.  Like to avoid confusion though.  Perhaps by doing
something like:
typedef RefCountingPtr<Foo> FooSavedPtr;
typedef Foo *FooTempPtr;


Yea, that's pretty subtle.  I might prefer punting on RefCountingPtr and
using explicit ref incr/decr operations instead... it sort of ruins the
magic of RefCountingPtr if you have to think about when to use it and when
not to.

Steve
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev



I added these two arrays:

_flatDestRegIdx
_flatSrcRegIdx

which I think are only computed on the way to the renamed registers and otherwise not used. If that's still the case we could look at getting rid of them. They're nice to have around for debugging purposes, but I expect that's where a decent part of that space is going.

Gabe
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to