Still no file...

Glad to see you looking into this.  Quick thoughts:

- Those DynInst structures do seem huge.  Slimming them down seems like it
would be a big win.
- I wouldn't take it on faith that FastAlloc is faster.  It might be, but I
wrote that a *long* time ago, and malloc has probably improved in the
interim.  It's easy to compile without it (I think there's a NO_FAST_ALLOC
flag); it would be interesting to see if that matters.  Of course, if most
of the allocations aren't using it anyway, maybe that won't tell the whole
story.
- I wonder if all the ref-counting pointer stuff is because we're copying
the pointers a lot instead of sharing references to them (e.g., as
parameters to short function calls).  Basically you end up incrementing
then deleting the ref count if you pass a pointer by value instead of by
reference, IIRC.

Steve

On Tue, May 29, 2012 at 8:01 AM, Ali Saidi <[email protected]> wrote:

> Lets try to attach the file againÅ 
>
> Ali
>
>
> On 5/29/12 10:27 AM, "Ali Saidi" <[email protected]> wrote:
>
> >
> >
> >We recently took a look at the callgraph from gem5 with an O3 cpu
> >and it's pretty startling (see attached picture). The majority of time
> >is spent in memory management. The biggest chunk of this is in fetch
> >when instructions are built, however I assumed that FastAlloc would be
> >used. Nominally it would, except for that with both ARM and x86 the size
> >of a DynInst is > 512 bytes which is the max size FastAlloc handles.
> >Alpha seems to sneak under the limit, but either way it is astounding to
> >me that a single instruction requires over .5kB of storage. Doing some
> >quick math, if more than 64 dyninsts exist in the system they don't fit
> >in the L1 cache anymore. One thing we can do is increase the max size of
> >FastAlloc to 1kB, but it seems like we need to think about how to
> >slim-down a DynInst. I've looked over it and it seems like we loose
> >around 48 bytes to alignment issues, as members are scattered throughout
> >and the are Addrs, bools and then more Addrs. It seems like changing
> >some of the bools we currently have to setters/getters with an
> >underlying bitvector might help, and we might want to think about
> >packing the most used members together as apposed to the somewhat random
> >approach we have right now. You could nearly half it, if the processor
> >of interest doesn't have > 256 physical registers.
> >
> >Furthermore,
> >looking at the picture there seem to be plenty of other places where
> >there are a lot of calls to new (teal-ish)/free(orange). It seems like
> >we could certainly make more use of FastAlloc, assuming it's actually
> >helping.
> >
> >Thanks,
> >
> >Ali
> >
> >
>
>
> -- IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium.  Thank you.
> _______________________________________________
> gem5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/gem5-dev
>
>
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to