Couple more comments below.

On Thu, Nov 29, 2018 at 3:08 PM Gabe Black <gabebl...@google.com> wrote:

> I looked at the code as is, and Packets themselves aren't ever inherited by
> anything, and they maintain a stack (I think) of sender state objects which
> doesn't affect the size of the packet. The sender state is inherited in a
> bunch of places since that is often customized. As far as compressed
> packets, what would the benefit be? It seems like there would be a lot of
> overhead to compress and decompress them, and I don't think they ever get
> all that big. I'm also hesitant to add "final" in there since I think it
> would be potentially limiting, but sometimes sensible limits are useful
> :-).
>

Not to speak for Daniel too much, but I think he was planning on modeling
the *data* in the packet being compressed. Then, there may be a different
packet size depending on the compression ratio of the data, for instance. I
think this would be cleaner to implement by inheriting from packet than
adding the interface to the current Packet. But, this isn't a hard
requirement. It was just an example of why we might want to inherit from
Packet at some point.

>
> The reason exactly how big Request/Packets are is a concern is that I think
> you'd get a meaningful performance benefit if you could use extra knowledge
> to do things faster than just using new. For instance, if you knew that
> what you were allocating was always X bytes, then you could just have a
> bunch of blocks of X bytes laying around in a list and return one when
> asked. You wouldn't have to try to keep things compacted like new would, or
> keep track of (or even check) how big things were. That marginal bit of
> savings could make a big difference if allocations happened a lot,
> particularly if the type of allocations were ones new (and/or malloc if
> it's built on it) was bad at. My brief Googling says malloc is bad at lots
> of small allocations, for what that's worth.
>

We use tcmalloc as our default allocator right now. I wonder how that
compares. IIRC, when we moved from malloc to tcmalloc we got a ~10%
performance boost. At least, that's what Hansson said ;).


>
> Also, a custom allocator could make some types of debugging marginally
> harder. For instance, if you used valgrind, it would replace malloc/new,
> and that wouldn't be used in the typical way for those types. They would
> likely be mostly batch allocated at the start of simulation, and then just
> returned by the various mechanisms later. That would make information about
> where a particular Request object came from less useful, although you would
> still likely be able to see that it was accessed in an illegal way, and
> where that access happened.
>

This is a big downside. I've found memory errors with valgrind in packets
*many times* due to how we copy data in to/out of them. This might be a
show stopper for me.

As a side note, tcmalloc has bitten me a number of times here. I've run
valgind with tcmalloc and seen "0 bytes allocated and 0 bytes deleted"
after waiting for hours too many times. I have finally learned to recompile
everything using "--without-tcmalloc" before running valgind.


> I agree that taking an incremental approach and doing some experiments are
> worth while, and that wrapping request allocation in a small function would
> be a good first step. I can't promise a timeline, but hopefully I'll have
> something to share at some point in the future.
>
> Gabe
>
> On Thu, Nov 29, 2018 at 9:21 AM Nikos Nikoleris <nikos.nikole...@arm.com>
> wrote:
>
> > Hey all,
> >
> > The bigger opportunity is probably around packets as sometimes we are
> > allocating a number of them per request.
> >
> > A custom allocator would be a great contribution if it would make gem5
> > substantially faster. But if it proved make things only marginally
> > better, I agree with Jason, we would need to factor in the fact that we
> > will be making the code more fragile. Would debugging/profiling be
> > harder if we used a custom deallocator?
> >
> > If the size of the Packet is a concern, we will probably have to deal
> > with members that are not fixed in size such as the senderState. It
> > might also be worth looking into something similar for the pointers from
> > packets such as data.
> >
> > Packet and Request memory management is something that we have been
> > concerned about as well. Giacomo, recently, changed all naked Request
> > pointers for shared_ptr and now we don't have to manually deallocate
> > them. But when we tried to do this for Packet pointers (figuring out
> > when to de-allocate a packet is not trivial) there was a substantial
> > overhead.
> >
> > Nikos
> >
> >
> > On 29/11/2018 16:40, Jason Lowe-Power wrote:
> > > Hey Gabe,
> > >
> > > Are you thinking that a custom allocator would make a difference in
> terms
> > > of memory footprint or in terms of performance (or both)?
> > >
> > > A couple of thoughts:
> > > - I'm hesitant to put the final keyword on Packet. I think we could see
> > > some code cleanup by making Packet a polymorphic object (e.g., Daniel
> has
> > > been talking about implementing packets with compressed data).
> Obviously,
> > > using polymorphism isn't required, and I don't have any plans to make
> > this
> > > change in the near future. But it's something that I've been thinking
> > about.
> > > - I like the idea of requiring the use of a small function to allocate
> > > requests and packets. In many cases, for Packets, we use
> > > Packet::createRead()/createWrite(), which is a step in that direction.
> I
> > > think this would be a relatively non-controversial change moving in the
> > > direction of custom allocators. It may also allow us to do some testing
> > on
> > > this more easily.
> > >
> > > Another argument against custom allocators is that we'll be adding a
> > > significant amount of error-prone code. I like the idea of using
> standard
> > > libraries as much as possible to aid in understanding the code and
> > reducing
> > > the bug surface.
> > >
> > > That said, if we get a 20% speedup or a reduction in memory footprint
> > then
> > > this change would likely be worth it!
> > >
> > > Cheers,
> > > Jason
> > >
> > > On Wed, Nov 28, 2018 at 7:22 PM Gabe Black <gabebl...@google.com>
> wrote:
> > >
> > >> I was just wondering whether it would make sense to have custom
> > allocators
> > >> defined for Request and Packet types which would keep around a pool of
> > them
> > >> rather than defaulting to the normal allocator. I suspect since both
> > types
> > >> of objects are allocated very frequently this could save a lot of heap
> > >> overhead.
> > >>
> > >> I think there are two complications to doing this. First, we'd want to
> > make
> > >> sure Request and Packet objects are truly those objects and are of the
> > >> appropriate size. Subclasses could add new members and make the types
> > >> bigger. I think to ensure that, we'd need to add the "final" keyword
> > (added
> > >> in C++11) to ensure those types can't be subclassed and have
> > unpredictable
> > >> sizes.
> > >>
> > >> Second, we use make_shared a lot to build Requests, and that actually
> > >> allocates a larger amount of memory to hold the object and reference
> > count
> > >> information. It allocates that larger object with new, and it looks
> like
> > >> you're supposed to use allocate_shared if you want to use a custom
> > >> allocator. Writing a custom allocator looks relatively complicated,
> but
> > it
> > >> might make a big difference if it works correctly.
> > >>
> > >> Also we'd probably want to put the Request allocator incantation in a
> > small
> > >> function rather than calling make_shared or allocate_shared directly
> to
> > >> avoid a lot of boilerplate and churn if things need to change.
> > >>
> > >> Thoughts? Like I said I suspect this may make a significant
> difference,
> > but
> > >> it might not be easy to implement and may not actually make much of a
> > >> difference at all.
> > >>
> > >> Gabe
> > >> _______________________________________________
> > >> gem5-dev mailing list
> > >> gem5-dev@gem5.org
> > >> http://m5sim.org/mailman/listinfo/gem5-dev
> > > _______________________________________________
> > > gem5-dev mailing list
> > > gem5-dev@gem5.org
> > > http://m5sim.org/mailman/listinfo/gem5-dev
> > >
> > IMPORTANT NOTICE: The contents of this email and any attachments are
> > confidential and may also be privileged. If you are not the intended
> > recipient, please notify the sender immediately and do not disclose the
> > contents to any other person, use it for any purpose, or store or copy
> the
> > information in any medium. Thank you.
> > _______________________________________________
> > gem5-dev mailing list
> > gem5-dev@gem5.org
> > http://m5sim.org/mailman/listinfo/gem5-dev
> _______________________________________________
> gem5-dev mailing list
> gem5-dev@gem5.org
> http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to