Couple more comments below. On Thu, Nov 29, 2018 at 3:08 PM Gabe Black <gabebl...@google.com> wrote:
> I looked at the code as is, and Packets themselves aren't ever inherited by > anything, and they maintain a stack (I think) of sender state objects which > doesn't affect the size of the packet. The sender state is inherited in a > bunch of places since that is often customized. As far as compressed > packets, what would the benefit be? It seems like there would be a lot of > overhead to compress and decompress them, and I don't think they ever get > all that big. I'm also hesitant to add "final" in there since I think it > would be potentially limiting, but sometimes sensible limits are useful > :-). > Not to speak for Daniel too much, but I think he was planning on modeling the *data* in the packet being compressed. Then, there may be a different packet size depending on the compression ratio of the data, for instance. I think this would be cleaner to implement by inheriting from packet than adding the interface to the current Packet. But, this isn't a hard requirement. It was just an example of why we might want to inherit from Packet at some point. > > The reason exactly how big Request/Packets are is a concern is that I think > you'd get a meaningful performance benefit if you could use extra knowledge > to do things faster than just using new. For instance, if you knew that > what you were allocating was always X bytes, then you could just have a > bunch of blocks of X bytes laying around in a list and return one when > asked. You wouldn't have to try to keep things compacted like new would, or > keep track of (or even check) how big things were. That marginal bit of > savings could make a big difference if allocations happened a lot, > particularly if the type of allocations were ones new (and/or malloc if > it's built on it) was bad at. My brief Googling says malloc is bad at lots > of small allocations, for what that's worth. > We use tcmalloc as our default allocator right now. I wonder how that compares. IIRC, when we moved from malloc to tcmalloc we got a ~10% performance boost. At least, that's what Hansson said ;). > > Also, a custom allocator could make some types of debugging marginally > harder. For instance, if you used valgrind, it would replace malloc/new, > and that wouldn't be used in the typical way for those types. They would > likely be mostly batch allocated at the start of simulation, and then just > returned by the various mechanisms later. That would make information about > where a particular Request object came from less useful, although you would > still likely be able to see that it was accessed in an illegal way, and > where that access happened. > This is a big downside. I've found memory errors with valgrind in packets *many times* due to how we copy data in to/out of them. This might be a show stopper for me. As a side note, tcmalloc has bitten me a number of times here. I've run valgind with tcmalloc and seen "0 bytes allocated and 0 bytes deleted" after waiting for hours too many times. I have finally learned to recompile everything using "--without-tcmalloc" before running valgind. > I agree that taking an incremental approach and doing some experiments are > worth while, and that wrapping request allocation in a small function would > be a good first step. I can't promise a timeline, but hopefully I'll have > something to share at some point in the future. > > Gabe > > On Thu, Nov 29, 2018 at 9:21 AM Nikos Nikoleris <nikos.nikole...@arm.com> > wrote: > > > Hey all, > > > > The bigger opportunity is probably around packets as sometimes we are > > allocating a number of them per request. > > > > A custom allocator would be a great contribution if it would make gem5 > > substantially faster. But if it proved make things only marginally > > better, I agree with Jason, we would need to factor in the fact that we > > will be making the code more fragile. Would debugging/profiling be > > harder if we used a custom deallocator? > > > > If the size of the Packet is a concern, we will probably have to deal > > with members that are not fixed in size such as the senderState. It > > might also be worth looking into something similar for the pointers from > > packets such as data. > > > > Packet and Request memory management is something that we have been > > concerned about as well. Giacomo, recently, changed all naked Request > > pointers for shared_ptr and now we don't have to manually deallocate > > them. But when we tried to do this for Packet pointers (figuring out > > when to de-allocate a packet is not trivial) there was a substantial > > overhead. > > > > Nikos > > > > > > On 29/11/2018 16:40, Jason Lowe-Power wrote: > > > Hey Gabe, > > > > > > Are you thinking that a custom allocator would make a difference in > terms > > > of memory footprint or in terms of performance (or both)? > > > > > > A couple of thoughts: > > > - I'm hesitant to put the final keyword on Packet. I think we could see > > > some code cleanup by making Packet a polymorphic object (e.g., Daniel > has > > > been talking about implementing packets with compressed data). > Obviously, > > > using polymorphism isn't required, and I don't have any plans to make > > this > > > change in the near future. But it's something that I've been thinking > > about. > > > - I like the idea of requiring the use of a small function to allocate > > > requests and packets. In many cases, for Packets, we use > > > Packet::createRead()/createWrite(), which is a step in that direction. > I > > > think this would be a relatively non-controversial change moving in the > > > direction of custom allocators. It may also allow us to do some testing > > on > > > this more easily. > > > > > > Another argument against custom allocators is that we'll be adding a > > > significant amount of error-prone code. I like the idea of using > standard > > > libraries as much as possible to aid in understanding the code and > > reducing > > > the bug surface. > > > > > > That said, if we get a 20% speedup or a reduction in memory footprint > > then > > > this change would likely be worth it! > > > > > > Cheers, > > > Jason > > > > > > On Wed, Nov 28, 2018 at 7:22 PM Gabe Black <gabebl...@google.com> > wrote: > > > > > >> I was just wondering whether it would make sense to have custom > > allocators > > >> defined for Request and Packet types which would keep around a pool of > > them > > >> rather than defaulting to the normal allocator. I suspect since both > > types > > >> of objects are allocated very frequently this could save a lot of heap > > >> overhead. > > >> > > >> I think there are two complications to doing this. First, we'd want to > > make > > >> sure Request and Packet objects are truly those objects and are of the > > >> appropriate size. Subclasses could add new members and make the types > > >> bigger. I think to ensure that, we'd need to add the "final" keyword > > (added > > >> in C++11) to ensure those types can't be subclassed and have > > unpredictable > > >> sizes. > > >> > > >> Second, we use make_shared a lot to build Requests, and that actually > > >> allocates a larger amount of memory to hold the object and reference > > count > > >> information. It allocates that larger object with new, and it looks > like > > >> you're supposed to use allocate_shared if you want to use a custom > > >> allocator. Writing a custom allocator looks relatively complicated, > but > > it > > >> might make a big difference if it works correctly. > > >> > > >> Also we'd probably want to put the Request allocator incantation in a > > small > > >> function rather than calling make_shared or allocate_shared directly > to > > >> avoid a lot of boilerplate and churn if things need to change. > > >> > > >> Thoughts? Like I said I suspect this may make a significant > difference, > > but > > >> it might not be easy to implement and may not actually make much of a > > >> difference at all. > > >> > > >> Gabe > > >> _______________________________________________ > > >> gem5-dev mailing list > > >> gem5-dev@gem5.org > > >> http://m5sim.org/mailman/listinfo/gem5-dev > > > _______________________________________________ > > > gem5-dev mailing list > > > gem5-dev@gem5.org > > > http://m5sim.org/mailman/listinfo/gem5-dev > > > > > IMPORTANT NOTICE: The contents of this email and any attachments are > > confidential and may also be privileged. If you are not the intended > > recipient, please notify the sender immediately and do not disclose the > > contents to any other person, use it for any purpose, or store or copy > the > > information in any medium. Thank you. > > _______________________________________________ > > gem5-dev mailing list > > gem5-dev@gem5.org > > http://m5sim.org/mailman/listinfo/gem5-dev > _______________________________________________ > gem5-dev mailing list > gem5-dev@gem5.org > http://m5sim.org/mailman/listinfo/gem5-dev _______________________________________________ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev