Github user franz1981 commented on the issue: https://github.com/apache/activemq-artemis/pull/1918 @clebertsuconic Yep , don't worry I'm glad that you have taken a look on it! Will answer inline.. > i - There's a buffer sitting on the ProtonSenderContext now that may be unused for a while. What if you had 1000 connections at one time.. and then all of them go inactive? I don't understand how this will be Reducing footprint. I see quite the contrary here. I think that optimizing for the common case would be the bigger fish, but I don't know if what you have described is the common pattern...it is? If yes you're right, but consider that although Netty will release it to the pool, the memory footprint won't goes away. To solve it I will use `SoftReference` that will react on memory pressure demands, just to be safe. Re memory footprint, it helps due to how Netty heap pools work: just allocating 10 bytes on it from any thread will trigger all the netty heap arenas to be allocated: that means to have more than (using the default config) 16 MB * 16 = 250 MB of allocated heaps just sitting to perform buffer copies. And the sad fact is that it will add a constant memory footprint to Artemis when the broker is idle for real too. Let me show you it... That's on master (with the patch of Tim), after forcing a Full GC: 250 MB of Netty arena remaining allocated ![image](https://user-images.githubusercontent.com/13125299/36903689-ba230bc2-1e2e-11e8-8976-a79b87813411.png) That's is with this patch (and Tim commit too): all the allocated bytes disappear ![image](https://user-images.githubusercontent.com/13125299/36903735-dcbab05e-1e2e-11e8-8e21-2af7ff8c9427.png) For the graph is not visible the throughput improvement and we can just focus just on memory: it is pretty visbile that the memory used is lower than with pooled Netty ByteBuf too and that's because with too many threads (producers/consumers) the Netty pool isn't able to scale and will fallback to produce garbage. I hope to have explained my concerns. > ii - It is not clear to me when the buffer would be busy, since the ProtonSenderContext represents a single consumer. Maybe it would.. but it shouldn't. if that's happening you will be sending a large heap buffer to GC. That's why I needed you and @tabish121 :P : if it is uncontended the logic could be made much simpler. From my tests (with 64 pairs of producers/consumers) seems uncontended so I have supposed that the "busy" case is the uncommon one and just having one byte created that will die very soon isn't a big deal for any GC (if is uncommon). Re how much big, with G1 (our defualt GC) `-XX:G1HeapRegionSize` is 2 MB AFAIK and any byte allocation > 50% of it ie 1MB is "less optimized". Let me re-write it with `SoftReference` and without 'AtomicReference` (it isn't contended) and it could address all your point, thanks!!