Github user franz1981 commented on the issue:
    @clebertsuconic Yep , don't worry I'm glad that you have taken a look on 
it! Will answer inline..
    > i - There's a buffer sitting on the ProtonSenderContext now that may be 
unused for a while. What if you had 1000 connections at one time.. and then all 
of them go inactive? I don't understand how this will be Reducing footprint. I 
see quite the contrary here.
    I think that optimizing for the common case would be the bigger fish, but I 
don't know if what you have described is the common is? If yes 
you're right, but consider that although Netty will release it to the pool, the 
memory footprint won't goes away. To solve it I will use `SoftReference` that 
will react on memory pressure demands, just to be safe.
    Re memory footprint, it helps due to how Netty heap pools work: just 
allocating 10 bytes on it from any thread will trigger all the netty heap 
arenas to be allocated: that means to have more than (using the default config) 
16 MB * 16 = 250 MB of allocated heaps just sitting to perform buffer copies.
    And the sad fact is that it will add a constant memory footprint to Artemis 
when the broker is idle for real too.
    Let me show you it...
    That's on master (with the patch of Tim), after forcing a Full GC: 250 MB 
of Netty arena remaining allocated
    That's is with this patch (and Tim commit too): all the allocated bytes 
    For the graph is not visible the throughput improvement and we can just 
focus just on memory: it is pretty visbile that the memory used is lower than 
with pooled Netty ByteBuf too and that's because with too many threads 
(producers/consumers) the Netty pool isn't able to scale and will fallback to 
produce garbage.
    I hope to have explained my concerns.
    > ii - It is not clear to me when the buffer would be busy, since the 
ProtonSenderContext represents a single consumer. Maybe it would.. but it 
shouldn't. if that's happening you will be sending a large heap buffer to GC.
    That's why I needed you and @tabish121 :P : if it is uncontended the logic 
could be made much simpler.
    From my tests (with 64 pairs of producers/consumers) seems uncontended so I 
have supposed that the "busy" case is the uncommon one and just having one 
byte[] created that will die very soon isn't a big deal for any GC (if is 
uncommon). Re how much big, with G1 (our defualt GC) `-XX:G1HeapRegionSize` is 
2 MB AFAIK and any byte[] allocation > 50% of it ie 1MB is "less optimized".
    Let me re-write it with `SoftReference` and without 'AtomicReference` (it 
isn't contended) and it could address all your point, thanks!!


Reply via email to