Hi guys! I'm writing here to have your opinion about what could look like an "issue" that I've found in the NettyConnection (InVM suffer of the same issue, but cover different use cases) of Artemis. If you push enough pressure on the NettyConnection::createTransportBuffer with huge messages and/or an high count of living connections there will be a moment in which Artemis will throw an OutOfDirectMemory exception due to the reached limit imposed by the io.netty.maxDirectMemory property. I've noticed that the most there will be the speed difference between the journalling phase and the messages arrival (through the connection) the most it will be the case that the exception will be thrown, but there are multiple scenarios that could make it happens.
IMHO this issue is not something that could be solved switching to an unpooled Heap ByteBuf allocator: it will shift the responsability to the JVM's GC, leading to even worst case effects, like infinite and unpredictable GC pauses (eg due to faster aging of garbage and/or homougous space exaustions depending by the rate/size/fragmentation of the byte[] allocations). What i'm seeing is more a design needs: an effective way to backpressure the different process layers of the messages. Netty doesn't provide it if not in the form of that exception, but for an application like Artemis I think it need to be done something that can be queried, monitored and tuned. Having an effective way to monitor and bound the latency between each stage will enable the lower stages (ie in front of the I/O) to choose a proper strategy, like dropping new connections when overloaded or blocking them until proper timeouts/freeing of resources. I've noticed that all the concurrent stages of message processing are backed by unbounded queues (ie LinkedBlockingQueue or ConcurrentLinkedQueue) hidden by the Executors/ExecutorServices and that's the point that need to be improved: switching to bounded queues and real message passing (not Runnable) between the stages will let the application to know in a simple and effective way when the next stages can't keepup with the processing requests. Sorry for this long post, but I've spent a lot of time figuring out (and trying/testing) different approaches before arriving to this conclusion... what do you think? Cheers, Franz -- View this message in context: http://activemq.2283324.n4.nabble.com/DISCUSS-OutOfDirectMemory-on-high-memory-pressure-4-NettyConnection-tp4722964.html Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.
