[DISCUSS] OutOfDirectMemory on high memory pressure 4 NettyConnection

nigro_franz Mon, 06 Mar 2017 01:17:34 -0800

Hi guys!

I'm writing here to have your opinion about what could look like an "issue"
that I've found in the NettyConnection (InVM suffer of the same issue, but
cover different use cases) of Artemis.
If you push enough pressure on the NettyConnection::createTransportBuffer
with huge messages and/or an high count of living connections there will be
a moment in which Artemis will throw an OutOfDirectMemory exception due to
the reached limit imposed by the io.netty.maxDirectMemory property.
I've noticed that the most there will be the speed difference between the
journalling phase and the messages arrival (through the connection) the most
it will be the case that the exception will be thrown, but there are
multiple scenarios that could make it happens.


IMHO this issue is not something that could be solved switching to an
unpooled Heap ByteBuf allocator: it will shift the responsability to the
JVM's GC, leading to even worst case effects, like infinite and
unpredictable GC pauses (eg due to faster aging of garbage and/or homougous
space exaustions depending by the rate/size/fragmentation of the byte[]
allocations).

What i'm seeing is more a design needs: an effective way to backpressure the
different process layers of the messages.
Netty doesn't provide it if not in the form of that exception, but for an
application like Artemis I think it need to be done something that can be
queried, monitored and tuned.

Having an effective way to monitor and bound the latency between each stage
will enable the lower stages (ie in front of the I/O) to choose a proper
strategy, like dropping new connections when overloaded or blocking them
until proper timeouts/freeing of resources.

I've noticed that all the concurrent stages of message processing are backed
by unbounded queues (ie LinkedBlockingQueue or ConcurrentLinkedQueue) hidden
by the Executors/ExecutorServices and that's the point that need to be
improved: switching to bounded queues and real message passing (not
Runnable) between the stages will let the application to know in a simple
and effective way when the next stages can't keepup with the processing
requests.

Sorry for this long post, but I've spent a lot of time figuring out (and
trying/testing) different approaches before arriving to this conclusion...

what do you think?

Cheers,
Franz



--
View this message in context: 
http://activemq.2283324.n4.nabble.com/DISCUSS-OutOfDirectMemory-on-high-memory-pressure-4-NettyConnection-tp4722964.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

[DISCUSS] OutOfDirectMemory on high memory pressure 4 NettyConnection

Reply via email to