Alex,

Below are the flow to disk logs from broker having 3million+ messages at
this time. We only have one virtual host. Time is in GMT. Looks like flow
to disk is active on the whole virtual host and not a queue level.

When the same broker went OOM yesterday, I did not see any flow to disk
logs from when it was started until it crashed (crashed twice within 4hrs).


4/19/17 4:17:43.509 AM INFO  [Housekeeping[test]] - [Housekeeping[test]]
BRK-1014 : Message flow to disk active :  Message memory use 3356539KB
exceeds threshold 3355443KB
4/19/17 2:31:13.502 AM INFO  [Housekeeping[test]] - [Housekeeping[test]]
BRK-1015 : Message flow to disk inactive : Message memory use 3354866KB
within threshold 3355443KB
4/19/17 2:28:43.511 AM INFO  [Housekeeping[test]] - [Housekeeping[test]]
BRK-1014 : Message flow to disk active :  Message memory use 3358509KB
exceeds threshold 3355443KB
4/19/17 2:20:13.500 AM INFO  [Housekeeping[test]] - [Housekeeping[test]]
BRK-1015 : Message flow to disk inactive : Message memory use 3353501KB
within threshold 3355443KB
4/19/17 2:18:13.500 AM INFO  [Housekeeping[test]] - [Housekeeping[test]]
BRK-1014 : Message flow to disk active :  Message memory use 3357544KB
exceeds threshold 3355443KB
4/19/17 2:08:43.501 AM INFO  [Housekeeping[test]] - [Housekeeping[test]]
BRK-1015 : Message flow to disk inactive : Message memory use 3353236KB
within threshold 3355443KB
4/19/17 2:08:13.501 AM INFO  [Housekeeping[test]] - [Housekeeping[test]]
BRK-1014 : Message flow to disk active :  Message memory use 3356704KB
exceeds threshold 3355443KB
4/19/17 2:00:43.500 AM INFO  [Housekeeping[test]] - [Housekeeping[test]]
BRK-1015 : Message flow to disk inactive : Message memory use 3353511KB
within threshold 3355443KB
4/19/17 2:00:13.504 AM INFO  [Housekeeping[test]] - [Housekeeping[test]]
BRK-1014 : Message flow to disk active :  Message memory use 3357948KB
exceeds threshold 3355443KB
4/19/17 1:50:43.501 AM INFO  [Housekeeping[test]] - [Housekeeping[test]]
BRK-1015 : Message flow to disk inactive : Message memory use 3355310KB
within threshold 3355443KB
4/19/17 1:47:43.501 AM INFO  [Housekeeping[test]] - [Housekeeping[test]]
BRK-1014 : Message flow to disk active :  Message memory use 3365624KB
exceeds threshold 3355443KB
4/19/17 1:43:43.501 AM INFO  [Housekeeping[test]] - [Housekeeping[test]]
BRK-1015 : Message flow to disk inactive : Message memory use 3355136KB
within threshold 3355443KB
4/19/17 1:31:43.509 AM INFO  [Housekeeping[test]] - [Housekeeping[test]]
BRK-1014 : Message flow to disk active :  Message memory use 3358683KB
exceeds threshold 3355443KB


After production release (2days back), we have seen 4 crashes in 3
different brokers, this is the most pressing concern for us in decision if
we should roll back to 0.32. Any help is greatly appreciated.

Thanks
Ramayan

On Wed, Apr 19, 2017 at 9:36 AM, Oleksandr Rudyy <[email protected]> wrote:

> Ramayan,
> Thanks for the details. I would like to clarify whether flow to disk was
> triggered today for 3 million messages?
>
> The following logs are issued for flow to disk:
> BRK-1014 : Message flow to disk active :  Message memory use {0,number,#}KB
> exceeds threshold {1,number,#.##}KB
> BRK-1015 : Message flow to disk inactive : Message memory use
> {0,number,#}KB within threshold {1,number,#.##}KB
>
> Kind Regards,
> Alex
>
>
> On 19 April 2017 at 17:10, Ramayan Tiwari <[email protected]>
> wrote:
>
> > Hi Alex,
> >
> > Thanks for your response, here are the details:
> >
> > We use "direct" exchange, without persistence (we specify NON_PERSISTENT
> > that while sending from client) and use BDB store. We use JSON virtual
> host
> > type. We are not using SSL.
> >
> > When the broker went OOM, we had around 1.3 million messages with 100
> bytes
> > average message size. Direct memory allocation (value read from MBean)
> kept
> > going up, even though it wouldn't need more DM to store these many
> > messages. DM allocated persisted at 99% for about 3 and half hours before
> > crashing.
> >
> > Today, on the same broker we have 3 million messages (same message size)
> > and DM allocated is only at 8%. This seems like there is some issue with
> > de-allocation or a leak.
> >
> > I have uploaded the memory utilization graph here:
> > https://drive.google.com/file/d/0Bwi0MEV3srPRVHFEbDlIYUpLaUE/
> > view?usp=sharing
> > Blue line is DM allocated, Yellow is DM Used (sum of queue payload) and
> Red
> > is heap usage.
> >
> > Thanks
> > Ramayan
> >
> > On Wed, Apr 19, 2017 at 4:10 AM, Oleksandr Rudyy <[email protected]>
> wrote:
> >
> > > Hi Ramayan,
> > >
> > > Could please share with us the details of messaging use case(s) which
> > ended
> > > up in OOM on broker side?
> > > I would like to reproduce the issue on my local broker in order to fix
> > it.
> > > I would appreciate if you could provide as much details as possible,
> > > including, messaging topology, message persistence type, message
> > > sizes,volumes, etc.
> > >
> > > Qpid Broker 6.0.x uses direct memory for keeping message content and
> > > receiving/sending data. Each plain connection utilizes 512K of direct
> > > memory. Each SSL connection uses 1M of direct memory. Your memory
> > settings
> > > look Ok to me.
> > >
> > > Kind Regards,
> > > Alex
> > >
> > >
> > > On 18 April 2017 at 23:39, Ramayan Tiwari <[email protected]>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > We are using Java broker 6.0.5, with patch to use MultiQueueConsumer
> > > > feature. We just finished deploying to production and saw couple of
> > > > instances of broker OOM due to running out of DirectMemory buffer
> > > > (exceptions at the end of this email).
> > > >
> > > > Here is our setup:
> > > > 1. Max heap 12g, max direct memory 4g (this is opposite of what the
> > > > recommendation is, however, for our use cause message payload is
> really
> > > > small ~400bytes and is way less than the per message overhead of
> 1KB).
> > In
> > > > perf testing, we were able to put 2 million messages without any
> > issues.
> > > > 2. ~400 connections to broker.
> > > > 3. Each connection has 20 sessions and there is one multi queue
> > consumer
> > > > attached to each session, listening to around 1000 queues.
> > > > 4. We are still using 0.16 client (I know).
> > > >
> > > > With the above setup, the baseline utilization (without any messages)
> > for
> > > > direct memory was around 230mb (with 410 connection each taking
> 500KB).
> > > >
> > > > Based on our understanding of broker memory allocation, message
> payload
> > > > should be the only thing adding to direct memory utilization (on top
> of
> > > > baseline), however, we are experiencing something completely
> different.
> > > In
> > > > our last broker crash, we see that broker is constantly running with
> > 90%+
> > > > direct memory allocated, even when message payload sum from all the
> > > queues
> > > > is only 6-8% (these % are against available DM of 4gb). During these
> > high
> > > > DM usage period, heap usage was around 60% (of 12gb).
> > > >
> > > > We would like some help in understanding what could be the reason of
> > > these
> > > > high DM allocations. Are there things other than message payload and
> > AMQP
> > > > connection, which use DM and could be contributing to these high
> usage?
> > > >
> > > > Another thing where we are puzzled is the de-allocation of DM byte
> > > buffers.
> > > > From log mining of heap and DM utilization, de-allocation of DM
> doesn't
> > > > correlate with heap GC. If anyone has seen any documentation related
> to
> > > > this, it would be very helpful if you could share that.
> > > >
> > > > Thanks
> > > > Ramayan
> > > >
> > > >
> > > > *Exceptions*
> > > >
> > > > java.lang.OutOfMemoryError: Direct buffer memory
> > > > at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.8.0_40]
> > > > at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
> > > > ~[na:1.8.0_40]
> > > > at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
> > > ~[na:1.8.0_40]
> > > > at
> > > > org.apache.qpid.bytebuffer.QpidByteBuffer.allocateDirect(
> > > > QpidByteBuffer.java:474)
> > > > ~[qpid-common-6.0.5.jar:6.0.5]
> > > > at
> > > > org.apache.qpid.server.transport.NonBlockingConnectionPlainDelegate.
> > > > restoreApplicationBufferForWrite(NonBlockingConnectionPlainDele
> > > > gate.java:93)
> > > > ~[qpid-broker-core-6.0.5.jar:6.0.5]
> > > > at
> > > > org.apache.qpid.server.transport.NonBlockingConnectionPlainDele
> > > > gate.processData(NonBlockingConnectionPlainDelegate.java:60)
> > > > ~[qpid-broker-core-6.0.5.jar:6.0.5]
> > > > at
> > > > org.apache.qpid.server.transport.NonBlockingConnection.doRead(
> > > > NonBlockingConnection.java:506)
> > > > ~[qpid-broker-core-6.0.5.jar:6.0.5]
> > > > at
> > > > org.apache.qpid.server.transport.NonBlockingConnection.doWork(
> > > > NonBlockingConnection.java:285)
> > > > ~[qpid-broker-core-6.0.5.jar:6.0.5]
> > > > at
> > > > org.apache.qpid.server.transport.NetworkConnectionScheduler.
> > > > processConnection(NetworkConnectionScheduler.java:124)
> > > > ~[qpid-broker-core-6.0.5.jar:6.0.5]
> > > > at
> > > > org.apache.qpid.server.transport.SelectorThread$ConnectionProcessor.
> > > > processConnection(SelectorThread.java:504)
> > > > ~[qpid-broker-core-6.0.5.jar:6.0.5]
> > > > at
> > > > org.apache.qpid.server.transport.SelectorThread$
> > > > SelectionTask.performSelect(SelectorThread.java:337)
> > > > ~[qpid-broker-core-6.0.5.jar:6.0.5]
> > > > at
> > > > org.apache.qpid.server.transport.SelectorThread$SelectionTask.run(
> > > > SelectorThread.java:87)
> > > > ~[qpid-broker-core-6.0.5.jar:6.0.5]
> > > > at
> > > > org.apache.qpid.server.transport.SelectorThread.run(
> > > > SelectorThread.java:462)
> > > > ~[qpid-broker-core-6.0.5.jar:6.0.5]
> > > > at
> > > > java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > > ThreadPoolExecutor.java:1142)
> > > > ~[na:1.8.0_40]
> > > > at
> > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > > ThreadPoolExecutor.java:617)
> > > > ~[na:1.8.0_40]
> > > > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_40]
> > > >
> > > >
> > > >
> > > > *Second exception*
> > > > java.lang.OutOfMemoryError: Direct buffer memory
> > > > at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.8.0_40]
> > > > at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
> > > > ~[na:1.8.0_40]
> > > > at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
> > > ~[na:1.8.0_40]
> > > > at
> > > > org.apache.qpid.bytebuffer.QpidByteBuffer.allocateDirect(
> > > > QpidByteBuffer.java:474)
> > > > ~[qpid-common-6.0.5.jar:6.0.5]
> > > > at
> > > > org.apache.qpid.server.transport.NonBlockingConnectionPlainDele
> > > > gate.<init>(NonBlockingConnectionPlainDelegate.java:45)
> > > > ~[qpid-broker-core-6.0.5.jar:6.0.5]
> > > > at
> > > > org.apache.qpid.server.transport.NonBlockingConnection.
> > > > setTransportEncryption(NonBlockingConnection.java:625)
> > > > ~[qpid-broker-core-6.0.5.jar:6.0.5]
> > > > at
> > > > org.apache.qpid.server.transport.NonBlockingConnection.<init>(
> > > > NonBlockingConnection.java:117)
> > > > ~[qpid-broker-core-6.0.5.jar:6.0.5]
> > > > at
> > > > org.apache.qpid.server.transport.NonBlockingNetworkTransport.
> > > > acceptSocketChannel(NonBlockingNetworkTransport.java:158)
> > > > ~[qpid-broker-core-6.0.5.jar:6.0.5]
> > > > at
> > > > org.apache.qpid.server.transport.SelectorThread$SelectionTask$1.run(
> > > > SelectorThread.java:191)
> > > > ~[qpid-broker-core-6.0.5.jar:6.0.5]
> > > > at
> > > > org.apache.qpid.server.transport.SelectorThread.run(
> > > > SelectorThread.java:462)
> > > > ~[qpid-broker-core-6.0.5.jar:6.0.5]
> > > > at
> > > > java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > > ThreadPoolExecutor.java:1142)
> > > > ~[na:1.8.0_40]
> > > > at
> > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > > ThreadPoolExecutor.java:617)
> > > > ~[na:1.8.0_40]
> > > > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_40]
> > > >
> > >
> >
>

Reply via email to