[ https://issues.apache.org/jira/browse/ARTEMIS-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Clebert Suconic closed ARTEMIS-3647. ------------------------------------ Fix Version/s: 2.21.0 Resolution: Fixed > rolledbackMessageRefs can grow until OOM with OpenWire clients > -------------------------------------------------------------- > > Key: ARTEMIS-3647 > URL: https://issues.apache.org/jira/browse/ARTEMIS-3647 > Project: ActiveMQ Artemis > Issue Type: Bug > Reporter: Anton Roskvist > Priority: Major > Fix For: 2.21.0 > > Time Spent: 4h 40m > Remaining Estimate: 0h > > {color:#1d1c1d}In my use case I have quite a few long lived OpenWire > consumers. I noticed that over time the heap usage increases. Looking through > a heap dump, I found that memory is held in "rolledbackMessageRefs". In one > case holding as much as 1.6GB of data with 0 messages on queue. > Disconnecting the consumer and then reconnecting released the memory. > Clients are running Spring with transactions. The clients affected by this > have some small issue receiving messages such that some of them are retried a > couple of times before getting processed properly. > I suspect that "rolledbackMessageRefs"{color} are not getting cleared with > the message ref once it's finally processed for some reason. > {-}{color:#1d1c1d}I have not found a way to reproduce this yet and it happens > over several days. > {color}{-}{color:#1d1c1d}UPDATE: I can easily reproduce this by setting up a > standalone Artemis broker with "out-of-the-box"-configuration and using these > tools:{color} -- [https://github.com/erik-wramner/JmsTools] (AmqJmsConsumer > and optionally AmqJmsProducer) > 1. Start the broker > 2. Send 100k messages to "queue://TEST" > {code:java} > # java -jar JmsTools/shaded-jars/AmqJmsProducer.jar -url > "tcp://localhost:61616" -user USER -pw PASSWORD -queue TEST -count > 100000{code} > 3. Receive one more message than produced and do a rollback on 30% of them > (unrealistic, but means this can be done in minutes instead of days. Receive > one more to ensure consumer stays live) > {code:java} > # java -jar JmsTools/shaded-jars/AmqJmsConsumer.jar -url > "tcp://localhost:61616?jms.prefetchPolicy.all=100&jms.nonBlockingRedelivery=true" > -user USER -pw PASSWORD -queue TEST -count 100001 -rollback 30{code} > 4. Wait until no more messages are left on "queue://TEST" (a few might be on > DLQ but that's okay) > 5. Get a heap dump with the consumer still connected > {code:java} > # jmap -dump:format=b,file=dump.hprof Artemis_PID{code} > 6. Running "Leak suspects" with MAT will show a (relatively) large amount of > memory held by {color:#1d1c1d}"rolledbackMessageRefs"{color} for the consumer > connected to queue://TEST > The consumer is run with "jms.nonBlockingRedelivery=true" to speed things up, > though it should not be strictly needed. > As an added bonus this also shows that the prefetch limit > "jms.prefetchPolicy.all=100" is not respected while messages are in the > redelivery process which can easily be seen in the consoles > "Attributes"-section for the queue. This is also true for the default > prefetch value of 1000. > Br, > Anton -- This message was sent by Atlassian Jira (v8.20.1#820001)