[
https://issues.apache.org/activemq/browse/AMQ-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rob Davies resolved AMQ-1925.
-----------------------------
Resolution: Fixed
Patch applied in SVN revision 693915.
> JDBC-Master/Slave Failover - Consumer stop after 1000 Messages
> --------------------------------------------------------------
>
> Key: AMQ-1925
> URL: https://issues.apache.org/activemq/browse/AMQ-1925
> Project: ActiveMQ
> Issue Type: Bug
> Components: Broker
> Affects Versions: 5.1.0
> Reporter: Mario Siegenthaler
> Assignee: Rob Davies
> Fix For: 5.3.0
>
> Attachments: AMQ1925Test.java, heapdump-1220373534484.hprof,
> patch-1925-1.diff, threaddump-1220371256910.tdump
>
>
> In a JDBC-Master/Slave Environment with ActiveMQ 5.1.0 (+patches for 1710 und
> 1838) the failover for consumers works, the consumers resume to get messages
> after the failover but then the suddenly stop after approx. 1000 messages
> (mostly 1000, one got to 1080). The consumers are using transacted sessions.
> The thread dump look unsuspicious, everybody is waiting on the Socket
> java.lang.Thread.State: RUNNABLE
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:129)
> at
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50)
> at
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.read(TcpBufferedInputStream.java:58)
> at java.io.DataInputStream.readInt(DataInputStream.java:370)
> at
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
> at
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:203)
> at
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:195)
> at
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:183)
> at java.lang.Thread.run(Thread.java:619)
> A memory dump from the consumers shows that they've really run out of
> messages and are waiting for the broker to deliver new ones. I've attached
> both the thread dump and the heap dump to this issue (or better: I'll do so :)
> The broker doesn't do anything (also waits on the transport-socket), the
> queue has a full page-in buffer (100 messages) but obviously fails to do
> anything with it. If I manually trigger a doDispatch of all pagedIn messages
> (via the debugger, just a try to revive the thing) it returns doing nothing
> at all, since all subscriptions are full (s.isFull). I further investigated
> the issue and was confused to see the prefetchExtension field of the
> PrefetchSubscription having a value of -1000 (negative!). This explains why
> it was considered full:
> dispatched.size() - prefetchExtension >= info.getPrefetchSize()
> 0 - (-1000) >= 1000
> quite nasty.. so even though the dispatched size was zero the client didn't
> receive any new messages.
> The only place this value can become negative is inside acknowledge, where
> it's decremented (prefetchExtension--), all other places do a Math.max(0, X).
> So here's my guess what happened: The client had a full (1000 messages)
> prefetch buffer when I killed my master. As soon as the slave was done
> starting they reconnected and started processing the messages in the prefetch
> and acknowleding them. This gradually decremented the counter into a negative
> value because the slave never got a chance to increment the prefetchExtension
> since it didn't action delivery those messages.
> Possible solutions:
> - clear the prefetch buffer on a failover
> - just don't allow this value to become smaller than zero (not sure if that
> covers all bases)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.