[
https://issues.apache.org/jira/browse/AMQ-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627877#comment-13627877
]
Raul Kripalani commented on AMQ-4465:
-------------------------------------
At the risk of sparking up an entire different discussion – the real culprit of
all this is the store-and-forward technique, in my humble opinion. I think the
AMQ model could be essentially flawed for highly dynamic, elastic or cloud-like
scenarios, where consumers and producers can appear anywhere in the messaging
fabric, and AMQ instances are provisioned and de-provisioned on the fly.
The replayWhenNoConsumers was a solution to bounce messages freely across the
cluster. But really what we need is multiple ACTIVE brokers to see a single
view of reality, i.e. a shared knowledge about what messages exist and are
pending to be delivered, what consumers are alive and where, etc: a messaging
cloud.
In the era of big data and huge in-memory caches, this seems perfectly doable.
I'd advocate for a solution such that:
- ACTIVE brokers can connect to a single cache/db, no more exclusivity or
master locks.
- Reads and writes must be atomic or transactional, but blazing fast in both
cases.
- All instances see all messages and consumers, but are responsible for only
local consumers. They decide when to pick a message from the cache and push it
to a consumer.
- May be embeddable, so that you don't have to start a separate process to use
AMQ OOTB.
- Can be persistent/non-persistent.
Many NoSQL databases or Java-based distributed cache technologies exist which
could fulfill these requirements (probably with some adaptations).
> Rethink replayWhenNoConsumers solution
> --------------------------------------
>
> Key: AMQ-4465
> URL: https://issues.apache.org/jira/browse/AMQ-4465
> Project: ActiveMQ
> Issue Type: Improvement
> Components: Broker
> Affects Versions: 5.8.0
> Reporter: Torsten Mielke
>
> I would like to start a discussion about the way we allow messages to be
> replayed back to the original broker in a broker network, i.e. setting
> replayWhenNoConsumers=true.
> This discussion is based on the blog post
> http://tmielke.blogspot.de/2012/03/i-have-messages-on-queue-but-they-dont.html
> but I will outline the full story here again.
> Consider a network of two brokers A and B.
> Broker A has a producer that sends one msg to queue Test.in. Broker B has a
> consumer connected so the msg is transferred to broker B. Lets assume the
> consumer disconnects from B *before* it consumes the msg and reconnects to
> broker A. If broker B has replayWhenNoConsumers=true, the message will be
> replayed back to broker A.
> If that replay happens in a short time frame, the cursor will mark the
> replayed msgs as a duplicate and won't dispatch it. To overcome this, one
> needs to set enableAudit=false on the policyEntry for the destination.
> This has a consequence as it disables duplicate detection in the cursor.
> External JMS producers will still be blocked from sending duplicates thanks
> to the duplicate detection built into the persistence adapter.
> However you can still get duplicate messages over the network bridge now.
> With enableAudit=false these duplicates will be happily added to the cursor
> now. If the same consumer receives the duplicate message, it will likely
> detect the duplicate. However if the duplicate message is dispatched to a
> different consumer, it won't be detected but will be processed by the
> application.
> For many use cases its important not to receive duplicate messages so the
> above setup replayWhenNoConsumers=true and enableAudit=false becomes a
> problem.
> There is the additional option of specifying auditNetworkProducers="true" on
> the transport connector but that's very likely going to have consequences as
> well. With auditNetworkProducers="true" we will now detect duplicates over
> the network bridge, so if there is a network glitch while the message is
> replayed back on the bridge to broker A and broker B tries to resend the
> message again, it will be detected as a duplicate on broker A. This is good.
> However lets assume the consumer now disconnects from broker A *after* the
> message was replayed back from broker B to broker A but *before* the consumer
> actually received the message. The consumer then reconnects to broker B
> again.
> The replayed message is on broker A now. Broker B registers a new demand for
> this message (due to the consumer reconnecting) and broker A will pass on the
> message to broker B again. However due to auditNetworkProducers="true" broker
> B will treat the resent message as a duplicate and very likely not accept it
> (or even worse simply drop the message - not sure how exactly it will
> behave).
> So the message is stuck again and won't be dispatched to the consumer on
> broker B.
> The networkTTL setting will further have an effect on this scenario and so
> will have other broker topologies like a full mesh.
> It seems to me that
> - When allowing replayWhenNoConsumers=true you may receive duplicate messages
> unless you also set auditNetworkProducers="true" which has consequences as
> well.
> - If consumers are reconnecting to a different broker each time that you may
> end up with msgs stuck on a broker that won't get dispatched.
> - Ideally you want sticky consumers, i.e. they reconnect to the same broker
> if possible in order to avoid replaying back messages. This implies that you
> don't want to use randomize=true on failover urls. I don't think we recommend
> this in any docs.
> - The network ttl will potentially never be high enough and the message may
> be stuck on a particular broker as the consumer may have reconnected to
> another broker in the network.
> I am sure there are more sides to this discussion. I just wanted to capture
> what gtully and I found when discussing this problem.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira