Hi,

We have a network of brokers using the multicast protocol. Some messages
on that network disappear.

Our application is a large php website, where we use Stomp+activemq to
offload some of the work to allow asynchronous processing. Among that
are notifications similar to Facebook's, but also other queues. Those
queues vary from a few messages to over three million messages a day.

In the network are 11 servers which we implicitly give certain roles:
- Store and forward
Each of our 9 webservers has a activemq instance. This allows saving
some network overhead for the non-reused connections of PHP's
stomp-client (i.e. connects to 'localhost') and provides some redundancy
and buffering in case the central node has a problem.

- Central node
We have 2 of these of which one is 'active'. We simply connect the
consumers using the failover protocol to these nodes, with the 'active'
one being tried first.

The nodes all run ActiveMQ 5.13.3 and use multicast for transport discovery.

The flow for the messages in question is:
1. 'Something' happens on the website (i.e. a user's post is quoted)
2. The php-code produces a Stomp-message
3. The message is sent to the activemq on 'localhost' of that webserver
4. (Since there is no consumer there) that activemq forwards it to the
central node
5. Consumed from that central node using a long running php process that
consumes and acks each message as it arrives (i.e. does not buffer)

Since users started to report bugs about this, we added logging at many
levels. After a lot of digging it turns out those missing messages
coincide with log messages like this on the central node's activemq.log:

2016-09-28 17:47:23,637 | WARN  | suppressing duplicate message send
[ID:panda-41468-1473163942165-83:18887261:-1:1:1] from network producer
with producerSequence [1] less than last stored: 2 |
org.apache.activemq.broker.ProducerBrokerExchange | ActiveMQ Transport:
tcp:///172.29.249.161:50026@61616
...
2016-09-28 22:14:55,310 | WARN  | suppressing duplicate message send
[ID:phobos-44763-1473074816679-26:18380536:-1:1:3] from network producer
with producerSequence [3] less than last stored: 5 |
org.apache.activemq.broker.ProducerBrokerExchange | ActiveMQ Transport:
tcp:///172.29.249.33:59068@61616

In total, this seems to happen about 3000 times a day.

So if those are (considered) duplicate, where are the initial ones? And
if they're false positives, how do we prevent false positives while
keeping support for true positives?

Below is a somewhat stripped down version of our <broker>-section (no
comments and debug stuff). All 11 servers have the same config (apart
from the brokerName).

Best regards,

Arjen


<broker xmlns="http://activemq.apache.org/schema/core";
brokerName="nestor" dataDirectory="${activemq.data}"
schedulerSupport="true">
    <destinationPolicy>
        <policyMap>
            <policyEntries>
                <policyEntry topic=">" >
                    <pendingMessageLimitStrategy>
                        <constantPendingMessageLimitStrategy limit="1000"/>
                    </pendingMessageLimitStrategy>
                </policyEntry>
                <policyEntry queue=">" producerFlowControl="true"
memoryLimit="128mb" enableAudit="false">
                    <networkBridgeFilterFactory>
                        <conditionalNetworkBridgeFilterFactory
replayWhenNoConsumers="true"/>
                    </networkBridgeFilterFactory>
                </policyEntry>
            </policyEntries>
        </policyMap>
    </destinationPolicy>

    <networkConnectors>
        <networkConnector
uri="multicast://default?group=tweakersActiveMQProduction&amp;prefetchSize=1"
/>
    </networkConnectors>

    <persistenceAdapter>
        <kahaDB directory="${activemq.data}/kahadb"/>
    </persistenceAdapter>

    <transportConnectors>
        <transportConnector name="openwire"
uri="tcp://0.0.0.0:61616?maximumConnections=1000&amp;wireFormat.maxFrameSize=104857600"
discoveryUri="multicast://default?group=tweakersActiveMQProduction"
auditNetworkProducers="true"/>
        <transportConnector name="stomp"
uri="stomp://0.0.0.0:61613?transport.closeAsync=false&amp;maximumConnections=1000&amp;wireFormat.maxFrameSize=104857600"/>
    </transportConnectors>
</broker>

Reply via email to