Hello,

That's my first post so i'll try to give as much information as possible.

My problem in a few words : i installed a 3-server ActiveMQ + Zookeeper +
_replicated_ leveldb store. I use 1 topic and 3 durable subscribers with a
filter. I send 1500 messages (500 for each durable subscriber). If i consume
them right after, that's OK. But if I perform either shutdown the ActiveMQ
master or the full cluster, when I restart it again, most of my messages are
gone ! No exceptions in ActiveMQ logs.


Now more in-depth reproduction steps :
1) I installed a Zookeeper 3 machine cluster (no password). See attached
zoo.cfg.

2) I uncompressed ActiveMQ 5.9.10 on 3 servers, using the attached
activemq.xml configuration file. I start with an EMPTY data / leveldb-data
dir.

3) To overcome this bug (https://issues.apache.org/jira/browse/AMQ-5105), I
follow the proposed workaround.

4) I start my 3 activemq servers (bin/activemq start) ; See all 3 attached
activemq.log

5) I create 3 durable subscribers on my Log.Raw topic (using jolokia API) :

curl --user admin:admin -s -XPOST
'http://rechrds01t.bbo1t.local:8161/api/jolokia/exec' -d '
{
  "type":"exec",
  "mbean":"org.apache.activemq:brokerName=task,type=Broker",
  "operation":"createDurableSubscriber",
  "arguments":[
    "Log.Raw.ORDO.1",
    "Log.Raw.ORDO",
    "Log.Raw",
    "domain='"'"'ORDO'"'"'"
  ]
}';echo
curl --user admin:admin -s -XPOST
'http://rechrds01t.bbo1t.local:8161/api/jolokia/exec' -d '
{
  "type":"exec",
  "mbean":"org.apache.activemq:brokerName=task,type=Broker",
  "operation":"createDurableSubscriber",
  "arguments":[
    "Log.Raw.BO.1",
    "Log.Raw.BO",
    "Log.Raw",
    "domain='"'"'BO'"'"'"
  ]
}';echo
curl --user admin:admin -s -XPOST
'http://rechrds01t.bbo1t.local:8161/api/jolokia/exec' -d '
{
  "type":"exec",
  "mbean":"org.apache.activemq:brokerName=task,type=Broker",
  "operation":"createDurableSubscriber",
  "arguments":[
    "Log.Raw.PARU.1",
    "Log.Raw.PARU",
    "Log.Raw",
    "domain='"'"'PARU'"'"'"
  ]
}';echo

6) I send 1500 messages, 500 with domain='BO', 500 with domain='PARU', 500
with domain='ORDO'
   NIO port sync mode, AUTO_ACKNOWLEDGE session.

7) I check (using hawt.io or ActiveMQ admin UI that my 3 durable subs have
500 enqueud messages.
   (If i try to consume them without stopping any activemq server, all is OK
: 1500 messages received)

8) I stop (bin/activemq stop) the activemq master (rechrds01t.bbo1t.local
for example) 
  => rechrds02t.bbo1t.local becomes master.

9) When i check the 3 durable subs queue size on the new master, it shows
that most of my messages
   have disappeared, except for a very few number.

I google'd a lot about this and found nothing really interesting...

I also attached a text file that lists the leveldb-data content at various
steps.

This is fully reproducible. What do i miss ? Is this a bug ?

Thank you a lot if you can help.
zoo.cfg <http://activemq.2283324.n4.nabble.com/file/n4683411/zoo.cfg>  
activemq.xml
<http://activemq.2283324.n4.nabble.com/file/n4683411/activemq.xml>  
activemq.log-on-rechrds01t
<http://activemq.2283324.n4.nabble.com/file/n4683411/activemq.log-on-rechrds01t>
  
activemq.log-on-rechrds02t
<http://activemq.2283324.n4.nabble.com/file/n4683411/activemq.log-on-rechrds02t>
  
activemq.log-on-rechrds03t
<http://activemq.2283324.n4.nabble.com/file/n4683411/activemq.log-on-rechrds03t>
  
leveldb-ls-lR.txt
<http://activemq.2283324.n4.nabble.com/file/n4683411/leveldb-ls-lR.txt>  



--
View this message in context: 
http://activemq.2283324.n4.nabble.com/3-machine-cluster-replicated-leveldb-durable-subs-massive-message-loss-on-restart-tp4683411.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Reply via email to