Gabriel Nieves created AMQ-5899:
-----------------------------------

             Summary: Unable to recover after going below viable H/A 
master/slave (Unkown data type
                 Key: AMQ-5899
                 URL: https://issues.apache.org/jira/browse/AMQ-5899
             Project: ActiveMQ
          Issue Type: Bug
          Components: activemq-leveldb-store
    Affects Versions: 5.10.0
         Environment: 3 CentOS Servers running ActiveMQ (5.10.0 or 5.11.0), 
each connect in different sites. Using H/A master slave concept. SSL enabled. 
using levelDB. Using Zookeeper. 
            Reporter: Gabriel Nieves


I have 3 servers running ActiveMQ in High availability mode. Lets call these 
server A, B and C, and lets say A is master. if you stop 2 servers, A and B, 
while at the same time you are send messages to server A, everything will go 
down; which makes since. Now if you start up you start up A and B, you will get 
an Unknown data type and a javax.IOException or a null pointer excepting after 
a master has been selected and a slave has attached.

I suspect this is caused mainly because during the time these servers stopped 
replication was occurring, thus "corrupting" the levelDB. I say "corrupting", 
however there has been cases were I only started one up after going below 
viable and everything worked fine, so this could be caused by a synchronization 
issue with levelDB replication.

After I get this Unknown data type error which value changes every time I 
replicated this issue,  the master server will restart. This happens many times 
and eventually the ActiveMQ process dies.

So far to get these server up an running again I need to clear the 
activemq-data folder where all the replication logs are located. This is not an 
acceptable solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to