Gabriel Nieves created AMQ-5899:
-----------------------------------
Summary: Unable to recover after going below viable H/A
master/slave (Unkown data type
Key: AMQ-5899
URL: https://issues.apache.org/jira/browse/AMQ-5899
Project: ActiveMQ
Issue Type: Bug
Components: activemq-leveldb-store
Affects Versions: 5.10.0
Environment: 3 CentOS Servers running ActiveMQ (5.10.0 or 5.11.0),
each connect in different sites. Using H/A master slave concept. SSL enabled.
using levelDB. Using Zookeeper.
Reporter: Gabriel Nieves
I have 3 servers running ActiveMQ in High availability mode. Lets call these
server A, B and C, and lets say A is master. if you stop 2 servers, A and B,
while at the same time you are send messages to server A, everything will go
down; which makes since. Now if you start up you start up A and B, you will get
an Unknown data type and a javax.IOException or a null pointer excepting after
a master has been selected and a slave has attached.
I suspect this is caused mainly because during the time these servers stopped
replication was occurring, thus "corrupting" the levelDB. I say "corrupting",
however there has been cases were I only started one up after going below
viable and everything worked fine, so this could be caused by a synchronization
issue with levelDB replication.
After I get this Unknown data type error which value changes every time I
replicated this issue, the master server will restart. This happens many times
and eventually the ActiveMQ process dies.
So far to get these server up an running again I need to clear the
activemq-data folder where all the replication logs are located. This is not an
acceptable solution.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)