Re: [PROPOSAL] Rebirth of replicatedKahaDB
+1 for adding a replication backend +1 for keeping KahaDB intact for stability +1 for adding the replication as a separate backend adapter (perhaps it depends and extends activemq-kahadb-store vs direct modification?) +1 on considering alternatives to ZooKeeper IMHO— fully-synchronous replication in distributed computing is more legend than reality. Its only practical on very reliable low-latency networks and there needs to be a plan for prolonged split-brain. -Matt > On Feb 18, 2021, at 8:55 AM, Christopher Shannon > wrote: > > I like the idea of having a distributed store for 5.x (and Artemis too) but > I don't think it makes sense to mess with KafkaDB at this point as it is > quite stable and I think it would be tough to get it to work properly. > There were a ton of problems with Zookeeper and LevelDB which is one reason > it was deprecated. If you do want to go that route I would try and keep the > original KahaDB intact at least to not break existing users. > > If you want to work on this my vote would be to try to use BookKeeper and > just write a store implementation for it which should be much easier. I > figure it makes sense to use an existing product that is designed to be > replicated so we don't have to reinvent the wheel. I also think Artemis > could potentially benefit from BookKeeper as well. Having multiple store > choices would be good for users. > > On Thu, Feb 18, 2021 at 8:03 AM Michael André Pearce > wrote: > >> Hi JB, >> >> +1 from me in general. >> >> I have to say, i like the idea of separating control plane from data >> plane, using zk for topology and leader election. Its something that i >> think we need to do in Artemis tbh, too. >> >> Re kahadb in particular i think there were alot of issues, and thats why >> it was removed/deprecated originally no? I think it be good to understand a >> bit more here, else we risk re-introducing a historic issue. >> >> I do like though the idea maybe instead of using bookkeeper for the >> storage layer as an alternative though, it has done well for storage layer >> for Apache Pulsar project, and been very scale-able setup, i wonder with >> that would we be in the realms of matching those kinds of systems with >> scalability with this new setup? But with the advanced features and >> compatibility that activemq brings >> >> Best >> Mike >> >> >> >> >> On 17 February 2021 at 20:44, JB Onofré wrote: >> >> Hi everyone >> >> On a cloud environment, our current ActiveMQ5 topologies have limitations: >> >> - master slave works fine but requires either a shared file system (for >> instance AWS EFS) or database. And it also means that we only have one >> broker active at a time. >> - network of broker can be used to have kind of partitioning of messages >> across several brokers. However if we have pending messages on a broker and >> we lost this broker, the messages on this one are not available until we >> restart the broker (with the same file system). >> >> The idea of replicatedKahaDB is to replicate messages from one kahadb to >> another one. If we lost the broker, a broker with the replica is able to >> load the messages to be available. >> >> I started to work on this implementation: >> - adding a new configuration element as persistence adapter >> - adding zookeeper client m, zookeeper is used as topology storage, >> heartbeat and leader élection >> - I’m evaluating the use of bookkeeper as well (directly as storage) >> >> I will share a branch on my local repo with you soon. >> >> Any comment is welcome. >> >> Thanks >> Regards >> JB >> >>
Re: [PROPOSAL] Rebirth of replicatedKahaDB
Hey JB, I am interested here. I know many approaches to replication have been tried - with AMQ 5 as well as Artemis. For example, "LevelDB replicated storage" and "Pure Master Slave" (where the active broker copied updates to the passive brokers) in AMQ 5. So I'm curious how the problem is getting solved in an effective manner. I've had people ask about GlusterFS, but haven't heard of anyone successfully using it. The reason shared filesystem works so well is that a synchronous write to the shared filesystem is guaranteed "on disk" (and hence, accessible by all clients of that filesystem). Even though the overhead of the sync write can be significant, high speed networking and advanced hardware help minimize the latency introduced. If we use replication, would it be doing effectively the same thing for all the replicated copies? If not, then how can message loss and duplication be prevented on change of the active broker? Of course, one big downside for the shared filesystem solution is that it requires the server to be redundant and highly-available (like a Filter, or EFS), so a distributed solution like this is appealing. Cheers! Art On Wed, Feb 17, 2021 at 1:52 PM JB Onofré wrote: > Hi everyone > > On a cloud environment, our current ActiveMQ5 topologies have limitations: > > - master slave works fine but requires either a shared file system (for > instance AWS EFS) or database. And it also means that we only have one > broker active at a time. > - network of broker can be used to have kind of partitioning of messages > across several brokers. However if we have pending messages on a broker and > we lost this broker, the messages on this one are not available until we > restart the broker (with the same file system). > > The idea of replicatedKahaDB is to replicate messages from one kahadb to > another one. If we lost the broker, a broker with the replica is able to > load the messages to be available. > > I started to work on this implementation: > - adding a new configuration element as persistence adapter > - adding zookeeper client m, zookeeper is used as topology storage, > heartbeat and leader élection > - I’m evaluating the use of bookkeeper as well (directly as storage) > > I will share a branch on my local repo with you soon. > > Any comment is welcome. > > Thanks > Regards > JB >
Re: [PROPOSAL] Rebirth of replicatedKahaDB
I like the idea of having a distributed store for 5.x (and Artemis too) but I don't think it makes sense to mess with KafkaDB at this point as it is quite stable and I think it would be tough to get it to work properly. There were a ton of problems with Zookeeper and LevelDB which is one reason it was deprecated. If you do want to go that route I would try and keep the original KahaDB intact at least to not break existing users. If you want to work on this my vote would be to try to use BookKeeper and just write a store implementation for it which should be much easier. I figure it makes sense to use an existing product that is designed to be replicated so we don't have to reinvent the wheel. I also think Artemis could potentially benefit from BookKeeper as well. Having multiple store choices would be good for users. On Thu, Feb 18, 2021 at 8:03 AM Michael André Pearce wrote: > Hi JB, > > +1 from me in general. > > I have to say, i like the idea of separating control plane from data > plane, using zk for topology and leader election. Its something that i > think we need to do in Artemis tbh, too. > > Re kahadb in particular i think there were alot of issues, and thats why > it was removed/deprecated originally no? I think it be good to understand a > bit more here, else we risk re-introducing a historic issue. > > I do like though the idea maybe instead of using bookkeeper for the > storage layer as an alternative though, it has done well for storage layer > for Apache Pulsar project, and been very scale-able setup, i wonder with > that would we be in the realms of matching those kinds of systems with > scalability with this new setup? But with the advanced features and > compatibility that activemq brings > > Best > Mike > > > > > On 17 February 2021 at 20:44, JB Onofré wrote: > > Hi everyone > > On a cloud environment, our current ActiveMQ5 topologies have limitations: > > - master slave works fine but requires either a shared file system (for > instance AWS EFS) or database. And it also means that we only have one > broker active at a time. > - network of broker can be used to have kind of partitioning of messages > across several brokers. However if we have pending messages on a broker and > we lost this broker, the messages on this one are not available until we > restart the broker (with the same file system). > > The idea of replicatedKahaDB is to replicate messages from one kahadb to > another one. If we lost the broker, a broker with the replica is able to > load the messages to be available. > > I started to work on this implementation: > - adding a new configuration element as persistence adapter > - adding zookeeper client m, zookeeper is used as topology storage, > heartbeat and leader élection > - I’m evaluating the use of bookkeeper as well (directly as storage) > > I will share a branch on my local repo with you soon. > > Any comment is welcome. > > Thanks > Regards > JB > >
Re: [PROPOSAL] Rebirth of replicatedKahaDB
Hi JB, +1 from me in general. I have to say, i like the idea of separating control plane from data plane, using zk for topology and leader election. Its something that i think we need to do in Artemis tbh, too. Re kahadb in particular i think there were alot of issues, and thats why it was removed/deprecated originally no? I think it be good to understand a bit more here, else we risk re-introducing a historic issue. I do like though the idea maybe instead of using bookkeeper for the storage layer as an alternative though, it has done well for storage layer for Apache Pulsar project, and been very scale-able setup, i wonder with that would we be in the realms of matching those kinds of systems with scalability with this new setup? But with the advanced features and compatibility that activemq brings Best Mike On 17 February 2021 at 20:44, JB Onofré wrote: Hi everyone On a cloud environment, our current ActiveMQ5 topologies have limitations: - master slave works fine but requires either a shared file system (for instance AWS EFS) or database. And it also means that we only have one broker active at a time. - network of broker can be used to have kind of partitioning of messages across several brokers. However if we have pending messages on a broker and we lost this broker, the messages on this one are not available until we restart the broker (with the same file system). The idea of replicatedKahaDB is to replicate messages from one kahadb to another one. If we lost the broker, a broker with the replica is able to load the messages to be available. I started to work on this implementation: - adding a new configuration element as persistence adapter - adding zookeeper client m, zookeeper is used as topology storage, heartbeat and leader élection - I’m evaluating the use of bookkeeper as well (directly as storage) I will share a branch on my local repo with you soon. Any comment is welcome. Thanks Regards JB