Re: Karaf clustering : Cellar or not ?

Ephemeris Lappis Wed, 12 Oct 2022 22:10:45 -0700

Hello

So what I've observed is a known behavior. Is there any workaround toallow chaining commands in a safe way ? Indeed we have to take someorientations about our future tooling for setup, deployment andmaintenance of our features in Karaf clusters, and chaining command inJolokia requests was an attractive solution.

I hope you can provide soon a safer cluster synchronization... I'll behappy to help testing it.

A last question about Camel (perhaps not the best list for it) : I'vetested the master component in routes with a file lock cluster service,since my tests rely on docker containers running on a same machine andcan use a shared volume that accepts locks. Is there a Camel ClusterService implementation that benefits of Cellar and Hazelcast (itprovides distributed named locks) that could be more suitable on reallydistributed cases with no shared storage ?


Thanks again.

Regards.

Ephemeris Lappis

Le 12/10/2022 à 18:39, Jean-Baptiste Onofré a écrit :

Hi,

I guess you are "hitting":

  https://issues.apache.org/jira/browse/KARAF-6168

Currently, the "cluster commands" (sent between nodes) are not
"ordered". So, depending of the network, you don't have the guarantee
that command "2" is executed after "1" on a node (some nodes can be
ok, some not).

When you execute the command step by step, and let the cluster sync,
it's OK. But if you send a bunch of commands at same time, you might
have "non ordered" events.

I propose (in the Jira) to refactore the event mechanism at follow:
1. instead of sending the full state in the command (for example,
currently Cellar send "install feature A on cluster group 1"), the
command will just send "feature update" to all nodes.
2. then, each node will check the cluster group status on the
distributed data gris (in Hazelcast) and compare its local state with
the cluster state, and sync.

I'm working on Karaf runtime releases right now, I will be back on
Cellar during the week end.

Regards
JB

On Wed, Oct 12, 2022 at 3:39 PM Ephemeris Lappis
<[email protected]> wrote:

Hello again.

I'm still testing Cellar. I've detected strange behaviours when
installing features. I've done the same tests with 2 or 3 nodes in my
cluster, and using both ssh shell or jolokia to execute commands to
add repositories and install new features.

When commands are played "slowly", waiting some time between them, all
seems to be OK.

But when commands are chained, for example in a single jolokia post,
the cluster seems to execute them in an unexpected order.

For example, the following jolokia post (the same happens with two
commands on the shell CLI), adding a repository, and installing a new
feature is executed on the first node (the node I use as jolokia
target), but it seems that the "installFeature" is propagated to the
second node before it receives or processes the first repository
command : this results with an error because the feature is unknown.

[
     {
         "type" : "exec",
         "mbean" : "org.apache.karaf.cellar:name=root,type=feature",
         "operation" : "addRepository(java.lang.String, java.lang.String)",
         "arguments":["default",
"mvn:my.tests/my-test-26-karaf-s-jms/0.0.1-SNAPSHOT/xml/features"]
     },
     {
         "type" : "exec",
         "mbean" : "org.apache.karaf.cellar:name=root,type=feature",
         "operation" : "installFeature(java.lang.String, java.lang.String)",
         "arguments":["default", "my-test-26-karaf-s-jms"]
     }
]

After this kind of error the failing node seems not to be able to
synchronize itself, adding again the feature does nothing, and
removing the repository with -u option also leads to errors...

This is the same if I try to install "standard" features like "http",
the "webconsole", and so on, in a same jolokia call...

During some tests, after removing the features and repositories, some
bundles are still present on some nodes. The only way to clean them is
to create brand new Karaf containers and create the Cellar cluster
again.

So, some questions...

Is there some "time logic" for chaining commands on a Cellar cluster ?

Is it important to have the same IDs for bundles on all the nodes ?
When commands are executed one by one, and waiting some time, this is
always the case, but when commands are in batches, problems seem to
appear with bundle IDs differences.

What do people usually do to install big numbers of features on Cellar
clusters, using dev-ops tooling for example ?

Thanks for your help.

Le ven. 7 oct. 2022 à 18:48, Jean-Baptiste Onofré <[email protected]> a écrit :

The node ID should be still the same.

Let me try to reproduce the "alias lost". It sounds like a bug in
alias cluster sync indeed.

Regards
JB

On Fri, Oct 7, 2022 at 6:25 PM Ephemeris Lappis
<[email protected]> wrote:

Hello.

Nice ! I'm waiting for the new package ;)...

I'm still testing on my docker environment. I've tried deploying
features using either the shell or jolokia, always with Cellar
commands or MBean. For now, no problem about the features, but I had
twice a strange issue : I've set aliases on my 3 instances, and after
a feature install, the 3 aliases have been lost. The same when
stopping the instances (compose stop in my case), and restarting them.
Perhaps some kind of bug on aliases ?

Thanks again.

Regards.

Le ven. 7 oct. 2022 à 08:24, Jean-Baptiste Onofré <[email protected]> a écrit :

Hi,

Catcha, let me create the Jira and work on this ;)

Thanks !
Regards
JB

On Thu, Oct 6, 2022 at 11:39 PM Ephemeris Lappis
<[email protected]> wrote:

Hello !

Thanks a lot for your very detailed answer.

After you explanations, I'm not sure that listeners are really needed in
our case, but I'm going to enable them and test again with basic
features/bundles commands. If we  can use the Cellar's MBean to script
our deployments playbooks with Jolokia calls, perhaps basic
synchronization is enough for us.

For the last point, a Karaf+Cellar "off the shelf" tarball would
obviously be a nice gift. I don't know if someone may use a prebuilt
image : we usually make our own Docker images based on common linux+java
stacks that are elaborated and managed by our DevOps team. Anyway,
working examples of configuration to build custom Karaf assemblies could
really help : the few examples I've found seem to build limited features
distributions, enumerating known features, adding some custom ones, but
probably missing others. An explained example with all Karaf features
and the addition of Cellar should be interesting for learning...

So if you could provide both... very happy :) !

Thanks again.

Ephemeris Lappis

Le 06/10/2022 à 19:56, Jean-Baptiste Onofré a écrit :

Hi,

1. By default, only cluster:* commands spread the state on the cluster
2. If you want the "regular" non cluster commands (like
feature:install) spread also the state, you have to enable the
listeners. The listeners are all disabled by default. You can enable
them in etc/org.apache.karaf.cellar.node.cfg. You have one listener
per resource: bundle.listener, config.listener, feature.listener. If
you set true to all of them, you will have sync on regular command and
even for local event (like changing cfg file for instance). It's
documented here:
https://karaf.apache.org/manual/cellar/latest-4/#_synchronizers_and_sync_policy
3. As jolokia is just a JMX client, and Cellar exposes MBeans, you can
interact with cluster using jolokia
4. About the distribution, I should definitely provide a full example
to create it and even push a karaf/cellar official distro and docker
image. Thoughts ?

Regards
JB

On Thu, Oct 6, 2022 at 5:58 PM Ephemeris Lappis
<[email protected]> wrote:

Hello again !

I've been testing Cellar on a simple cluster of 3 karaf instances
created with docker images and compose.
I've seen that cluster commands provide a synchronized provisioning of
features, and for example, that stopped nodes synchronization is done
when restarting.
This is clearly what we need :) !

I've also noticed that "non cluster" feature commands (repo-add,
install, unistall) do not produce the synchronization. I suppose
that's normal. So a new question : will it be possible to use jolokia
to execute cluster commands the same way we do it with default
features commands ?

Now I'd like to go a step further before testing on real k8s clusters.

In your presentation you said that for now there's not a downloadable
karaf distribution including Cellar, but that the best way to deploy
clusters is generating such a custom distribution, and then providing
a docker image with it. I've not found any example of the plugin
configuration to generate a Karaf+Cellar distribution with all the
default Karaf features and configurations, just adding Cellar.

Could you please provide any link to working examples ? This could be
very nice and help a lot ;) !!!

Thanks again.

Regards.

Le mar. 4 oct. 2022 à 07:40, Jean-Baptiste Onofré <[email protected]> a écrit :

Yes, you can mix the approaches together. For instance, you can
package in docker image: karaf runtime + cellar + your apps and then
you mix Kubernetes with Cellar. It's the presentation I did while ago
at ApacheCon.

Regards
JB

On Tue, Oct 4, 2022 at 7:12 AM Ephemeris Lappis
<[email protected]> wrote:

Hello.

Thanks for your explanations.

I understand that your 3rd choice is the only one to get multiple active
and synchronized instances. But can't I run them as PODs inside a
Kubernetes namespace, using deployments of an image based on
Karaf+Cellar, and then using the Jolokia API, for example, to deploy and
update my applications as  features, targeting any one of the scaled
instances, and let Cellar synchronizing the other instances ?

We already use Jolokia this way via Ansible playbooks to deploy
applications, as profiles instead of features, on Fuse clusters...

Thanks again.

Regards.

Ephemeris Lappis

Le 03/10/2022 à 18:36, Jean-Baptiste Onofré a écrit :

Hi,

In order:

1. Karaf HA Lock: you have one active, other instances are passive
2. Kubernetes: you can orchestrate start/stop of the Karaf docker
image, but Kubernetes doesn't sync Karaf instances state (like config,
feature installed, etc)
3. Cellar: sync Karaf instances together (you install one feature on
one Karaf instance, the feature will be installed on other Karaf
instances in the cluster)

Regards
JB

On Mon, Oct 3, 2022 at 5:44 PM Ephemeris Lappis
<[email protected]> wrote:

Hello.

I've just looked at the presentation of Cellar. If I understand it
well, this presentation says that Cellar's main goal is for "big
clusters", allowing automatic synchronization between Karaf instances.
It seems to be really nice in the presentation :) !

On the other hand, the basic lock mechanism only provides an
active/passive solution.

What should I prefer if my need is to provide both failover and load
balancing over a limited number of active instances, and not a "big
cluster". Today we use 6 Fuse Karaf instances distributed on 3 VM. Is
Cellar the right way, or did I miss something in the presentation ?

Another thing : for other kinds of applications my customer manages
several Kubernetes clusters. So I suppose that if a containerized
solution is preferred, it should be running on Kubernetes, since all
the existing DevOps tooling is already based on it.

The presentation focuses on Mesos/Marathon but also says that
Kubernetes is also an alternative solution. Right ? In this case, what
is the preferred way to package and deploy Karaf : just create a
custom Karaf+Cellar image (the same way the presentation shows), and
then create a Kubernetes deployment with the needed sizing and scaled
replicas ?

Some examples perhaps ?

Thanks in advance for your help.

Regards.

--
Cet e-mail a été vérifié par le logiciel antivirus d'Avast.
www.avast.com

--
Cet e-mail a été vérifié par le logiciel antivirus d'Avast.
www.avast.com


--
Cet e-mail a été vérifié par le logiciel antivirus d'Avast.
www.avast.com

Re: Karaf clustering : Cellar or not ?

Reply via email to