It's important to understand the oVirt design philosophy.

That may be somewhat understated in the documentation, because I am afraid they 
copied that from VMware's vSphere who might have copied it from Nutanix, who 
might have copied it from who-know-else... which might explain why they are a 
little shy about it.

The basic truth is: HA  is a chicken and egg issue. Having several management 
engines won't get you HA, because in a case of conflict, these HA engines can't 
easily decide who is boss.

Which is why oVirt (and vSphere/Nutanix most likely) will concede defeat (or 
death) on every startup.

What that mostly means is that you don't really need to ensure that a smart and 
higly complex management machine, which can juggle dozens of infrastructure 
pieces against an optimal plan of operations, is in fact at all times highly 
available.

It's quite enough to have the infrastructure pieces ensure that the last plan 
this ME produced is faithfully executed.

So oVirt has a super intelligent management engine build a plan.
That plan is written to super primitive but reliant storage.
All hosts will faithfully (and without personal ambitions to improve) execute 
that last plan, which includes launching the management engine...
And that single newly started management engine, can read the basic 
infrastructure data, as well as the latest plant, to hopefully create a better 
new plan, before it dies...

And that's why, unless your ME always dies before a new plan can be created, 
you don't need HA for the ME: It's sufficient to have a good-enough plan for 
all hosts.

Like far too many clusters, oVirt relegates HA to a passive storage device, 
that is always highly available. With SANs and NFS filers, that's hopefully 
solved in hardware. With HCI Glusters it's done with majority voting, hopefully.

All that said...

I've rarely had all 3 nodes register just perfectly in a 3 node oVirt HCI 
cluster. I don't have any idea why that is the case in both 4.3 and 4.4.

I have almost always had to add the two additional node via 'add host' to make 
them available both as compute nodes and as Gluster peers. On the other hand, 
it just works and doing it twice or a hundred times, wont break a thing.

And that is true for almost every component of oVirt: practically all services 
can fail, or be restarted at any time, without causing a major disruption or 
outrigth failure. That's where I can't but admire this fail-safe approach 
(which somewhat unfortunately might not have been invented at Redhat, even if 
Moshe Bar most likely had a hand in it).

It never hurts do make sure that you add those extra nodes with the ability to 
run the management engine, either, but it's also something you can always add 
later to any host (just takes Ansible patience to do so).

Today I just consider that one of dozens if not hundreds of quirks of oVirt, 
that I find 'amazing' in a product also sold commercially.
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/AYAKEZMSUSPN5GIDA7HELHVRU4GFY36F/

Reply via email to