Hello All, I've been struggling with a few issues upgrading my 3-node HCI custer from 4.4 to 4.5.
At present the self hosted engine VM is properly running oVirt 4.5 on CentOS 8x stream. First host node, I set in maintenance and installed new node-ng image. I ran into issue with rescue mode on boot which appears to have been related to LVM devices bug. I was able to work past that and get the node to boot. The node running 4.5.2 image is booting properly and gluster/lvm mounts etc all look good. I am able to activate the host and run VMs on it etc. however, oVirt cli is showing that all bricks on host are DOWN. I was unable to get the bricks back up even after doing a force start of the volumes. Here is the glusterd log from the host in question when I try force start on the engine volume (other volumes are similar: ==> glusterd.log <== The message "I [MSGID: 106568] [glusterd-svc-mgmt.c:266:glusterd_svc_stop] 0-management: bitd service is stopped" repeated 2 times between [2022-08-29 18:09:56.027147 +0000] and [2022-08-29 18:10:34.694144 +0000] [2022-08-29 18:10:34.695348 +0000] I [MSGID: 106618] [glusterd-svc-helper.c:909:glusterd_attach_svc] 0-glusterd: adding svc glustershd (volume=engine) to existing process with pid 2473 [2022-08-29 18:10:34.695669 +0000] I [MSGID: 106131] [glusterd-proc-mgmt.c:84:glusterd_proc_stop] 0-management: scrub already stopped [2022-08-29 18:10:34.695691 +0000] I [MSGID: 106568] [glusterd-svc-mgmt.c:266:glusterd_svc_stop] 0-management: scrub service is stopped [2022-08-29 18:10:34.695832 +0000] I [MSGID: 106617] [glusterd-svc-helper.c:698:glusterd_svc_attach_cbk] 0-management: svc glustershd of volume engine attached successfully to pid 2473 [2022-08-29 18:10:34.703718 +0000] E [MSGID: 106115] [glusterd-mgmt.c:119:gd_mgmt_v3_collate_errors] 0-management: Post commit failed on gluster2.xxxxx. Please check log file for details. [2022-08-29 18:10:34.703774 +0000] E [MSGID: 106115] [glusterd-mgmt.c:119:gd_mgmt_v3_collate_errors] 0-management: Post commit failed on gluster1.xxxxx. Please check log file for details. [2022-08-29 18:10:34.703797 +0000] E [MSGID: 106664] [glusterd-mgmt.c:1969:glusterd_mgmt_v3_post_commit] 0-management: Post commit failed on peers [2022-08-29 18:10:34.703800 +0000] E [MSGID: 106664] [glusterd-mgmt.c:2664:glusterd_mgmt_v3_initiate_all_phases] 0-management: Post commit Op Failed If I run start command manually on host cli: gluster volume start engine force volume start: engine: failed: Post commit failed on gluster1.xxxx. Please check log file for details. Post commit failed on gluster2.xxxx. Please check log file for details. I feel like this may be some issue with the difference in major versions of GlusterFS on the nodes but I am unsure. The other nodes are running ovirt-node-ng-4.4.6.3 At this point I am afraid to bring down any other node to attempt upgrading it without the bricks in UP status on the first host. I do not want to lose quorum and potentially disrupt running VMs. Any idea why I can't seem to start the volumes on the upgraded host? Thanks!
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/JOGBWNI5VY4URQPPBX4ALB5W4UD4YYKA/