[ovirt-users] Re: Gluster volumes not healing (perhaps after host maintenance?)
A/ & PTR records are pretty important.As long as you setup your /etc/hosts jn the format like this you will be OK: 10.10.10.10 host1.anysubdomain.domain host110.10.10.11 host2.anysubdomain.domain host2 Usually the hostname is defined for each peer in the /var/lib/glusterd/peers. Can you check the contents on all nodes ? Best Regards,Strahil Nikolov On Sat, Apr 24, 2021 at 21:57, David White via Users wrote: ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/CYPYALTFM7ITZZENSI6R5E6ZNT7TRY5Y/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/WPTIX725OH43KE5FIK2I2G3H2FCMWEDH/
[ovirt-users] Re: Gluster volumes not healing (perhaps after host maintenance?)
As part of my troubleshooting earlier this morning, I gracefully shut down the ovirt-engine so that it would come up on a different host (can't remember if I mentioned that or not). I just verified forward DNS on all 3 of the hosts. All 3 resolve each other just fine, and are able to ping each other. The hostnames look good, too. I'm fairly certain that this problem didn't exist prior to me shutting the host down and replacing the network card. That said, I don't think I ever setup rdns / ptr records to begin with. I don't recall reading that rdns was a requirement, nor do I remember setting it up when I built the cluster a couple weeks ago. Is this a requirement? I did setup forward dns entries into /etc/hosts on each server, though. Sent with ProtonMail Secure Email. ‐‐‐ Original Message ‐‐‐ On Saturday, April 24, 2021 11:03 AM, Strahil Nikolov wrote: > Hi David, > > let's start with the DNS. > Check that both nodes resolve each other (both A/ & PTR records). > > If you set entries in /etc/hosts, check them out. > > Also , check the output of 'hostname -s' & 'hostname -f' on both hosts. > > Best Regards, > Strahil Nikolov publickey - dmwhite823@protonmail.com - 0x320CD582.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/CYPYALTFM7ITZZENSI6R5E6ZNT7TRY5Y/
[ovirt-users] Re: How do I share a disk across multiple VMs?
On Sat, Apr 24, 2021 at 3:31 PM David White via Users wrote: > > Off topic, but something to address: We need a stable ovirt-guest-agent > package. This doesn't seem to be working for me, although I'll take a look > at it more closely again when I have some time: > https://launchpad.net/ubuntu/focal/+source/ovirt-guest-agent > > ovirt-guest-agent is deprecated in 4.4. See also here the downstream documentation: https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/html/release_notes/deprecated_features_rhv Gianluca ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/JWJJSLK24YY4XNJ7NIFKCAD7M7FKQO5H/
[ovirt-users] Re: Gluster volumes not healing (perhaps after host maintenance?)
Hi David, let's start with the DNS.Check that both nodes resolve each other (both A/ & PTR records). If you set entries in /etc/hosts, check them out. Also , check the output of 'hostname -s' & 'hostname -f' on both hosts. Best Regards,Strahil Nikolov___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/6S4SBF42ABLYDWJNBZEBBGNB3FLSL53W/
[ovirt-users] Re: How do I share a disk across multiple VMs?
This turned into quite a discussion. LOL. A lot of interesting points. Thomas said --> > If only oVirt was a product rather than only a patchwork design! I think Sandro already spoke into this a little bit, but I would echo what they (he? she?) said. oVirt is an open source project, so there's really nothing preventing any of us from jumping in and assisting where we can. Granted, I'm not much of a software developer, but eventually, I could see how I can contribute my time in some ways. Replying to emails on the mailing list, providing engineering input on system level decisions, testing RC releases, fixing / debugging Ansible scripts, etc... (I love ansible!), helping to update documentation, etc... Sandro Said --> > I can understand the position here, but the fact that oVirt is developed and > stabilized against CentOS Stream which is upstream to RHEL doesn't prevent > you to run oVirt on RHEL or any other RHEL rebuild on production. If you face > any issue running oVirt on top of a downstream to CentOS Stream please report > a bug for it and we'll be happy to handle. My hosts are running RHEL 8.3, and I have no plans to move to something different. Ironically, though, the vast majority of my VMs are running Ubuntu. Off topic, but something to address: We need a stable ovirt-guest-agent package. This doesn't seem to be working for me, although I'll take a look at it more closely again when I have some time: https://launchpad.net/ubuntu/focal/+source/ovirt-guest-agent Thomas said --> > Yes, after what I know today, I should not have started with oVirt on > Gluster, but unfortunately HCI is exactly the most attractive grassroots use > case and the biggest growth opportunity for oVirt. Serious question: What's preventing you (or anyone) from just spinning up new storage with NFS, SCSI or whatever, mounting to the engine, and migrating your VMs to that new storage? Correct me if I'm wrong, but HCI is more of a philosophy and a framework than anything. There's nothing that prevents us from moving away from an HCI model. In fact, I'm getting ready to do something that will already start to move my environment in that direction. I have a 4th server that I originally bought last fall for testing & spare parts, but I've decided I want to put it into the datacenter, and use it as a backup destination, as well as a second DNS server for the overall environment. I have 3 spinning disks arriving next week that I'll put into a RAID 5 on that server, and my plan is to then build two different NFS mount points, and expose those NFS shares to the oVirt Cluster. I'm also half tempted to add that server as a host to the cluster as well, because it has 40 cores in it that would otherwise just sit there unused. I'll use 1 NFS share as a Backup domain, and I'll use another NFS share to store images that don't need ssd speeds - such as ISOs, and such. Sorry for the rabbit trail, but this plan I think is a classic example of my point that HCI is a philosophy, not a hard and fast rule. You can do whatever you want. If you don't like HCI, why can't you move to something else? Gianluca said --> > And for sure many problems are there in Gluster implementations, but for NFS, > FC or iSCSI based the situation in my opinion is quite better. That's interesting to hear. Out of curiosity, why keep gluster around? And also, why hasn't there been much of an effort to support Ceph in a HCI environment? I actually had someone who I know is a Red Hat TAM advise me (off the record, off the clock, not through official Red Hat channels) to stay away from Gluster as well. After my research & testing, though, it was the *only* way I could see how I could deploy my environment on the limited budget that I had (and still have). Eventually, yes, I would like to get better storage. One idea that comes to mind that would be stable and still relatively cheap is to grab a couple of Synology NAS devices, stick ssds into them, and put them into an HA pair, and expose that storage as an iscsi mount. - David Sent with ProtonMail Secure Email. publickey - dmwhite823@protonmail.com - 0x320CD582.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/KTZV4TELAWIZ4EKS5OB736H5QMN46MSU/
[ovirt-users] Gluster volumes not healing (perhaps after host maintenance?)
I discovered that the servers I purchased did not come with 10Gbps network cards, like I thought they did. So my storage network has been running on a 1Gbps connection for the past week, since I deployed the servers into the datacenter a little over a week ago. I purchased 10Gbps cards, and put one of my hosts into maintenance mode yesterday, prior to replacing the daughter card. It is now back online running fine on the 10Gbps card. All VMs seem to be working, even when I migrate them onto cha2, which is the host I did maintenance on yesterday morning. The other two hosts are still running on the 1Gbps connection, but I plan to do maintenance on them next week. The oVirt manager shows that all 3 hosts are up, and that all of my volumes - and all of my bricks - are up. However, every time I look at the storage, it appears that the self-heal info for 1 of the volumes is 10 minutes, and the self-heal info for another volume is 50+ minutes. This morning is the first time in the last couple of days that I've paid close attention to the numbers, but I don't see them going down. When I log into each of the hosts, I do see everything is connected in gluster. It is interesting to me, in this particular case, though that gluster on cha3 notices the hostname of 10.1.0.10 to be the IP address, and not the hostname (cha1). The host that I did the maintenance on is cha2. [root@cha3-storage dwhite]# gluster peer statusNumber of Peers: 2Hostname: 10.1.0.10Uuid: 87a4f344-321a-48b9-adfb-e3d2b56b8e7bState: Peer in Cluster (Connected)Hostname: cha2-storage.mgt.barredowlweb.comUuid: 93e12dee-c37d-43aa-a9e9-f4740b9cab14State: Peer in Cluster (Connected) When I run `gluster volume heal data`, I see the following: [root@cha3-storage dwhite]# gluster volume heal data Launching heal operation to perform index self heal on volume data has been unsuccessful: Commit failed on cha2-storage.mgt.barredowlweb.com. Please check log file for details. I get the same results if I run the command on cha2, for any volume: [root@cha2-storage dwhite]# gluster volume heal data Launching heal operation to perform index self heal on volume data has been unsuccessful: Glusterd Syncop Mgmt brick op 'Heal' failed. Please check glustershd log file for details. [root@cha2-storage dwhite]# gluster volume heal vmstore Launching heal operation to perform index self heal on volume vmstore has been unsuccessful: Glusterd Syncop Mgmt brick op 'Heal' failed. Please check glustershd log file for details. I see a lot of stuff like this on cha2 /var/log/glusterfs/glustershd.log: [2021-04-24 11:33:01.319888] I [rpc-clnt.c:1975:rpc_clnt_reconfig] 2-engine-client-0: changing port to 49153 (from 0)[2021-04-24 11:33:01.329463] I [MSGID: 114057] [client-handshake.c:1128:select_server_supported_programs] 2-engine-client-0: Using Program [{Program-name=GlusterFS 4.x v1}, {Num=1298437}, {Version=400}][2021-04-24 11:33:01.330075] W [MSGID: 114043] [client-handshake.c:727:client_setvolume_cbk] 2-engine-client-0: failed to set the volume [{errno=2}, {error=No such file or directory}][2021-04-24 11:33:01.330116] W [MSGID: 114007] [client-handshake.c:752:client_setvolume_cbk] 2-engine-client-0: failed to get from reply dict [{process-uuid}, {errno=22}, {error=Invalid argument}][2021-04-24 11:33:01.330140] E [MSGID: 114044] [client-handshake.c:757:client_setvolume_cbk] 2-engine-client-0: SETVOLUME on remote-host failed [{remote-error=Brick not found}, {errno=2}, {error=No such file or directory}][2021-04-24 11:33:01.330155] I [MSGID: 114051] [client-handshake.c:879:client_setvolume_cbk] 2-engine-client-0: sending CHILD_CONNECTING event [][2021-04-24 11:33:01.640480] I [rpc-clnt.c:1975:rpc_clnt_reconfig] 3-vmstore-client-0: changing port to 49154 (from 0)The message "W [MSGID: 114007] [client-handshake.c:752:client_setvolume_cbk] 3-vmstore-client-0: failed to get from reply dict [{process-uuid}, {errno=22}, {error=Invalid argument}]" repeated 4 times between [2021-04-24 11:32:49.602164] and [2021-04-24 11:33:01.649850][2021-04-24 11:33:01.649867] E [MSGID: 114044] [client-handshake.c:757:client_setvolume_cbk] 3-vmstore-client-0: SETVOLUME on remote-host failed [{remote-error=Brick not found}, {errno=2}, {error=No such file or directory}][2021-04-24 11:33:01.649969] I [MSGID: 114051] [client-handshake.c:879:client_setvolume_cbk] 3-vmstore-client-0: sending CHILD_CONNECTING event [][2021-04-24 11:33:01.650095] I [MSGID: 114018] [client.c:2225:client_rpc_notify] 3-vmstore-client-0: disconnected from client, process will keep trying to connect glusterd until brick's port is available [{conn-name=vmstore-client-0}] How do I further troubleshoot? Sent with ProtonMail Secure Email. publickey - dmwhite823@protonmail.com - 0x320CD582.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature ___ Users mailing list -- users@ovirt.org To unsubscribe