[ovirt-users] Re: Gluster volumes not healing (perhaps after host maintenance?)

2021-04-24 Thread Strahil Nikolov via Users
A/ & PTR records are pretty important.As long as you setup your /etc/hosts 
jn the format like this you will be OK:
10.10.10.10 host1.anysubdomain.domain host110.10.10.11 
host2.anysubdomain.domain host2
Usually the hostname is defined for each peer in the /var/lib/glusterd/peers. 
Can you check the contents on all nodes ?
Best Regards,Strahil Nikolov 
 
  On Sat, Apr 24, 2021 at 21:57, David White via Users wrote:  
 ___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/CYPYALTFM7ITZZENSI6R5E6ZNT7TRY5Y/
  
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WPTIX725OH43KE5FIK2I2G3H2FCMWEDH/


[ovirt-users] Re: Gluster volumes not healing (perhaps after host maintenance?)

2021-04-24 Thread David White via Users
As part of my troubleshooting earlier this morning, I gracefully shut down the 
ovirt-engine so that it would come up on a different host (can't remember if I 
mentioned that or not).

I just verified forward DNS on all 3 of the hosts.
All 3 resolve each other just fine, and are able to ping each other. The 
hostnames look good, too.

I'm fairly certain that this problem didn't exist prior to me shutting the host 
down and replacing the network card.

That said, I don't think I ever setup rdns / ptr records to begin with. I don't 
recall reading that rdns was a requirement, nor do I remember setting it up 
when I built the cluster a couple weeks ago. Is this a requirement?

I did setup forward dns entries into /etc/hosts on each server, though.

Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐
On Saturday, April 24, 2021 11:03 AM, Strahil Nikolov  
wrote:

> Hi David,
> 

> let's start with the DNS.
> Check that both nodes resolve each other (both A/ & PTR records).
> 

> If you set entries in /etc/hosts, check them out.
> 

> Also , check the output of 'hostname -s' & 'hostname -f' on both hosts.
> 

> Best Regards,
> Strahil Nikolov

publickey - dmwhite823@protonmail.com - 0x320CD582.asc
Description: application/pgp-keys


signature.asc
Description: OpenPGP digital signature
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/CYPYALTFM7ITZZENSI6R5E6ZNT7TRY5Y/


[ovirt-users] Re: How do I share a disk across multiple VMs?

2021-04-24 Thread Gianluca Cecchi
On Sat, Apr 24, 2021 at 3:31 PM David White via Users 
wrote:

>
> Off topic, but something to address: We need a stable ovirt-guest-agent
> package. This doesn't seem to be working for me, although I'll take a look
> at it more closely again when I have some time:
> https://launchpad.net/ubuntu/focal/+source/ovirt-guest-agent
>
>
ovirt-guest-agent is deprecated in 4.4. See also here the downstream
documentation:
https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/html/release_notes/deprecated_features_rhv

Gianluca
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JWJJSLK24YY4XNJ7NIFKCAD7M7FKQO5H/


[ovirt-users] Re: Gluster volumes not healing (perhaps after host maintenance?)

2021-04-24 Thread Strahil Nikolov via Users
Hi David,

let's start with the DNS.Check that both nodes resolve each other (both A/ 
& PTR records).
If you set entries in /etc/hosts, check them out.
Also , check the output of 'hostname -s' & 'hostname -f' on both hosts.

Best Regards,Strahil Nikolov___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6S4SBF42ABLYDWJNBZEBBGNB3FLSL53W/


[ovirt-users] Re: How do I share a disk across multiple VMs?

2021-04-24 Thread David White via Users
This turned into quite a discussion. LOL.
A lot of interesting points.

Thomas said --> 

> If only oVirt was a product rather than only a patchwork design!
I think Sandro already spoke into this a little bit, but I would echo what they 
(he? she?) said. oVirt is an open source project, so there's really nothing 
preventing any of us from jumping in and assisting where we can. Granted, I'm 
not much of a software developer, but eventually, I could see how I can 
contribute my time in some ways. Replying to emails on the mailing list, 
providing engineering input on system level decisions, testing RC releases, 
fixing / debugging Ansible scripts, etc... (I love ansible!), helping to update 
documentation, etc...

Sandro Said -->
> I can understand the position here, but the fact that oVirt is developed and 
> stabilized against CentOS Stream which is upstream to RHEL doesn't prevent 
> you to run oVirt on RHEL or any other RHEL rebuild on production. If you face 
> any issue running oVirt on top of a downstream to CentOS Stream please report 
> a bug for it and we'll be happy to handle.
My hosts are running RHEL 8.3, and I have no plans to move to something 
different. Ironically, though, the vast majority of my VMs are running Ubuntu. 

Off topic, but something to address: We need a stable ovirt-guest-agent 
package. This doesn't seem to be working for me, although I'll take a look at 
it more closely again when I have some time: 
https://launchpad.net/ubuntu/focal/+source/ovirt-guest-agent

Thomas said --> 

> Yes, after what I know today, I should not have started with oVirt on 
> Gluster, but unfortunately HCI is exactly the most attractive grassroots use 
> case and the biggest growth opportunity for oVirt.

Serious question: What's preventing you (or anyone) from just spinning up new 
storage with NFS, SCSI or whatever, mounting to the engine, and migrating your 
VMs to that new storage?
Correct me if I'm wrong, but HCI is more of a philosophy and a framework than 
anything. There's nothing that prevents us from moving away from an HCI model.

In fact, I'm getting ready to do something that will already start to move my 
environment in that direction.
I have a 4th server that I originally bought last fall for testing & spare 
parts, but I've decided I want to put it into the datacenter, and use it as a 
backup destination, as well as a second DNS server for the overall environment. 
I have 3 spinning disks arriving next week that I'll put into a RAID 5 on that 
server, and my plan is to then build two different NFS mount points, and expose 
those NFS shares to the oVirt Cluster. I'm also half tempted to add that server 
as a host to the cluster as well, because it has 40 cores in it that would 
otherwise just sit there unused. I'll use 1 NFS share as a Backup domain, and 
I'll use another NFS share to store images that don't need ssd speeds - such as 
ISOs, and such.

Sorry for the rabbit trail, but this plan I think is a classic example of my 
point that HCI is a philosophy, not a hard and fast rule.
You can do whatever you want. If you don't like HCI, why can't you move to 
something else?

Gianluca said -->
> And for sure many problems are there in Gluster implementations, but for NFS, 
> FC or iSCSI based the situation in my opinion is quite better.
That's interesting to hear. Out of curiosity, why keep gluster around? And 
also, why hasn't there been much of an effort to support Ceph in a HCI 
environment?
I actually had someone who I know is a Red Hat TAM advise me (off the record, 
off the clock, not through official Red Hat channels) to stay away from Gluster 
as well. After my research & testing, though, it was the *only* way I could see 
how I could deploy my environment on the limited budget that I had (and still 
have).

Eventually, yes, I would like to get better storage. One idea that comes to 
mind that would be stable and still relatively cheap is to grab a couple of 
Synology NAS devices, stick ssds into them, and put them into an HA pair, and 
expose that storage as an iscsi mount.


- David

Sent with ProtonMail Secure Email.


publickey - dmwhite823@protonmail.com - 0x320CD582.asc
Description: application/pgp-keys


signature.asc
Description: OpenPGP digital signature
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KTZV4TELAWIZ4EKS5OB736H5QMN46MSU/


[ovirt-users] Gluster volumes not healing (perhaps after host maintenance?)

2021-04-24 Thread David White via Users
I discovered that the servers I purchased did not come with 10Gbps network 
cards, like I thought they did. So my storage network has been running on a 
1Gbps connection for the past week, since I deployed the servers into the 
datacenter a little over a week ago. I purchased 10Gbps cards, and put one of 
my hosts into maintenance mode yesterday, prior to replacing the daughter card. 
It is now back online running fine on the 10Gbps card.

All VMs seem to be working, even when I migrate them onto cha2, which is the 
host I did maintenance on yesterday morning.
The other two hosts are still running on the 1Gbps connection, but I plan to do 
maintenance on them next week.

The oVirt manager shows that all 3 hosts are up, and that all of my volumes - 
and all of my bricks - are up. However, every time I look at the storage, it 
appears that the self-heal info for 1 of the volumes is 10 minutes, and the 
self-heal info for another volume is 50+ minutes.

This morning is the first time in the last couple of days that I've paid close 
attention to the numbers, but I don't see them going down.

When I log into each of the hosts, I do see everything is connected in gluster.
It is interesting to me, in this particular case, though that gluster on cha3 
notices the hostname of 10.1.0.10 to be the IP address, and not the hostname 
(cha1).
The host that I did the maintenance on is cha2.

[root@cha3-storage dwhite]# gluster peer statusNumber of Peers: 2Hostname: 
10.1.0.10Uuid: 87a4f344-321a-48b9-adfb-e3d2b56b8e7bState: Peer in Cluster 
(Connected)Hostname: cha2-storage.mgt.barredowlweb.comUuid: 
93e12dee-c37d-43aa-a9e9-f4740b9cab14State: Peer in Cluster (Connected)

When I run `gluster volume heal data`, I see the following:
[root@cha3-storage dwhite]# gluster volume heal data
Launching heal operation to perform index self heal on volume data has been 
unsuccessful:
Commit failed on cha2-storage.mgt.barredowlweb.com. Please check log file for 
details.

I get the same results if I run the command on cha2, for any volume:
[root@cha2-storage dwhite]# gluster volume heal data
Launching heal operation to perform index self heal on volume data has been 
unsuccessful:
Glusterd Syncop Mgmt brick op 'Heal' failed. Please check glustershd log file 
for details.
[root@cha2-storage dwhite]# gluster volume heal vmstore
Launching heal operation to perform index self heal on volume vmstore has been 
unsuccessful:
Glusterd Syncop Mgmt brick op 'Heal' failed. Please check glustershd log file 
for details.

I see a lot of stuff like this on cha2 /var/log/glusterfs/glustershd.log:
[2021-04-24 11:33:01.319888] I [rpc-clnt.c:1975:rpc_clnt_reconfig] 
2-engine-client-0: changing port to 49153 (from 0)[2021-04-24 11:33:01.329463] 
I [MSGID: 114057] [client-handshake.c:1128:select_server_supported_programs] 
2-engine-client-0: Using Program [{Program-name=GlusterFS 4.x v1}, 
{Num=1298437}, {Version=400}][2021-04-24 11:33:01.330075] W [MSGID: 114043] 
[client-handshake.c:727:client_setvolume_cbk] 2-engine-client-0: failed to set 
the volume [{errno=2}, {error=No such file or directory}][2021-04-24 
11:33:01.330116] W [MSGID: 114007] 
[client-handshake.c:752:client_setvolume_cbk] 2-engine-client-0: failed to get 
from reply dict [{process-uuid}, {errno=22}, {error=Invalid 
argument}][2021-04-24 11:33:01.330140] E [MSGID: 114044] 
[client-handshake.c:757:client_setvolume_cbk] 2-engine-client-0: SETVOLUME on 
remote-host failed [{remote-error=Brick not found}, {errno=2}, {error=No such 
file or directory}][2021-04-24 11:33:01.330155] I [MSGID: 114051] 
[client-handshake.c:879:client_setvolume_cbk] 2-engine-client-0: sending 
CHILD_CONNECTING event [][2021-04-24 11:33:01.640480] I 
[rpc-clnt.c:1975:rpc_clnt_reconfig] 3-vmstore-client-0: changing port to 49154 
(from 0)The message "W [MSGID: 114007] 
[client-handshake.c:752:client_setvolume_cbk] 3-vmstore-client-0: failed to get 
from reply dict [{process-uuid}, {errno=22}, {error=Invalid argument}]" 
repeated 4 times between [2021-04-24 11:32:49.602164] and [2021-04-24 
11:33:01.649850][2021-04-24 11:33:01.649867] E [MSGID: 114044] 
[client-handshake.c:757:client_setvolume_cbk] 3-vmstore-client-0: SETVOLUME on 
remote-host failed [{remote-error=Brick not found}, {errno=2}, {error=No such 
file or directory}][2021-04-24 11:33:01.649969] I [MSGID: 114051] 
[client-handshake.c:879:client_setvolume_cbk] 3-vmstore-client-0: sending 
CHILD_CONNECTING event [][2021-04-24 11:33:01.650095] I [MSGID: 114018] 
[client.c:2225:client_rpc_notify] 3-vmstore-client-0: disconnected from client, 
process will keep trying to connect glusterd until brick's port is available 
[{conn-name=vmstore-client-0}]

How do I further troubleshoot?

Sent with ProtonMail Secure Email.

publickey - dmwhite823@protonmail.com - 0x320CD582.asc
Description: application/pgp-keys


signature.asc
Description: OpenPGP digital signature
___
Users mailing list -- users@ovirt.org
To unsubscribe