[ovirt-users] Re: storage healing question

Sahina Bose Thu, 08 Nov 2018 21:01:45 -0800

On Fri, Nov 9, 2018 at 3:42 AM Dev Ops <[email protected]> wrote:
>
> The switches above our environment had some VPC issues and the port channels 
> went offline. The ports that had issues belonged to 2 of the gfs nodes in our 
> environment. We have 3 storage nodes total with the 3rd being the arbiter. I 
> wound up rebooting the first 2 nodes and everything came back happy. After a 
> few hours I noticed that the storage was up but complaining about being out 
> of sync and needing healing. Within the hour I noticed a VM had paused itself 
> due to storage issues. This is a small environment, for now, with only 30 
> VM's. I am new to Ovirt so this is uncharted territory for me. I am tailing 
> some logs and things look sort of normal and google is sending me down a 
> wormhole.
>
> If I run "gluster volume heal cps-vms-gfs info" this number seems to be 
> changing pretty regularly. Logs are showing lots of entries like this:
>
> [2018-11-08 21:55:05.996675] I [MSGID: 114047] 
> [client-handshake.c:1242:client_setvolume_cbk] 0-cps-vms-gfs-client-1: Server 
> and Client lk-version numbers are not same, reopening the fds
> [2018-11-08 21:55:05.997693] I [MSGID: 108002] [afr-common.c:5312:afr_notify] 
> 0-cps-vms-gfs-replicate-0: Client-quorum is met
> [2018-11-08 21:55:05.997717] I [MSGID: 114035] 
> [client-handshake.c:202:client_set_lk_version_cbk] 0-cps-vms-gfs-client-1: 
> Server lk version = 1
>
> I guess I am curious what else should I be looking for? Is this just taking 
> forever to heal? Is there something else I can run or I should do to verify 
> things are actually getting better? I ran an actual heal command and it 
> cleared everything for a few seconds and then the entries started to populate 
> again when I did the info command.
>
> [root@cps-vms-gfs01 glusterfs]# gluster volume status
> Status of volume: cps-vms-gfs
> Gluster process                                                               
>                  TCP Port  RDMA Port  Online  Pid
> ------------------------------------------------------------------------------
> Brick 10.8.255.1:/gluster/cps-vms-gfs01/brick                                 
>          49152     0          Y       4054
> Brick 10.8.255.2:/gluster/cps-vms-gfs02/brick                                 
>          49152     0          Y       4144
> Brick 10.8.255.3:/gluster/cps-vms-gfs03/brick                                 
>          49152     0          Y       4294
> Self-heal Daemon on localhost               N/A       N/A        Y       4279
> Self-heal Daemon on cps-vms-gfs02.cisco.com N/A       N/A        Y       5185
> Self-heal Daemon on 10.196.152.145          N/A       N/A        Y       50948
>
> Task Status of Volume cps-vms-gfs
> ------------------------------------------------------------------------------
> There are no active volume tasks
>
> I am running ovirt 4.2.5 and gluster 3.12.11.


Can you provide output of gluster volume heal cps-vms-gfs info, and
the logs from  /var/log/glusterfs/glfsheal-cps-vms-gfs.log and the
brick logs from /var/log/glusterfs/bricks for this volume.



>
> Thanks!
> _______________________________________________
> Users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/[email protected]/message/MDZXUZQSWQUKZRM3OUIGDOAMGDZHPVIF/
_______________________________________________
Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/W4Q3L3SV73WEOMDPHAL7SDRRBJGYT2EK/

[ovirt-users] Re: storage healing question

Reply via email to