[ovirt-users] Re: storage healing question
Hi, Can you restart the self-heal daemon by doing a `gluster volume start bgl-vms-gfs force` and then launch the heal again? If you are seeing different entries and counts each time you run heal info, there is likely a network issue (disconnect) between the (gluster fuse?) mount and the bricks of the volume leading to pending heals. Also, there was a bug in arbiter volumes[1] that got fixed in glusterfs 3.12.15. It can cause VMs to pause when you reboot the arbiter node, so it is recommended to upgrade to this gluster version. HTH, Ravi [1] https://bugzilla.redhat.com/show_bug.cgi?id=1637989 From: Dev Ops Date: Mon, Nov 12, 2018 at 1:09 PM Subject: [ovirt-users] Re: storage healing question To: Any help would be appreciated. I have since rebooted the 3rd gluster node which is the arbiter. This doesn't seem to want to heal. gluster volume heal bgl-vms-gfs info |grep Number Number of entries: 68 Number of entries: 0 Number of entries: 68 ___ Users mailing list --users@ovirt.org To unsubscribe send an email tousers-le...@ovirt.org Privacy Statement:https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:https://lists.ovirt.org/archives/list/users@ovirt.org/message/XNHA6WS5MGCLXJX3HCDGLZRVJ7E5Q7NX/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/YFN4DQKKX6YQR6W4U2EC2OJQS5TOJJSW/
[ovirt-users] Re: storage healing question
Any help would be appreciated. I have since rebooted the 3rd gluster node which is the arbiter. This doesn't seem to want to heal. gluster volume heal bgl-vms-gfs info |grep Number Number of entries: 68 Number of entries: 0 Number of entries: 68 ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/XNHA6WS5MGCLXJX3HCDGLZRVJ7E5Q7NX/
[ovirt-users] Re: storage healing question
Just a quick note the volume in question is actually called bgl-vms-gfs. The original message is still valid. [root@bgl-vms-gfs03 bricks]# gluster volume heal bgl-vms-gfs info Brick 10.8.255.1:/gluster/bgl-vms-gfs01/brick /.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.989 /.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.988 /.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.423 /.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.612 /.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.614 /.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.611 /.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.236 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.48 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.52 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.423 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.424 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.611 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.612 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.799 /.shard/18954415-3210-4d93-8591-0b3e1e5b3a16.498 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.1175 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.1551 /c71bb8b0-c669-4bf6-8348-14aafd4a805f/images/9dc54d22-7cb3-4e07-adbb-70f0ec5b7e6b/5f8515f7-3fae-4af6-adc4-d38426a9aa72 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.611 /.shard/dfe31381-6b91-4eb1-9050-0332182e424a.424 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.50 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.51 /.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.424 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.236 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.425 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.428 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.427 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.614 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.1363 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.238 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.428 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.612 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.423 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.614 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.987 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.429 /.shard/dfe31381-6b91-4eb1-9050-0332182e424a.429 /.shard/dfe31381-6b91-4eb1-9050-0332182e424a.241 /c71bb8b0-c669-4bf6-8348-14aafd4a805f/dom_md/ids /.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.987 /.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.987 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.241 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.429 /.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.424 /.shard/dfe31381-6b91-4eb1-9050-0332182e424a.987 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.987 /.shard/dfe31381-6b91-4eb1-9050-0332182e424a.238 /.shard/dfe31381-6b91-4eb1-9050-0332182e424a.428 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.238 /.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.611 /.shard/18954415-3210-4d93-8591-0b3e1e5b3a16.504 /.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.238 /.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.428 /.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.612 /.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.614 /.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.241 /.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.429 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.236 /.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.989 /.shard/18954415-3210-4d93-8591-0b3e1e5b3a16.909 /__DIRECT_IO_TEST__ /.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.240 /.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.429 /.shard/18954415-3210-4d93-8591-0b3e1e5b3a16.497 /.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.238 /.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.428 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.241 /.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.991 /.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.236 /.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.990 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.48 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.52 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.424 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.425 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.799 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.1175 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.1551 /.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.1363 /.shard/dfe31381-6b91-4eb1-9050-0332182e424a.612 /.shard/dfe31381-6b91-4eb1-9050-0332182e424a.614 /.shard/dfe31381-6b91-4eb1-9050-0332182e424a.236 /.shard/dfe31381-6b91-4eb1-9050-0332182e424a.611 /.shard/dfe31381-6b91-4eb1-9050-0332182e424a.989 /.shard/dfe31381-6b91-4eb1-9050-0332182e424a.988 /.shard/dfe31381-6b91-4eb1-9050-0332182e424a.990 /.shard/dfe31381-6b91-4eb1-9050-0332182e424a.991 Status: Connected Number of entries: 86 Brick 10.8.255.2:/gluster/bgl-vms-gfs02/brick Status: Connected Number of entries: 0 Brick 10.8.255.3:/gluster/bgl-vms-gfs03/brick /.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.236 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.48 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.52 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.423 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.424 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.611 /.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.612 /.sha
[ovirt-users] Re: storage healing question
On Fri, Nov 9, 2018 at 3:42 AM Dev Ops wrote: > > The switches above our environment had some VPC issues and the port channels > went offline. The ports that had issues belonged to 2 of the gfs nodes in our > environment. We have 3 storage nodes total with the 3rd being the arbiter. I > wound up rebooting the first 2 nodes and everything came back happy. After a > few hours I noticed that the storage was up but complaining about being out > of sync and needing healing. Within the hour I noticed a VM had paused itself > due to storage issues. This is a small environment, for now, with only 30 > VM's. I am new to Ovirt so this is uncharted territory for me. I am tailing > some logs and things look sort of normal and google is sending me down a > wormhole. > > If I run "gluster volume heal cps-vms-gfs info" this number seems to be > changing pretty regularly. Logs are showing lots of entries like this: > > [2018-11-08 21:55:05.996675] I [MSGID: 114047] > [client-handshake.c:1242:client_setvolume_cbk] 0-cps-vms-gfs-client-1: Server > and Client lk-version numbers are not same, reopening the fds > [2018-11-08 21:55:05.997693] I [MSGID: 108002] [afr-common.c:5312:afr_notify] > 0-cps-vms-gfs-replicate-0: Client-quorum is met > [2018-11-08 21:55:05.997717] I [MSGID: 114035] > [client-handshake.c:202:client_set_lk_version_cbk] 0-cps-vms-gfs-client-1: > Server lk version = 1 > > I guess I am curious what else should I be looking for? Is this just taking > forever to heal? Is there something else I can run or I should do to verify > things are actually getting better? I ran an actual heal command and it > cleared everything for a few seconds and then the entries started to populate > again when I did the info command. > > [root@cps-vms-gfs01 glusterfs]# gluster volume status > Status of volume: cps-vms-gfs > Gluster process > TCP Port RDMA Port Online Pid > -- > Brick 10.8.255.1:/gluster/cps-vms-gfs01/brick > 49152 0 Y 4054 > Brick 10.8.255.2:/gluster/cps-vms-gfs02/brick > 49152 0 Y 4144 > Brick 10.8.255.3:/gluster/cps-vms-gfs03/brick > 49152 0 Y 4294 > Self-heal Daemon on localhost N/A N/AY 4279 > Self-heal Daemon on cps-vms-gfs02.cisco.com N/A N/AY 5185 > Self-heal Daemon on 10.196.152.145 N/A N/AY 50948 > > Task Status of Volume cps-vms-gfs > -- > There are no active volume tasks > > I am running ovirt 4.2.5 and gluster 3.12.11. Can you provide output of gluster volume heal cps-vms-gfs info, and the logs from /var/log/glusterfs/glfsheal-cps-vms-gfs.log and the brick logs from /var/log/glusterfs/bricks for this volume. > > Thanks! > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/MDZXUZQSWQUKZRM3OUIGDOAMGDZHPVIF/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/W4Q3L3SV73WEOMDPHAL7SDRRBJGYT2EK/