[ovirt-users] Re: storage high latency, sanlock errors, cluster instability

Nir Soffer Sun, 29 May 2022 11:27:27 -0700

On Sun, May 29, 2022 at 9:03 PM Jonathan Baecker <jonba...@gmail.com> wrote:
>
> Am 29.05.22 um 19:24 schrieb Nir Soffer:
>
> On Sun, May 29, 2022 at 7:50 PM Jonathan Baecker <jonba...@gmail.com> wrote:
>
> Hello everybody,
>
> we run a 3 node self hosted cluster with GlusterFS. I had a lot of problem 
> upgrading ovirt from 4.4.10 to 4.5.0.2 and now we have cluster instability.
>
> First I will write down the problems I had with upgrading, so you get a 
> bigger picture:
>
> engine update when fine
> But nodes I could not update because of wrong version of imgbase, so I did a 
> manual update to 4.5.0.1 and later to 4.5.0.2. First time after updating it 
> was still booting into 4.4.10, so I did a reinstall.
> Then after second reboot I ended up in the emergency mode. After a long 
> searching I figure out that lvm.conf using use_devicesfile now but there it 
> uses the wrong filters. So I comment out this and add the old filters back. 
> This procedure I have done on all 3 nodes.
>
> When use_devicesfile (default in 4.5) is enabled, lvm filter is not
> used. During installation
> the old lvm filter is removed.
>
> Can you share more info on why it does not work for you?
>
> The problem was, that the node could not mount the gluster volumes anymore 
> and ended up in emergency mode.
>
> - output of lsblk
>
> NAME                                                       MAJ:MIN RM   SIZE 
> RO TYPE  MOUNTPOINT
> sda                                                          8:0    0   1.8T  
> 0 disk
> `-XA1920LE10063_HKS028AV                                   253:0    0   1.8T  
> 0 mpath
>   |-gluster_vg_sda-gluster_thinpool_gluster_vg_sda_tmeta   253:16   0     9G  
> 0 lvm
>   | `-gluster_vg_sda-gluster_thinpool_gluster_vg_sda-tpool 253:18   0   1.7T  
> 0 lvm
>   |   |-gluster_vg_sda-gluster_thinpool_gluster_vg_sda     253:19   0   1.7T  
> 1 lvm
>   |   |-gluster_vg_sda-gluster_lv_data                     253:20   0   100G  
> 0 lvm   /gluster_bricks/data
>   |   `-gluster_vg_sda-gluster_lv_vmstore                  253:21   0   1.6T  
> 0 lvm   /gluster_bricks/vmstore
>   `-gluster_vg_sda-gluster_thinpool_gluster_vg_sda_tdata   253:17   0   1.7T  
> 0 lvm
>     `-gluster_vg_sda-gluster_thinpool_gluster_vg_sda-tpool 253:18   0   1.7T  
> 0 lvm
>       |-gluster_vg_sda-gluster_thinpool_gluster_vg_sda     253:19   0   1.7T  
> 1 lvm
>       |-gluster_vg_sda-gluster_lv_data                     253:20   0   100G  
> 0 lvm   /gluster_bricks/data
>       `-gluster_vg_sda-gluster_lv_vmstore                  253:21   0   1.6T  
> 0 lvm   /gluster_bricks/vmstore
> sr0                                                         11:0    1  1024M  
> 0 rom
> nvme0n1                                                    259:0    0 238.5G  
> 0 disk
> |-nvme0n1p1                                                259:1    0     1G  
> 0 part  /boot
> |-nvme0n1p2                                                259:2    0   134G  
> 0 part
> | |-onn-pool00_tmeta                                       253:1    0     1G  
> 0 lvm
> | | `-onn-pool00-tpool                                     253:3    0    87G  
> 0 lvm
> | |   |-onn-ovirt--node--ng--4.5.0.2--0.20220513.0+1       253:4    0    50G  
> 0 lvm   /
> | |   |-onn-pool00                                         253:7    0    87G  
> 1 lvm
> | |   |-onn-home                                           253:8    0     1G  
> 0 lvm   /home
> | |   |-onn-tmp                                            253:9    0     1G  
> 0 lvm   /tmp
> | |   |-onn-var                                            253:10   0    15G  
> 0 lvm   /var
> | |   |-onn-var_crash                                      253:11   0    10G  
> 0 lvm   /var/crash
> | |   |-onn-var_log                                        253:12   0     8G  
> 0 lvm   /var/log
> | |   |-onn-var_log_audit                                  253:13   0     2G  
> 0 lvm   /var/log/audit
> | |   |-onn-ovirt--node--ng--4.5.0.1--0.20220511.0+1       253:14   0    50G  
> 0 lvm
> | |   `-onn-var_tmp                                        253:15   0    10G  
> 0 lvm   /var/tmp
> | |-onn-pool00_tdata                                       253:2    0    87G  
> 0 lvm
> | | `-onn-pool00-tpool                                     253:3    0    87G  
> 0 lvm
> | |   |-onn-ovirt--node--ng--4.5.0.2--0.20220513.0+1       253:4    0    50G  
> 0 lvm   /
> | |   |-onn-pool00                                         253:7    0    87G  
> 1 lvm
> | |   |-onn-home                                           253:8    0     1G  
> 0 lvm   /home
> | |   |-onn-tmp                                            253:9    0     1G  
> 0 lvm   /tmp
> | |   |-onn-var                                            253:10   0    15G  
> 0 lvm   /var
> | |   |-onn-var_crash                                      253:11   0    10G  
> 0 lvm   /var/crash
> | |   |-onn-var_log                                        253:12   0     8G  
> 0 lvm   /var/log
> | |   |-onn-var_log_audit                                  253:13   0     2G  
> 0 lvm   /var/log/audit
> | |   |-onn-ovirt--node--ng--4.5.0.1--0.20220511.0+1       253:14   0    50G  
> 0 lvm
> | |   `-onn-var_tmp                                        253:15   0    10G  
> 0 lvm   /var/tmp
> | `-onn-swap                                               253:5    0    20G  
> 0 lvm   [SWAP]
> `-nvme0n1p3                                                259:3    0    95G  
> 0 part
>   `-gluster_vg_nvme0n1p3-gluster_lv_engine                 253:6    0    94G  
> 0 lvm   /gluster_bricks/engine
 >
> - The old lvm filter used, and why it was needed
>
> filter = 
> ["a|^/dev/disk/by-id/lvm-pv-uuid-Nn7tZl-TFdY-BujO-VZG5-EaGW-5YFd-Lo5pwa$|", 
> "a|^/dev/disk/by-id/lvm-pv-uuid-Wcbxnx-2RhC-s1Re-s148-nLj9-Tr3f-jj4VvE$|", 
> "a|^/dev/disk/by-id/lvm-pv-uuid-lX51wm-H7V4-3CTn-qYob-Rkpx-Tptd-t94jNL$|", 
> "r|.*|"]
>
> I don't remember exactly any more why it was needed, but without the node was 
> not working correctly. I think I even used vdsm-tool config-lvm-filter.


I think that if  you list the devices in this filter:

    ls -lh /dev/disk/by-id/lvm-pv-uuid-Nn7tZl-TFdY-BujO-VZG5-EaGW-5YFd-Lo5pwa \
             /dev/disk/by-id/lvm-pv-uuid-Wcbxnx-2RhC-s1Re-s148-nLj9-Tr3f-jj4VvE
\
            /dev/disk/by-id/lvm-pv-uuid-lX51wm-H7V4-3CTn-qYob-Rkpx-Tptd-t94jNL

You will see that these are the devices used by these vgs:

    gluster_vg_sda, gluster_vg_nvme0n1p3, onn

>
> - output of vdsm-tool config-lvm-filter
>
> Analyzing host...
> Found these mounted logical volumes on this host:
>
>   logical volume:  /dev/mapper/gluster_vg_nvme0n1p3-gluster_lv_engine
>   mountpoint:      /gluster_bricks/engine
>   devices:         /dev/nvme0n1p3
>
>   logical volume:  /dev/mapper/gluster_vg_sda-gluster_lv_data
>   mountpoint:      /gluster_bricks/data
>   devices:         /dev/mapper/XA1920LE10063_HKS028AV
>
>   logical volume:  /dev/mapper/gluster_vg_sda-gluster_lv_vmstore
>   mountpoint:      /gluster_bricks/vmstore
>   devices:         /dev/mapper/XA1920LE10063_HKS028AV
>
>   logical volume:  /dev/mapper/onn-home
>   mountpoint:      /home
>   devices:         /dev/nvme0n1p2
>
>   logical volume:  /dev/mapper/onn-ovirt--node--ng--4.5.0.2--0.20220513.0+1
>   mountpoint:      /
>   devices:         /dev/nvme0n1p2
>
>   logical volume:  /dev/mapper/onn-swap
>   mountpoint:      [SWAP]
>   devices:         /dev/nvme0n1p2
>
>   logical volume:  /dev/mapper/onn-tmp
>   mountpoint:      /tmp
>   devices:         /dev/nvme0n1p2
>
>   logical volume:  /dev/mapper/onn-var
>   mountpoint:      /var
>   devices:         /dev/nvme0n1p2
>
>   logical volume:  /dev/mapper/onn-var_crash
>   mountpoint:      /var/crash
>   devices:         /dev/nvme0n1p2
>
>   logical volume:  /dev/mapper/onn-var_log
>   mountpoint:      /var/log
>   devices:         /dev/nvme0n1p2
>
>   logical volume:  /dev/mapper/onn-var_log_audit
>   mountpoint:      /var/log/audit
>   devices:         /dev/nvme0n1p2
>
>   logical volume:  /dev/mapper/onn-var_tmp
>   mountpoint:      /var/tmp
>   devices:         /dev/nvme0n1p2
>
> Configuring LVM system.devices.
> Devices for following VGs will be imported:
>
>  gluster_vg_sda, gluster_vg_nvme0n1p3, onn
>
> To properly configure the host, we need to add multipath
> blacklist in /etc/multipath/conf.d/vdsm_blacklist.conf:
>
>   blacklist {
>       wwid "eui.0025388901b1e26f"
>   }
>
>
> Configure host? [yes,NO]

If you run "vdsm-tool config-lvm-filter" and confirm with "yes", I
think all the vgs
will be imported properly into lvm devices file.

I don't think it will solve the storage issues you have since Feb
2022, but at least
you will have a standard configuration and the next upgrade will not revert your
local settings.

> If using lvm devices does not work for you, you can enable the lvm
> filter in vdsm configuration
> by adding a drop-in file:
>
> $ cat /etc/vdsm/vdsm.conf.d/99-local.conf
> [lvm]
> config_method = filter
>
> And run:
>
>     vdsm-tool config-lvm-filter
>
> to configure the lvm filter in the best way for vdsm. If this does not create
> the right filter we would like to know why, but in general you should use
> lvm devices since it avoids the trouble of maintaining the filter and dealing
> with upgrades and user edited lvm filter.
>
> If  you disable use_devicesfile, the next vdsm upgrade will enable it
> back unless
> you change the configuration.
>
> I would be happy to just use the default, when there is a way to make 
> use_devicesfile to wok.
>
> Also even if you disable use_devicesfile in lvm.conf, vdsm still use
> --devices instead
> of filter when running lvm commands, and lvm commands run by vdsm ignore your
> lvm filter since the --devices option overrides the system settings.
>
> ...
>
> I notice some unsync volume warning, but because I had this in the past to, 
> after upgrading, I though after some time they will disappear. The next day 
> there still where there, so I decided to put the nodes again in the 
> maintenance mode and restart the glusterd service. After some time the sync 
> warnings where gone.
>
> Not clear what these warnings are, I guess Gluster warning?
>
> Yes was Gluster warnings under Storage -> Volumes it was saying that some 
> entries are unsync.
>
> So now the actual problem:
>
> Since this time the cluster is unstable. I get different errors and warning, 
> like:
>
> VM [name] is not responding
> out of nothing HA VM gets migrated
> VM migration can fail
> VM backup with snapshoting and export take very long
>
> How do you backup the vms? do you sue a backup application? how is it
> configured?
>
> I use a self made plython script, which uses the rest api. I create a 
> snapshot from the VM, build a new VM from that snapshot and move the new one 
> to the export domain.

This is not very efficient - this copy the entire vm at the point of
time of the snapshot
and then copy it again to the export domain.

If you use a backup application supporting the incremental backup API,
the first full backup
will copy the entire vm once, but later incremental backup will copy
only the changes
since the last backup.

>
> VMs are getting very slow some times
> Storage domain vmstore experienced a high latency of 9.14251
> ovs|00001|db_ctl_base|ERR|no key "dpdk-init" in Open_vSwitch record "." 
> column other_config
> 489279 [1064359]: s8 renewal error -202 delta_length 10 last_success 489249
> 444853 [2243175]: s27 delta_renew read timeout 10 sec offset 0 
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_vmstore/3cf83851-1cc8-4f97-8960-08a60b9e25db/dom_md/ids
> 471099 [2243175]: s27 delta_renew read timeout 10 sec offset 0 
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_vmstore/3cf83851-1cc8-4f97-8960-08a60b9e25db/dom_md/ids
> many of: 424035 [2243175]: s27 delta_renew long write time XX sec
>
> All these issues tell use that your storage is not working correctly.
>
> sanlock.log is full of renewal errors form May:
>
> $ grep 2022-05- sanlock.log | wc -l
> 4844
>
> $ grep 2022-05- sanlock.log | grep 'renewal error' | wc -l
> 631
>
> But there is lot of trouble from earlier months:
>
> $ grep 2022-04- sanlock.log | wc -l
> 844
> $ grep 2022-04- sanlock.log | grep 'renewal error' | wc -l
> 29
>
> $ grep 2022-03- sanlock.log | wc -l
> 1609
> $ grep 2022-03- sanlock.log | grep 'renewal error' | wc -l
> 483
>
> $ grep 2022-02- sanlock.log | wc -l
> 826
> $ grep 2022-02- sanlock.log | grep 'renewal error' | wc -l
> 242
>
> Here sanlock log looks healthy:
>
> $ grep 2022-01- sanlock.log | wc -l
> 3
> $ grep 2022-01- sanlock.log | grep 'renewal error' | wc -l
> 0
>
> $ grep 2021-12- sanlock.log | wc -l
> 48
> $ grep 2021-12- sanlock.log | grep 'renewal error' | wc -l
> 0
>
> vdsm log shows that 2 domains are not accessible:
>
> $ grep ERROR vdsm.log
> 2022-05-29 15:07:19,048+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_data/de5f4123-0fac-4238-abcf-a329c142bd47/dom_md/metadata
> (monitor:511)
> 2022-05-29 16:33:59,049+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_data/de5f4123-0fac-4238-abcf-a329c142bd47/dom_md/metadata
> (monitor:511)
> 2022-05-29 16:34:39,049+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_data/de5f4123-0fac-4238-abcf-a329c142bd47/dom_md/metadata
> (monitor:511)
> 2022-05-29 17:21:39,050+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_data/de5f4123-0fac-4238-abcf-a329c142bd47/dom_md/metadata
> (monitor:511)
> 2022-05-29 17:55:59,712+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_vmstore/3cf83851-1cc8-4f97-8960-08a60b9e25db/dom_md/metadata
> (monitor:511)
> 2022-05-29 17:56:19,711+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_vmstore/3cf83851-1cc8-4f97-8960-08a60b9e25db/dom_md/metadata
> (monitor:511)
> 2022-05-29 17:56:39,050+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_data/de5f4123-0fac-4238-abcf-a329c142bd47/dom_md/metadata
> (monitor:511)
> 2022-05-29 17:56:39,711+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_vmstore/3cf83851-1cc8-4f97-8960-08a60b9e25db/dom_md/metadata
> (monitor:511)
>
> You need to find what is the issue with your Gluster storage.
>
> I hope that Ritesh can help debug the issue with Gluster.
>
> Nir
>
> I'm worry that I do something, that it makes it even more worst, and I hove 
> not idea what's the problem. To me it looks not exactly like a problem with 
> data inconsistencies.

The problem is that your Gluster storage is not healthy, and reading
and writing to it times out.

Please keep users@ovirt.org CC when you reply. Gluster storage is very
popular in this mailing list
and you may get useful help from other users.

Nir
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4NLSHJ6RNVNWPSJA3DD5ZLLP2KMEFREQ/

[ovirt-users] Re: storage high latency, sanlock errors, cluster instability

Reply via email to