Hi there, recently there was a network failure in our ovirt infrastructure, causing ovirt engine to become unstable. It will restarted after 10-20minutes. Load average was high. Command issued will hanged.
Looking at host logs, there was endless locking errors (/var/log/sanlock.log) below. I tried to re-initialize by stopp HE HA agent/broker in all hosts, by issuing following command in one of the host: # su - vdsm -s /bin/bash $ sanlock direct init -s hosted-engine:0:/rhev/data-center/mnt/192.168.10.10 \\:_engine/a184f8ac-b779-4bf8-81c3-751115e15436/ha_agent/hosted-engine.lockspace:0 and than restart both agent and broker in the same host. However i m still getting the same problem. Any advice on this matter? Installation Infos: ---------------------- ovirt 3.5.3 vdsm-xmlrpc-4.16.24-0.el6.noarch vdsm-python-zombiereaper-4.16.24-0.el6.noarch vdsm-python-4.16.24-0.el6.noarch vdsm-jsonrpc-4.16.24-0.el6.noarch vdsm-4.16.24-0.el6.x86_64 vdsm-cli-4.16.24-0.el6.noarch vdsm-yajsonrpc-4.16.24-0.el6.noarch ovirt-hosted-engine-ha-1.2.6-2.el6.noarch ovirt-hosted-engine-setup-1.2.6-0.0.master.20150812080635.git5295df1.el6.noarch ----- end installation infos ------ /var/log/sanlock.log 2015-08-18 04:20:02+0800 1704 [9385]: s2 delta_renew read rv -202 offset 0 /rhev/data-center/mnt/192.168.10.10: _engine/a184f8ac-b779-4bf8-81c3-751115e15436/dom_md/ids 2015-08-18 04:20:02+0800 1704 [9385]: s2 renewal error -202 delta_length 11 last_success 1662 2015-08-18 04:20:11+0800 1713 [9385]: a184f8ac aio collect 0 0x7fc5040008c0:0x7fc5040008d0:0x7fc50b9f7000 result 1048576:0 other free 2015-08-18 04:20:11+0800 1713 [9833]: hosted-e aio collect 0 0x7fc4f80008c0:0x7fc4f80008d0:0x7fc50baf9000 result 1048576:0 other free 2015-08-18 04:20:11+0800 1713 [9385]: a184f8ac aio collect 0 0x7fc504000910:0x7fc504000920:0x7fc50bbfb000 result 1048576:0 other free 2015-08-18 04:20:11+0800 1713 [9833]: hosted-e aio collect 0 0x7fc4f8000910:0x7fc4f8000920:0x7fc50beff000 result 1048576:0 other free 2015-08-18 04:21:43+0800 1805 [9385]: a184f8ac aio timeout 0 0x7fc5040008c0:0x7fc5040008d0:0x7fc50adf2000 ioto 10 to_count 18 2015-08-18 04:21:43+0800 1805 [9385]: s2 delta_renew read rv -202 offset 0 /rhev/data-center/mnt/192.168.10.10: _engine/a184f8ac-b779-4bf8-81c3-751115e15436/dom_md/ids 2015-08-18 04:21:43+0800 1805 [9385]: s2 renewal error -202 delta_length 10 last_success 1774 2015-08-18 04:21:43+0800 1805 [9833]: hosted-e aio timeout 0 0x7fc4f80008c0:0x7fc4f80008d0:0x7fc50aef4000 ioto 10 to_count 14 2015-08-18 04:21:43+0800 1805 [9833]: s3 delta_renew read rv -202 offset 0 /rhev/data-center/mnt/192.168.10.10: _engine/a184f8ac-b779-4bf8-81c3-751115e15436/images/190f4d2a-77f4-4403-af0d-62853560c653/2be7db4d-f30e-4873-b4ef-cff9e757341c 2015-08-18 04:21:43+0800 1805 [9833]: s3 renewal error -202 delta_length 10 last_success 1774 2015-08-18 04:21:52+0800 1814 [9385]: a184f8ac aio collect 0 0x7fc5040008c0:0x7fc5040008d0:0x7fc50adf2000 result 1048576:0 other free 2015-08-18 04:21:52+0800 1814 [9833]: hosted-e aio collect 0 0x7fc4f80008c0:0x7fc4f80008d0:0x7fc50aef4000 result 1048576:0 other free 2015-08-18 04:23:04+0800 1885 [9833]: hosted-e aio timeout 0 0x7fc4f80008c0:0x7fc4f80008d0:0x7fc50bbfb000 ioto 10 to_count 15 2015-08-18 04:23:04+0800 1885 [9833]: s3 delta_renew read rv -202 offset 0 /rhev/data-center/mnt/192.168.10.10: _engine/a184f8ac-b779-4bf8-81c3-751115e15436/images/190f4d2a-77f4-4403-af0d-62853560c653/2be7db4d-f30e-4873-b4ef-cff9e757341c 2015-08-18 04:23:04+0800 1885 [9833]: s3 renewal error -202 delta_length 10 last_success 1855 2015-08-18 04:23:04+0800 1886 [9385]: a184f8ac aio timeout 0 0x7fc5040008c0:0x7fc5040008d0:0x7fc50beff000 ioto 10 to_count 19 2015-08-18 04:23:04+0800 1886 [9385]: s2 delta_renew read rv -202 offset 0 /rhev/data-center/mnt/192.168.10.10: _engine/a184f8ac-b779-4bf8-81c3-751115e15436/dom_md/ids 2015-08-18 04:23:04+0800 1886 [9385]: s2 renewal error -202 delta_length 10 last_success 1855 2015-08-18 04:23:15+0800 1896 [9833]: hosted-e aio timeout 0 0x7fc4f8000910:0x7fc4f8000920:0x7fc50baf9000 ioto 10 to_count 16 2015-08-18 04:23:15+0800 1896 [9833]: s3 delta_renew read rv -202 offset 0 /rhev/data-center/mnt/192.168.10.10: _engine/a184f8ac-b779-4bf8-81c3-751115e15436/images/190f4d2a-77f4-4403-af0d-62853560c653/2be7db4d-f30e-4873-b4ef-cff9e757341c 2015-08-18 04:23:15+0800 1896 [9833]: s3 renewal error -202 delta_length 10 last_success 1855 2015-08-18 04:23:15+0800 1897 [9385]: a184f8ac aio timeout 0 0x7fc504000910:0x7fc504000920:0x7fc50b9f7000 ioto 10 to_count 20 2015-08-18 04:23:15+0800 1897 [9385]: s2 delta_renew read rv -202 offset 0 /rhev/data-center/mnt/192.168.10.10: _engine/a184f8ac-b779-4bf8-81c3-751115e15436/dom_md/ids 2015-08-18 04:23:15+0800 1897 [9385]: s2 renewal error -202 delta_length 11 last_success 1855 2015-08-18 04:23:26+0800 1907 [9833]: hosted-e aio timeout 0 0x7fc4f8000960:0x7fc4f8000970:0x7fc50aef4000 ioto 10 to_count 17 2015-08-18 04:23:26+0800 1907 [9833]: s3 delta_renew read rv -202 offset 0 /rhev/data-center/mnt/192.168.10.10: _engine/a184f8ac-b779-4bf8-81c3-751115e15436/images/190f4d2a-77f4-4403-af0d-62853560c653/2be7db4d-f30e-4873-b4ef-cff9e757341c 2015-08-18 04:23:26+0800 1907 [9833]: s3 renewal error -202 delta_length 10 last_success 1855 2015-08-18 04:23:26+0800 1908 [9385]: a184f8ac aio timeout 0 0x7fc504000960:0x7fc504000970:0x7fc50adf2000 ioto 10 to_count 21 2015-08-18 04:23:26+0800 1908 [9385]: s2 delta_renew read rv -202 offset 0 /rhev/data-center/mnt/192.168.10.10: _engine/a184f8ac-b779-4bf8-81c3-751115e15436/dom_md/ids 2015-08-18 04:23:26+0800 1908 [9385]: s2 renewal error -202 delta_length 11 last_success 1855 ----- end /var/log/sanlock.log
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users