[ovirt-users] Re: Sanlock volume corrupted on deployment
On Thu, Jan 31, 2019 at 2:48 PM Nir Soffer wrote: > On Thu, Jan 31, 2019 at 2:52 PM Strahil Nikolov > wrote: > >> Dear Nir, >> >> the issue with the 'The method does not exist or is not available: >> {'method': u'GlusterHost.list'}, code = -32601' is not related to the >> sanlock. I don't know why the 'vdsm-gluster' package was not installed as a >> dependency. >> > > Please file a bug about this. > > > Can you share your sanlock log? >> > >> I'm attaching the contents of /var/log , but here is a short snippet: >> >> About the sanlock issue - it reappeared with errors like : >> 2019-01-31 13:33:10 27551 [17279]: leader1 delta_acquire_begin error -223 >> lockspace hosted-engine host_id 1 >> > > As I said, the error is not -233, but -223, which make sense - this error > means sanlock did not > find the magic number for a delta lease area, which means the area was not > formatted, or > corrupted. > > >> 2019-01-31 13:33:10 27551 [17279]: leader2 path >> /var/run/vdsm/storage/808423f9-8a5c-40cd-bc9f-2568c85b8c74/2c74697a-8bd9-4472-8a98-bf624f3462d5/411b6cee-5b01-47ca-8c28-bb1fed8ac83b >> offset 0 >> 2019-01-31 13:33:10 27551 [17279]: leader3 m 0 v 30003 ss 512 nh 0 mh 1 >> oi 0 og 0 lv 0 >> 2019-01-31 13:33:10 27551 [17279]: leader4 sn hosted-engine rn ts 0 cs >> 60346c59 >> 2019-01-31 13:33:11 27551 [21482]: s6 add_lockspace fail result -223 >> 2019-01-31 13:33:16 27556 [21482]: s7 lockspace >> hosted-engine:1:/var/run/vdsm/storage/808423f9-8a5c-40cd-bc9f-2568c85b8c74/2c74697a-8bd9-4472-8a98-bf624f3462d5/411b6cee-5b01-47ca-8c28-bb1fe >> d8ac83b:0 >> >> >> I have managed to fix it by running the following immediately after the >> ha services were started by ansible: >> >> cd >> /rhev/data-center/mnt/glusterSD/ovirt1.localdomain\:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/ha_agent/ >> > > This is not a path managed by vdsm, so I guess the issue is with hosted > enigne > specific lockspace that is managed by hosted engine, not by vdsm. > > >> sanlock direct init -s hosted-engine:0:hosted-engine.lockspace:0 >> > > This formats the lockspace, and is expected to fix this issue. > > > >> systemctl stop ovirt-ha-agent ovirt-ha-broker >> systemctl status vdsmd >> systemctl start ovirt-ha-broker ovirt-ha-agent >> >> Once the VM started - ansible managed to finish the deployment without >> any issues. >> I hope someone can check the sanlock init stuff , as it is really >> frustrating. >> > I'd suggest to avoid directly playing with the managed in the middle of the deployment to avoid further issues. > > If I understand the flow correctly, you create a new environment from > scratch, so this is > an issue with hosted engine deploymnet, not initializing the lockspace. > > I think filing a bug with the info in this thread is the first step. > > Simone, can you take a look at this? > On our CI env everything is working as expected and the lockspace volume got initialised as expected. In the attached logs a log of steps got skipped since a lot of things were already up and running so they are not really useful. Strahil, can you please retry on a really clean environment and eventually attach the relevant logs if you are able to reproduce the issue? ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZQWO6H5JRKIJI6ZDRTP2PTK7SX6XILQH/
[ovirt-users] Re: Sanlock volume corrupted on deployment
On Tue, Jan 29, 2019 at 2:00 PM Strahil wrote: > Dear Nir, > > According to redhat solution 1179163 'add_lockspace fail result -233' > indicates corrupted ids lockspace. > Good work finding the solution! Note that the page mention error -223, not -233: 2014-08-27 14:26:42+ 2244 [14497]: s30 add_lockspace fail result -223 #<-- corrupted ids lockspace > > During the install, the VM fails to get up. > In order to fix it, I stop: > ovirt-ha-agent, ovirt-ha-broker, vdsmd, supervdsmd, sanlock > Then reinitialize the lockspace via 'sanlock direct init -s' (used > bugreport 1116469 as guidance). > Once the init is successful and all the services are up - the VM is > started but the deployment was long over and the setup needs additional > cleaning up. > > I will rebuild the gluster cluster and then will repeat the deployment. > > Can you guide me what information will be needed , as I'm quite new in > ovirt/RHV ? > > Best Regards, > Strahil Nikolov > > On Jan 28, 2019 20:34, Nir Soffer wrote: > > On Sat, Jan 26, 2019 at 6:13 PM Strahil wrote: > > Hey guys, > > I have noticed that with 4.2.8 the sanlock issue (during deployment) is > still not fixed. > Am I the only one with bad luck or there is something broken there ? > > The sanlock service reports code 's7 add_lockspace fail result -233' > 'leader1 delta_acquire_begin error -233 lockspace hosted-engine host_id > 1'. > > > Sanlock does not have such error code - are you sure this is -233? > > Here sanlock return values: > https://pagure.io/sanlock/blob/master/f/src/sanlock_rv.h > > Can you share your sanlock log? > > > > > Best Regards, > Strahil Nikolov > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/SZMF5KKHSXOUTLGX3LR2NBN7E6QGS6G3/ > > > ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/FMAWDZO5UO2HAGMHXT7AKGEKKXTIJS5S/
[ovirt-users] Re: Sanlock volume corrupted on deployment
On Thu, Jan 31, 2019 at 2:52 PM Strahil Nikolov wrote: > Dear Nir, > > the issue with the 'The method does not exist or is not available: > {'method': u'GlusterHost.list'}, code = -32601' is not related to the > sanlock. I don't know why the 'vdsm-gluster' package was not installed as a > dependency. > Please file a bug about this. > Can you share your sanlock log? > > > I'm attaching the contents of /var/log , but here is a short snippet: > > About the sanlock issue - it reappeared with errors like : > 2019-01-31 13:33:10 27551 [17279]: leader1 delta_acquire_begin error -223 > lockspace hosted-engine host_id 1 > As I said, the error is not -233, but -223, which make sense - this error means sanlock did not find the magic number for a delta lease area, which means the area was not formatted, or corrupted. > 2019-01-31 13:33:10 27551 [17279]: leader2 path > /var/run/vdsm/storage/808423f9-8a5c-40cd-bc9f-2568c85b8c74/2c74697a-8bd9-4472-8a98-bf624f3462d5/411b6cee-5b01-47ca-8c28-bb1fed8ac83b > offset 0 > 2019-01-31 13:33:10 27551 [17279]: leader3 m 0 v 30003 ss 512 nh 0 mh 1 oi > 0 og 0 lv 0 > 2019-01-31 13:33:10 27551 [17279]: leader4 sn hosted-engine rn ts 0 cs > 60346c59 > 2019-01-31 13:33:11 27551 [21482]: s6 add_lockspace fail result -223 > 2019-01-31 13:33:16 27556 [21482]: s7 lockspace > hosted-engine:1:/var/run/vdsm/storage/808423f9-8a5c-40cd-bc9f-2568c85b8c74/2c74697a-8bd9-4472-8a98-bf624f3462d5/411b6cee-5b01-47ca-8c28-bb1fe > d8ac83b:0 > > > I have managed to fix it by running the following immediately after the ha > services were started by ansible: > > cd > /rhev/data-center/mnt/glusterSD/ovirt1.localdomain\:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/ha_agent/ > This is not a path managed by vdsm, so I guess the issue is with hosted enigne specific lockspace that is managed by hosted engine, not by vdsm. > sanlock direct init -s hosted-engine:0:hosted-engine.lockspace:0 > This formats the lockspace, and is expected to fix this issue. > systemctl stop ovirt-ha-agent ovirt-ha-broker > systemctl status vdsmd > systemctl start ovirt-ha-broker ovirt-ha-agent > > Once the VM started - ansible managed to finish the deployment without any > issues. > I hope someone can check the sanlock init stuff , as it is really > frustrating. > If I understand the flow correctly, you create a new environment from scratch, so this is an issue with hosted engine deploymnet, not initializing the lockspace. I think filing a bug with the info in this thread is the first step. Simone, can you take a look at this? ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/JWFLLOQS7AWN6P4XZS3HC4PTUWU2G5SP/
[ovirt-users] Re: Sanlock volume corrupted on deployment
Dear All, I have rebuilt the gluster cluster , but it seems that with the latest updates (I started over from scratch) I am not able to complete the "Prepare VM" phase and thus I cannot reach to the last phase where the sanlock issue happens. I have checked the contents of " /var/log/ovirt-hosted-engine-setup/engine-logs-2019-01-31T06:54:22Z/ovirt-engine/engine.log" and the only errors I see are: [root@ovirt1 ovirt-engine]# grep ERROR engine.log 2019-01-31 08:56:33,326+02 ERROR [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-55) [3806b629] Failed in 'GlusterServersListVDS' method 2019-01-31 08:56:33,343+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-55) [3806b629] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM ovirt1.localdomain command GlusterServersListVDS failed: The method does not exist or is not available: {'method': u'GlusterHost.list'} 2019-01-31 08:56:33,344+02 ERROR [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-55) [3806b629] Command 'GlusterServersListVDSCommand(HostName = ovirt1.localdomain, VdsIdVDSCommandParametersBase:{hostId='07c6b36a-6939-4059-8dd3-4e47ea094538'})' execution failed: VDSGenericException: VDSErrorException: Failed to GlusterServersListVDS, error = The method does not exist or is not available: {'method': u'GlusterHost.list'}, code = -32601 2019-01-31 08:56:33,591+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-55) [51bf8a11] EVENT_ID: GLUSTER_COMMAND_FAILED(4,035), Gluster command [] failed on server . 2019-01-31 08:56:34,856+02 ERROR [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-60) [3ee4bd51] Failed in 'GlusterServersListVDS' method 2019-01-31 08:56:34,857+02 ERROR [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-60) [3ee4bd51] Command 'GlusterServersListVDSCommand(HostName = ovirt1.localdomain, VdsIdVDSCommandParametersBase:{hostId='07c6b36a-6939-4059-8dd3-4e47ea094538'})' execution failed: VDSGenericException: VDSErrorException: Failed to GlusterServersListVDS, error = The method does not exist or is not available: {'method': u'GlusterHost.list'}, code = -32601 2019-01-31 08:56:35,191+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-60) [3fd826e] EVENT_ID: GLUSTER_COMMAND_FAILED(4,035), Gluster command [] failed on server . Any hint how to proceed further ? Best Regards,Strahil Nikolov В вторник, 29 януари 2019 г., 14:01:17 ч. Гринуич+2, Strahil написа: Dear Nir, According to redhat solution 1179163 'add_lockspace fail result -233' indicates corrupted ids lockspace. During the install, the VM fails to get up.In order to fix it, I stop:ovirt-ha-agent, ovirt-ha-broker, vdsmd, supervdsmd, sanlockThen reinitialize the lockspace via 'sanlock direct init -s' (used bugreport 1116469 as guidance).Once the init is successful and all the services are up - the VM is started but the deployment was long over and the setup needs additional cleaning up. I will rebuild the gluster cluster and then will repeat the deployment. Can you guide me what information will be needed , as I'm quite new in ovirt/RHV ? Best Regards,Strahil Nikolov On Jan 28, 2019 20:34, Nir Soffer wrote: On Sat, Jan 26, 2019 at 6:13 PM Strahil wrote: Hey guys, I have noticed that with 4.2.8 the sanlock issue (during deployment) is still not fixed.Am I the only one with bad luck or there is something broken there ? The sanlock service reports code 's7 add_lockspace fail result -233' 'leader1 delta_acquire_begin error -233 lockspace hosted-engine host_id 1'. Sanlock does not have such error code - are you sure this is -233? Here sanlock return values:https://pagure.io/sanlock/blob/master/f/src/sanlock_rv.h Can you share your sanlock log? Best Regards,Strahil Nikolov___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SZMF5KKHSXOUTLGX3LR2NBN7E6QGS6G3/ Dear Nir, According to redhat solution 1179163 'add_lockspace fail result -233' indicates corrupted ids lockspace. During the install, the VM fails to get up. In order to fix it, I stop: ovirt-ha-agent, ovirt-ha-broker, vdsmd, supervdsmd, sanlock Then reinitialize the lockspace via 'sanlock direct init -s' (used bugreport 1116469 as guidance). Once the init is
[ovirt-users] Re: Sanlock volume corrupted on deployment
Dear Nir,According to redhat solution 1179163 'add_lockspace fail result -233' indicates corrupted ids lockspace.During the install, the VM fails to get up.In order to fix it, I stop:ovirt-ha-agent, ovirt-ha-broker, vdsmd, supervdsmd, sanlockThen reinitialize the lockspace via 'sanlock direct init -s' (used bugreport 1116469 as guidance).Once the init is successful and all the services are up - the VM is started but the deployment was long over and the setup needs additional cleaning up.I will rebuild the gluster cluster and then will repeat the deployment.Can you guide me what information will be needed , as I'm quite new in ovirt/RHV ?Best Regards,Strahil NikolovOn Jan 28, 2019 20:34, Nir Soffer wrote:On Sat, Jan 26, 2019 at 6:13 PM Strahilwrote:Hey guys,I have noticed that with 4.2.8 the sanlock issue (during deployment) is still not fixed.Am I the only one with bad luck or there is something broken there ?The sanlock service reports code 's7 add_lockspace fail result -233' 'leader1 delta_acquire_begin error -233 lockspace hosted-engine host_id 1'.Sanlock does not have such error code - are you sure this is -233?Here sanlock return values:https://pagure.io/sanlock/blob/master/f/src/sanlock_rv.hCan you share your sanlock log? Best Regards,Strahil Nikolov___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SZMF5KKHSXOUTLGX3LR2NBN7E6QGS6G3/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/OKPTTWQHN52CIG5ZM3VMLNKVJSJ6IVU6/
[ovirt-users] Re: Sanlock volume corrupted on deployment
On Sat, Jan 26, 2019 at 6:13 PM Strahil wrote: > Hey guys, > > I have noticed that with 4.2.8 the sanlock issue (during deployment) is > still not fixed. > Am I the only one with bad luck or there is something broken there ? > > The sanlock service reports code 's7 add_lockspace fail result -233' > 'leader1 delta_acquire_begin error -233 lockspace hosted-engine host_id > 1'. > Sanlock does not have such error code - are you sure this is -233? Here sanlock return values: https://pagure.io/sanlock/blob/master/f/src/sanlock_rv.h Can you share your sanlock log? > > Best Regards, > Strahil Nikolov > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/SZMF5KKHSXOUTLGX3LR2NBN7E6QGS6G3/ > ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RUONCMQFRH3HBTBFD4YMI7AGDPAS5D6T/
[ovirt-users] Re: Sanlock volume corrupted on deployment
Hi Simone, I will reinstall the nodes and will provide an update. Best Regards,Strahil Nikolov On Sat, Jan 26, 2019 at 5:13 PM Strahil wrote: Hey guys, I have noticed that with 4.2.8 the sanlock issue (during deployment) is still not fixed.Am I the only one with bad luck or there is something broken there ? Hi,I'm not aware on anything breaking hosted-engine deployment on 4.2.8.Which kind of storage are you using?Can you please share your logs? The sanlock service reports code 's7 add_lockspace fail result -233' 'leader1 delta_acquire_begin error -233 lockspace hosted-engine host_id 1'. Best Regards,Strahil Nikolov___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SZMF5KKHSXOUTLGX3LR2NBN7E6QGS6G3/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTRUQE7Q7NHIK7MHROGIN56FXVK65ZOD/
[ovirt-users] Re: Sanlock volume corrupted on deployment
On Sat, Jan 26, 2019 at 5:13 PM Strahil wrote: > Hey guys, > > I have noticed that with 4.2.8 the sanlock issue (during deployment) is > still not fixed. > Am I the only one with bad luck or there is something broken there ? > Hi, I'm not aware on anything breaking hosted-engine deployment on 4.2.8. Which kind of storage are you using? Can you please share your logs? > > The sanlock service reports code 's7 add_lockspace fail result -233' > 'leader1 delta_acquire_begin error -233 lockspace hosted-engine host_id 1'. > > Best Regards, > Strahil Nikolov > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/SZMF5KKHSXOUTLGX3LR2NBN7E6QGS6G3/ > ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/NDTZ67C3LDVA6OGLUGQXUX2LSIMW6HZW/