[ovirt-users] Re: Sanlock volume corrupted on deployment

2019-01-31 Thread Simone Tiraboschi
On Thu, Jan 31, 2019 at 2:48 PM Nir Soffer  wrote:

> On Thu, Jan 31, 2019 at 2:52 PM Strahil Nikolov 
> wrote:
>
>> Dear Nir,
>>
>> the issue with the 'The method does not exist or is not available:
>> {'method': u'GlusterHost.list'}, code = -32601' is not related to the
>> sanlock. I don't know why the 'vdsm-gluster' package was not installed as a
>> dependency.
>>
>
> Please file a bug about this.
>
> > Can you share your sanlock log?
>> >
>> I'm attaching the contents of /var/log , but here is a short snippet:
>>
>> About the sanlock issue - it reappeared with errors like :
>> 2019-01-31 13:33:10 27551 [17279]: leader1 delta_acquire_begin error -223
>> lockspace hosted-engine host_id 1
>>
>
> As I said, the error is not -233, but -223, which make sense - this error
> means sanlock did not
> find the magic number for a delta lease area, which means the area was not
> formatted, or
> corrupted.
>
>
>> 2019-01-31 13:33:10 27551 [17279]: leader2 path
>> /var/run/vdsm/storage/808423f9-8a5c-40cd-bc9f-2568c85b8c74/2c74697a-8bd9-4472-8a98-bf624f3462d5/411b6cee-5b01-47ca-8c28-bb1fed8ac83b
>> offset 0
>> 2019-01-31 13:33:10 27551 [17279]: leader3 m 0 v 30003 ss 512 nh 0 mh 1
>> oi 0 og 0 lv 0
>> 2019-01-31 13:33:10 27551 [17279]: leader4 sn hosted-engine rn  ts 0 cs
>> 60346c59
>> 2019-01-31 13:33:11 27551 [21482]: s6 add_lockspace fail result -223
>> 2019-01-31 13:33:16 27556 [21482]: s7 lockspace
>> hosted-engine:1:/var/run/vdsm/storage/808423f9-8a5c-40cd-bc9f-2568c85b8c74/2c74697a-8bd9-4472-8a98-bf624f3462d5/411b6cee-5b01-47ca-8c28-bb1fe
>> d8ac83b:0
>>
>>
>> I have managed to fix it by running the following immediately after the
>> ha services were started by ansible:
>>
>> cd
>> /rhev/data-center/mnt/glusterSD/ovirt1.localdomain\:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/ha_agent/
>>
>
> This is not a path managed by vdsm, so I guess the issue is with hosted
> enigne
> specific lockspace that is managed by hosted engine, not by vdsm.
>
>
>> sanlock direct init -s hosted-engine:0:hosted-engine.lockspace:0
>>
>
> This formats the lockspace, and is expected to fix this issue.
>
>
>
>> systemctl stop ovirt-ha-agent ovirt-ha-broker
>> systemctl status vdsmd
>> systemctl start ovirt-ha-broker ovirt-ha-agent
>>
>> Once the VM started - ansible managed to finish the deployment without
>> any issues.
>> I hope someone can check the sanlock init stuff , as it is really
>> frustrating.
>>
>
I'd suggest to avoid directly playing with the managed in the middle of the
deployment to avoid further issues.


>
> If I understand the flow correctly, you create a new environment from
> scratch, so this is
> an issue with hosted engine deploymnet, not initializing the lockspace.
>
> I think filing a bug with the info in this thread is the first step.
>
> Simone, can you take a look at this?
>

On our CI env everything is working as expected and the lockspace volume
got initialised as expected.
In the attached logs a log of steps got skipped since a lot of things were
already up and running so they are not really useful.
Strahil, can you please retry on a really clean environment and eventually
attach the relevant logs if you are able to reproduce the issue?
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZQWO6H5JRKIJI6ZDRTP2PTK7SX6XILQH/


[ovirt-users] Re: Sanlock volume corrupted on deployment

2019-01-31 Thread Nir Soffer
On Tue, Jan 29, 2019 at 2:00 PM Strahil  wrote:

> Dear Nir,
>
> According to redhat solution 1179163 'add_lockspace fail result -233'
> indicates corrupted ids lockspace.
>

Good work finding the solution!

Note that the page mention error -223, not -233:

2014-08-27 14:26:42+ 2244 [14497]: s30 add_lockspace fail result
-223  #<-- corrupted ids lockspace



>
> During the install, the VM fails to get up.
> In order to fix it, I stop:
> ovirt-ha-agent, ovirt-ha-broker, vdsmd, supervdsmd, sanlock
> Then reinitialize the lockspace via 'sanlock direct init -s' (used
> bugreport 1116469 as guidance).
> Once the init is successful and all the services are up - the VM is
> started but the deployment was long over and the setup needs additional
> cleaning up.
>
> I will rebuild the gluster cluster and then will repeat the deployment.
>
> Can you guide me what information will be needed , as I'm quite new in
> ovirt/RHV ?
>
> Best Regards,
> Strahil Nikolov
>
> On Jan 28, 2019 20:34, Nir Soffer  wrote:
>
> On Sat, Jan 26, 2019 at 6:13 PM Strahil  wrote:
>
> Hey guys,
>
> I have noticed that with 4.2.8 the sanlock issue (during deployment) is
> still not fixed.
> Am I the only one with bad luck or there is something broken there ?
>
> The sanlock service reports code 's7 add_lockspace fail result -233'
> 'leader1 delta_acquire_begin error -233 lockspace hosted-engine host_id
> 1'.
>
>
> Sanlock does not have such error code - are you sure this is -233?
>
> Here sanlock return values:
> https://pagure.io/sanlock/blob/master/f/src/sanlock_rv.h
>
> Can you share your sanlock log?
>
>
>
>
> Best Regards,
> Strahil Nikolov
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/SZMF5KKHSXOUTLGX3LR2NBN7E6QGS6G3/
>
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FMAWDZO5UO2HAGMHXT7AKGEKKXTIJS5S/


[ovirt-users] Re: Sanlock volume corrupted on deployment

2019-01-31 Thread Nir Soffer
On Thu, Jan 31, 2019 at 2:52 PM Strahil Nikolov 
wrote:

> Dear Nir,
>
> the issue with the 'The method does not exist or is not available:
> {'method': u'GlusterHost.list'}, code = -32601' is not related to the
> sanlock. I don't know why the 'vdsm-gluster' package was not installed as a
> dependency.
>

Please file a bug about this.

> Can you share your sanlock log?
> >
> I'm attaching the contents of /var/log , but here is a short snippet:
>
> About the sanlock issue - it reappeared with errors like :
> 2019-01-31 13:33:10 27551 [17279]: leader1 delta_acquire_begin error -223
> lockspace hosted-engine host_id 1
>

As I said, the error is not -233, but -223, which make sense - this error
means sanlock did not
find the magic number for a delta lease area, which means the area was not
formatted, or
corrupted.


> 2019-01-31 13:33:10 27551 [17279]: leader2 path
> /var/run/vdsm/storage/808423f9-8a5c-40cd-bc9f-2568c85b8c74/2c74697a-8bd9-4472-8a98-bf624f3462d5/411b6cee-5b01-47ca-8c28-bb1fed8ac83b
> offset 0
> 2019-01-31 13:33:10 27551 [17279]: leader3 m 0 v 30003 ss 512 nh 0 mh 1 oi
> 0 og 0 lv 0
> 2019-01-31 13:33:10 27551 [17279]: leader4 sn hosted-engine rn  ts 0 cs
> 60346c59
> 2019-01-31 13:33:11 27551 [21482]: s6 add_lockspace fail result -223
> 2019-01-31 13:33:16 27556 [21482]: s7 lockspace
> hosted-engine:1:/var/run/vdsm/storage/808423f9-8a5c-40cd-bc9f-2568c85b8c74/2c74697a-8bd9-4472-8a98-bf624f3462d5/411b6cee-5b01-47ca-8c28-bb1fe
> d8ac83b:0
>
>
> I have managed to fix it by running the following immediately after the ha
> services were started by ansible:
>
> cd
> /rhev/data-center/mnt/glusterSD/ovirt1.localdomain\:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/ha_agent/
>

This is not a path managed by vdsm, so I guess the issue is with hosted
enigne
specific lockspace that is managed by hosted engine, not by vdsm.


> sanlock direct init -s hosted-engine:0:hosted-engine.lockspace:0
>

This formats the lockspace, and is expected to fix this issue.



> systemctl stop ovirt-ha-agent ovirt-ha-broker
> systemctl status vdsmd
> systemctl start ovirt-ha-broker ovirt-ha-agent
>
> Once the VM started - ansible managed to finish the deployment without any
> issues.
> I hope someone can check the sanlock init stuff , as it is really
> frustrating.
>

If I understand the flow correctly, you create a new environment from
scratch, so this is
an issue with hosted engine deploymnet, not initializing the lockspace.

I think filing a bug with the info in this thread is the first step.

Simone, can you take a look at this?
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JWFLLOQS7AWN6P4XZS3HC4PTUWU2G5SP/


[ovirt-users] Re: Sanlock volume corrupted on deployment

2019-01-30 Thread Strahil Nikolov
 Dear All,
I have rebuilt the gluster cluster , but it seems that with the latest updates 
(I started over from scratch) I am not able to complete the "Prepare VM" phase 
and thus I cannot reach to the last phase where the sanlock issue happens.

I have checked the contents of " 
/var/log/ovirt-hosted-engine-setup/engine-logs-2019-01-31T06:54:22Z/ovirt-engine/engine.log"
 and the only errors I see are:

[root@ovirt1 ovirt-engine]# grep ERROR engine.log
2019-01-31 08:56:33,326+02 ERROR 
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] 
(EE-ManagedThreadFactory-engineScheduled-Thread-55) [3806b629] Failed in 
'GlusterServersListVDS' method
2019-01-31 08:56:33,343+02 ERROR 
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(EE-ManagedThreadFactory-engineScheduled-Thread-55) [3806b629] EVENT_ID: 
VDS_BROKER_COMMAND_FAILURE(10,802), VDSM ovirt1.localdomain command 
GlusterServersListVDS failed: The method does not exist or is not available: 
{'method': u'GlusterHost.list'}
2019-01-31 08:56:33,344+02 ERROR 
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] 
(EE-ManagedThreadFactory-engineScheduled-Thread-55) [3806b629] Command 
'GlusterServersListVDSCommand(HostName = ovirt1.localdomain, 
VdsIdVDSCommandParametersBase:{hostId='07c6b36a-6939-4059-8dd3-4e47ea094538'})' 
execution failed: VDSGenericException: VDSErrorException: Failed to 
GlusterServersListVDS, error = The method does not exist or is not available: 
{'method': u'GlusterHost.list'}, code = -32601
2019-01-31 08:56:33,591+02 ERROR 
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(EE-ManagedThreadFactory-engineScheduled-Thread-55) [51bf8a11] EVENT_ID: 
GLUSTER_COMMAND_FAILED(4,035), Gluster command [] failed on server 
.
2019-01-31 08:56:34,856+02 ERROR 
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] 
(EE-ManagedThreadFactory-engineScheduled-Thread-60) [3ee4bd51] Failed in 
'GlusterServersListVDS' method
2019-01-31 08:56:34,857+02 ERROR 
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] 
(EE-ManagedThreadFactory-engineScheduled-Thread-60) [3ee4bd51] Command 
'GlusterServersListVDSCommand(HostName = ovirt1.localdomain, 
VdsIdVDSCommandParametersBase:{hostId='07c6b36a-6939-4059-8dd3-4e47ea094538'})' 
execution failed: VDSGenericException: VDSErrorException: Failed to 
GlusterServersListVDS, error = The method does not exist or is not available: 
{'method': u'GlusterHost.list'}, code = -32601
2019-01-31 08:56:35,191+02 ERROR 
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(EE-ManagedThreadFactory-engineScheduled-Thread-60) [3fd826e] EVENT_ID: 
GLUSTER_COMMAND_FAILED(4,035), Gluster command [] failed on server 
.



Any hint how to proceed further ?
Best Regards,Strahil Nikolov



В вторник, 29 януари 2019 г., 14:01:17 ч. Гринуич+2, Strahil 
 написа:  
 
 Dear Nir,
According to redhat solution 1179163 'add_lockspace fail result -233' indicates 
corrupted ids lockspace.
During the install, the VM fails to get up.In order to fix it, I 
stop:ovirt-ha-agent, ovirt-ha-broker, vdsmd, supervdsmd, sanlockThen 
reinitialize the lockspace via 'sanlock direct init -s' (used bugreport 1116469 
as guidance).Once the init is successful and all the services are up - the VM 
is started but the deployment was long over and the setup needs additional 
cleaning up.
I will rebuild the gluster cluster and then will repeat the deployment.
Can you guide me what information will be needed , as I'm quite new in 
ovirt/RHV ?
Best Regards,Strahil Nikolov
On Jan 28, 2019 20:34, Nir Soffer  wrote:

On Sat, Jan 26, 2019 at 6:13 PM Strahil  wrote:

Hey guys,
I have noticed that with 4.2.8 the sanlock issue (during deployment) is still 
not fixed.Am I the only one with bad luck or there is something broken there ?
The sanlock service reports code 's7 add_lockspace fail result -233' 'leader1 
delta_acquire_begin error -233 lockspace hosted-engine host_id 1'.

Sanlock does not have such error code - are you sure this is -233?
Here sanlock return 
values:https://pagure.io/sanlock/blob/master/f/src/sanlock_rv.h

Can you share your sanlock log?
 

Best Regards,Strahil Nikolov___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SZMF5KKHSXOUTLGX3LR2NBN7E6QGS6G3/



Dear Nir,

According to redhat solution 1179163 'add_lockspace fail result -233' indicates 
corrupted ids lockspace.

During the install, the VM fails to get up.
In order to fix it, I stop:
ovirt-ha-agent, ovirt-ha-broker, vdsmd, supervdsmd, sanlock
Then reinitialize the lockspace via 'sanlock direct init -s' (used bugreport 
1116469 as guidance).
Once the init is 

[ovirt-users] Re: Sanlock volume corrupted on deployment

2019-01-29 Thread Strahil
Dear Nir,According to redhat solution 1179163 'add_lockspace fail result -233' indicates corrupted ids lockspace.During the install, the VM fails to get up.In order to fix it, I stop:ovirt-ha-agent, ovirt-ha-broker, vdsmd, supervdsmd, sanlockThen reinitialize the lockspace via 'sanlock direct init -s' (used bugreport 1116469 as guidance).Once the init is successful and all the services are up - the VM is started but the deployment was long over and the setup needs additional cleaning up.I will rebuild the gluster cluster and then will repeat the deployment.Can you guide me what information will be needed , as I'm quite new in ovirt/RHV ?Best Regards,Strahil NikolovOn Jan 28, 2019 20:34, Nir Soffer  wrote:On Sat, Jan 26, 2019 at 6:13 PM Strahil  wrote:Hey guys,I have noticed that with 4.2.8 the sanlock issue (during deployment) is still not fixed.Am I the only one with bad luck or there is something broken there ?The sanlock service reports code 's7 add_lockspace fail result -233' 'leader1 delta_acquire_begin error -233 lockspace hosted-engine host_id 1'.Sanlock does not have such error code - are you sure this is -233?Here sanlock return values:https://pagure.io/sanlock/blob/master/f/src/sanlock_rv.hCan you share your sanlock log? Best Regards,Strahil Nikolov___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SZMF5KKHSXOUTLGX3LR2NBN7E6QGS6G3/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OKPTTWQHN52CIG5ZM3VMLNKVJSJ6IVU6/


[ovirt-users] Re: Sanlock volume corrupted on deployment

2019-01-28 Thread Nir Soffer
On Sat, Jan 26, 2019 at 6:13 PM Strahil  wrote:

> Hey guys,
>
> I have noticed that with 4.2.8 the sanlock issue (during deployment) is
> still not fixed.
> Am I the only one with bad luck or there is something broken there ?
>
> The sanlock service reports code 's7 add_lockspace fail result -233'
> 'leader1 delta_acquire_begin error -233 lockspace hosted-engine host_id
> 1'.
>

Sanlock does not have such error code - are you sure this is -233?

Here sanlock return values:
https://pagure.io/sanlock/blob/master/f/src/sanlock_rv.h

Can you share your sanlock log?



>
> Best Regards,
> Strahil Nikolov
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/SZMF5KKHSXOUTLGX3LR2NBN7E6QGS6G3/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RUONCMQFRH3HBTBFD4YMI7AGDPAS5D6T/


[ovirt-users] Re: Sanlock volume corrupted on deployment

2019-01-28 Thread Strahil Nikolov
 Hi Simone,
I will reinstall the nodes and will provide an update.
Best Regards,Strahil Nikolov
On Sat, Jan 26, 2019 at 5:13 PM Strahil  wrote:

Hey guys,
I have noticed that with 4.2.8 the sanlock issue (during deployment) is still 
not fixed.Am I the only one with bad luck or there is something broken there ?

Hi,I'm not aware on anything breaking hosted-engine deployment on 4.2.8.Which 
kind of storage are you using?Can you please share your logs? 

The sanlock service reports code 's7 add_lockspace fail result -233' 'leader1 
delta_acquire_begin error -233 lockspace hosted-engine host_id 1'.
Best Regards,Strahil Nikolov___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SZMF5KKHSXOUTLGX3LR2NBN7E6QGS6G3/

  ___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTRUQE7Q7NHIK7MHROGIN56FXVK65ZOD/


[ovirt-users] Re: Sanlock volume corrupted on deployment

2019-01-28 Thread Simone Tiraboschi
On Sat, Jan 26, 2019 at 5:13 PM Strahil  wrote:

> Hey guys,
>
> I have noticed that with 4.2.8 the sanlock issue (during deployment) is
> still not fixed.
> Am I the only one with bad luck or there is something broken there ?
>

Hi,
I'm not aware on anything breaking hosted-engine deployment on 4.2.8.
Which kind of storage are you using?
Can you please share your logs?


>
> The sanlock service reports code 's7 add_lockspace fail result -233'
> 'leader1 delta_acquire_begin error -233 lockspace hosted-engine host_id 1'.
>
> Best Regards,
> Strahil Nikolov
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/SZMF5KKHSXOUTLGX3LR2NBN7E6QGS6G3/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NDTZ67C3LDVA6OGLUGQXUX2LSIMW6HZW/