[ovirt-users] Re: Failed to activate Storage Domain --- ovirt 4.2

2021-09-20 Thread fabian . rapetti
Nir,
Thanks for your time. 
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/LLYQP5LKXJDP37V5ZP63WNZZY2A7JJV5/


[ovirt-users] Re: Failed to activate Storage Domain --- ovirt 4.2

2021-09-19 Thread Nir Soffer
On Sat, Sep 18, 2021 at 9:26 AM  wrote:
>
> Hi all.
> I'm using Ovrit 4.3.10 two nodes cluster and facing the same error (second 
> metadata area corruption).
> Does anybody know if there is a solution for that?
>
> Our software include:
> lvm2-2.02.186-7.el7_8.2.x86_64
> Virt Node 4.3.10
> kernel  3.10.0-1127.8.2.el7.x86_64
> device-mapper-multipath-libs-0.4.9-131.el7.x86_64
> libvirt-4.5.0-33.el7_8.1.x86_64

This is a known issue in vdsm < vdsm-4.30.50 and  lvm2 < 2.02.187-6:
https://bugzilla.redhat.com/1849595

The only way to avoid this is to upgrade to oVirt >= 4.3.11.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/C343NUCNXEHEFJAXTP2CLUOB3GEMD3S7/


[ovirt-users] Re: Failed to activate Storage Domain --- ovirt 4.2

2021-09-18 Thread fabian . rapetti
Hi all.
I'm using Ovrit 4.3.10 two nodes cluster and facing the same error (second 
metadata area corruption). 
Does anybody know if there is a solution for that?

Our software include:  
lvm2-2.02.186-7.el7_8.2.x86_64
Virt Node 4.3.10
kernel  3.10.0-1127.8.2.el7.x86_64
device-mapper-multipath-libs-0.4.9-131.el7.x86_64
libvirt-4.5.0-33.el7_8.1.x86_64

Thanks in advance 
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZXM4DBBUCMBQBIK6JLYO3MMIOQSD3HTJ/


[ovirt-users] Re: Failed to activate Storage Domain --- ovirt 4.2

2019-06-11 Thread Aminur Rahman
Hi Nir

Yes, the metadata was corrupted but the VMs were running OK. This master 
storage domain has increased its allocation significantly overnight and ran out 
the space limit and went to offline completely. The cluster was online and VMs 
were running OK but the affected Storage Domain went offline. I tired increase 
the storage domain but the Ovirt wasn’t allowing to expend the storage.

Due to time constrain, I had restore the storage domain using Compellent 
snapshot. However, we need to prevent this happening again when Master storage 
Domain fill-up with the space. Currently, we have the following parameter set 
in the 5TB storage Domain.

ID: 0e1f2a5d-a548-476c-94bd-3ab3fe239926
Size: 5119 GiB
Available: 2361 GiB
Used: 2758 GiB
Allocated: 3104 GiB
Over Allocation Ratio: 14%
Images: 13
Warning Low Space Indicator: 10% (511 GiB)
Critical Space Action Blocker: 5 GiB

Please kindly advise what action needs to implement, so we can prevent this 
occurs again in the future.

Thanks
Aminur Rahman
aminur.rah...@iongroup.com<mailto:aminur.rah...@iongroup.com>
t
+44 20 7398 0243
m
+44 7825 780697
iongroup.com<https://www.iongroup.com>

From: Nir Soffer 
Sent: 10 June 2019 22:07
To: David Teigland 
Cc: Aminur Rahman ; users 
Subject: Re: [ovirt-users] Failed to activate Storage Domain --- ovirt 4.2

On Mon, Jun 10, 2019 at 11:22 PM David Teigland 
mailto:teigl...@redhat.com>> wrote:
On Mon, Jun 10, 2019 at 10:59:43PM +0300, Nir Soffer wrote:
> > [root@uk1-ion-ovm-18  pvscan
> >   /dev/mapper/36000d31005697814: Checksum error at offset
> > 4397954425856
> >   Couldn't read volume group metadata from
> > /dev/mapper/36000d31005697814.
> >   Metadata location on /dev/mapper/36000d31005697814 at
> > 4397954425856 has invalid summary for VG.
> >   Failed to read metadata summary from
> > /dev/mapper/36000d31005697814
> >   Failed to scan VG from /dev/mapper/36000d31005697814
>
> This looks like corrupted vg metadata.

Yes, the second metadata area, at the end of the device is corrupted; the
first metadata area is probably ok.  That version of lvm is not able to
continue by just using the one good copy.

Can we copy the first metadata area into the second metadata area?

Last week I pushed out major changes to LVM upstream to be able to handle
and repair most of these cases.  So, one option is to build lvm from the
upstream master branch, and check if that can read and repair this
metadata.

This sound pretty risky for production.

> David, we keep 2 metadata copies on the first PV. Can we use one of the
> copies on the PV to restore the metadata to the least good state?

pvcreate with --restorefile and --uuid, and with the right backup metadata

What would be the right backup metadata?

could probably correct things, but experiment with some temporary PVs
first.

Aminur, can you copy and compress the metadata areas, and shared them somewhere?

To copy the first metadata area, use:

dd if=/dev/mapper/360014058ccaab4857eb40f393aaf0351 of=md1 bs=128M count=1 
skip=4096 iflag=skip_bytes

To copy the second metadata area, you need to know the size of the PV. On my 
setup with 100G
PV, I have 800 extents (128M each), and this works:

dd if=/dev/mapper/360014058ccaab4857eb40f393aaf0351 of=md2 bs=128M count=1 
skip=799

gzip md1 md2

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZCD5K4UTMZ3QVS7OC2KWBSXWCHWTXQLV/


[ovirt-users] Re: Failed to activate Storage Domain --- ovirt 4.2

2019-06-10 Thread Nir Soffer
On Mon, Jun 10, 2019 at 11:22 PM David Teigland  wrote:

> On Mon, Jun 10, 2019 at 10:59:43PM +0300, Nir Soffer wrote:
> > > [root@uk1-ion-ovm-18  pvscan
> > >   /dev/mapper/36000d31005697814: Checksum error at
> offset
> > > 4397954425856
> > >   Couldn't read volume group metadata from
> > > /dev/mapper/36000d31005697814.
> > >   Metadata location on /dev/mapper/36000d31005697814 at
> > > 4397954425856 has invalid summary for VG.
> > >   Failed to read metadata summary from
> > > /dev/mapper/36000d31005697814
> > >   Failed to scan VG from /dev/mapper/36000d31005697814
> >
> > This looks like corrupted vg metadata.
>
> Yes, the second metadata area, at the end of the device is corrupted; the
> first metadata area is probably ok.  That version of lvm is not able to
> continue by just using the one good copy.


Can we copy the first metadata area into the second metadata area?

Last week I pushed out major changes to LVM upstream to be able to handle
> and repair most of these cases.  So, one option is to build lvm from the
> upstream master branch, and check if that can read and repair this
> metadata.
>

This sound pretty risky for production.

> David, we keep 2 metadata copies on the first PV. Can we use one of the
> > copies on the PV to restore the metadata to the least good state?
>
> pvcreate with --restorefile and --uuid, and with the right backup metadata
>

What would be the right backup metadata?


> could probably correct things, but experiment with some temporary PVs
> first.
>

Aminur, can you copy and compress the metadata areas, and shared them
somewhere?

To copy the first metadata area, use:

dd if=/dev/mapper/360014058ccaab4857eb40f393aaf0351 of=md1 bs=128M count=1
skip=4096 iflag=skip_bytes

To copy the second metadata area, you need to know the size of the PV. On
my setup with 100G
PV, I have 800 extents (128M each), and this works:

dd if=/dev/mapper/360014058ccaab4857eb40f393aaf0351 of=md2 bs=128M count=1
skip=799

gzip md1 md2

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RYQA4SXJQJJN7DV3U6KB2XQ3AOPLAHT6/


[ovirt-users] Re: Failed to activate Storage Domain --- ovirt 4.2

2019-06-10 Thread David Teigland
On Mon, Jun 10, 2019 at 10:59:43PM +0300, Nir Soffer wrote:
> > [root@uk1-ion-ovm-18  pvscan
> >   /dev/mapper/36000d31005697814: Checksum error at offset
> > 4397954425856
> >   Couldn't read volume group metadata from
> > /dev/mapper/36000d31005697814.
> >   Metadata location on /dev/mapper/36000d31005697814 at
> > 4397954425856 has invalid summary for VG.
> >   Failed to read metadata summary from
> > /dev/mapper/36000d31005697814
> >   Failed to scan VG from /dev/mapper/36000d31005697814
> 
> This looks like corrupted vg metadata.

Yes, the second metadata area, at the end of the device is corrupted; the
first metadata area is probably ok.  That version of lvm is not able to
continue by just using the one good copy.

Last week I pushed out major changes to LVM upstream to be able to handle
and repair most of these cases.  So, one option is to build lvm from the
upstream master branch, and check if that can read and repair this
metadata.

> David, we keep 2 metadata copies on the first PV. Can we use one of the
> copies on the PV to restore the metadata to the least good state?

pvcreate with --restorefile and --uuid, and with the right backup metadata
could probably correct things, but experiment with some temporary PVs
first.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6T7EM2R7422CXGBO3CKALMIHBYSTBUYK/


[ovirt-users] Re: Failed to activate Storage Domain --- ovirt 4.2

2019-06-10 Thread Nir Soffer
On Fri, Jun 7, 2019 at 5:03 PM  wrote:

> Hi
> Has anyone experiencing the following issue with Storage Domain -
>
> Failed to activate Storage Domain cLUN-R940-DC2-dstore01 --
> VDSM command ActivateStorageDomainVDS failed: Storage domain does not
> exist: (u'1b0ef853-fd71-45ea-8165-cc6047a267bc',)
>
> Currently, the storge Domain is Inactive and strangely, the VMs are
> running as normal. We can't manage or extend the volume size of this
> storage domain. The pvscan shows as:
> [root@uk1-ion-ovm-18  pvscan
>   /dev/mapper/36000d31005697814: Checksum error at offset
> 4397954425856
>   Couldn't read volume group metadata from
> /dev/mapper/36000d31005697814.
>   Metadata location on /dev/mapper/36000d31005697814 at
> 4397954425856 has invalid summary for VG.
>   Failed to read metadata summary from
> /dev/mapper/36000d31005697814
>   Failed to scan VG from /dev/mapper/36000d31005697814
>

This looks like corrupted vg metadata.

> I have tired the following steps:
> 1. Restarted ovirt-engine.service
> 2. tried to restore the metadata using vgcfgrestore but it failed with the
> following error:
>
> [root@uk1-ion-ovm-19 backup]# vgcfgrestore
> 36000d31005697814
>   Volume group 36000d31005697814 has active volume: .
>   WARNING: Found 1 active volume(s) in volume group
> "36000d31005697814".
>   Restoring VG with active LVs, may cause mismatch with its metadata.
> Do you really want to proceed with restore of volume group
> "36000d31005697814", while 1 volume(s) are active? [y/n]: y
>

This is not safe, you cannot fix the VG while it is being used by oVirt.

You need to migrate the running VMs to other storage, or shut down the VMs.
Then
deactivate this storage domain. Only then you can try to restore the VG.

  /dev/mapper/36000d31005697814: Checksum error at offset
> 4397954425856
>   Couldn't read volume group metadata from
> /dev/mapper/36000d31005697814.
>   Metadata location on /dev/mapper/36000d31005697814 at
> 4397954425856 has invalid summary for VG.
>   Failed to read metadata summary from
> /dev/mapper/36000d31005697814
>   Failed to scan VG from /dev/mapper/36000d31005697814
>   /etc/lvm/backup/36000d31005697814: stat failed: No such
> file or directory
>

Looks like you don't have a backup in this host. You may have the most
recent backup
on another host.


>   Couldn't read volume group metadata from file.
>   Failed to read VG 36000d31005697814 from
> /etc/lvm/backup/36000d31005697814
>   Restore failed.
>
> Please let me know if anyone knows any possible resolution.
>

David, we keep 2 metadata copies on the first PV. Can we use one of the
copies on the PV
to restore the metadata to the least good state?

David, how do you suggest to proceed?

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KA4TVUE775MMCQVD3YF7GSUZGEEOCQCF/


[ovirt-users] Re: Failed to activate Storage Domain --- ovirt 4.2

2019-06-10 Thread Eyal Shenitzky
Nir, can you please have a look?

On Mon, Jun 10, 2019 at 2:29 PM Aminur Rahman 
wrote:

> Hi Eyal
>
>
>
> We’re using:
>
>
>
> ovirt-engine-4.2.8.2-1.el7.noarch
>
> vdsm-client-4.20.46-1.el7.noarch
>
>
>
> Thanks
>
> *Aminur Rahman*
>
> aminur.rah...@iongroup.com
>
> *t*
>
> +44 20 7398 0243 <+44%2020%207398%200243>
>
> *m*
>
> +44 7825 780697 <+44%207825%20780697%3c>
>
> iongroup.com <https://www.iongroup.com>
>
>
>
> *From:* Eyal Shenitzky 
> *Sent:* 10 June 2019 07:20
> *To:* Aminur Rahman ; Nir Soffer <
> nsof...@redhat.com>
> *Cc:* users 
> *Subject:* Re: [ovirt-users] Failed to activate Storage Domain --- ovirt
> 4.2
>
>
>
> Hi Aminur,
>
>
>
> Can you please send the engine and vdsm versions?
>
>
>
>
>
> On Fri, Jun 7, 2019 at 5:03 PM  wrote:
>
> Hi
> Has anyone experiencing the following issue with Storage Domain -
>
> Failed to activate Storage Domain cLUN-R940-DC2-dstore01 --
> VDSM command ActivateStorageDomainVDS failed: Storage domain does not
> exist: (u'1b0ef853-fd71-45ea-8165-cc6047a267bc',)
>
> Currently, the storge Domain is Inactive and strangely, the VMs are
> running as normal. We can't manage or extend the volume size of this
> storage domain. The pvscan shows as:
> [root@uk1-ion-ovm-18  pvscan
>   /dev/mapper/36000d31005697814: Checksum error at offset
> 4397954425856
>   Couldn't read volume group metadata from
> /dev/mapper/36000d31005697814.
>   Metadata location on /dev/mapper/36000d31005697814 at
> 4397954425856 has invalid summary for VG.
>   Failed to read metadata summary from
> /dev/mapper/36000d31005697814
>   Failed to scan VG from /dev/mapper/36000d31005697814
>
> I have tired the following steps:
> 1. Restarted ovirt-engine.service
> 2. tried to restore the metadata using vgcfgrestore but it failed with the
> following error:
>
> [root@uk1-ion-ovm-19 backup]# vgcfgrestore
> 36000d31005697814
>   Volume group 36000d31005697814 has active volume: .
>   WARNING: Found 1 active volume(s) in volume group
> "36000d31005697814".
>   Restoring VG with active LVs, may cause mismatch with its metadata.
> Do you really want to proceed with restore of volume group
> "36000d31005697814", while 1 volume(s) are active? [y/n]: y
>   /dev/mapper/36000d31005697814: Checksum error at offset
> 4397954425856
>   Couldn't read volume group metadata from
> /dev/mapper/36000d31005697814.
>   Metadata location on /dev/mapper/36000d31005697814 at
> 4397954425856 has invalid summary for VG.
>   Failed to read metadata summary from
> /dev/mapper/36000d31005697814
>   Failed to scan VG from /dev/mapper/36000d31005697814
>   /etc/lvm/backup/36000d31005697814: stat failed: No such
> file or directory
>   Couldn't read volume group metadata from file.
>   Failed to read VG 36000d31005697814 from
> /etc/lvm/backup/36000d31005697814
>   Restore failed.
>
> Please let me know if anyone knows any possible resolution.
>
> -AMinur
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> <https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.org%2Fsite%2Fprivacy-policy%2F=02%7C01%7Caminur.rahman%40iongroup.com%7Ce457f2f7fda045b004d308d6ed6bb13a%7C768fe7d4ebee41a79851d5825ecdd396%7C0%7C0%7C636957444150154770=%2FJxtIXYVAV4gkKkzpyLKivL3S7ohq4h%2FmZhqMhsn5fc%3D=0>
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> <https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.org%2Fcommunity%2Fabout%2Fcommunity-guidelines%2F=02%7C01%7Caminur.rahman%40iongroup.com%7Ce457f2f7fda045b004d308d6ed6bb13a%7C768fe7d4ebee41a79851d5825ecdd396%7C0%7C0%7C636957444150164768=ADx6k3WdNq5hfLpASEny93MoETVWPO8%2FadE5YOLtNUo%3D=0>
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/W2JP7ZO5XMV66ATT3N33IKCZHKM6XPWJ/
> <https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovirt.org%2Farchives%2Flist%2Fusers%40ovirt.org%2Fmessage%2FW2JP7ZO5XMV66ATT3N33IKCZHKM6XPWJ%2F=02%7C01%7Caminur.rahman%40iongroup.com%7Ce457f2f7fda045b004d308d6ed6bb13a%7C768fe7d4ebee41a79851d5825ecdd396%7C0%7C0%7C636957444150164768=ajeVWbuYqWIZQhg3lfXbGLMl6ingezh4EK3A9RueT8Y%3D=0>
>
>
>
>
>

[ovirt-users] Re: Failed to activate Storage Domain --- ovirt 4.2

2019-06-10 Thread Eyal Shenitzky
Hi Aminur,

Can you please send the engine and vdsm versions?


On Fri, Jun 7, 2019 at 5:03 PM  wrote:

> Hi
> Has anyone experiencing the following issue with Storage Domain -
>
> Failed to activate Storage Domain cLUN-R940-DC2-dstore01 --
> VDSM command ActivateStorageDomainVDS failed: Storage domain does not
> exist: (u'1b0ef853-fd71-45ea-8165-cc6047a267bc',)
>
> Currently, the storge Domain is Inactive and strangely, the VMs are
> running as normal. We can't manage or extend the volume size of this
> storage domain. The pvscan shows as:
> [root@uk1-ion-ovm-18  pvscan
>   /dev/mapper/36000d31005697814: Checksum error at offset
> 4397954425856
>   Couldn't read volume group metadata from
> /dev/mapper/36000d31005697814.
>   Metadata location on /dev/mapper/36000d31005697814 at
> 4397954425856 has invalid summary for VG.
>   Failed to read metadata summary from
> /dev/mapper/36000d31005697814
>   Failed to scan VG from /dev/mapper/36000d31005697814
>
> I have tired the following steps:
> 1. Restarted ovirt-engine.service
> 2. tried to restore the metadata using vgcfgrestore but it failed with the
> following error:
>
> [root@uk1-ion-ovm-19 backup]# vgcfgrestore
> 36000d31005697814
>   Volume group 36000d31005697814 has active volume: .
>   WARNING: Found 1 active volume(s) in volume group
> "36000d31005697814".
>   Restoring VG with active LVs, may cause mismatch with its metadata.
> Do you really want to proceed with restore of volume group
> "36000d31005697814", while 1 volume(s) are active? [y/n]: y
>   /dev/mapper/36000d31005697814: Checksum error at offset
> 4397954425856
>   Couldn't read volume group metadata from
> /dev/mapper/36000d31005697814.
>   Metadata location on /dev/mapper/36000d31005697814 at
> 4397954425856 has invalid summary for VG.
>   Failed to read metadata summary from
> /dev/mapper/36000d31005697814
>   Failed to scan VG from /dev/mapper/36000d31005697814
>   /etc/lvm/backup/36000d31005697814: stat failed: No such
> file or directory
>   Couldn't read volume group metadata from file.
>   Failed to read VG 36000d31005697814 from
> /etc/lvm/backup/36000d31005697814
>   Restore failed.
>
> Please let me know if anyone knows any possible resolution.
>
> -AMinur
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/W2JP7ZO5XMV66ATT3N33IKCZHKM6XPWJ/
>


-- 
Regards,
Eyal Shenitzky
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BSEQ2QQQ3SHQQOTWNFWJWRPKH7QM2YWA/


[ovirt-users] Re: Failed to activate Storage Domain --- ovirt 4.2

2019-06-07 Thread aminur . rahman
The Storage Domain was the master domain and it ran out space. 
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4Y6LEK5RZE33HDLZQXBJ4IODUKDI67R5/