I forgot to mention that LVM config has to be modified in order to 'inform' 
local LVM stack to rely on clvmd/dlm for locking purposes.
Yet, this brings abother layer of complexity which I prefer to avoid , thus I 
use HA-LVM on my pacemaker clusters.

@Martin,

Check the link from Benny and if possible check if the 2 cases are related.

Best Regards,
Strahil NikolovOn Jul 24, 2019 11:07, Benny Zlotnik <bzlot...@redhat.com> wrote:
>
> We have seen something similar in the past and patches were posted to deal 
> with this issue, but it's still in progress[1]
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1553133
>
> On Mon, Jul 22, 2019 at 8:07 PM Strahil <hunter86...@yahoo.com> wrote:
>>
>> I have a theory... But after all without any proof it will remain theory.
>>
>> The storage volumes are just VGs over a shared storage.The SPM host is 
>> supposed to be the only one that is working with the LVM metadata, but I 
>> have observed that when someone is executing a simple LVM command  (for 
>> example -lvs, vgs or pvs ) while another one is going on on another host - 
>> your metadata can corrupt, due to lack of clvmd.
>>
>> As a protection, I could offer you to try the following solution:
>> 1. Create new iSCSI lun
>> 2. Share it to all nodes and create the storage domain. Set it to 
>> maintenance.
>> 3. Start dlm & clvmd services on all hosts
>> 4. Convert the VG of your shared storage domain to have a 'cluster'-ed  flag:
>> vgchange -c y mynewVG
>> 5. Check the lvs of that VG.
>> 6. Activate the storage domain.
>>
>> Of course  test it on a test cluster before inplementing it on Prod.
>> This is one of the approaches used in Linux HA clusters in order to avoid  
>> LVM metadata corruption.
>>
>> Best Regards,
>> Strahil Nikolov
>>
>> On Jul 22, 2019 15:46, Martijn Grendelman <martijn.grendel...@isaac.nl> 
>> wrote:
>>>
>>> Hi,
>>>
>>> Op 22-7-2019 om 14:30 schreef Strahil:
>>>>
>>>> If you can give directions (some kind of history) , the dev might try to 
>>>> reproduce this type of issue.
>>>>
>>>> If it is reproduceable - a fix can be provided.
>>>>
>>>> Based on my experience, if something as used as Linux LVM gets broken, the 
>>>> case is way hard to reproduce.
>>>
>>>
>>> Yes, I'd think so too, especially since this activity (online moving of 
>>> disk images) is done all the time, mostly without problems. In this case, 
>>> there was a lot of activity on all storage domains, because I'm moving all 
>>> my storage (> 10TB in 185 disk images) to a new storage platform. During 
>>> the online move of one the images, the metadata checksum became corrupted 
>>> and the storage domain went offline.
>>>
>>> Of course, I could dig up the engine logs and vdsm logs of when it 
>>> happened, but that would be some work and I'm not very confident that the 
>>> actual cause would be in there.
>>>
>>> If any oVirt devs are interested in the logs, I'll provide them, but 
>>> otherwise I think I'll just see it as an incident and move on.
>>>
>>> Best regards,
>>> Martijn.
>>>
>>>
>>>
>>>
>>> On Jul 22, 2019 10:17, Martijn Grendelman <martijn.grendel...@isaac.nl> 
>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Thanks for the tips! I didn't know about 'pvmove', thanks.
>>>>>
>>>>> In  the mean time, I managed to get it fixed by restoring the VG metadata 
>>>>> on the iSCSI server, so on the underlying Zvol directly, rather than via 
>>>>> the iSCSI session on the oVirt host. That allowed me to perform the 
>>>>> restore without bringing all VMs down, which was important to me, because 
>>>>> if I had to shut down VMs, I was sure I wouldn't be able to restart them 
>>>>> before the storage domain was back online.
>>>>>
>>>>> Of course this is a more a Linux problem than an oVirt problem, but oVirt 
>>>>> did cause it ;-)
>>>>>
>>>>> Thanks,
>>>>> Martijn.
>>>>>
>>>>>
>>>>>
>>>>> Op 19-7-2019 om 19:06 schreef Strahil Nikolov:
>>>>>>
>>>>>> Hi Martin,
>>>>>>
>>>>>> First check what went wrong with the VG -as it could be something simple.
>>>>>> vgcfgbackup -f VGname will create a file which you can use to compare 
>>>>>> current metadata with a previous version.
>>>>>>
>>>>>> If you have Linux boxes - you can add disks from another storage an
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/372RXBUUEZPMJHYKQ4HATJ4KZ3GNPACJ/

Reply via email to