Am 6/3/2016 um 6:37 PM schrieb Nir Soffer: > On Fri, Jun 3, 2016 at 11:27 AM, InterNetX - Juergen Gotteswinter > <juergen.gotteswin...@internetx.com> wrote: >> What if we move all vm off the lun which causes this error, drop the lun >> and recreated it. Will we "migrate" the error with the VM to a different >> lun or could this be a fix? > > This should will fix the ids file, but since we don't know why this corruption > happened, it may happen again. >
i am pretty sure to know when / why this happend, after a major outage with engine gone crazy in fencing hosts + crash / hard reset of the san this messages occoured the first time. but i can provide a log package, no problem > Please open a bug with the log I requested so we can investigate this issue. > > To fix the ids file you don't have to recreate the lun, just > initialize the ids lv. > > 1. Put the domain to maintenance (via engine) > > No host should access it while you reconstruct the ids file > > 2. Activate the ids lv > > You may need to connect to this iscsi target first, unless you have other > vgs connected on the same target. > > lvchange -ay sd_uuid/ids > > 3. Initialize the lockspace > > sanlock direct init -s <sd_uuid>:0:/dev/<sd_uuid>/ids:0 > > 4. Deactivate the ids lv > > lvchange -an sd_uuid/ids > > 6. Activate the domain (via engine) > > The domain should become active after a while. > oh, this is great, going to announce an maintance window. Thanks a lot, this already started to drive me crazy. Will Report after we did this! > Nir > >> >> Am 6/3/2016 um 10:08 AM schrieb InterNetX - Juergen Gotteswinter: >>> Hello David, >>> >>> thanks for your explanation of those messages, is there any possibility >>> to get rid of this? i already figured out that it might be an corruption >>> of the ids file, but i didnt find anything about re-creating or other >>> solutions to fix this. >>> >>> Imho this occoured after an outage where several hosts, and the iscsi >>> SAN has been fenced and/or rebooted. >>> >>> Thanks, >>> >>> Juergen >>> >>> >>> Am 6/2/2016 um 6:03 PM schrieb David Teigland: >>>> On Thu, Jun 02, 2016 at 06:47:37PM +0300, Nir Soffer wrote: >>>>>> This is a mess that's been caused by improper use of storage, and various >>>>>> sanity checks in sanlock have all reported errors for "impossible" >>>>>> conditions indicating that something catastrophic has been done to the >>>>>> storage it's using. Some fundamental rules are not being followed. >>>>> >>>>> Thanks David. >>>>> >>>>> Do you need more output from sanlock to understand this issue? >>>> >>>> I can think of nothing more to learn from sanlock. I'd suggest tighter, >>>> higher level checking or control of storage. Low level sanity checks >>>> detecting lease corruption are not a convenient place to work from. >>>> >>> >>> _______________________________________________ >>> Users mailing list >>> Users@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/users >>> _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users