[ovirt-users] Re: strange issue: vm lost info on disk

Juan Pablo Mon, 14 May 2018 07:21:46 -0700

ok, so Im confirming that the image is wrong somehow:
with no snapshot, from inside the vm disk size is reporting 750G.
with a snapshot, from inside the vm disk size is reporting 1100G.
both have no partitions on it, so I guess ovirt migrated the structure of
the 750G disk on a 1100 disk, any ideas to troubleshoot this and see if
there's data to recover?


regards,


2018-05-13 15:25 GMT-03:00 Juan Pablo <pablo.localh...@gmail.com>:

> 2 clues:
> -the original size of the disk was 750G and was extended a month ago to
> 1100G. The System rebooted fine several times, and took the new size with
> no problems.
>
> -I run fdisk from a centos 7 rescue cd and '/dev/vda' reported 750G. then,
> I took a snapshot of the disk to play with recovery tools and now fdisk
> reports 1100G...  ¬¬
>
> so my guess is on the extend and later migration to a different storage
> domain caused the issue.
> Im currently running testdisk to see if theres any partition to recover.
>
> regards,
>
> 2018-05-13 12:31 GMT-03:00 Juan Pablo <pablo.localh...@gmail.com>:
>
>> I removed the auto-snapshot and still no lucky. no bootable disk found. =(
>> ideas?
>>
>>
>> 2018-05-13 12:26 GMT-03:00 Juan Pablo <pablo.localh...@gmail.com>:
>>
>>> benny, thanks for your reply:
>>> ok, so the steps are : removing the snapshot on the first place. then
>>> what do you suggest?
>>>
>>>
>>> 2018-05-12 15:23 GMT-03:00 Nir Soffer <nsof...@redhat.com>:
>>>
>>>> On Sat, 12 May 2018, 11:32 Benny Zlotnik, <bzlot...@redhat.com> wrote:
>>>>
>>>>> Using the auto-generated snapshot is generally a bad idea as it's
>>>>> inconsistent,
>>>>>
>>>>
>>>> What do you mean by inconsistant?
>>>>
>>>>
>>>> you should remove it before moving further
>>>>>
>>>>> On Fri, May 11, 2018 at 7:25 PM, Juan Pablo <pablo.localh...@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> I rebooted it with no luck, them I used the auto-gen snapshot , same
>>>>>> luck.
>>>>>> attaching the logs in gdrive
>>>>>>
>>>>>> thanks in advance
>>>>>>
>>>>>> 2018-05-11 12:50 GMT-03:00 Benny Zlotnik <bzlot...@redhat.com>:
>>>>>>
>>>>>>> I see here a failed attempt:
>>>>>>> 2018-05-09 16:00:20,129-03 ERROR [org.ovirt.engine.core.dal.dbb
>>>>>>> roker.auditloghandling.AuditLogDirector]
>>>>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-67)
>>>>>>> [bd8eeb1d-f49a-4f91-a521-e0f31b4a7cbd] EVENT_ID:
>>>>>>> USER_MOVED_DISK_FINISHED_FAILURE(2,011), User admin@internal-authz
>>>>>>> have failed to move disk mail02-int_Disk1 to domain 2penLA.
>>>>>>>
>>>>>>> Then another:
>>>>>>> 2018-05-09 16:15:06,998-03 ERROR [org.ovirt.engine.core.dal.dbb
>>>>>>> roker.auditloghandling.AuditLogDirector]
>>>>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] EVENT_ID:
>>>>>>> USER_MOVED_DISK_FINISHED_FAILURE(2,011), User admin@internal-authz
>>>>>>> have failed to move disk mail02-int_Disk1 to domain 2penLA.
>>>>>>>
>>>>>>> Here I see a successful attempt:
>>>>>>> 2018-05-09 21:58:42,628-03 INFO  [org.ovirt.engine.core.dal.dbb
>>>>>>> roker.auditloghandling.AuditLogDirector] (default task-50)
>>>>>>> [940b051c-8c63-4711-baf9-f3520bb2b825] EVENT_ID:
>>>>>>> USER_MOVED_DISK(2,008), User admin@internal-authz moving disk
>>>>>>> mail02-int_Disk1 to domain 2penLA.
>>>>>>>
>>>>>>>
>>>>>>> Then, in the last attempt I see the attempt was successful but live
>>>>>>> merge failed:
>>>>>>> 2018-05-11 03:37:59,509-03 ERROR 
>>>>>>> [org.ovirt.engine.core.bll.MergeStatusCommand]
>>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2)
>>>>>>> [d5b7fdf5-9c37-4c1f-8543-a7bc75c993a5] Failed to live merge, still
>>>>>>> in volume chain: [5d9d2958-96bc-49fa-9100-2f33a3ba737f,
>>>>>>> 52532d05-970e-4643-9774-96c31796062c]
>>>>>>> 2018-05-11 03:38:01,495-03 INFO  [org.ovirt.engine.core.bll.Ser
>>>>>>> ialChildCommandsExecutionCallback] 
>>>>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-51)
>>>>>>> [d5b7fdf5-9c37-4c1f-8543-a7bc75c993a5] Command 'LiveMigrateDisk'
>>>>>>> (id: '115fc375-6018-4d59-b9f2-51ee05ca49f8') waiting on child
>>>>>>> command id: '26bc52a4-4509-4577-b342-44a679bc628f'
>>>>>>> type:'RemoveSnapshot' to complete
>>>>>>> 2018-05-11 03:38:01,501-03 ERROR [org.ovirt.engine.core.bll.sna
>>>>>>> pshots.RemoveSnapshotSingleDiskLiveCommand]
>>>>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-51)
>>>>>>> [d5b7fdf5-9c37-4c1f-8543-a7bc75c993a5] Command id:
>>>>>>> '4936d196-a891-4484-9cf5-fceaafbf3364 failed child command status
>>>>>>> for step 'MERGE_STATUS'
>>>>>>> 2018-05-11 03:38:01,501-03 INFO  [org.ovirt.engine.core.bll.sna
>>>>>>> pshots.RemoveSnapshotSingleDiskLiveCommandCallback]
>>>>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-51)
>>>>>>> [d5b7fdf5-9c37-4c1f-8543-a7bc75c993a5] Command
>>>>>>> 'RemoveSnapshotSingleDiskLive' id: 
>>>>>>> '4936d196-a891-4484-9cf5-fceaafbf3364'
>>>>>>> child commands '[8da5f261-7edd-4930-8d9d-d34f232d84b3,
>>>>>>> 1c320f4b-7296-43c4-a3e6-8a868e23fc35, 
>>>>>>> a0e9e70c-cd65-4dfb-bd00-076c4e99556a]'
>>>>>>> executions were completed, status 'FAILED'
>>>>>>> 2018-05-11 03:38:02,513-03 ERROR [org.ovirt.engine.core.bll.sna
>>>>>>> pshots.RemoveSnapshotSingleDiskLiveCommand]
>>>>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-2)
>>>>>>> [d5b7fdf5-9c37-4c1f-8543-a7bc75c993a5] Merging of snapshot
>>>>>>> '319e8bbb-9efe-4de4-a9a6-862e3deb891f' images
>>>>>>> '52532d05-970e-4643-9774-96c31796062c'..'5d9d2958-96bc-49fa-9100-2f33a3ba737f'
>>>>>>> failed. Images have been marked illegal and can no longer be previewed 
>>>>>>> or
>>>>>>> reverted to. Please retry Live Merge on the snapshot to complete the
>>>>>>> operation.
>>>>>>> 2018-05-11 03:38:02,519-03 ERROR [org.ovirt.engine.core.bll.sna
>>>>>>> pshots.RemoveSnapshotSingleDiskLiveCommand]
>>>>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-2)
>>>>>>> [d5b7fdf5-9c37-4c1f-8543-a7bc75c993a5] Ending command
>>>>>>> 'org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand'
>>>>>>> with failure.
>>>>>>> 2018-05-11 03:38:03,530-03 INFO  [org.ovirt.engine.core.bll.Con
>>>>>>> currentChildCommandsExecutionCallback]
>>>>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-37)
>>>>>>> [d5b7fdf5-9c37-4c1f-8543-a7bc75c993a5] Command 'RemoveSnapshot' id:
>>>>>>> '26bc52a4-4509-4577-b342-44a679bc628f' child commands
>>>>>>> '[4936d196-a891-4484-9cf5-fceaafbf3364]' executions were completed,
>>>>>>> status 'FAILED'
>>>>>>> 2018-05-11 03:38:04,548-03 ERROR 
>>>>>>> [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotCommand]
>>>>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-66)
>>>>>>> [d5b7fdf5-9c37-4c1f-8543-a7bc75c993a5] Ending command
>>>>>>> 'org.ovirt.engine.core.bll.snapshots.RemoveSnapshotCommand' with
>>>>>>> failure.
>>>>>>> 2018-05-11 03:38:04,557-03 INFO  
>>>>>>> [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotCommand]
>>>>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-66)
>>>>>>> [d5b7fdf5-9c37-4c1f-8543-a7bc75c993a5] Lock freed to object
>>>>>>> 'EngineLock:{exclusiveLocks='[4808bb70-c9cc-4286-aa39-16b579
>>>>>>> 8213ac=LIVE_STORAGE_MIGRATION]', sharedLocks=''}'
>>>>>>>
>>>>>>> I do not see the merge attempt in the vdsm.log, so please send vdsm
>>>>>>> logs for node02.phy.eze.ampgn.com.ar from that time.
>>>>>>>
>>>>>>> Also, did you use the auto-generated snapshot to start the vm?
>>>>>>>
>>>>>>>
>>>>>>> On Fri, May 11, 2018 at 6:11 PM, Juan Pablo <
>>>>>>> pablo.localh...@gmail.com> wrote:
>>>>>>>
>>>>>>>> after the xfs_repair, it says: sorry I could not find valid
>>>>>>>> secondary superblock
>>>>>>>>
>>>>>>>> 2018-05-11 12:09 GMT-03:00 Juan Pablo <pablo.localh...@gmail.com>:
>>>>>>>>
>>>>>>>>> hi,
>>>>>>>>> Alias:
>>>>>>>>> mail02-int_Disk1
>>>>>>>>> Description:
>>>>>>>>> ID:
>>>>>>>>> 65ec515e-0aae-4fe6-a561-387929c7fb4d
>>>>>>>>> Alignment:
>>>>>>>>> Unknown
>>>>>>>>> Disk Profile:
>>>>>>>>> Wipe After Delete:
>>>>>>>>> No
>>>>>>>>>
>>>>>>>>> that one
>>>>>>>>>
>>>>>>>>> 2018-05-11 11:12 GMT-03:00 Benny Zlotnik <bzlot...@redhat.com>:
>>>>>>>>>
>>>>>>>>>> I looked at the logs and I see some disks have moved successfully
>>>>>>>>>> and some failed. Which disk is causing the problems?
>>>>>>>>>>
>>>>>>>>>> On Fri, May 11, 2018 at 5:02 PM, Juan Pablo <
>>>>>>>>>> pablo.localh...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi, just sent you via drive the files. attaching some extra
>>>>>>>>>>> info, thanks thanks and thanks :
>>>>>>>>>>>
>>>>>>>>>>> from inside the migrated vm I had the following attached dmesg
>>>>>>>>>>> output before rebooting
>>>>>>>>>>>
>>>>>>>>>>> regards and thanks again for the help,
>>>>>>>>>>>
>>>>>>>>>>> 2018-05-11 10:45 GMT-03:00 Benny Zlotnik <bzlot...@redhat.com>:
>>>>>>>>>>>
>>>>>>>>>>>> Dropbox or google drive I guess. Also, can you attach
>>>>>>>>>>>> engine.log?
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, May 11, 2018 at 4:43 PM, Juan Pablo <
>>>>>>>>>>>> pablo.localh...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> vdsm is too big for gmail ...any other way I can share it with
>>>>>>>>>>>>> you?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> ---------- Forwrded message ----------
>>>>>>>>>>>>> From: Juan Pablo <pablo.localh...@gmail.com>
>>>>>>>>>>>>> Date: 2018-05-11 10:40 GMT-03:00
>>>>>>>>>>>>> Subject: Re: [ovirt-users] strange issue: vm lost info on disk
>>>>>>>>>>>>> To: Benny Zlotnik <bzlot...@redhat.com>
>>>>>>>>>>>>> Cc: users <Users@ovirt.org>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Benny, thanks for your reply! it was a Live migration. sorry,
>>>>>>>>>>>>> it was from nfs to iscsi, not otherwise. I have reboot the vm for 
>>>>>>>>>>>>> rescue
>>>>>>>>>>>>> and it does not detect any partitions with fdisk, Im running a 
>>>>>>>>>>>>> xfs_repair
>>>>>>>>>>>>> with -n and found some corrupted primary superblock., its still 
>>>>>>>>>>>>> running...
>>>>>>>>>>>>> ( so... there's info in the disk maybe?)
>>>>>>>>>>>>>
>>>>>>>>>>>>> attaching logs, let me know if those are the ones.
>>>>>>>>>>>>> thanks again!
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2018-05-11 9:45 GMT-03:00 Benny Zlotnik <bzlot...@redhat.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can you provide the logs? engine and vdsm.
>>>>>>>>>>>>>> Did you perform a live migration (the VM is running) or cold?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, May 11, 2018 at 2:49 PM, Juan Pablo <
>>>>>>>>>>>>>> pablo.localh...@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi! , Im strugled about an ongoing problem:
>>>>>>>>>>>>>>>  after migrating a vm's disk from an iscsi domain to a nfs
>>>>>>>>>>>>>>> and ovirt reporting the migration was successful, I see there's 
>>>>>>>>>>>>>>> no data
>>>>>>>>>>>>>>> 'inside' the vm's disk. we never had this issues with ovirt so 
>>>>>>>>>>>>>>> Im stranged
>>>>>>>>>>>>>>> about the root cause and if theres a chance of recovering the 
>>>>>>>>>>>>>>> information.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> can you please help me out troubleshooting this one? I would
>>>>>>>>>>>>>>> really appreciate it =)
>>>>>>>>>>>>>>> running ovirt 4.2.1 here!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> thanks in advance,
>>>>>>>>>>>>>>> JP
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Users mailing list -- users@ovirt.org
>>>>>>>>>>>>>>> To unsubscribe send an email to users-le...@ovirt.org
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list -- users@ovirt.org
>>>>> To unsubscribe send an email to users-le...@ovirt.org
>>>>>
>>>>
>>>
>>
>

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org

[ovirt-users] Re: strange issue: vm lost info on disk

Reply via email to