Please find the result:

psql -d engine -c "\x on" -c "select * from images where image_group_id = 
'd7bd480d-2c51-4141-a386-113abf75219e';"

Expanded display is on.

-[ RECORD 1 ]---------+-------------------------------------

image_guid            | 8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8

creation_date         | 2020-04-23 14:59:23+02

size                  | 161061273600

it_guid               | 00000000-0000-0000-0000-000000000000

parentid              | 00000000-0000-0000-0000-000000000000

imagestatus           | 1

lastmodified          | 2020-07-06 20:38:36.093+02

vm_snapshot_id        | 6bc03db7-82a3-4b7e-9674-0bdd76933eb8

volume_type           | 2

volume_format         | 4

image_group_id        | d7bd480d-2c51-4141-a386-113abf75219e

_create_date          | 2020-04-23 14:59:20.919344+02

_update_date          | 2020-07-06 20:38:36.093788+02

active                | f

volume_classification | 1

qcow_compat           | 2

-[ RECORD 2 ]---------+-------------------------------------

image_guid            | 6197b30d-0732-4cc7-aef0-12f9f6e9565b

creation_date         | 2020-07-06 20:38:38+02

size                  | 161061273600

it_guid               | 00000000-0000-0000-0000-000000000000

parentid              | 8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8

imagestatus           | 1

lastmodified          | 1970-01-01 01:00:00+01

vm_snapshot_id        | fd5193ac-dfbc-4ed2-b86c-21caa8009bb2

volume_type           | 2

volume_format         | 4

image_group_id        | d7bd480d-2c51-4141-a386-113abf75219e

_create_date          | 2020-07-06 20:38:36.093788+02

_update_date          | 2020-07-06 20:38:52.139003+02

active                | t

volume_classification | 0

qcow_compat           | 2


psql -d engine -c "\x on" -c "SELECT s.* FROM snapshots s, images i where 
i.vm_snapshot_id = s.snapshot_id and i.image_guid = 
'6197b30d-0732-4cc7-aef0-12f9f6e9565b';"

Expanded display is on.

-[ RECORD 1 
]-----------+------------------------------------------------------------------------------------------------------------------------------------------------------

snapshot_id             | fd5193ac-dfbc-4ed2-b86c-21caa8009bb2

vm_id                   | b5534254-660f-44b1-bc83-d616c98ba0ba

snapshot_type           | ACTIVE

status                  | OK

description             | Active VM

creation_date           | 2020-04-23 14:59:20.171+02

app_list                | 
kernel-3.10.0-957.12.2.el7,xorg-x11-drv-qxl-0.1.5-4.el7.1,kernel-3.10.0-957.12.1.el7,kernel-3.10.0-957.38.1.el7,ovirt-guest-agent-common-1.0.14-1.el7

vm_configuration        |

_create_date            | 2020-04-23 14:59:20.154023+02

_update_date            | 2020-07-03 17:33:17.483215+02

memory_metadata_disk_id |

memory_dump_disk_id     |

vm_configuration_broken | f


Thanks.



On Tue, 2020-07-21 at 13:45 +0300, Benny Zlotnik wrote:
I forgot to add the `\x on` to make the output readable, can you run it with:
$ psql -U engine -d engine -c "\x on" -c "<rest of the query...>"

On Mon, Jul 20, 2020 at 2:50 PM Arsène Gschwind 
<arsene.gschw...@unibas.ch<mailto:arsene.gschw...@unibas.ch>> wrote:
Hi,

Please find the output:

select * from images where image_group_id = 
'd7bd480d-2c51-4141-a386-113abf75219e';


              image_guid              |     creation_date      |     size     | 
              it_guid                |               parentid               | 
imagestatus |        lastmodified        |            vm_snapshot_id            
| volume_type | volume_for

mat |            image_group_id            |         _create_date          |    
     _update_date          | active | volume_classification | qcow_compat

--------------------------------------+------------------------+--------------+--------------------------------------+--------------------------------------+-------------+----------------------------+--------------------------------------+-------------+-----------

----+--------------------------------------+-------------------------------+-------------------------------+--------+-----------------------+-------------

 8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8 | 2020-04-23 14:59:23+02 | 161061273600 | 
00000000-0000-0000-0000-000000000000 | 00000000-0000-0000-0000-000000000000 |   
        1 | 2020-07-06 20:38:36.093+02 | 6bc03db7-82a3-4b7e-9674-0bdd76933eb8 | 
          2 |

  4 | d7bd480d-2c51-4141-a386-113abf75219e | 2020-04-23 14:59:20.919344+02 | 
2020-07-06 20:38:36.093788+02 | f      |                     1 |           2

 6197b30d-0732-4cc7-aef0-12f9f6e9565b | 2020-07-06 20:38:38+02 | 161061273600 | 
00000000-0000-0000-0000-000000000000 | 8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8 |   
        1 | 1970-01-01 01:00:00+01     | fd5193ac-dfbc-4ed2-b86c-21caa8009bb2 | 
          2 |

  4 | d7bd480d-2c51-4141-a386-113abf75219e | 2020-07-06 20:38:36.093788+02 | 
2020-07-06 20:38:52.139003+02 | t      |                     0 |           2

(2 rows)



SELECT s.* FROM snapshots s, images i where i.vm_snapshot_id = s.snapshot_id 
and i.image_guid = '6197b30d-0732-4cc7-aef0-12f9f6e9565b';

             snapshot_id              |                vm_id                 | 
snapshot_type | status | description |       creation_date        |             
                                                          app_list

                                 | vm_configuration |         _create_date      
    |         _update_date          | memory_metadata_disk_id | 
memory_dump_disk_id | vm_configuration_broken

--------------------------------------+--------------------------------------+---------------+--------+-------------+----------------------------+----------------------------------------------------------------------------------------------------------------------

---------------------------------+------------------+-------------------------------+-------------------------------+-------------------------+---------------------+-------------------------

 fd5193ac-dfbc-4ed2-b86c-21caa8009bb2 | b5534254-660f-44b1-bc83-d616c98ba0ba | 
ACTIVE        | OK     | Active VM   | 2020-04-23 14:59:20.171+02 | 
kernel-3.10.0-957.12.2.el7,xorg-x11-drv-qxl-0.1.5-4.el7.1,kernel-3.10.0-957.12.1.el7,kernel-3.10.0-957.38.1.el7,ovirt

-guest-agent-common-1.0.14-1.el7 |                  | 2020-04-23 
14:59:20.154023+02 | 2020-07-03 17:33:17.483215+02 |                         |  
                   | f

(1 row)

Thanks,
Arsene

On Sun, 2020-07-19 at 16:34 +0300, Benny Zlotnik wrote:

Sorry, I only replied to the question, in addition to removing the

image from the images table, you may also need to set the parent as

the active image and remove the snapshot referenced by this image from

the database. Can you provide the output of:

$ psql -U engine -d engine -c "select * from images where

image_group_id = <disk_id>";


As well as

$ psql -U engine -d engine -c "SELECT s.* FROM snapshots s, images i

where i.vm_snapshot_id = s.snapshot_id and i.image_guid =

'6197b30d-0732-4cc7-aef0-12f9f6e9565b';"


On Sun, Jul 19, 2020 at 12:49 PM Benny Zlotnik <

<mailto:bzlot...@redhat.com>

bzlot...@redhat.com

> wrote:


It can be done by deleting from the images table:

$ psql -U engine -d engine -c "DELETE FROM images WHERE image_guid =

'6197b30d-0732-4cc7-aef0-12f9f6e9565b'";


of course the database should be backed up before doing this




On Fri, Jul 17, 2020 at 6:45 PM Nir Soffer <

<mailto:nsof...@redhat.com>

nsof...@redhat.com

> wrote:


On Thu, Jul 16, 2020 at 11:33 AM Arsène Gschwind

<

<mailto:arsene.gschw...@unibas.ch>

arsene.gschw...@unibas.ch

> wrote:


It looks like the Pivot completed successfully, see attached vdsm.log.

Is there a way to recover that VM?

Or would it be better to recover the VM from Backup?


This what we see in the log:


1. Merge request recevied


2020-07-13 11:18:30,282+0200 INFO  (jsonrpc/7) [api.virt] START

merge(drive={u'imageID': u'd7bd480d-2c51-4141-a386-113abf75219e',

u'volumeID': u'6197b30d-0732-4cc7-aef0-12f9f6e9565b', u'domainID':

u'33777993-a3a5-4aad-a24c-dfe5e473faca', u'poolID':

u'00000002-0002-0002-0002-000000000289'},

baseVolUUID=u'8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8',

topVolUUID=u'6197b30d-0732-4cc7-aef0-12f9f6e9565b', bandwidth=u'0',

jobUUID=u'720410c3-f1a0-4b25-bf26-cf40aa6b1f97')

from=::ffff:10.34.38.31,39226,

flow_id=4a8b9527-06a3-4be6-9bb9-88630febc227,

vmId=b5534254-660f-44b1-bc83-d616c98ba0ba (api:48)


To track this job, we can use the jobUUID: 720410c3-f1a0-4b25-bf26-cf40aa6b1f97

and the top volume UUID: 6197b30d-0732-4cc7-aef0-12f9f6e9565b


2. Starting the merge


2020-07-13 11:18:30,690+0200 INFO  (jsonrpc/7) [virt.vm]

(vmId='b5534254-660f-44b1-bc83-d616c98ba0ba') Starting merge with

jobUUID=u'720410c3-f1a0-4b25-bf26-cf40aa6b1f97', original

chain=8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8 <

6197b30d-0732-4cc7-aef0-12f9f6e9565b (top), disk='sda', base='sda[1]',

top=None, bandwidth=0, flags=12 (vm:5945)


We see the original chain:

8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8 <

6197b30d-0732-4cc7-aef0-12f9f6e9565b (top)


3. The merge was completed, ready for pivot


2020-07-13 11:19:00,992+0200 INFO  (libvirt/events) [virt.vm]

(vmId='b5534254-660f-44b1-bc83-d616c98ba0ba') Block job ACTIVE_COMMIT

for drive 
/rhev/data-center/mnt/blockSD/33777993-a3a5-4aad-a24c-dfe5e473faca/images/d7bd480d-2c51-4141-a386-113abf75219e/6197b30d-0732-4cc7-aef0-12f9f6e9565b

is ready (vm:5847)


At this point parent volume contains all the data in top volume and we can pivot

to the parent volume.


4. Vdsm detect that the merge is ready, and start the clean thread

that will complete the merge


2020-07-13 11:19:06,166+0200 INFO  (periodic/1) [virt.vm]

(vmId='b5534254-660f-44b1-bc83-d616c98ba0ba') Starting cleanup thread

for job: 720410c3-f1a0-4b25-bf26-cf40aa6b1f97 (vm:5809)


5. Requesting pivot to parent volume:


2020-07-13 11:19:06,717+0200 INFO  (merge/720410c3) [virt.vm]

(vmId='b5534254-660f-44b1-bc83-d616c98ba0ba') Requesting pivot to

complete active layer commit (job

720410c3-f1a0-4b25-bf26-cf40aa6b1f97) (vm:6205)


6. Pivot was successful


2020-07-13 11:19:06,734+0200 INFO  (libvirt/events) [virt.vm]

(vmId='b5534254-660f-44b1-bc83-d616c98ba0ba') Block job ACTIVE_COMMIT

for drive 
/rhev/data-center/mnt/blockSD/33777993-a3a5-4aad-a24c-dfe5e473faca/images/d7bd480d-2c51-4141-a386-113abf75219e/6197b30d-0732-4cc7-aef0-12f9f6e9565b

has completed (vm:5838)


7. Vdsm wait until libvirt updates the xml:


2020-07-13 11:19:06,756+0200 INFO  (merge/720410c3) [virt.vm]

(vmId='b5534254-660f-44b1-bc83-d616c98ba0ba') Pivot completed (job

720410c3-f1a0-4b25-bf26-cf40aa6b1f97) (vm:6219)


8. Syncronizing vdsm metadata


2020-07-13 11:19:06,776+0200 INFO  (merge/720410c3) [vdsm.api] START

imageSyncVolumeChain(sdUUID='33777993-a3a5-4aad-a24c-dfe5e473faca',

imgUUID='d7bd480d-2c51-4141-a386-113abf75219e',

volUUID='6197b30d-0732-4cc7-aef0-12f9f6e9565b',

newChain=['8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8']) from=internal,

task_id=b8f605bd-8549-4983-8fc5-f2ebbe6c4666 (api:48)


We can see the new chain:

['8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8']


2020-07-13 11:19:07,005+0200 INFO  (merge/720410c3) [storage.Image]

Current chain=8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8 <

6197b30d-0732-4cc7-aef0-12f9f6e9565b (top)  (image:1221)


The old chain:

8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8 <

6197b30d-0732-4cc7-aef0-12f9f6e9565b (top)


2020-07-13 11:19:07,006+0200 INFO  (merge/720410c3) [storage.Image]

Unlinking subchain: ['6197b30d-0732-4cc7-aef0-12f9f6e9565b']

(image:1231)

2020-07-13 11:19:07,017+0200 INFO  (merge/720410c3) [storage.Image]

Leaf volume 6197b30d-0732-4cc7-aef0-12f9f6e9565b is being removed from

the chain. Marking it ILLEGAL to prevent data corruption (image:1239)


This matches what we see on storage.


9. Merge job is untracked


2020-07-13 11:19:21,134+0200 INFO  (periodic/1) [virt.vm]

(vmId='b5534254-660f-44b1-bc83-d616c98ba0ba') Cleanup thread

<vdsm.virt.vm.LiveMergeCleanupThread object at 0x7fa1e0370350>

successfully completed, untracking job

720410c3-f1a0-4b25-bf26-cf40aa6b1f97

(base=8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8,

top=6197b30d-0732-4cc7-aef0-12f9f6e9565b) (vm:5752)


This was a successful merge on vdsm side.


We don't see any more requests for the top volume in this log. The next step to

complete the merge it to delete the volume 6197b30d-0732-4cc7-aef0-12f9f6e9565b

but this can be done only on the SPM.


To understand why this did not happen, we need engine log showing this

interaction,

and logs from the SPM host from the same time.


Please file a bug about this and attach these logs (and the vdsm log

you sent here).

Fixing this vm is important but preventing this bug for other users is even more

important.



How to fix the volume metadata:


1. Edit 8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8.meta


Change:

VOLTYPE=INTERNAL


To:

VOLTYPE=LEAF


See attached file for reference.


2. Truncate the file to 512 bytes


truncate -s 512 8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8.meta


3. Verify the file size


$ ls -lh 8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8.meta

-rw-r--r--. 1 nsoffer nsoffer 512 Jul 17 18:17

8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8.meta


4. Get the slot number for this volume from the LV using MD_N and

compute the offset


(copied from your pdf)


lvs -o vg_name,lv_name,tags | grep d7bd480d-2c51-4141-a386-113abf75219e


33777993-a3a5-4aad-a24c-dfe5e473faca

6197b30d-0732-4cc7-aef0-12f9f6e9565b

IU_d7bd480d-2c51-4141-a386-113abf75219e,MD_58,PU_8e412b5a-

85ec-4c53-a5b8-dfb4d6d987b8


33777993-a3a5-4aad-a24c-dfe5e473faca

8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8

IU_d7bd480d-2c51-4141-a386-113abf75219e,MD_28,PU_00000000-

0000-0000-0000-000000000000


5. Get the metadata from the slot to verify that we change the right metadata


dd if=/dev/33777993-a3a5-4aad-a24c-dfe5e473faca/metadata bs=512

count=1 skip=1277952 iflag=skip_bytes > /tmp/8e412b5a-85ec-4c53-a5b8-

dfb4d6d987b8.meta.bad


Compare 8e412b5a-85ec-4c53-a5b8-dfb4d6d987b8.meta.bad with the fixed

file, the only change

should be the VOLTYPE=LEAF line, and the amount of padding.


6. Write new metadata to storage


dd of=/dev/33777993-a3a5-4aad-a24c-dfe5e473faca/metadata bs=512

count=1 seek=1277952 oflag=direct,seek_bytes conv=fsync <

/tmp/8e412b5a-85ec-4c53-a5b8-

dfb4d6d987b8.meta.fixed


7. Delete the lv 6197b30d-0732-4cc7-aef0-12f9f6e9565b on the SPM host


WARNING: this must be done on the SPM host, otherwise you may corrupt

the VG metadata.


If you selected the wipe-after-delete option for this disk, you want

to wipe it before

deleting. If you selected the discard-after-delete you want to discard

the lv before

deleting it.


Activate the lv on the SPM host:


    lvchange -ay

33777993-a3a5-4aad-a24c-dfe5e473faca/6197b30d-0732-4cc7-aef0-12f9f6e9565b


If needed, wipe it:


    blkdiscard --zeroout --step 32m

/dev/33777993-a3a5-4aad-a24c-dfe5e473faca/6197b30d-0732-4cc7-aef0-12f9f6e9565b


If needed, discard it:


    blkdiscard 
/dev/33777993-a3a5-4aad-a24c-dfe5e473faca/6197b30d-0732-4cc7-aef0-12f9f6e9565b


Deactivate the lv:


    lvchange -an

33777993-a3a5-4aad-a24c-dfe5e473faca/6197b30d-0732-4cc7-aef0-12f9f6e9565b


Remove the lv:


       lvremove

33777993-a3a5-4aad-a24c-dfe5e473faca/6197b30d-0732-4cc7-aef0-12f9f6e9565b


9. Fixing engine db


Benny, Eyal: can you add instructions how to remove the volume on the

engine side?


After the volume is removed from engine side, starting the vm will succeed.


--

Arsène Gschwind
Fa. Sapify AG im Auftrag der universitaet Basel
IT Services
Klinelbergstr. 70 | CH-4056 Basel | Switzerland
Tel: +41 79 449 25 63 | http://its.unibas.ch
ITS-ServiceDesk: support-...@unibas.ch<mailto:support-...@unibas.ch> | +41 61 
267 14 11

--

Arsène Gschwind
Fa. Sapify AG im Auftrag der universitaet Basel
IT Services
Klinelbergstr. 70 | CH-4056 Basel | Switzerland
Tel: +41 79 449 25 63 | http://its.unibas.ch
ITS-ServiceDesk: support-...@unibas.ch<mailto:support-...@unibas.ch> | +41 61 
267 14 11
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TC3VAW5HP6A4IYTP2EALX7ZXEBPQQOXF/

Reply via email to