I see
>To be honest, I'm not sure that I remember the reason for detaching from
the parent.
>I think it was related to compensation but anyway I believe we should
preserve the execution context, and by that probably preserve the
>correlation id.
>What do you think?
I'm not sure, AFAIK we usually detach in the endAction methods when the
child running is async, to prevent NPEs when the parent command is cleared
before the child, but this is usually done if the parent's success does not
depend on the child completing successfully. In my first reply I thought it
was executed in an endAction method, but now that I see it isn't and I'm
not sure if that's the case here

Another thing that may be possible is to set the correlation id in the
parameters, as suggested in Liran's patch[1]

[1] https://gerrit.ovirt.org/c/ovirt-system-tests/+/112469

On Wed, Dec 2, 2020 at 6:38 PM Arik Hadas <[email protected]> wrote:

>
>
> On Wed, Dec 2, 2020 at 5:27 PM Benny Zlotnik <[email protected]> wrote:
>
>> The failing test was merged two days ago[1]
>> A snapshot is created when the VM isn't running as well?
>>
>
> Yes, the reason is twofold:
> 1. When you create the snapshot and clone the disks from the snapshot, the
> VM can start during the clone operation (no need to wait for the clone
> operation to complete before starting the VM)
> 2. That way the implementation for both running and non-running VM is the
> same
>
> We took this approach for export VM to OVA for quite some time
>
>
>> My guess is because snapshot removal runs detached from the parent[2]
>> (which is correct IMHO), it doesn't have the same correlation_id, so
>> all_jobs_finished[3] doesn't wait for it. Perhaps snapshot creation/removal
>> can be skipped if it's a cold clone?
>>
>
> To be honest, I'm not sure that I remember the reason for detaching from
> the parent.
> I think it was related to compensation but anyway I believe we should
> preserve the execution context, and by that probably preserve the
> correlation id.
> What do you think?
>
>
>>
>> [1] https://gerrit.ovirt.org/c/ovirt-system-tests/+/112039
>> [2]
>> https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/CloneVmCommand.java#L273
>> [3]
>> https://github.com/oVirt/ovirt-system-tests/blob/master/basic-suite-master/test-scenarios/test_004_basic_sanity.py#L518
>>
>> On Wed, Dec 2, 2020 at 2:49 PM Liran Rotenberg <[email protected]>
>> wrote:
>>
>>>
>>>
>>> On Wed, Dec 2, 2020 at 2:03 PM Nir Soffer <[email protected]> wrote:
>>>
>>>> On Wed, Dec 2, 2020 at 1:50 PM Steven Rosenberg <[email protected]>
>>>> wrote:
>>>> >
>>>> > Dear Personnel,
>>>> >
>>>> > It seems that the OST is failing due to a storage issue [1]. The
>>>> failure is on the test_verify_and_remove_cloned_vm test which is not
>>>> changed on the patch [2] .
>>>> >
>>>> > The engine log prints the error:
>>>> >
>>>> > 2020-12-01 13:07:48,297+01 ERROR
>>>> [org.ovirt.engine.core.bll.storage.domain.AttachStorageDomainToPoolCommand]
>>>> (default task-1) []
>>>> >
>>>> > An error occurred while fetching unregistered disks from Storage
>>>> Domain id '26d55c27-bf61-4f01-b5a3-d9a6a01c3e8a'
>>>> > 2020-12-01 13:07:48,298+01 DEBUG
>>>> [org.ovirt.engine.core.bll.storage.domain.AttachStorageDomainToPoolCommand]
>>>> (default task-1) [] Skipping format update for domain
>>>> '26d55c27-bf61-4f01-b5a3-d9a6a01c3e8a' (type 'ImportExport')
>>>> >
>>>> > and returns the following error:
>>>> >
>>>> > 2020-12-01 13:17:46,784+01 ERROR
>>>> [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default
>>>> task-2) [] Operation Failed:
>>>> >
>>>> > [Cannot remove VM. The VM is performing an operation on a Snapshot.
>>>> Please wait for the operation to finish, and try again.]
>>>>
>>>> Looks like bad synchronization with the previous test, not waiting
>>>> until the other tests changing
>>>> the snapshot has completed.
>>>>
>>> +Shmuel Melamud <[email protected]> +Arik Hadas <[email protected]>
>>> That seems correct. We had https://bugzilla.redhat.com/1177156, as part
>>> of clone VM there is a snapshot created and later removed. It was merged
>>> pretty long ago (Oct 19).
>>> Maybe in the current run you had no luck in the timing but that is just
>>> an indicator this test might be flaky and should have better handling.
>>>
>>>>
>>>> > Maybe there is a recent change causing a regression here?
>>>> >
>>>> >
>>>> >
>>>> > [1]
>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/14137/
>>>> > [2]
>>>> https://github.com/oVirt/ovirt-system-tests/blob/master/basic-suite-master/test-scenarios/test_004_basic_sanity.py#L526
>>>> >
>>>> > With Best Regards.
>>>> >
>>>> > Steven.
>>>> >
>>>> >
>>>> _______________________________________________
>>>> Devel mailing list -- [email protected]
>>>> To unsubscribe send an email to [email protected]
>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>>> oVirt Code of Conduct:
>>>> https://www.ovirt.org/community/about/community-guidelines/
>>>> List Archives:
>>>> https://lists.ovirt.org/archives/list/[email protected]/message/YVUXG2W7G7TIGMGSAQIIFOMCRN63ZDV2/
>>>>
>>> _______________________________________________
>>> Devel mailing list -- [email protected]
>>> To unsubscribe send an email to [email protected]
>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/[email protected]/message/5327PJXJILJWAWHWI57GYSN4YNWCYJ2C/
>>>
>>
_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/MEBNH752JYC62E7BXMZQY4467STDCN5R/

Reply via email to