On 20.06.2016 10:40, Nikolay Shirokovskiy wrote:
> This patch is not critical but nice to have. The original motivation
> was error message in logs on undefining domain thru vz driver.
> Undefine procedure drops domain lock while waiting for detaching
> disks vz sdk call. Meanwhile vz sdk event domain-config-changed
> arrives, its handler finds domain and is blocked waiting for job
> condition. After undefine API call finishes event processing procedes
> and tries to refreshes domain config thru existing vz sdk domain handle.
> Domain does not exists anymore and event processing fails. Everything
> is fine we just don't want to see error message in log for this
> particular case.
>
> Fortunately domain has flag that domain is removed from list. This
> also imply that vz sdk domain is also undefined. Thus if we check
> for this flag right after domain is locked again on accuiring
> job condition we gracefully handle this situation.
>
> Actually the race can happen in other situations too. Any
> time we wait for job condition in mutualy exclusive job in
> time when we acquire it vz sdk domain can cease to exist.
> So instead of general internal error we can return domain
> not found which is easier to handle. We don't need to patch
> other places in mutually exclusive jobs where domain lock
> is dropped as if job is started domain can't be undefine
> by mutually exclusive undefine job.
>
> In case of API calls that are not jobs (load snapshots etc) we'd
> better to check if domain exists every time domain lock is
> reacquired. Fortunately these calls drop domain lock only
> once to call appropriate vz sdk API call.
>
> The code of this patch is quite similar to qemu driver checks
> for is domain is active after acquiring a job. The difference
> only while qemu domain is operational while process is active
> vz domain is operational while domain exists.
>
> Signed-off-by: Nikolay Shirokovskiy <[email protected]>
> ---
> src/vz/vz_driver.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
> src/vz/vz_sdk.c | 32 +++++++++++++++++++++++++++++---
> 2 files changed, 75 insertions(+), 3 deletions(-)
>
[snip]
Another case of hindsight is 20/20 )) I think the below changes are
not good and should be dropped from the patch. We can't rely on 'removing'
flag in case other then mutually exclusive jobs.
>
> +static void
> +vzDomainObjCheckExist(virDomainObjPtr dom)
> +{
> + char uuidstr[VIR_UUID_STRING_BUFLEN];
> +
> + if (!dom->removing)
> + return;
> +
> + virUUIDFormat(dom->def->uuid, uuidstr);
> + virReportError(VIR_ERR_NO_DOMAIN,
> + _("no domain with matching uuid '%s' (%s)"),
> + uuidstr, dom->def->name);
> +}
> +
> int
> prlsdkDomainManagedSaveRemove(virDomainObjPtr dom)
> {
> @@ -3959,8 +3979,10 @@ prlsdkDomainManagedSaveRemove(virDomainObjPtr dom)
> PRL_HANDLE job;
>
> job = PrlVm_DropSuspendedState(privdom->sdkdom);
> - if (PRL_FAILED(waitDomainJob(job, dom)))
> + if (PRL_FAILED(waitDomainJob(job, dom))) {
> + vzDomainObjCheckExist(dom);
> return -1;
> + }
>
> return 0;
> }
> @@ -4394,8 +4416,10 @@ prlsdkLoadSnapshots(virDomainObjPtr dom)
> char *treexml = NULL;
>
> job = PrlVm_GetSnapshotsTreeEx(privdom->sdkdom,
> PGST_WITHOUT_SCREENSHOTS);
> - if (PRL_FAILED(getDomainJobResult(job, dom, &result)))
> + if (PRL_FAILED(getDomainJobResult(job, dom, &result))) {
> + vzDomainObjCheckExist(dom);
> goto cleanup;
> + }
>
> if (!(treexml = prlsdkGetStringParamVar(PrlResult_GetParamAsString,
> result)))
> goto cleanup;
> @@ -4427,8 +4451,10 @@ int prlsdkDeleteSnapshot(virDomainObjPtr dom, const
> char *uuid, bool children)
> PRL_HANDLE job;
>
> job = PrlVm_DeleteSnapshot(privdom->sdkdom, uuid, children);
> - if (PRL_FAILED(waitDomainJob(job, dom)))
> + if (PRL_FAILED(waitDomainJob(job, dom))) {
> + vzDomainObjCheckExist(dom);
> return -1;
> + }
>
> return 0;
> }
>
--
libvir-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/libvir-list