Change in vdsm[master]: storage: Fix abort race in SDM.copy_data
gerrit-hooks has posted comments on this change. Change subject: storage: Fix abort race in SDM.copy_data .. Patch Set 6: * Update Tracker::IGNORE, no bug url/s found * Check Bug-Url::IGNORE, not relevant for branch: master * Check Public Bug::WARN, no public bug url found * Check merged to previous::IGNORE, Not in stable branch (['ovirt-3.6', 'ovirt-4.0']) -- To view, visit https://gerrit.ovirt.org/65150 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0c7d9aadb981e8b851a70c7ad0fdc0d75b46dc51 Gerrit-PatchSet: 6 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam LitkeGerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: storage: Fix abort race in SDM.copy_data
Nir Soffer has posted comments on this change. Change subject: storage: Fix abort race in SDM.copy_data .. Patch Set 5: Code-Review-1 (2 comments) https://gerrit.ovirt.org/#/c/65150/5/tests/storage_sdm_copy_data_test.py File tests/storage_sdm_copy_data_test.py: Line 309: job_id = make_uuid() Line 310: job = storage.sdm.api.copy_data.Job(job_id, 0, source, dest) Line 311: # Simulate status changing to aborting right after _run called Line 312: job._run = partial(_fake_run, job, job._run) Line 313: job.run() I think we can simplify - since we are already using job._run, we can simulate the framework: job = Job() job._status = RUNNING job.abort() The job should be in ABORTING state, so: with self.assertRaises(ActionStopped): job._run() Line 314: self.assertEqual(jobs.STATUS.ABORTED, job.status) Line 315: self.assertEqual(sc.LEGAL_VOL, dst_vol.getLegality()) Line 316: self.assertEqual(gen_id, dst_vol.getMetaParam(sc.GENERATION)) Line 317: https://gerrit.ovirt.org/#/c/65150/5/vdsm/storage/sdm/api/copy_data.py File vdsm/storage/sdm/api/copy_data.py: Line 73: Line 74: with self._status_lock: Line 75: # Do not start copying if we have already been aborted Line 76: if self._status == jobs.STATUS.ABORTING: Line 77: raise exception.ActionStopped() This will be needed by every job supporting abort. We probably need to add a helper in the framework, like _check_aborting. We also need to rename the exception to Aborted. Line 78: self._operation = qemuimg.convert( Line 79: self._source.path, Line 80: self._dest.path, Line 81: srcFormat=src_format, -- To view, visit https://gerrit.ovirt.org/65150 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0c7d9aadb981e8b851a70c7ad0fdc0d75b46dc51 Gerrit-PatchSet: 5 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam LitkeGerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: Yes ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: storage: Fix abort race in SDM.copy_data
gerrit-hooks has posted comments on this change. Change subject: storage: Fix abort race in SDM.copy_data .. Patch Set 5: * Update tracker: IGNORE, no Bug-Url found * Check Bug-Url::WARN, no bug url found, make sure header matches 'Bug-Url: ' and is a valid url. * Check merged to previous::IGNORE, Not in stable branch (['ovirt-3.6', 'ovirt-4.0']) -- To view, visit https://gerrit.ovirt.org/65150 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0c7d9aadb981e8b851a70c7ad0fdc0d75b46dc51 Gerrit-PatchSet: 5 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam LitkeGerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: storage: Fix abort race in SDM.copy_data
gerrit-hooks has posted comments on this change. Change subject: storage: Fix abort race in SDM.copy_data .. Patch Set 4: * Update tracker: IGNORE, no Bug-Url found * Check Bug-Url::WARN, no bug url found, make sure header matches 'Bug-Url: ' and is a valid url. * Check merged to previous::IGNORE, Not in stable branch (['ovirt-3.6', 'ovirt-4.0']) -- To view, visit https://gerrit.ovirt.org/65150 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0c7d9aadb981e8b851a70c7ad0fdc0d75b46dc51 Gerrit-PatchSet: 4 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam LitkeGerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: storage: Fix abort race in SDM.copy_data
gerrit-hooks has posted comments on this change. Change subject: storage: Fix abort race in SDM.copy_data .. Patch Set 3: * Update tracker: IGNORE, no Bug-Url found * Check Bug-Url::WARN, no bug url found, make sure header matches 'Bug-Url: ' and is a valid url. * Check merged to previous::IGNORE, Not in stable branch (['ovirt-3.6', 'ovirt-4.0']) -- To view, visit https://gerrit.ovirt.org/65150 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0c7d9aadb981e8b851a70c7ad0fdc0d75b46dc51 Gerrit-PatchSet: 3 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam LitkeGerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: storage: Fix abort race in SDM.copy_data
Nir Soffer has posted comments on this change. Change subject: storage: Fix abort race in SDM.copy_data .. Patch Set 2: Code-Review-1 (2 comments) https://gerrit.ovirt.org/#/c/65150/2/vdsm/storage/sdm/api/copy_data.py File vdsm/storage/sdm/api/copy_data.py: Line 82 Line 83 Line 84 Line 85 Line 86 I wonder if we should merge start and wait_for_completion() to run(): with self._dest.volume_operation(): self._operation.run() We can keep separate start() and wait() methods, or add them later if needed. I like to have generic start(), stop(), wait(), and run() for everything, if possible. Did you consider to create the operation inside the volume operation context? with guarded.context(locks): with prepare(src), prepare(dst): with dst.volume_operation: with self._status_lock: if self._status == ABORTING: raise ActionStopped self._operation = qemuimg.convert(...) self._operation.run() This means that we check the state right before we start the process. If the job was aborted during locking, taking the lease, setting status to illegal, we abort without starting qemu. Line 60: if self._operation: Line 61: self._operation.abort() Line 62: Line 63: def _run(self): Line 64: print "called: status=%r" % self.status leftover? Line 65: with guarded.context(self._source.locks + self._dest.locks): Line 66: with self._source.prepare(), self._dest.prepare(): Line 67: # Workaround for volumes containing VM configuration info that Line 68: # were created with invalid vdsm metadata. -- To view, visit https://gerrit.ovirt.org/65150 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0c7d9aadb981e8b851a70c7ad0fdc0d75b46dc51 Gerrit-PatchSet: 2 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam LitkeGerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: Yes ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: storage: Fix abort race in SDM.copy_data
gerrit-hooks has posted comments on this change. Change subject: storage: Fix abort race in SDM.copy_data .. Patch Set 2: * Update tracker: IGNORE, no Bug-Url found * Check Bug-Url::WARN, no bug url found, make sure header matches 'Bug-Url: ' and is a valid url. * Check merged to previous::IGNORE, Not in stable branch (['ovirt-3.6', 'ovirt-4.0']) -- To view, visit https://gerrit.ovirt.org/65150 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0c7d9aadb981e8b851a70c7ad0fdc0d75b46dc51 Gerrit-PatchSet: 2 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam LitkeGerrit-Reviewer: Jenkins CI Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: storage: Fix abort race in SDM.copy_data
gerrit-hooks has posted comments on this change. Change subject: storage: Fix abort race in SDM.copy_data .. Patch Set 1: * Update tracker: IGNORE, no Bug-Url found * Check Bug-Url::WARN, no bug url found, make sure header matches 'Bug-Url: ' and is a valid url. * Check merged to previous::IGNORE, Not in stable branch (['ovirt-3.6', 'ovirt-4.0']) -- To view, visit https://gerrit.ovirt.org/65150 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0c7d9aadb981e8b851a70c7ad0fdc0d75b46dc51 Gerrit-PatchSet: 1 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam LitkeGerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: storage: Fix abort race in SDM.copy_data
Adam Litke has uploaded a new change for review. Change subject: storage: Fix abort race in SDM.copy_data .. storage: Fix abort race in SDM.copy_data The copy_data job supports aborting the qemuimg process. If such a process has been started we kill it. If a process hasn't yet been created we do nothing but place the job in aborting state. Later before creating the qemuimg command we want to check if we have been asked to abort. If so, raise ActionStopped instead of continuing. Change-Id: I0c7d9aadb981e8b851a70c7ad0fdc0d75b46dc51 Signed-off-by: Adam Litke--- M tests/storage_sdm_copy_data_test.py M vdsm/storage/sdm/api/copy_data.py 2 files changed, 44 insertions(+), 9 deletions(-) git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/50/65150/1 diff --git a/tests/storage_sdm_copy_data_test.py b/tests/storage_sdm_copy_data_test.py index 6b116d5..0c6589d 100644 --- a/tests/storage_sdm_copy_data_test.py +++ b/tests/storage_sdm_copy_data_test.py @@ -21,6 +21,7 @@ import threading from contextlib import contextmanager +from functools import partial from fakelib import FakeScheduler from monkeypatch import MonkeyPatchScope @@ -285,6 +286,35 @@ self.assertEqual(sc.ILLEGAL_VOL, dst_vol.getLegality()) self.assertEqual(gen_id, dst_vol.getMetaParam(sc.GENERATION)) +@permutations((('file',), ('block',))) +def test_abort_before_copy(self, env_type): +def _fake_run(job_instance, real_run): +job_instance._status = jobs.STATUS.ABORTING +real_run() + +fmt = sc.RAW_FORMAT +with self.get_vols(env_type, fmt, fmt) as (src_chain, dst_chain): +src_vol = src_chain[0] +dst_vol = dst_chain[0] +gen_id = dst_vol.getMetaParam(sc.GENERATION) +source = dict(endpoint_type='div', sd_id=src_vol.sdUUID, + img_id=src_vol.imgUUID, vol_id=src_vol.volUUID, + generation=0) +dest = dict(endpoint_type='div', sd_id=dst_vol.sdUUID, +img_id=dst_vol.imgUUID, vol_id=dst_vol.volUUID, +generation=gen_id) +fake_convert = FakeQemuConvertChecker(src_vol, dst_vol, + error=RuntimeError) +with MonkeyPatchScope([(qemuimg, 'convert', fake_convert)]): +job_id = make_uuid() +job = storage.sdm.api.copy_data.Job(job_id, 0, source, dest) +# Simulate status changing to aborting right after _run called +job._run = partial(_fake_run, job, job._run) +job.run() +self.assertEqual(jobs.STATUS.ABORTED, job.status) +self.assertEqual(sc.LEGAL_VOL, dst_vol.getLegality()) +self.assertEqual(gen_id, dst_vol.getMetaParam(sc.GENERATION)) + def test_wrong_generation(self): fmt = sc.RAW_FORMAT with self.get_vols('block', fmt, fmt) as (src_chain, dst_chain): @@ -318,20 +348,23 @@ self.ready_event = threading.Event() def __call__(self, *args, **kwargs): -assert sc.LEGAL_VOL == self.src_vol.getLegality() -assert sc.ILLEGAL_VOL == self.dst_vol.getLegality() -return FakeQemuImgOperation(self.ready_event, self.wait_for_abort, +return FakeQemuImgOperation(self.src_vol, self.dst_vol, +self.ready_event, self.wait_for_abort, self.error) class FakeQemuImgOperation(object): -def __init__(self, ready_event, wait_for_abort, error): +def __init__(self, src_vol, dst_vol, ready_event, wait_for_abort, error): +self.src_vol = src_vol +self.dst_vol = dst_vol self.ready_event = ready_event self.wait_for_abort = wait_for_abort self.error = error self.abort_event = threading.Event() def start(self): +assert sc.LEGAL_VOL == self.src_vol.getLegality() +assert sc.ILLEGAL_VOL == self.dst_vol.getLegality() self.ready_event.set() def abort(self): diff --git a/vdsm/storage/sdm/api/copy_data.py b/vdsm/storage/sdm/api/copy_data.py index 3dc6fff..7182f49 100644 --- a/vdsm/storage/sdm/api/copy_data.py +++ b/vdsm/storage/sdm/api/copy_data.py @@ -26,6 +26,7 @@ from vdsm import jobs from vdsm import properties from vdsm import qemuimg +from vdsm.common import exception from vdsm.storage import constants as sc from vdsm.storage import guarded from vdsm.storage import workarounds @@ -60,12 +61,9 @@ self._operation.abort() def _run(self): +print "called: status=%r" % self.status with guarded.context(self._source.locks + self._dest.locks): with self._source.prepare(), self._dest.prepare(): -# Do not start copying if we have already been aborted -