Change in vdsm[master]: jobs: Fix abort semantics
gerrit-hooks has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 7: * Update Tracker::IGNORE, no bug url/s found * Set MODIFIED::IGNORE, no Bug-Url found. -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 7 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Adam Litke Gerrit-Reviewer: Dan Kenigsberg Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: Shahar Havivi Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
Nir Soffer has submitted this change and it was merged. Change subject: jobs: Fix abort semantics .. jobs: Fix abort semantics Prior to this commit jobs.abort was treated as an synchronous operation and if it returned successfully the caller could assume that no operations related to the job were still running. Unforatunately this assumption is wrong since most underlying operations are aborted asynchronously (ie. by sending a signal to a process). To fix this we must adopt async abort sementics in the jobs API. If a job is pending and abort is called we can simply move it to aborted state. If the job is running we move it to aborting state, call the abort helper (_abort) and return. When run() finishes it will move an aborting job to aborted. We must not autodelete or otherwise change the state of jobs with aborting status because it could mask the fact that a process is still changing the host or storage. Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Signed-off-by: Adam Litke Reviewed-on: https://gerrit.ovirt.org/65102 Reviewed-by: Nir Soffer Continuous-Integration: Jenkins CI --- M lib/vdsm/jobs.py M tests/jobsTests.py 2 files changed, 105 insertions(+), 46 deletions(-) Approvals: Adam Litke: Verified Nir Soffer: Looks good to me, approved Jenkins CI: Passed CI tests -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: merged Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 7 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Adam Litke Gerrit-Reviewer: Dan Kenigsberg Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: Shahar Havivi Gerrit-Reviewer: gerrit-hooks ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
Adam Litke has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 6: Verified+1 Verified with unit tests. -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 6 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Adam Litke Gerrit-Reviewer: Dan Kenigsberg Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: Shahar Havivi Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
Nir Soffer has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 6: Code-Review+2 Lets make some progress. -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 6 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Adam Litke Gerrit-Reviewer: Dan Kenigsberg Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: Shahar Havivi Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
gerrit-hooks has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 6: * Update Tracker::IGNORE, no bug url/s found * Check Bug-Url::IGNORE, not relevant for branch: master * Check Public Bug::WARN, no public bug url found * Check merged to previous::IGNORE, Not in stable branch (['ovirt-3.6', 'ovirt-4.0']) -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 6 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Adam Litke Gerrit-Reviewer: Dan Kenigsberg Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: Shahar Havivi Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
Adam Litke has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 5: (1 comment) https://gerrit.ovirt.org/#/c/65102/5/lib/vdsm/jobs.py File lib/vdsm/jobs.py: Line 237: """ Line 238: raise NotImplementedError() Line 239: Line 240: def _autodelete_if_required(self): Line 241: if self.autodelete: > We need a log here: Done Line 242: timeout = config.getint("jobs", "autodelete_delay") Line 243: if timeout >= 0: Line 244: _scheduler.schedule(timeout, self._delete) Line 245: -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 5 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Adam Litke Gerrit-Reviewer: Dan Kenigsberg Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: Shahar Havivi Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: Yes ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
Nir Soffer has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 5: Code-Review+1 (1 comment) Francesco, would take another look? https://gerrit.ovirt.org/#/c/65102/5/lib/vdsm/jobs.py File lib/vdsm/jobs.py: Line 237: """ Line 238: raise NotImplementedError() Line 239: Line 240: def _autodelete_if_required(self): Line 241: if self.autodelete: We need a log here: "Job %r will be deleted in %d seconds" But we can add this later. Line 242: timeout = config.getint("jobs", "autodelete_delay") Line 243: if timeout >= 0: Line 244: _scheduler.schedule(timeout, self._delete) Line 245: -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 5 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Adam Litke Gerrit-Reviewer: Dan Kenigsberg Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: Shahar Havivi Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: Yes ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
gerrit-hooks has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 5: * Update tracker: IGNORE, no Bug-Url found * Check Bug-Url::WARN, no bug url found, make sure header matches 'Bug-Url: ' and is a valid url. * Check merged to previous::IGNORE, Not in stable branch (['ovirt-3.6', 'ovirt-4.0']) -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 5 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Adam Litke Gerrit-Reviewer: Dan Kenigsberg Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: Shahar Havivi Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
Adam Litke has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 4: (6 comments) https://gerrit.ovirt.org/#/c/65102/4/lib/vdsm/jobs.py File lib/vdsm/jobs.py: Line 146 Line 147 Line 148 Line 149 Line 150 > We can log here that we are starting the job. Done Line 132: if self.status == STATUS.PENDING: Line 133: # Autodelete should only be handled here for pending state. Line 134: # There is no operation running so we can go straight to Line 135: # aborted state. In all other cases, autodelete is handled as Line 136: # the _run method finishes. > +1 (for consistency purposes from my POV) Done Line 137: self._status = STATUS.ABORTED Line 138: if self.autodelete: Line 139: self._autodelete() Line 140: elif self.status == STATUS.RUNNING: Line 148: raise JobNotActive() Line 149: Line 150: def run(self): Line 151: if not self._may_run(): Line 152: return > We need log info for starting the job - maybe inside _may_run? Done Line 153: try: Line 154: self._run() Line 155: except exception.ActionStopped: Line 156: self._abort_completed() Line 155: except exception.ActionStopped: Line 156: self._abort_completed() Line 157: except Exception as e: Line 158: self._run_failed(e) Line 159: else: > For extra consistency, we can introduce _run_completed() for these lines. Done Line 160: with self._status_lock: Line 161: self._status = STATUS.DONE Line 162: finally: Line 163: if self.autodelete: Line 157: except Exception as e: Line 158: self._run_failed(e) Line 159: else: Line 160: with self._status_lock: Line 161: self._status = STATUS.DONE > What we miss in the successful case ia log info about job completing. Done Line 162: finally: Line 163: if self.autodelete: Line 164: self._autodelete() Line 165: PS4, Line 177: _abort_completed > perhaps _run_aborted(self) could be a bit clearer? I prefer the current name but I'll add some docstrings to these methods to make their scope and function more clear. -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 4 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Adam Litke Gerrit-Reviewer: Dan Kenigsberg Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: Shahar Havivi Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: Yes ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
Adam Litke has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 4: (1 comment) https://gerrit.ovirt.org/#/c/65102/4/lib/vdsm/jobs.py File lib/vdsm/jobs.py: Line 162: finally: Line 163: if self.autodelete: Line 164: self._autodelete() Line 165: Line 166: def _may_run(self): Add docstrings for each of these helpers. Line 167: with self._status_lock: Line 168: if self.status == STATUS.ABORTED: Line 169: logging.debug('Refusing to run aborted job %r', self._id) Line 170: return False -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 4 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Adam Litke Gerrit-Reviewer: Dan Kenigsberg Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: Shahar Havivi Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: Yes ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
Francesco Romani has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 4: designwise ACK from me. -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 4 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Adam Litke Gerrit-Reviewer: Dan Kenigsberg Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: Shahar Havivi Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
Francesco Romani has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 4: (2 comments) initial review. Looks nice, but I need to doublecheck the state machine. https://gerrit.ovirt.org/#/c/65102/4/lib/vdsm/jobs.py File lib/vdsm/jobs.py: Line 132: if self.status == STATUS.PENDING: Line 133: # Autodelete should only be handled here for pending state. Line 134: # There is no operation running so we can go straight to Line 135: # aborted state. In all other cases, autodelete is handled as Line 136: # the _run method finishes. > We need to log here, something like: +1 (for consistency purposes from my POV) logging.info() should be fine. Line 137: self._status = STATUS.ABORTED Line 138: if self.autodelete: Line 139: self._autodelete() Line 140: elif self.status == STATUS.RUNNING: PS4, Line 177: _abort_completed perhaps _run_aborted(self) could be a bit clearer? -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 4 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Adam Litke Gerrit-Reviewer: Dan Kenigsberg Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: Shahar Havivi Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: Yes ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
Nir Soffer has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 4: Code-Review+1 (5 comments) https://gerrit.ovirt.org/#/c/65102/4/lib/vdsm/jobs.py File lib/vdsm/jobs.py: Line 146 Line 147 Line 148 Line 149 Line 150 We can log here that we are starting the job. Line 132: if self.status == STATUS.PENDING: Line 133: # Autodelete should only be handled here for pending state. Line 134: # There is no operation running so we can go straight to Line 135: # aborted state. In all other cases, autodelete is handled as Line 136: # the _run method finishes. We need to log here, something like: Aborted Job xxxyyy Line 137: self._status = STATUS.ABORTED Line 138: if self.autodelete: Line 139: self._autodelete() Line 140: elif self.status == STATUS.RUNNING: Line 148: raise JobNotActive() Line 149: Line 150: def run(self): Line 151: if not self._may_run(): Line 152: return We need log info for starting the job - maybe inside _may_run? Line 153: try: Line 154: self._run() Line 155: except exception.ActionStopped: Line 156: self._abort_completed() Line 155: except exception.ActionStopped: Line 156: self._abort_completed() Line 157: except Exception as e: Line 158: self._run_failed(e) Line 159: else: For extra consistency, we can introduce _run_completed() for these lines. Line 160: with self._status_lock: Line 161: self._status = STATUS.DONE Line 162: finally: Line 163: if self.autodelete: Line 157: except Exception as e: Line 158: self._run_failed(e) Line 159: else: Line 160: with self._status_lock: Line 161: self._status = STATUS.DONE What we miss in the successful case ia log info about job completing. Line 162: finally: Line 163: if self.autodelete: Line 164: self._autodelete() Line 165: -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 4 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Adam Litke Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: Yes ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
gerrit-hooks has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 4: * Update tracker: IGNORE, no Bug-Url found * Check Bug-Url::WARN, no bug url found, make sure header matches 'Bug-Url: ' and is a valid url. * Check merged to previous::IGNORE, Not in stable branch (['ovirt-3.6', 'ovirt-4.0']) -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 4 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Adam Litke Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
Adam Litke has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 3: (7 comments) https://gerrit.ovirt.org/#/c/65102/3/lib/vdsm/jobs.py File lib/vdsm/jobs.py: Line 168 Line 169 Line 170 Line 171 Line 172 > Should raise? Done Line 169 Line 170 Line 171 Line 172 Line 173 > We must remove this line, _abort cannot know if the job was aborted, it is Done Line 152: return Line 153: try: Line 154: self._run() Line 155: except exception.ActionStopped: Line 156: self._handle_completed_abort() > Maybe _abort_completed? Done Line 157: except Exception as e: Line 158: self._handle_execution_error(e) Line 159: else: Line 160: # We always mark the job done. Even if we were in aborting state Line 154: self._run() Line 155: except exception.ActionStopped: Line 156: self._handle_completed_abort() Line 157: except Exception as e: Line 158: self._handle_execution_error(e) > Maybe _run_failed? Done Line 159: else: Line 160: # We always mark the job done. Even if we were in aborting state Line 161: # _run might finish normally before the abort action executes. Line 162: with self._status_lock: Line 157: except Exception as e: Line 158: self._handle_execution_error(e) Line 159: else: Line 160: # We always mark the job done. Even if we were in aborting state Line 161: # _run might finish normally before the abort action executes. > I think this comment is confusing, setting status to DONE in the else is pr Done Line 162: with self._status_lock: Line 163: self._status = STATUS.DONE Line 164: finally: Line 165: if self.autodelete: Line 162: with self._status_lock: Line 163: self._status = STATUS.DONE Line 164: finally: Line 165: if self.autodelete: Line 166: self._autodelete() > Cannot be more elegant! Thanks! Line 167: Line 168: def _prepare_to_run(self): Line 169: with self._status_lock: Line 170: if self.status == STATUS.ABORTED: Line 164: finally: Line 165: if self.autodelete: Line 166: self._autodelete() Line 167: Line 168: def _prepare_to_run(self): > May be _may_run? Done Line 169: with self._status_lock: Line 170: if self.status == STATUS.ABORTED: Line 171: logging.debug('Refusing to run aborted job %r', self._id) Line 172: return False -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 3 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Adam Litke Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: Yes ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
Nir Soffer has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 3: Code-Review+1 (7 comments) Nice and simple! Please check the comments. https://gerrit.ovirt.org/#/c/65102/3/lib/vdsm/jobs.py File lib/vdsm/jobs.py: Line 168 Line 169 Line 170 Line 171 Line 172 Should raise? Line 169 Line 170 Line 171 Line 172 Line 173 We must remove this line, _abort cannot know if the job was aborted, it is async. Line 152: return Line 153: try: Line 154: self._run() Line 155: except exception.ActionStopped: Line 156: self._handle_completed_abort() Maybe _abort_completed? Line 157: except Exception as e: Line 158: self._handle_execution_error(e) Line 159: else: Line 160: # We always mark the job done. Even if we were in aborting state Line 154: self._run() Line 155: except exception.ActionStopped: Line 156: self._handle_completed_abort() Line 157: except Exception as e: Line 158: self._handle_execution_error(e) Maybe _run_failed? Line 159: else: Line 160: # We always mark the job done. Even if we were in aborting state Line 161: # _run might finish normally before the abort action executes. Line 162: with self._status_lock: Line 157: except Exception as e: Line 158: self._handle_execution_error(e) Line 159: else: Line 160: # We always mark the job done. Even if we were in aborting state Line 161: # _run might finish normally before the abort action executes. I think this comment is confusing, setting status to DONE in the else is pretty clear to me. Line 162: with self._status_lock: Line 163: self._status = STATUS.DONE Line 164: finally: Line 165: if self.autodelete: Line 162: with self._status_lock: Line 163: self._status = STATUS.DONE Line 164: finally: Line 165: if self.autodelete: Line 166: self._autodelete() Cannot be more elegant! Line 167: Line 168: def _prepare_to_run(self): Line 169: with self._status_lock: Line 170: if self.status == STATUS.ABORTED: Line 164: finally: Line 165: if self.autodelete: Line 166: self._autodelete() Line 167: Line 168: def _prepare_to_run(self): May be _may_run? Line 169: with self._status_lock: Line 170: if self.status == STATUS.ABORTED: Line 171: logging.debug('Refusing to run aborted job %r', self._id) Line 172: return False -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 3 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: Yes ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
gerrit-hooks has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 3: * Update tracker: IGNORE, no Bug-Url found * Check Bug-Url::WARN, no bug url found, make sure header matches 'Bug-Url: ' and is a valid url. * Check merged to previous::IGNORE, Not in stable branch (['ovirt-3.6', 'ovirt-4.0']) -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 3 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
Nir Soffer has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 2: Code-Review-1 (2 comments) https://gerrit.ovirt.org/#/c/65102/2/lib/vdsm/jobs.py File lib/vdsm/jobs.py: Line 129: Line 130: def abort(self): Line 131: with self._status_lock: Line 132: if self.status == STATUS.PENDING: Line 133: self._status = STATUS.ABORTED Maybe comment only here: # We autodelete only in pending state, otherwise we autodelete in run() Line 134: if self.autodelete: Line 135: self._autodelete() Line 136: elif self.status == STATUS.RUNNING: Line 137: self._status = STATUS.ABORTING Line 136: elif self.status == STATUS.RUNNING: Line 137: self._status = STATUS.ABORTING Line 138: logging.info("Aborting job %r...", self.id) Line 139: self._abort() Line 140: # Autodelete is handled by run when the operation is terminated Strange that we comment about auto delete only in RUNNING state, and ignore ABORTING state. Looks like an error that someone will try to "fix". Line 141: elif self.status == STATUS.ABORTING: Line 142: logging.info("Retrying abort job %r...", self.id) Line 143: self._abort() Line 144: else: -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 2 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: Yes ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
gerrit-hooks has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 2: * Update tracker: IGNORE, no Bug-Url found * Check Bug-Url::WARN, no bug url found, make sure header matches 'Bug-Url: ' and is a valid url. * Check merged to previous::IGNORE, Not in stable branch (['ovirt-3.6', 'ovirt-4.0']) -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 2 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: jobs: Fix abort semantics
Nir Soffer has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 1: Code-Review-1 (8 comments) https://gerrit.ovirt.org/#/c/65102/1/lib/vdsm/jobs.py File lib/vdsm/jobs.py: Line 146 Line 147 Line 148 Line 149 Line 150 Note to self: status may change at this point to ABORTING Line 169 Line 170 Line 171 Line 172 Line 173 These do not matter much now, maybe change Must to Should. Line 136: elif self.status == STATUS.RUNNING: Line 137: self._status = STATUS.ABORTING Line 138: logging.info("Aborting job %r...", self.id) Line 139: self._abort() Line 140: # Autodelete is handled by run when the operation is terminated This comment is relevant for both running and aborting state, maybe move to the pending section, or add note on the top of the function. Line 141: elif self.status == STATUS.ABORTING: Line 142: logging.info("Retrying abort job %r...", self.id) Line 143: self._abort() Line 144: else: Line 158: except Exception as e: Line 159: with self._status_lock: Line 160: if self.status == STATUS.ABORTING: Line 161: logging.exception("Failed to abort job (id=%s desc=%s)", Line 162: self.id, self.description) Why this is failure to abort? If run raised, it may be because an opeation was aborted (raised ActionStopped). I think this means that user requested an abort, so we don't care if the job failed or aborted, and we should change state to ABORTED. Line 163: # We do not autodelete failed aborting jobs because it is Line 164: # likely that something is still running on the system. Line 165: else: Line 166: self._status = STATUS.FAILED Line 166: self._status = STATUS.FAILED Line 167: logging.exception("Job (id=%s desc=%s) failed", Line 168: self.id, self.description) Line 169: if self.autodelete: Line 170: self._autodelete() We need autodelete both for job errors and job success, better move it down to the finally block. Line 171: if not isinstance(e, exception.VdsmException): Line 172: e = exception.GeneralException(str(e)) Line 173: self._error = e Line 174: else: Line 169: if self.autodelete: Line 170: self._autodelete() Line 171: if not isinstance(e, exception.VdsmException): Line 172: e = exception.GeneralException(str(e)) Line 173: self._error = e We can wrap the exception first, regardless of the status. Line 174: else: Line 175: with self._status_lock: Line 176: if self.status == STATUS.ABORTING: Line 177: self._status = STATUS.ABORTED Line 175: with self._status_lock: Line 176: if self.status == STATUS.ABORTING: Line 177: self._status = STATUS.ABORTED Line 178: else: Line 179: self._status = STATUS.DONE Checking aborting state twice is little ugly. Maybe we can move this to the common section? something like: try: run... except: wrap error finally: take lock if status is aborting: set status to aborted eif error: set status to failed set error else: self status to done if autodelete: autodelete... Line 180: if self.autodelete: Line 181: self._autodelete() Line 182: Line 183: def _abort(self): Line 177: self._status = STATUS.ABORTED Line 178: else: Line 179: self._status = STATUS.DONE Line 180: if self.autodelete: Line 181: self._autodelete() We want to do autodelete in a finally block. Line 182: Line 183: def _abort(self): Line 184: """ Line 185: May be implemented by child class. This is an asynchronous operation -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 1 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: Yes ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patc
Change in vdsm[master]: jobs: Fix abort semantics
Adam Litke has uploaded a new change for review. Change subject: jobs: Fix abort semantics .. jobs: Fix abort semantics Prior to this commit jobs.abort was treated as an synchronous operation and if it returned successfully the caller could assume that no operations related to the job were still running. Unforatunately this assumption is wrong since most underlying operations are aborted asynchronously (ie. by sending a signal to a process). To fix this we must adopt async abort sementics in the jobs API. If a job is pending and abort is called we can simply move it to aborted state. If the job is running we move it to aborting state, call the abort helper (_abort) and return. When run() finishes it will move an aborting job to aborted. We must not autodelete or otherwise change the state of jobs with aborting status because it could mask the fact that a process is still changing the host or storage. Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Signed-off-by: Adam Litke --- M lib/vdsm/jobs.py M tests/jobsTests.py 2 files changed, 53 insertions(+), 38 deletions(-) git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/02/65102/1 diff --git a/lib/vdsm/jobs.py b/lib/vdsm/jobs.py index b92779d..2c6c27c 100644 --- a/lib/vdsm/jobs.py +++ b/lib/vdsm/jobs.py @@ -33,11 +33,12 @@ class STATUS: -PENDING = 'pending' # Job has not started yet -RUNNING = 'running' # Job is running -DONE = 'done'# Job has finished successfully -ABORTED = 'aborted' # Job was aborted by user request -FAILED = 'failed'# Job has failed +PENDING = 'pending'# Job has not started yet +RUNNING = 'running'# Job is running +DONE = 'done' # Job has finished successfully +ABORTING = 'aborting' # Job is running but abort is in progress +ABORTED = 'aborted'# Job was aborted by user request +FAILED = 'failed' # Job has failed class ClientError(Exception): @@ -128,16 +129,20 @@ def abort(self): with self._status_lock: -if not self.active: +if self.status == STATUS.PENDING: +self._status = STATUS.ABORTED +if self.autodelete: +self._autodelete() +elif self.status == STATUS.RUNNING: +self._status = STATUS.ABORTING +logging.info("Aborting job %r...", self.id) +self._abort() +# Autodelete is handled by run when the operation is terminated +elif self.status == STATUS.ABORTING: +logging.info("Retrying abort job %r...", self.id) +self._abort() +else: raise JobNotActive() -logging.info('Job %r aborting...', self._id) -self._abort() -self._status = STATUS.ABORTED - -# We MUST NOT autodelete a job if abort failed. Otherwise there could -# still be ongoing operations on storage without any associated job. -if self.autodelete: -self._autodelete() def run(self): with self._status_lock: @@ -150,25 +155,37 @@ self._status = STATUS.RUNNING try: self._run() -status = STATUS.DONE except Exception as e: -status = STATUS.FAILED -logging.exception("Job (id=%s desc=%s) failed", - self.id, self.description) -if not isinstance(e, exception.VdsmException): -e = exception.GeneralException(str(e)) -self._error = e -finally: with self._status_lock: -if self.status == STATUS.ABORTED: -return -self._status = status -if self.autodelete: -self._autodelete() +if self.status == STATUS.ABORTING: +logging.exception("Failed to abort job (id=%s desc=%s)", + self.id, self.description) +# We do not autodelete failed aborting jobs because it is +# likely that something is still running on the system. +else: +self._status = STATUS.FAILED +logging.exception("Job (id=%s desc=%s) failed", + self.id, self.description) +if self.autodelete: +self._autodelete() +if not isinstance(e, exception.VdsmException): +e = exception.GeneralException(str(e)) +self._error = e +else: +with self._status_lock: +if self.status == STATUS.ABORTING: +self._status = STATUS.ABORTED +else: +self._status = STATUS.DONE +if self.autodelete: +self._aut
Change in vdsm[master]: jobs: Fix abort semantics
gerrit-hooks has posted comments on this change. Change subject: jobs: Fix abort semantics .. Patch Set 1: * Update tracker: IGNORE, no Bug-Url found * Check Bug-Url::WARN, no bug url found, make sure header matches 'Bug-Url: ' and is a valid url. * Check merged to previous::IGNORE, Not in stable branch (['ovirt-3.6', 'ovirt-4.0']) -- To view, visit https://gerrit.ovirt.org/65102 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801082c50b10cf0571210d65cd3a5cec0d282a5c Gerrit-PatchSet: 1 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org