Re: [PR] fix(reports): guard SUCCESS-state report execution against duplicate sends and stuck WORKING state [superset]

via GitHub Mon, 01 Jun 2026 18:02:37 -0700


codeant-ai-for-open-source[bot] commented on code in PR #40657:
URL: https://github.com/apache/superset/pull/40657#discussion_r3337992886



##########
superset/commands/report/execute.py:
##########
@@ -1031,17 +1031,45 @@ def next(self) -> None:
                     )
                     return
             except Exception as ex:
-                self.send_error(
-                    f"Error occurred for {self._report_schedule.type}:"
-                    f" {self._report_schedule.name}",
-                    str(ex),
-                )
-                self.update_report_schedule_and_log(
-                    ReportState.ERROR,
-                    error_message=REPORT_SCHEDULE_ERROR_NOTIFICATION_MARKER,
-                )
+                # Ensure the schedule always transitions out of WORKING to
+                # ERROR, even if sending the error notification itself fails —
+                # otherwise the schedule is stuck in WORKING until the working
+                # timeout. Mirrors ReportNotTriggeredErrorState.next().
+                try:
+                    self.send_error(
+                        f"Error occurred for {self._report_schedule.type}:"
+                        f" {self._report_schedule.name}",
+                        str(ex),
+                    )
+                except Exception:  # noqa: BLE001  # pylint: 
disable=broad-except
+                    logger.warning(
+                        "Failed to send error notification for report schedule 
"
+                        "(execution %s)",
+                        self._execution_id,
+                        exc_info=True,
+                    )
+                finally:
+                    try:
+                        self.update_report_schedule_and_log(
+                            ReportState.ERROR,
+                            
error_message=REPORT_SCHEDULE_ERROR_NOTIFICATION_MARKER,
+                        )
+                    except ReportScheduleUnexpectedError:
+                        logger.warning(
+                            "Failed to log ERROR state for report schedule "
+                            "(execution %s) due to database issue",
+                            self._execution_id,
+                            exc_info=True,
+                        )

Review Comment:
   **Suggestion:** `REPORT_SCHEDULE_ERROR_NOTIFICATION_MARKER` is logged 
unconditionally in the alert exception path, even when `send_error()` fails. 
That marker is used by `ReportScheduleDAO.find_last_error_notification()` to 
detect that an error notification was sent, so this will incorrectly suppress 
later error notifications during grace-period checks. Set the logged 
`error_message` based on whether `send_error()` actually succeeded (marker only 
on success; otherwise log the real send failure message). [incorrect condition 
logic]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Error grace logic misfires after failed error notification.
   - ⚠️ Subsequent alert failures during grace send no email.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Configure an alert-type report schedule in the database so that it is 
executed via
   `AsyncExecuteReportScheduleCommand.run()` at 
`superset/commands/report/execute.py:48-77`,
   which constructs and runs `ReportScheduleStateMachine` at `execute.py:62-65`.
   
   2. Ensure the schedule's `last_state` is `SUCCESS` or `GRACE` so the state 
machine enters
   `ReportSuccessState.next()` (class defined at `execute.py:1015-1099`), and 
cause
   `AlertCommand(...).run()` in the ALERT branch at `execute.py:1025-1038` to 
raise an
   exception (e.g., by triggering a runtime/database error).
   
   3. In the same scenario, have `state.send_error(...)` in the ALERT exception 
handler at
   `execute.py:1039-1043` raise (e.g., email backend failure). The `except` 
block logs a
   warning at `execute.py:1045-1050`, and the `finally` block at 
`execute.py:1051-1063` still
   calls `update_report_schedule_and_log(ReportState.ERROR,
   error_message=REPORT_SCHEDULE_ERROR_NOTIFICATION_MARKER)`, recording the 
marker even
   though no error email was actually sent.
   
   4. On a subsequent scheduler run for the same schedule within its error 
grace period, the
   state machine will enter `ReportNotTriggeredErrorState.next()` at 
`execute.py:861-939`,
   which calls `self.is_in_error_grace_period()` at `execute.py:925-926`. That 
method at
   `execute.py:832-848` calls 
`ReportScheduleDAO.find_last_error_notification()` at
   `superset/daos/report.py:15-31`, which filters for logs where `error_message 
==
   REPORT_SCHEDULE_ERROR_NOTIFICATION_MARKER` (line 25-26). Because the marker 
was logged in
   step 3, `is_in_error_grace_period()` returns True and the `if not
   self.is_in_error_grace_period(): self.send_error(...)` block at 
`execute.py:925-933` is
   skipped, suppressing new error notifications even though the previous one 
never
   successfully went out.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=bbc3987b079f4835b9130ca377074f2a&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=bbc3987b079f4835b9130ca377074f2a&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/commands/report/execute.py
   **Line:** 1051:1063
   **Comment:**
        *Incorrect Condition Logic: `REPORT_SCHEDULE_ERROR_NOTIFICATION_MARKER` 
is logged unconditionally in the alert exception path, even when `send_error()` 
fails. That marker is used by 
`ReportScheduleDAO.find_last_error_notification()` to detect that an error 
notification was sent, so this will incorrectly suppress later error 
notifications during grace-period checks. Set the logged `error_message` based 
on whether `send_error()` actually succeeded (marker only on success; otherwise 
log the real send failure message).
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40657&comment_hash=b4c698b4994328f0a8a9077d005b0545687358813fe08d9c688b08819c7c324e&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40657&comment_hash=b4c698b4994328f0a8a9077d005b0545687358813fe08d9c688b08819c7c324e&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] fix(reports): guard SUCCESS-state report execution against duplicate sends and stuck WORKING state [superset]

Reply via email to