Ohashiro commented on code in PR #44696:
URL: https://github.com/apache/airflow/pull/44696#discussion_r1875682993
##########
providers/src/airflow/providers/microsoft/azure/operators/powerbi.py:
##########
@@ -126,7 +124,11 @@ def execute_complete(self, context: Context, event:
dict[str, str]) -> Any:
Relies on trigger to throw an exception, otherwise it assumes
execution was successful.
"""
if event:
- if event["status"] == "error":
+ if (
+ event["status"] == "error"
Review Comment:
We investigated the codebase a little and we would like your opinion on the
best way to handle this "status" field.
Here is what happens when a dataset refresh is triggered:
- The PowerBiTrigger triggers a new dataset refresh and gets a `refreshId`
from the API response. Then it sends several requests through the powerBi hook
to fetch the refresh status
- The Powerbi hook requests the Microsoft API to get the recent refresh
history. The API responds with a 200 status code and a list of refresh
histories ([example of
response](https://learn.microsoft.com/en-us/rest/api/power-bi/datasets/get-refresh-history#failed-refresh-example)),
each corresponding to a refreshId and with different possible "statuses":
Failed, Completed, Disabled, etc. corresponding to the dataset refresh status.
- The hook should return the refresh status for the given refreshId (hence
Failed, Completed, Disabled, ...). But it can instead raise an exception, for
ex if one of the request to the API fails, if the hook gets an empty refresh
history, or if the list does not contain the expected refreshId
- The trigger gets the hook response. If it's an exception, it yields an
event with `"status": "error"`. Otherwise, it yields an event with `status` the
refresh status (ex: Failed, Completed, ...)
Therefore, the operator receives an event with a "status" field
corresponding to 2 different types of status: either "error" (mostly
corresponding to a hook exception) or "Failed"/"Completed"/"Disabled"
(corresponding to the refresh status).
As for now, the task fails only if the status is "error". We would like to
make it also fail when the refresh failed (ex: status is "Failed").
We see different solutions:
- we keep both types of statuses in the `status` field (same as now)
- We can add a field `ERROR` to the class
`PowerBIDatasetRefreshStatus`, or create another class eg
`PowerBIDatasetRefreshOperationStatus` with both the API statuses (Completed,
Failed, etc.) and the "error" status
- We keep only one `status` field, corresponding to `"Completed"` (or
"success" or else) when the refresh was successful, `"error"` in any other
situation.
- When the trigger detects a "failing refresh status" (ex: Failed,
Disabled), instead of yielding an event with status = refresh status, it yields
an event with status = "error" (the event message contains the refresh status).
- We divide the information in 2 fields: a field `dataset_refresh_status`
(with possible values being the statuses returned by Microsoft API: Completed,
Failed, etc.) and a field `status` (or another name, with value "error" if the
trigger yields an error event)
I think the first solution has less impact, but maintains 2 types of
information hidden in the same field.
What do you think is best?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]