JonnyWaffles opened a new issue #21654:
URL: https://github.com/apache/airflow/issues/21654


   ### Description
   
   Hi team,
   
   First I want to thank you for all that you do! This feature is better 
described by a business scenario
   
   > I expect to execute a dag that process an event, N number times during a 
schedule window (quarterly, monthly, etc), I don't know or care when they are 
processed, but I want to be warned if I am nearing the window's interval end 
and the N number of executions have not yet been successful.
   
   Naming this feature request is tricky, as is searching for it, so you may 
have already considered it. If so please point me in the right direction. 
Essentially, we set up a trigger system that executes unscheduled dags 
"manually" for each event type. Whenever the event occurs, an event specific 
dag is triggered to handle the event. But now, we need a way to enforce and 
notify if these "ad hoc" jobs are approaching the end of their generic window. 
For example, "I expect 3 executions for event A in a quarter". I don't think 
such a convention exists. An `ExternalTaskSensor` requires knowing the exact 
`logical_date` for your target dag or task, but we wouldn't know the exact time 
an event would occur.
   
   Therefore, we created a custom `ExternalDagDateRangeSensor` that uses a date 
range instead of specific logical_dates, and polls between the current and next 
`data_interval_end` (ie during the run time of the parent monitoring dag) for a 
success count over threshold.
   
   Then we set up a parent "monitoring" dag for each known interval (quarterly, 
monthly, weekly) and add `ExternalDagDateRangeSensor` tasks for each observed 
target dag with a timeout matching the interval's duration. I'd submit a PR, 
but our implementation described above is pretty wonky, but it kind of works 
for our needs.
   
   - Is there a better way to enforce "loose" or "variable" SLAs like "may 
occur any time in a month but has to occur twice?"
   - Is this something the community would be interested in?
   
   Lastly, as I understand it you can't send out multiple SLA miss 
notifications as you get close and closer to the timeout window. For now we 
just allow users to set a timeout offset, ie enter SLA miss state 2 days before 
the window closes. A more robust solution may be to create additional sensors 
that resume where the previous failed and thus can "renter" the SLA miss state 
with a more severe alarms as the window closes!
   
   
   
   
   
   ### Use case/motivation
   
   _No response_
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to