ferruzzi commented on issue #43989: URL: https://github.com/apache/airflow/issues/43989#issuecomment-2479743865
Yeah, the more I think about this, the more I think I am on the right track. I don't have much experience with db design so I want to bounce it off a couple people to verify before I/we commit to it, but yes, based on your replies I think this is the right approach. A new table called deadlines with three columns: id, id_type, and deadline which has a primary key on id so we can lookup and drop by id, and an index on deadline so we can easily get MIN(deadline). `dag_processor` will calculate the deadline and add a row when the dagrun is created. A call in the scheduler loop will query `SELECT MIN(deadline) AS earliest_deadline FROM deadlines;` and compare that to now() to see if any deadline has passed. If there are any values returned, then call the DeadlinesHandler. The DeadlinesHandler will get all deadlines that have passed, queue their callbacks, and drop them from the table so they are not queued again next pass. We will need to make sure we can fetch the callback given the id and id_type OR store the callback in the table directly and save that lookup. I think the callback will already be stored in the dagrun table so we should fetch from there. Add logic to the dagrun cleanup/exit code to the effect of "if this dagrun has a Deadline then try to drop it from the table". We "try" because if the deadline passed then it would have been dropped already. --- I don't understand this part of the question: ``` Also curious on what we imagined the relationship would look like, quick gut check on my understanding of the relationship of dag dagrun task taskinstance would it roughly look something like this? DeadlineEntry -> DagRun: Optional reference DeadlineEntry -> TaskInstance: Optional reference DeadlineEntry -> DAG: Optional reference (via dag_id) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
