potiuk edited a comment on issue #7911:
URL: https://github.com/apache/airflow/issues/7911#issuecomment-898901416


   > If I get some guidance on how/where to start I could try to do it
   
   I think good start is to take a look at the - quite popular - maintenance 
dags here: https://github.com/teamclairvoyant/airflow-maintenance-dags  - this 
is a set of 3rd-party maintenance DAGs that people are using for some kind of 
maintenance (`db-cleanup`). We do not know how "correct" it is and how well it 
copes with the new Airflow versions, but It can give an idea on how users deal 
with it.
   
   I think that might be a good idea to start from that and work out an 
approach (other than DAGs) implementing something like that in airlfow  as 
periodic Job  - especially that long term plans will be to not allow tasks to 
talk to the DB directly, the DAG-approach would not work in this case.
   
   I think personally this should start with at least discussion in the devlist 
or (maybe even better) a new AIP (Airflow Improvement Proposal - 
https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals
 ) as result of this discussion.
   
   I think there are many ways it can be done, but it needs some proposal and 
quite extensive discussion (on performance consequence, where should such 
cleanup be running, whether it should be a separate process or should it run 
within scheduler, how to deal with multiple-schedulers if we choose 
scheduler-embedded solution, etc. etc. It's actually quite an extensive one
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to