potiuk edited a comment on issue #7911: URL: https://github.com/apache/airflow/issues/7911#issuecomment-898901416
> If I get some guidance on how/where to start I could try to do it I think good start is to take a look at the - quite popular - maintenance dags here: https://github.com/teamclairvoyant/airflow-maintenance-dags - this is a set of 3rd-party maintenance DAGs that people are using for some kind of maintenance (`db-cleanup`). We do not know how "correct" it is and how well it copes with the new Airflow versions, but It can give an idea on how users deal with it. I think that might be a good idea to start from that and work out an approach (other than DAGs) implementing something like that in airlfow as periodic Job - especially that long term plans will be to not allow tasks to talk to the DB directly, the DAG-approach would not work in this case. I think personally this should start with at least discussion in the devlist or (maybe even better) a new AIP (Airflow Improvement Proposal - https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals ) as result of this discussion. I think there are many ways it can be done, but it needs some proposal and quite extensive discussion (on performance consequence, where should such cleanup be running, whether it should be a separate process or should it run within scheduler, how to deal with multiple-schedulers if we choose scheduler-embedded solution, etc. etc. It's actually quite an extensive one -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
