[ 
https://issues.apache.org/jira/browse/AIRFLOW-108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Imberman updated AIRFLOW-108:
------------------------------------
    Description: 
Airflow's DB currently holds the entire history of all executions for all time. 
This is problematic as the DB grows. The UI starts to get slower, and the DB's 
disk usage grows. There is no bound to how large the DB will grow.

It would be useful to add a feature in Airflow to do two things:
 # Delete old data from the DB
 # Mark some lower watermark, past which DAG executions are ignored

For example, (2) would allow you to tell the scheduler "ignore all data prior 
to a year ago". And (1) would allow Airflow to delete all data prior to January 
1, 2015.

  was:
Airflow's DB currently holds the entire history of all executions for all time. 
This is problematic as the DB grows. The UI starts to get slower, and the DB's 
disk usage grows. There is no bound to how large the DB will grow.

It would be useful to add a feature in Airflow to do two things:

# Delete old data from the DB
# Mark some lower watermark, past which DAG executions are ignored

For example, (2) would allow you to tell the scheduler "ignore all data prior 
to a year ago". And (1) would allow Airflow to delete all data prior to January 
1, 2015.


> Add data retention policy to Airflow
> ------------------------------------
>
>                 Key: AIRFLOW-108
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-108
>             Project: Apache Airflow
>          Issue Type: Wish
>          Components: database
>            Reporter: Chris Riccomini
>            Priority: Major
>
> Airflow's DB currently holds the entire history of all executions for all 
> time. This is problematic as the DB grows. The UI starts to get slower, and 
> the DB's disk usage grows. There is no bound to how large the DB will grow.
> It would be useful to add a feature in Airflow to do two things:
>  # Delete old data from the DB
>  # Mark some lower watermark, past which DAG executions are ignored
> For example, (2) would allow you to tell the scheduler "ignore all data prior 
> to a year ago". And (1) would allow Airflow to delete all data prior to 
> January 1, 2015.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to