Daniel Standish writes:

> I have however heard reports of it taking a super long time and 
> perhaps failing to complete, when the db table is quite large (and 
> perhaps pushing the capabilities of the db instance), likely due to 
> statement timeout.  When the process fails, probably it will be in 
> the delete step, after the archive table has been created.  In this 
> case the negative consequence is you now have an archive table (of 
> the records that were to be deleted) and you still have all the 
> records in the main table.  So you can drop the archive, increase 
> statement timeout and try again.  Better though to batch it by 
> starting with an older "delete before timestamp" and incrementally 
> make that more recent.  This will result in smaller batches.

Thanks Daniel -- this is helpful insight and performing cleanups in
small batches sounds like good advice.

> As for gotchas, the one that comes to mind is if say you did this 
> prior to an upgrade, then did not realize that you put in the wrong 
> date until after the upgrade, and the upgrade had migrations in a 
> table that you cleaned.

Aha, well noted. It sounds like the best approach here is to perform any
upgrades (and associated migrations) /prior/ to doing a cleanup ... and
for any archives we keep around, to note the Airflow version from which
they were exported.

I appreciate your help!

Ben

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@airflow.apache.org
For additional commands, e-mail: users-h...@airflow.apache.org

Reply via email to