Re: [DISCUSS] Managing airflow database size and retention

Jarek Potiuk Mon, 29 Dec 2025 12:20:35 -0800

Wait.. Actually ... Stupid me!

I just realized setting the "unlogged" for the two tables means that you do
not have to do any partitioning at all ....... Because you will not have
all the WAL issues AT ALL.


So it's not really about making things "partitioning friendly" but simply
"mark those two tables as unlogged" and make sure that you handle recovery
when plug is pulled. No partitioning needed at all. Might be connected with
the "send audit logs somewhere".

So - we might not need **any** change to airflow, just a documentation
(maybe even unofficial blog of some sort) - describing how to mark those
two tables as unlogged and how to handle recovery in case of replica
failover.



On Mon, Dec 29, 2025 at 9:15 PM Jarek Potiuk <[email protected]> wrote:

> > But eg lookiups to log table might use ti key attrs .
>
> Yeah - that would be the price to pay. Trade-offs, trade-offs everywhere.
>
> Such queries over a multi-partition table would have to do full table scan
> - they will not be able to use indexes (unless the queries use partition
> key). So they would **work** but would be **slow** if the table gets too
> big.
> However I would argue that if someone would like to partition those
> tables, they will frequently and aggressively prune the old partitions
> (this is the reason why they want to have it) - which will effectively keep
> those tables "small" (relatively). And even if full table scan is used, if
> there is a question about task_id audit log, that would be fine.
>
> Actually - I strongly believe - and we had this discussion in the past -
> that the log table is quite a bit of abomination, because it's not a "true"
> audit log if it is kept in modifiable database, and anyone who want to make
> it a "true" audit log will have to effectively send those log entries to a
> "write-only" storage and query the audit logs there. And that nicely fits
> into "unlogged" pattern -> you could add a trigger in your DB to
> automatically send audit logs somewhere "write-only" and aggressively prune
> the old data (i.e. partitions).
>
> So - from the side of airflow that would mean that those tables are
> "partitioning friendly", but not to implement partitioning itself. Leaving
> "partitioning|" and "handling recovery" and "handling audit logs" up to the
> Deployment Manager. Which IMHO should happen anyway for "log" table and
> would be nice pattern to describe as a way to achieve "true" audit log.
>
> J,.
>
>
>
> On Mon, Dec 29, 2025 at 9:02 PM Daniel Standish via dev <
> [email protected]> wrote:
>
>> The problem with partitioning them is that when querying you would need to
>> know the partition.
>> But eg lookiups to log table might use ti key attrs .
>>
>> I would try just trimming log and job table in small batches frequently
>> using an appropriate index, before thinking about the complexity of
>> partitioning.
>>
>>
>>
>>
>>
>> On Mon, Dec 29, 2025 at 11:24 AM Jarek Potiuk <[email protected]> wrote:
>>
>> > I have a feeling that making those **two** tables partition-friendly
>> should
>> > be easy - and maybe that's all that we need. That would make it possible
>> > for you to use the partitioning, (where it would not be necessary for
>> > everyone).
>> > There might be a few ways of doing it without losing uniqueness. For
>> > example Job ID and audit log ID could  follow certain conventions - and
>> > always start with a partition key) thus providing uniqueness we need.
>> >
>> > Those two tables are pretty specific and neither job id nor log id
>> impact
>> > anything else in our data model.
>> >
>> > I think such a change to make those two tables "partition friendly"
>> could
>> > be accepted, but we could only say it after seeing a POC.
>> >
>> > J.
>> >
>> >
>> > On Mon, Dec 29, 2025 at 8:01 PM <[email protected]> wrote:
>> >
>> > > I do think that the second option is best, it is also what we wanted
>> to
>> > > do, the only reason we did not do that is because from our tests, sql
>> > > alchemy sometimes breaks as it expects certain constraints which are
>> not
>> > > there, mainly for update queries, select works well, if I am not
>> > mistaken,
>> > > there are 3 or 4 large tables, job being the largest (7 times larger
>> than
>> > > the second largest), the question is, will such a pull request be
>> > approved?
>> > >
>> > > As we do lose the unique constraint (as it becomes per partition),
>> though
>> > > it is a sequence that most likely won't repeat until the previous has
>> > been
>> > > deleted, but if not, we might query unrelated job or log data, and so
>> > > changes in the api server are also needed, creating the pr is not a
>> > > problem, the question is how beneficial will it be, as if it is done
>> to
>> > > those problematic tables, it means that the preferred way to manage
>> > > retention is from the db, and can be an optional alembic script.
>> > >
>> > > I do not want to just push a fix that will add more complexity than
>> the
>> > > benefit it will bring.
>> > >
>> > > If we do go the pr way, a separate discussion is probably needed to
>> > decide
>> > > how it should be done (most likely an additional airflow command to
>> turn
>> > on
>> > > or off the partitions and retention)
>> > >
>> > > Doing it as a custom solution has caused problems with sqlalchemy,
>> and we
>> > > do not want to do so as if later an alembic script relies on the
>> primary
>> > > key in some way, we will need to fix it manually and deal with the
>> > problems
>> > > it may cause when we update.
>> > >
>> > > > On 29 Dec 2025, at 21:36, Jarek Potiuk <[email protected]> wrote:
>> > > >
>> > > > Another option is that you ( could make ONLY for those two tables -
>> > > > partition-friendly.  And do not partition anything else.  I think
>> that
>> > > > **could** be possible - both have datetime fields that could be
>> used as
>> > > > partition keys - you would have to assess if you can do it as your
>> > > "custom"
>> > > > solution or whether it would require some changes to airflow models.
>> > But
>> > > I
>> > > > can't see foreign key problems if ONLY those two tables are
>> > partitioned,
>> > > so
>> > > > likely you could do it yourself in your DB.
>> > > >
>> > > > In this case - maybe "let's solve those tables that are
>> problematic" is
>> > > > easier to do than "let's apply partitioning to everything".
>> > > >
>> > > >> On Mon, Dec 29, 2025 at 7:31 PM Jarek Potiuk <[email protected]>
>> > wrote:
>> > > >>
>> > > >> Purely theoretically, you could change the log and job tables to be
>> > > >> unlogged - and thus avoid WAL for them.
>> > > >>
>> > > >> The drawback of this:
>> > > >>
>> > > >> * the data in those tables will be lost if you pull the plug or
>> kill
>> > -9
>> > > >> the primary postgres server
>> > > >> * the tables are not available (at all) in replicas - so in case
>> of a
>> > > >> fallback, you would have to have a manual data import/export for
>> those
>> > > >> tables on fail-over, rather than rely on replicas being "ready to
>> > > fallback
>> > > >> immediately". Or accept data loss.
>> > > >> * the data from those tables will not be present in backups
>> > > >>
>> > > >> I am not 100% sure, but I believe loss of data in both tables is
>> not
>> > > >> really catastrophic for Airflow, so maybe it's acceptable risk (but
>> > > likely
>> > > >> you should do a disaster-recovery test to see what happens and how
>> to
>> > > >> recover in case, indeed, someone pulls the plug on your postgres
>> > server.
>> > > >>
>> > > >> J,
>> > > >>
>> > > >>
>> > > >>
>> > > >>
>> > > >>> On Mon, Dec 29, 2025 at 7:16 PM Natanel <[email protected]>
>> > > wrote:
>> > > >>>
>> > > >>> Yes, the problem is the manual deletions, we have tried it, it
>> > > resulted in
>> > > >>> the same exact issue, as the scheduled db procedures which clean
>> up
>> > the
>> > > >>> rows marked as deleted actually get deleted, and so it takes up
>> > > storage,
>> > > >>> yet it does not solve the WAL problem, the problematic table is
>> > > actually
>> > > >>> not task_instance, it is relatively small, the log and job tables
>> are
>> > > the
>> > > >>> biggest tables (the problem with them is the primary key change
>> > > required),
>> > > >>> by a multiple of 10 (or more, cluster dependant).
>> > > >>>
>> > > >>> The smaller batches might solve the issue, however, it seems to
>> just
>> > > delay
>> > > >>> the problem a little rather than solve it, as deleting data with a
>> > > delete
>> > > >>> query (especially a lot of data) is not a very light operation,
>> and
>> > so
>> > > I
>> > > >>> think that this is the main issue.
>> > > >>>
>> > > >>> It would be nice if we could use partitions instead, as it is a
>> > lighter
>> > > >>> operation, and does not require us to maintain a query and manage
>> our
>> > > db,
>> > > >>> I
>> > > >>> have thought about changing the models, most of the changes are
>> > > relatively
>> > > >>> simple, for some it is just removing the foreign key and relying
>> on
>> > ORM
>> > > >>> level constraints, for others, it requires adding a pre query to
>> have
>> > > the
>> > > >>> same constrains but I do not like that idea, maybe there is
>> another
>> > > way to
>> > > >>> make airflow "partition-friendly"?
>> > > >>>
>> > > >>> I can't think of a nice way to do so, maybe it does not exist, as
>> the
>> > > db
>> > > >>> clean is as simple as a delete query gets, yet when there is a
>> lot of
>> > > >>> data,
>> > > >>> it is all duplicated in WALs.
>> > > >>>
>> > > >>> On Mon, Dec 29, 2025, 19:40 Daniel Standish via dev <
>> > > >>> [email protected]>
>> > > >>> wrote:
>> > > >>>
>> > > >>>> Have you looked at doing manual deletions?  I.e. writing your own
>> > sql?
>> > > >>>>
>> > > >>>> The db clean command is probably not "optimal" for all scenarios.
>> > > >>>>
>> > > >>>> So for example, if the main problem table for you is
>> task_instance,
>> > > you
>> > > >>>> could periodically delete TI records in smaller batches using
>> some
>> > > >>>> appropriate index (whether it exists now or you add it).  Then
>> maybe
>> > > you
>> > > >>>> would not stress the db as hard.
>> > > >>>>
>> > > >>>> Airflow isn't designed to use partitions so, you may not get good
>> > > >>> results
>> > > >>>> with that approach.
>> > > >>>>
>> > > >>>>
>> > > >>>>
>> > > >>>> On Mon, Dec 29, 2025 at 7:32 AM Natanel <[email protected]
>> >
>> > > >>> wrote:
>> > > >>>>
>> > > >>>>> Hello everyone, after having issues with the 'airflow db clean'
>> > > >>> command,
>> > > >>>>> where due to the amount of dags and tasks that are running every
>> > day
>> > > >>> in
>> > > >>>> our
>> > > >>>>> deployments, we get a lot of new data every day, which is
>> stored in
>> > > >>> the
>> > > >>>>> database, and when we delete the data, due to the way PGSQL
>> works,
>> > > the
>> > > >>>>> WAL's get replicated to both the archive storage and the main
>> data
>> > > >>>> storage
>> > > >>>>> of the db instance, which in turn, causes a significant jump in
>> cpu
>> > > >>>> usage,
>> > > >>>>> ram usage and disk usage, whenever we run the command, which
>> causes
>> > > >>> all
>> > > >>>>> kinds of issues, we even had it once fill up the db storage, and
>> > > >>> causing
>> > > >>>>> the database to be unresponsive, forcing us to move to our
>> backup
>> > > >>>> database,
>> > > >>>>> after we haven't ran the command for a few months due to human
>> > error.
>> > > >>>>>
>> > > >>>>> As of now, I know that this is the accepted and widely used way
>> of
>> > > >>>> managing
>> > > >>>>> the airflow database's size, however, we noticed that it may
>> cause
>> > > >>> issues
>> > > >>>>> in certain cases, just like in our case, where if the db has not
>> > been
>> > > >>>>> cleaned up for a while, cleaning it can be problematic.
>> > > >>>>>
>> > > >>>>> We decided to try and partition the table, and use pgsql's
>> built in
>> > > >>>>> retention of partitions, which does not issue a DELETE query,
>> and
>> > is
>> > > >>>>> lighter and faster, while being simpler to use, however, we have
>> > > >>>>> encountered issues due to having Foreign Key constraints in some
>> > > >>> tables,
>> > > >>>>> having to duplicate such keys and other than forcing code
>> changes
>> > (as
>> > > >>> the
>> > > >>>>> foreign key must include the partitioned key, as the partitioned
>> > key
>> > > >>> must
>> > > >>>>> be part of the primary key), while also having the issue of
>> > > sqlalchemy
>> > > >>>>> breaking once we change the primary key, with the addition of
>> the
>> > > >>>>> constraints on the primary key breaking.
>> > > >>>>>
>> > > >>>>> And in Mysql, due to the foreign keys, it is not possible to
>> > > partition
>> > > >>>>> tables which include them, as it is not supported yet
>> (according to
>> > > >>> this
>> > > >>>>> <
>> > > >>>>>
>> > > >>>>
>> > > >>>
>> > >
>> >
>> https://dev.mysql.com/doc/refman/8.4/en/partitioning-limitations-storage-engines.html
>> > > >>>>>>
>> > > >>>>> ).
>> > > >>>>>
>> > > >>>>> Has anyone else tried to use the databases built in partition
>> > > >>> retention
>> > > >>>>> system instead of the 'airflow db clean' command?
>> > > >>>>>
>> > > >>>>> Thanks, Natanel.
>> > > >>>>>
>> > > >>>>
>> > > >>>
>> > > >>
>> > >
>> > > ---------------------------------------------------------------------
>> > > To unsubscribe, e-mail: [email protected]
>> > > For additional commands, e-mail: [email protected]
>> > >
>> > >
>> >
>>
>

Re: [DISCUSS] Managing airflow database size and retention

Reply via email to