Not yet. It was just worried about future possibility. Our envrionment is on AWS so I wanted to keep the database as small as I can.
2017-05-18 8:58 GMT+09:00 George Leslie-Waksman < [email protected]>: > We're sitting at over 2.4M task instances in our metadata db without much > trouble. Have you seen substantial performance degradation or are you just > worried about the future possibility? > > On Wed, Apr 19, 2017 at 12:23 PM Maxime Beauchemin < > [email protected]> wrote: > > > You can archive the `job` and `tasks_instance` table, the scheduler won't > > try to backfill them as their respective DagRuns are not in a `running` > > state. The scheduler only tries to schedule active DagRuns, and only > > creates new [active] DagRuns forward from the latest one. > > > > Note that the criteria to archive `task_intance` should be based on > > `start_date` and not `execution_date` as you don't want the archiving to > > interfere with backfills or anything ongoing. > > > > Max > > > > On Wed, Apr 19, 2017 at 5:41 AM, Yongjun Park <[email protected]> > > wrote: > > > > > Hi folks. > > > > > > I have a question about task instances. > > > > > > Is it possible to delete old task instances that have run successfully? > > > Isn't it trying to backfill missing tasks? > > > > > > I have about 1,500 dags and am getting more dags. There're about 300 > > > thousand of task instances currently. 10,000 tasks instances are made > by > > > every day. It'll use 3.6 million rows of mysql table in an year. > > > > > > I have concerns about a table which stores task instances that makes > > large > > > table which can cause performance degradation. > > > > > > How can I keep the table which stores task instances not to be bloated? > > > > > > > > > Thanks, > > > Yongjun > > > > > >
