[GitHub] [airflow] Taragolis commented on issue #33647: Airflow Triggerer facing frequent restarts


Taragolis commented on issue #33647:
URL: https://github.com/apache/airflow/issues/33647#issuecomment-1716320225

First of all I would recommend considering possibility of upgrading to a new
version of MySQL, [5.7 it is almost EOL](https://endoflife.date/mysql), even if
Amazon would support MySQL 5.7 on Aurora there is big chance that Airflow would
stop support MySQL 5.7 in versions which released after **31 Oct 2023**. That
mean that further improvements in triggerer would not available. In additional
8.0 should provide better query analyser/planner. Just make sure that you test
migration on snapshot of DB before doing this on prod database.

---

Anyway, I inspected data transfers between Triggerer and TriggerJob, it
might help someone (maybe it was me) who want to optimise this:
1. [Load
Triggers](https://github.com/apache/airflow/blob/8918b435be8c683bbd6bb2ffa871dbd31d476f48/airflow/jobs/triggerer_job_runner.py#L374-L378)
2. [All associated IDs with current
Triggerer](https://github.com/apache/airflow/blob/87b08ad0840a11d8cd5c0b5043d3a341b1a8f258/airflow/models/trigger.py#L200)
3. [Update
Triggers](https://github.com/apache/airflow/blob/8918b435be8c683bbd6bb2ffa871dbd31d476f48/airflow/jobs/triggerer_job_runner.py#L641)
4. [Bulk Load
Triggers](https://github.com/apache/airflow/blob/87b08ad0840a11d8cd5c0b5043d3a341b1a8f258/airflow/models/trigger.py#L99-L110)
- Query which might make a problem in case of huge input dataset
5. Put data in different different dequeue

Seems like 1-4 might be executed in one query with additional overhead on
captured data but it might reduce time to execute on DB side, however required
additional filtration on client (Airflow) side.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to