potiuk commented on issue #17127: URL: https://github.com/apache/airflow/issues/17127#issuecomment-884857501
We support Scheduler HA (running more than one scheduler) https://airflow.apache.org/docs/apache-airflow/stable/concepts/scheduler.html?highlight=scheduler%20ha#running-more-than-one-scheduler - our Scheduler runs in Active/Active mode (which means that both schedulers are parsing DAGs at the same time). This is supported in MySQL 8+ and *should* work (of course there might be some edge cases, but generally we tested it and it works). This is of course very different than Database HA. This is something that is outside of the realm of Airflow and is done by your deployment. From the very beginning we had the assumption, and we have developed Airflow 2 with the assumption that the Database is running at most in Active/Passive mode. The comment from #14788 indicated that someone had similar problem when running DB in active/active mode behind (and there switching to talk directly to only one physical DB helped). So my assumption was that one of the reasons is that you have similar setup. Also - we've seen similar problems with various proxies which provided kind'a poor's man DB HA, where the proxy had several physical DB clusters behind. We heavily base our Scheduler's HA on Database locking, and locking is hard problem to solve in Active/Active setup. That leads to the suggestion - that this might be similar case for you. If it is not and you are 100% sure that you have single physical DB behind then the problem needs deeper investigation and will take quite some time to resolve, and possibly some iterations here to find out the reason (because we have not seen it in our tests). So if you are 100% sure you do not have multiple DBs being accessed at the same time (even if single proxy is used) then my advice will be to switch to Postgres, as it might take quite a lot of time to find out the cause (we've seen it in the past - sometimes people used customized versions of the databases with some functionality disabled for example). Postgres is much more stable, and less configurable (MySQL for example can have multiple engines with different capabilities) and there might be many other reason why MySQL (especially custom-configured one) creates problem. Unfortunlately we have no capacity to investigate and help individual users here in the community and investigate those cases deeply, so unless you have time and capacity to try to investigate it and provide more information, I am afraid it might take quite some time to even reproduce this kind of problem you have. Going Postgres is much more "certain" route, and if you are keen on timing, I'd heartily recommend going that route. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
