100% with Kaxil on this one. On Wed, May 20, 2026 at 1:56 PM Kaxil Naik <[email protected]> wrote:
> Hi Diego, > > Thanks for the detailed proposal, and apologies for the slow response > on the list. Picking it up now because your PR #66711 > <https://github.com/apache/airflow/pull/66711> reviving the > work deserves a substantive reply rather than continued silence. > > I want to be straightforward: I don't think this should land, and I > want to explain why in terms of project history rather than the > merits of the specific diff. > > MariaDB as a metadata backend has been discussed on this list at > least three times in the past five years, and the answer has > consistently been the same. A few of the relevant threads: > > * 2021-07: a user hit Galera/HAProxy connection-drop hangs. > Project position was stated as "we do not have MariaDB > support... we even discourage people from using MariaDB because > we do not run tests with either version in our CI." > https://lists.apache.org/thread/hpxgzzw5mfjgl5b5twj0rjm3xb9r5cx2 > > * 2021-11 "Elephant in the room - MySQL", triggered by an earlier > MariaDB-compat PR (#18506). Explicit -1s from several committers > (Ash and I included) against officially adopting MariaDB. The > net outcome was to keep the status quo. > https://lists.apache.org/thread/dp78j0ssyhx62008lbtblrc856nbmlfb > > * 2022 documentation change (#24556) baked the position into the > prerequisites docs. > https://github.com/apache/airflow/pull/24556 > > * 2023-08 "Drop MsSQL" used MariaDB as the precedent for "we > deliberately don't support this DB", and the same reasoning was > then applied to drop MsSQL. > https://lists.apache.org/thread/r06j306hldg03g2my1pd4nyjxg78b3h4 > > So this isn't a fresh question; it's a recurring one, and the > constraint hasn't moved unfortunately. > > The blocker is mainly structural: we > have already been through removing an under-supported backend > (MsSQL) and we remember what that cost users. Adding another one > without a committed group of maintainers behind it would be a > near-certainty to repeat that. > > Sorry for the unenthusiastic answer after the work you've put in. > I'd rather say this clearly now than have you keep rebasing through > another release cycle while waiting for the silence to break in a > different directions. > > Regards, > Kaxil > > On Tue, 7 Apr 2026 at 14:30, Diego Dupin <[email protected]> wrote: > > > Hi all, > > > > I'd like to start a discussion about adding MariaDB as a supported > database > > backend for Apache Airflow, alongside the existing MySQL and PostgreSQL > > backends. > > > > ## Summary > > > > This proposal shows all the changes needed to support MariaDB. All core > > tests pass on LTS versions of both MySQL and MariaDB. > > > > The compatibility commit ( > > > > > https://github.com/apache/airflow/pull/60133/changes/b9da4fe819503ff1aad67aa0943da3ac33707d8d > > ) > > changes 6 files: 4 production code files and 2 test files. Each change is > > detailed below with the root cause, the fix, and the maintenance burden. > > > > ## 1. Scope of Required Code Changes > > > > ### Overview > > > > | # | File | What changes | MariaDB-specific? | Maintenance burden | > > |---|------|--------------|-------------------|-------------------| > > | 1 | `airflow/utils/sqlalchemy.py` | Bug fix: remove incorrect > > `supports_for_update_of` guard | No - improves MySQL too | None (code > > removal) | > > | 2 | `migrations/0036_...add_name_field_to_dataset_model.py` | MySQL > Error > > 1093 workaround | Shared MySQL/MariaDB | None (one-time migration) | > > | 3 | `migrations/0049_...remove_pickled_data_from_xcom_table.py` | > MariaDB > > REGEXP_REPLACE syntax | MariaDB-specific branch | None (one-time > migration) > > | > > | 4 | `airflow/jobs/scheduler_job_runner.py` | Disable unused RETURNING > > clause | No - benefits all dialects | None (one-line addition) | > > | 5 | `tests/.../test_exceptions.py` | Accept multiple valid SQL formats > | > > Test robustness improvement | None | > > | 6 | `tests/.../test_sqlalchemy.py` | Fix test expectation after #1 | > Test > > correction | None | > > > > Key takeaway: Of the 6 changes, only one (Migration 0049) is truly > > MariaDB-specific. The others are either bug fixes, general improvements, > or > > shared MySQL/MariaDB workarounds. > > > > ## 2. Detailed Analysis of Each Change > > > > ### Change 1: `with_row_locks` - Remove incorrect > `supports_for_update_of` > > guard > > > > **File:** `airflow-core/src/airflow/utils/sqlalchemy.py` > > > > Root cause: The `with_row_locks()` function had a guard that disabled > > row-level locking when `supports_for_update_of` was `False`. This check > > made sense when it was written (MariaDB 10.3 lacked full locking > support), > > but it incorrectly disables row locking on MariaDB 10.6+. > > > > The facts: > > - MariaDB 10.6+ fully supports `FOR UPDATE`, `NOWAIT`, and `SKIP LOCKED` > > - SQLAlchemy already guards the `OF <table>` clause internally when the > > dialect doesn't support it > > - The `supports_for_update_of` attribute tests the wrong capability for > the > > intended purpose > > > > Fix: Remove the 4-line guard entirely. The `USE_ROW_LEVEL_LOCKING` config > > flag is sufficient for users who want to explicitly disable row locking. > > > > Impact: This is a bug fix that benefits all MySQL-family backends. > > > > Maintenance burden: Zero - this is pure code removal. > > > > ### Change 2: Migration 0036 - MariaDB Error 1093 workaround > > > > **File:** > > > > > `airflow-core/src/airflow/migrations/versions/0036_3_0_0_add_name_field_to_dataset_model.py` > > > > Root cause: The migration's `downgrade()` function uses a CTE-based > `DELETE > > ... WHERE id IN (SELECT * FROM cte)` query to remove duplicate datasets. > > MariaDB raises Error 1093 when a `DELETE` statement references the same > > table in a subquery. > > > > Fix: Add a `dialect == "mysql"` branch that rewrites the query using a > > derived table wrapper. > > > > Impact: This is MariaDB-specific in practice, but the original query > works > > fine on MySQL 8.0. > > > > Maintenance burden: Zero - this is a one-time migration that will never > > change. > > > > ### Change 3: Migration 0049 - MariaDB REGEXP_REPLACE syntax > > > > **File:** > > > > > `airflow-core/src/airflow/migrations/versions/0049_3_0_0_remove_pickled_data_from_xcom_table.py` > > > > Root cause: This migration sanitizes non-standard JSON tokens in the XCom > > table during the pickle-to-JSON conversion. The `REGEXP_REPLACE` function > > has different signatures between MySQL 8 and MariaDB: > > > > | Database | Signature | Regex Engine | > > |----------|----------|-------------| > > | MySQL 8 | 6 args | ICU | > > | MariaDB | 3 args | PCRE2 | > > > > Fix: Add an `is_mariadb` check within the existing `dialect == "mysql"` > > branch to use the 3-argument form with PCRE2 backreferences. > > > > Impact: This is the only truly MariaDB-specific production code change. > > > > Maintenance burden: Zero - this is a one-time migration that will never > > change. > > > > ### Change 4: `scheduler_job_runner.py` - Disable unused RETURNING clause > > > > **File:** `airflow-core/src/airflow/jobs/scheduler_job_runner.py` > > > > Root cause: The `_orphan_unreferenced_assets()` method executes a `DELETE > > ... WHERE EXISTS (SELECT ... FROM cte)` query. On MariaDB, SQLAlchemy's > > dialect automatically adds a `RETURNING` clause to DELETE statements, but > > MariaDB doesn't support `RETURNING` when the DELETE references a CTE in > an > > EXISTS subquery. > > > > Fix: Add `.execution_options(return_defaults=False)` to disable the > > unnecessary `RETURNING` clause. > > > > Impact: This is a general improvement for all dialects (prevents syntax > > error on MariaDB, avoids overhead on PostgreSQL). > > > > Maintenance burden: Zero - this is a one-line addition that makes the > code > > more explicit about its intent. > > > > ### Change 5: `test_exceptions.py` - Accept multiple valid SQL formats > > > > **File:** `airflow-core/tests/unit/api_fastapi/common/test_exceptions.py` > > > > Root cause: The unique constraint error handler tests had hardcoded > > expected SQL statements and error messages that were specific to one > MySQL > > connector (mysqlclient with `%(name)s` parameter style). Different > > connectors and database servers produce different output: > > > > | Aspect | mysqlclient + MySQL | PyMySQL + MariaDB | > > |--------|-------------------|-------------------| > > | Parameter style | `%(name)s` (pyformat) | `%s` (format) | > > | Constraint name in error | `slot_pool_pool_uq` | > > `slot_pool.slot_pool_pool_uq` | > > > > Fix: Change expected `statement` and `orig_error` values from single > > strings to lists of acceptable values. > > > > Impact: This is a test robustness improvement that makes the tests > > connector-agnostic. > > > > Maintenance burden: Negligible - only affects test expectations, not > > production code. > > > > ### Change 6: `test_sqlalchemy.py` - Fix test expectation after row-lock > > fix > > > > **File:** `airflow-core/tests/unit/utils/test_sqlalchemy.py` > > > > Root cause: After removing the `supports_for_update_of` guard, the test > > case for `("mysql", False, True, ...)` needed its expected value updated > > from `False` to `True`. > > > > Fix: Update the test case. > > > > Impact: Test-only change. > > > > Maintenance burden: Zero. > > > > ## 3. SQLAlchemy Version Requirement > > > > Airflow already requires SQLAlchemy >= 2.0.46. This version includes two > > relevant fixes: > > > > 1. MariaDB `NOCYCLE` DDL compilation - SQLAlchemy 2.0.46 correctly omits > > the `NO CYCLE` clause for MariaDB when creating sequences with > > `cycle=False` > > 2. aiosqlite thread-hanging fix - unrelated to MariaDB but included in > the > > same version bump > > > > No application code changes were needed for the NOCYCLE fix - it's > handled > > entirely by SQLAlchemy. > > > > ## 4. What Does NOT Need to Change > > > > It's equally important to note what works without any modifications: > > - All SQLAlchemy ORM operations > > - All Alembic migrations (except the two noted above) > > - JSON column type > > - Connection pooling and async > > - All 1900+ core tests pass on MariaDB 11.8 with just these 6 file > changes > > > > ## 5. Maintenance Overhead Assessment > > > > ### One-time changes (zero ongoing maintenance) > > - Migrations 0036 and 0049: These are frozen historical migrations > > - Row-lock fix: Code removal > > - RETURNING fix: One-line addition > > - Test improvements: More robust assertions > > > > ### Potential future attention areas > > - Raw SQL with `REGEXP_REPLACE`: If new migrations use database-specific > > regex, they would need a MariaDB branch (but this is rare - only 1 > > migration in the entire history needed it) > > - New `DELETE ... RETURNING` with CTE patterns: If new code uses this > > pattern, it would need `.execution_options(return_defaults=False)` - but > > this is good practice regardless > > - SQLAlchemy dialect differences: SQLAlchemy's MySQL/MariaDB runtime > > detection handles the vast majority of differences automatically > > > > ### Quantitative comparison > > The total diff for MariaDB compatibility is approximately +80 / -30 lines > > across 6 files, of which the majority are test improvements and migration > > workarounds that benefit MySQL as well. > > > > ## 6. Testing Infrastructure > > > > The second commit ( > > > > > https://github.com/apache/airflow/pull/60133/changes/b47cea389a9ed4f97a02820c992f505da82a4e4c > > ) > > is just a workaround to enable MariaDB testing in the CI pipeline. (When > > setting "mysql" dialect and prefixing the version to "mariadb:11.4" > causes > > a MariaDB Docker image to be spun up instead of a MySQL one) > > > > This is just a temporary mesure, that has permitted us to confirm that > the > > previous compatibility commit works with existing tests and adds MariaDB > > testing to the CI pipeline. > > > > ## 7. Conclusion > > > > Supporting MariaDB requires minimal code changes. > > > > The maintenance overhead is demonstrably low: after months of continuous > > rebasing against the main branch, no additional MariaDB-specific changes > > were required beyond the initial 6-file changeset. The compatibility has > > proven stable across multiple Airflow releases. > > > > Furthermore, the number of required corrections has actually decreased > over > > time, primarily because requiring a recent version of SQLAlchemy (>= > > 2.0.46) handles many of the dialect differences automatically. Early > > MariaDB support attempts needed more workarounds, but SQLAlchemy's > improved > > MariaDB dialect support has eliminated many of those needs. > > > > Looking forward to the community's thoughts on this proposal. > > > > Best regards, > > Diego Dupin > > >
