100% with Kaxil on this one.

On Wed, May 20, 2026 at 1:56 PM Kaxil Naik <[email protected]> wrote:

> Hi Diego,
>
> Thanks for the detailed proposal, and apologies for the slow response
> on the list. Picking it up now because your PR #66711
> <https://github.com/apache/airflow/pull/66711> reviving the
> work deserves a substantive reply rather than continued silence.
>
> I want to be straightforward: I don't think this should land, and I
> want to explain why in terms of project history rather than the
> merits of the specific diff.
>
> MariaDB as a metadata backend has been discussed on this list at
> least three times in the past five years, and the answer has
> consistently been the same. A few of the relevant threads:
>
> * 2021-07: a user hit Galera/HAProxy connection-drop hangs.
>   Project position was stated as "we do not have MariaDB
>   support... we even discourage people from using MariaDB because
>   we do not run tests with either version in our CI."
>   https://lists.apache.org/thread/hpxgzzw5mfjgl5b5twj0rjm3xb9r5cx2
>
> * 2021-11 "Elephant in the room - MySQL", triggered by an earlier
>   MariaDB-compat PR (#18506). Explicit -1s from several committers
>   (Ash and I included) against officially adopting MariaDB. The
>   net outcome was to keep the status quo.
>   https://lists.apache.org/thread/dp78j0ssyhx62008lbtblrc856nbmlfb
>
> * 2022 documentation change (#24556) baked the position into the
>   prerequisites docs.
>   https://github.com/apache/airflow/pull/24556
>
> * 2023-08 "Drop MsSQL" used MariaDB as the precedent for "we
>   deliberately don't support this DB", and the same reasoning was
>   then applied to drop MsSQL.
>   https://lists.apache.org/thread/r06j306hldg03g2my1pd4nyjxg78b3h4
>
> So this isn't a fresh question; it's a recurring one, and the
> constraint hasn't moved unfortunately.
>
> The blocker is mainly structural: we
> have already been through removing an under-supported backend
> (MsSQL) and we remember what that cost users. Adding another one
> without a committed group of maintainers behind it would be a
> near-certainty to repeat that.
>
> Sorry for the unenthusiastic answer after the work you've put in.
> I'd rather say this clearly now than have you keep rebasing through
> another release cycle while waiting for the silence to break in a
> different directions.
>
> Regards,
> Kaxil
>
> On Tue, 7 Apr 2026 at 14:30, Diego Dupin <[email protected]> wrote:
>
> > Hi all,
> >
> > I'd like to start a discussion about adding MariaDB as a supported
> database
> > backend for Apache Airflow, alongside the existing MySQL and PostgreSQL
> > backends.
> >
> > ## Summary
> >
> > This proposal shows all the changes needed to support MariaDB. All core
> > tests pass on LTS versions of both MySQL and MariaDB.
> >
> > The compatibility commit (
> >
> >
> https://github.com/apache/airflow/pull/60133/changes/b9da4fe819503ff1aad67aa0943da3ac33707d8d
> > )
> > changes 6 files: 4 production code files and 2 test files. Each change is
> > detailed below with the root cause, the fix, and the maintenance burden.
> >
> > ## 1. Scope of Required Code Changes
> >
> > ### Overview
> >
> > | # | File | What changes | MariaDB-specific? | Maintenance burden |
> > |---|------|--------------|-------------------|-------------------|
> > | 1 | `airflow/utils/sqlalchemy.py` | Bug fix: remove incorrect
> > `supports_for_update_of` guard | No - improves MySQL too | None (code
> > removal) |
> > | 2 | `migrations/0036_...add_name_field_to_dataset_model.py` | MySQL
> Error
> > 1093 workaround | Shared MySQL/MariaDB | None (one-time migration) |
> > | 3 | `migrations/0049_...remove_pickled_data_from_xcom_table.py` |
> MariaDB
> > REGEXP_REPLACE syntax | MariaDB-specific branch | None (one-time
> migration)
> > |
> > | 4 | `airflow/jobs/scheduler_job_runner.py` | Disable unused RETURNING
> > clause | No - benefits all dialects | None (one-line addition) |
> > | 5 | `tests/.../test_exceptions.py` | Accept multiple valid SQL formats
> |
> > Test robustness improvement | None |
> > | 6 | `tests/.../test_sqlalchemy.py` | Fix test expectation after #1 |
> Test
> > correction | None |
> >
> > Key takeaway: Of the 6 changes, only one (Migration 0049) is truly
> > MariaDB-specific. The others are either bug fixes, general improvements,
> or
> > shared MySQL/MariaDB workarounds.
> >
> > ## 2. Detailed Analysis of Each Change
> >
> > ### Change 1: `with_row_locks` - Remove incorrect
> `supports_for_update_of`
> > guard
> >
> > **File:** `airflow-core/src/airflow/utils/sqlalchemy.py`
> >
> > Root cause: The `with_row_locks()` function had a guard that disabled
> > row-level locking when `supports_for_update_of` was `False`. This check
> > made sense when it was written (MariaDB 10.3 lacked full locking
> support),
> > but it incorrectly disables row locking on MariaDB 10.6+.
> >
> > The facts:
> > - MariaDB 10.6+ fully supports `FOR UPDATE`, `NOWAIT`, and `SKIP LOCKED`
> > - SQLAlchemy already guards the `OF <table>` clause internally when the
> > dialect doesn't support it
> > - The `supports_for_update_of` attribute tests the wrong capability for
> the
> > intended purpose
> >
> > Fix: Remove the 4-line guard entirely. The `USE_ROW_LEVEL_LOCKING` config
> > flag is sufficient for users who want to explicitly disable row locking.
> >
> > Impact: This is a bug fix that benefits all MySQL-family backends.
> >
> > Maintenance burden: Zero - this is pure code removal.
> >
> > ### Change 2: Migration 0036 - MariaDB Error 1093 workaround
> >
> > **File:**
> >
> >
> `airflow-core/src/airflow/migrations/versions/0036_3_0_0_add_name_field_to_dataset_model.py`
> >
> > Root cause: The migration's `downgrade()` function uses a CTE-based
> `DELETE
> > ... WHERE id IN (SELECT * FROM cte)` query to remove duplicate datasets.
> > MariaDB raises Error 1093 when a `DELETE` statement references the same
> > table in a subquery.
> >
> > Fix: Add a `dialect == "mysql"` branch that rewrites the query using a
> > derived table wrapper.
> >
> > Impact: This is MariaDB-specific in practice, but the original query
> works
> > fine on MySQL 8.0.
> >
> > Maintenance burden: Zero - this is a one-time migration that will never
> > change.
> >
> > ### Change 3: Migration 0049 - MariaDB REGEXP_REPLACE syntax
> >
> > **File:**
> >
> >
> `airflow-core/src/airflow/migrations/versions/0049_3_0_0_remove_pickled_data_from_xcom_table.py`
> >
> > Root cause: This migration sanitizes non-standard JSON tokens in the XCom
> > table during the pickle-to-JSON conversion. The `REGEXP_REPLACE` function
> > has different signatures between MySQL 8 and MariaDB:
> >
> > | Database | Signature | Regex Engine |
> > |----------|----------|-------------|
> > | MySQL 8 | 6 args | ICU |
> > | MariaDB | 3 args | PCRE2 |
> >
> > Fix: Add an `is_mariadb` check within the existing `dialect == "mysql"`
> > branch to use the 3-argument form with PCRE2 backreferences.
> >
> > Impact: This is the only truly MariaDB-specific production code change.
> >
> > Maintenance burden: Zero - this is a one-time migration that will never
> > change.
> >
> > ### Change 4: `scheduler_job_runner.py` - Disable unused RETURNING clause
> >
> > **File:** `airflow-core/src/airflow/jobs/scheduler_job_runner.py`
> >
> > Root cause: The `_orphan_unreferenced_assets()` method executes a `DELETE
> > ... WHERE EXISTS (SELECT ... FROM cte)` query. On MariaDB, SQLAlchemy's
> > dialect automatically adds a `RETURNING` clause to DELETE statements, but
> > MariaDB doesn't support `RETURNING` when the DELETE references a CTE in
> an
> > EXISTS subquery.
> >
> > Fix: Add `.execution_options(return_defaults=False)` to disable the
> > unnecessary `RETURNING` clause.
> >
> > Impact: This is a general improvement for all dialects (prevents syntax
> > error on MariaDB, avoids overhead on PostgreSQL).
> >
> > Maintenance burden: Zero - this is a one-line addition that makes the
> code
> > more explicit about its intent.
> >
> > ### Change 5: `test_exceptions.py` - Accept multiple valid SQL formats
> >
> > **File:** `airflow-core/tests/unit/api_fastapi/common/test_exceptions.py`
> >
> > Root cause: The unique constraint error handler tests had hardcoded
> > expected SQL statements and error messages that were specific to one
> MySQL
> > connector (mysqlclient with `%(name)s` parameter style). Different
> > connectors and database servers produce different output:
> >
> > | Aspect | mysqlclient + MySQL | PyMySQL + MariaDB |
> > |--------|-------------------|-------------------|
> > | Parameter style | `%(name)s` (pyformat) | `%s` (format) |
> > | Constraint name in error | `slot_pool_pool_uq` |
> > `slot_pool.slot_pool_pool_uq` |
> >
> > Fix: Change expected `statement` and `orig_error` values from single
> > strings to lists of acceptable values.
> >
> > Impact: This is a test robustness improvement that makes the tests
> > connector-agnostic.
> >
> > Maintenance burden: Negligible - only affects test expectations, not
> > production code.
> >
> > ### Change 6: `test_sqlalchemy.py` - Fix test expectation after row-lock
> > fix
> >
> > **File:** `airflow-core/tests/unit/utils/test_sqlalchemy.py`
> >
> > Root cause: After removing the `supports_for_update_of` guard, the test
> > case for `("mysql", False, True, ...)` needed its expected value updated
> > from `False` to `True`.
> >
> > Fix: Update the test case.
> >
> > Impact: Test-only change.
> >
> > Maintenance burden: Zero.
> >
> > ## 3. SQLAlchemy Version Requirement
> >
> > Airflow already requires SQLAlchemy >= 2.0.46. This version includes two
> > relevant fixes:
> >
> > 1. MariaDB `NOCYCLE` DDL compilation - SQLAlchemy 2.0.46 correctly omits
> > the `NO CYCLE` clause for MariaDB when creating sequences with
> > `cycle=False`
> > 2. aiosqlite thread-hanging fix - unrelated to MariaDB but included in
> the
> > same version bump
> >
> > No application code changes were needed for the NOCYCLE fix - it's
> handled
> > entirely by SQLAlchemy.
> >
> > ## 4. What Does NOT Need to Change
> >
> > It's equally important to note what works without any modifications:
> > - All SQLAlchemy ORM operations
> > - All Alembic migrations (except the two noted above)
> > - JSON column type
> > - Connection pooling and async
> > - All 1900+ core tests pass on MariaDB 11.8 with just these 6 file
> changes
> >
> > ## 5. Maintenance Overhead Assessment
> >
> > ### One-time changes (zero ongoing maintenance)
> > - Migrations 0036 and 0049: These are frozen historical migrations
> > - Row-lock fix: Code removal
> > - RETURNING fix: One-line addition
> > - Test improvements: More robust assertions
> >
> > ### Potential future attention areas
> > - Raw SQL with `REGEXP_REPLACE`: If new migrations use database-specific
> > regex, they would need a MariaDB branch (but this is rare - only 1
> > migration in the entire history needed it)
> > - New `DELETE ... RETURNING` with CTE patterns: If new code uses this
> > pattern, it would need `.execution_options(return_defaults=False)` - but
> > this is good practice regardless
> > - SQLAlchemy dialect differences: SQLAlchemy's MySQL/MariaDB runtime
> > detection handles the vast majority of differences automatically
> >
> > ### Quantitative comparison
> > The total diff for MariaDB compatibility is approximately +80 / -30 lines
> > across 6 files, of which the majority are test improvements and migration
> > workarounds that benefit MySQL as well.
> >
> > ## 6. Testing Infrastructure
> >
> > The second commit (
> >
> >
> https://github.com/apache/airflow/pull/60133/changes/b47cea389a9ed4f97a02820c992f505da82a4e4c
> > )
> > is just a workaround to enable MariaDB testing in the CI pipeline. (When
> > setting "mysql" dialect and prefixing the version to "mariadb:11.4"
> causes
> > a MariaDB Docker image to be spun up instead of a MySQL one)
> >
> > This is just a temporary mesure, that has permitted us to confirm that
> the
> > previous compatibility commit works with existing tests and adds MariaDB
> > testing to the CI pipeline.
> >
> > ## 7. Conclusion
> >
> > Supporting MariaDB requires minimal code changes.
> >
> > The maintenance overhead is demonstrably low: after months of continuous
> > rebasing against the main branch, no additional MariaDB-specific changes
> > were required beyond the initial 6-file changeset. The compatibility has
> > proven stable across multiple Airflow releases.
> >
> > Furthermore, the number of required corrections has actually decreased
> over
> > time, primarily because requiring a recent version of SQLAlchemy (>=
> > 2.0.46) handles many of the dialect differences automatically. Early
> > MariaDB support attempts needed more workarounds, but SQLAlchemy's
> improved
> > MariaDB dialect support has eliminated many of those needs.
> >
> > Looking forward to the community's thoughts on this proposal.
> >
> > Best regards,
> > Diego Dupin
> >
>

Reply via email to