Hi Diego,

Thanks for the detailed proposal, and apologies for the slow response
on the list. Picking it up now because your PR #66711
<https://github.com/apache/airflow/pull/66711> reviving the
work deserves a substantive reply rather than continued silence.

I want to be straightforward: I don't think this should land, and I
want to explain why in terms of project history rather than the
merits of the specific diff.

MariaDB as a metadata backend has been discussed on this list at
least three times in the past five years, and the answer has
consistently been the same. A few of the relevant threads:

* 2021-07: a user hit Galera/HAProxy connection-drop hangs.
  Project position was stated as "we do not have MariaDB
  support... we even discourage people from using MariaDB because
  we do not run tests with either version in our CI."
  https://lists.apache.org/thread/hpxgzzw5mfjgl5b5twj0rjm3xb9r5cx2

* 2021-11 "Elephant in the room - MySQL", triggered by an earlier
  MariaDB-compat PR (#18506). Explicit -1s from several committers
  (Ash and I included) against officially adopting MariaDB. The
  net outcome was to keep the status quo.
  https://lists.apache.org/thread/dp78j0ssyhx62008lbtblrc856nbmlfb

* 2022 documentation change (#24556) baked the position into the
  prerequisites docs.
  https://github.com/apache/airflow/pull/24556

* 2023-08 "Drop MsSQL" used MariaDB as the precedent for "we
  deliberately don't support this DB", and the same reasoning was
  then applied to drop MsSQL.
  https://lists.apache.org/thread/r06j306hldg03g2my1pd4nyjxg78b3h4

So this isn't a fresh question; it's a recurring one, and the
constraint hasn't moved unfortunately.

The blocker is mainly structural: we
have already been through removing an under-supported backend
(MsSQL) and we remember what that cost users. Adding another one
without a committed group of maintainers behind it would be a
near-certainty to repeat that.

Sorry for the unenthusiastic answer after the work you've put in.
I'd rather say this clearly now than have you keep rebasing through
another release cycle while waiting for the silence to break in a
different directions.

Regards,
Kaxil

On Tue, 7 Apr 2026 at 14:30, Diego Dupin <[email protected]> wrote:

> Hi all,
>
> I'd like to start a discussion about adding MariaDB as a supported database
> backend for Apache Airflow, alongside the existing MySQL and PostgreSQL
> backends.
>
> ## Summary
>
> This proposal shows all the changes needed to support MariaDB. All core
> tests pass on LTS versions of both MySQL and MariaDB.
>
> The compatibility commit (
>
> https://github.com/apache/airflow/pull/60133/changes/b9da4fe819503ff1aad67aa0943da3ac33707d8d
> )
> changes 6 files: 4 production code files and 2 test files. Each change is
> detailed below with the root cause, the fix, and the maintenance burden.
>
> ## 1. Scope of Required Code Changes
>
> ### Overview
>
> | # | File | What changes | MariaDB-specific? | Maintenance burden |
> |---|------|--------------|-------------------|-------------------|
> | 1 | `airflow/utils/sqlalchemy.py` | Bug fix: remove incorrect
> `supports_for_update_of` guard | No - improves MySQL too | None (code
> removal) |
> | 2 | `migrations/0036_...add_name_field_to_dataset_model.py` | MySQL Error
> 1093 workaround | Shared MySQL/MariaDB | None (one-time migration) |
> | 3 | `migrations/0049_...remove_pickled_data_from_xcom_table.py` | MariaDB
> REGEXP_REPLACE syntax | MariaDB-specific branch | None (one-time migration)
> |
> | 4 | `airflow/jobs/scheduler_job_runner.py` | Disable unused RETURNING
> clause | No - benefits all dialects | None (one-line addition) |
> | 5 | `tests/.../test_exceptions.py` | Accept multiple valid SQL formats |
> Test robustness improvement | None |
> | 6 | `tests/.../test_sqlalchemy.py` | Fix test expectation after #1 | Test
> correction | None |
>
> Key takeaway: Of the 6 changes, only one (Migration 0049) is truly
> MariaDB-specific. The others are either bug fixes, general improvements, or
> shared MySQL/MariaDB workarounds.
>
> ## 2. Detailed Analysis of Each Change
>
> ### Change 1: `with_row_locks` - Remove incorrect `supports_for_update_of`
> guard
>
> **File:** `airflow-core/src/airflow/utils/sqlalchemy.py`
>
> Root cause: The `with_row_locks()` function had a guard that disabled
> row-level locking when `supports_for_update_of` was `False`. This check
> made sense when it was written (MariaDB 10.3 lacked full locking support),
> but it incorrectly disables row locking on MariaDB 10.6+.
>
> The facts:
> - MariaDB 10.6+ fully supports `FOR UPDATE`, `NOWAIT`, and `SKIP LOCKED`
> - SQLAlchemy already guards the `OF <table>` clause internally when the
> dialect doesn't support it
> - The `supports_for_update_of` attribute tests the wrong capability for the
> intended purpose
>
> Fix: Remove the 4-line guard entirely. The `USE_ROW_LEVEL_LOCKING` config
> flag is sufficient for users who want to explicitly disable row locking.
>
> Impact: This is a bug fix that benefits all MySQL-family backends.
>
> Maintenance burden: Zero - this is pure code removal.
>
> ### Change 2: Migration 0036 - MariaDB Error 1093 workaround
>
> **File:**
>
> `airflow-core/src/airflow/migrations/versions/0036_3_0_0_add_name_field_to_dataset_model.py`
>
> Root cause: The migration's `downgrade()` function uses a CTE-based `DELETE
> ... WHERE id IN (SELECT * FROM cte)` query to remove duplicate datasets.
> MariaDB raises Error 1093 when a `DELETE` statement references the same
> table in a subquery.
>
> Fix: Add a `dialect == "mysql"` branch that rewrites the query using a
> derived table wrapper.
>
> Impact: This is MariaDB-specific in practice, but the original query works
> fine on MySQL 8.0.
>
> Maintenance burden: Zero - this is a one-time migration that will never
> change.
>
> ### Change 3: Migration 0049 - MariaDB REGEXP_REPLACE syntax
>
> **File:**
>
> `airflow-core/src/airflow/migrations/versions/0049_3_0_0_remove_pickled_data_from_xcom_table.py`
>
> Root cause: This migration sanitizes non-standard JSON tokens in the XCom
> table during the pickle-to-JSON conversion. The `REGEXP_REPLACE` function
> has different signatures between MySQL 8 and MariaDB:
>
> | Database | Signature | Regex Engine |
> |----------|----------|-------------|
> | MySQL 8 | 6 args | ICU |
> | MariaDB | 3 args | PCRE2 |
>
> Fix: Add an `is_mariadb` check within the existing `dialect == "mysql"`
> branch to use the 3-argument form with PCRE2 backreferences.
>
> Impact: This is the only truly MariaDB-specific production code change.
>
> Maintenance burden: Zero - this is a one-time migration that will never
> change.
>
> ### Change 4: `scheduler_job_runner.py` - Disable unused RETURNING clause
>
> **File:** `airflow-core/src/airflow/jobs/scheduler_job_runner.py`
>
> Root cause: The `_orphan_unreferenced_assets()` method executes a `DELETE
> ... WHERE EXISTS (SELECT ... FROM cte)` query. On MariaDB, SQLAlchemy's
> dialect automatically adds a `RETURNING` clause to DELETE statements, but
> MariaDB doesn't support `RETURNING` when the DELETE references a CTE in an
> EXISTS subquery.
>
> Fix: Add `.execution_options(return_defaults=False)` to disable the
> unnecessary `RETURNING` clause.
>
> Impact: This is a general improvement for all dialects (prevents syntax
> error on MariaDB, avoids overhead on PostgreSQL).
>
> Maintenance burden: Zero - this is a one-line addition that makes the code
> more explicit about its intent.
>
> ### Change 5: `test_exceptions.py` - Accept multiple valid SQL formats
>
> **File:** `airflow-core/tests/unit/api_fastapi/common/test_exceptions.py`
>
> Root cause: The unique constraint error handler tests had hardcoded
> expected SQL statements and error messages that were specific to one MySQL
> connector (mysqlclient with `%(name)s` parameter style). Different
> connectors and database servers produce different output:
>
> | Aspect | mysqlclient + MySQL | PyMySQL + MariaDB |
> |--------|-------------------|-------------------|
> | Parameter style | `%(name)s` (pyformat) | `%s` (format) |
> | Constraint name in error | `slot_pool_pool_uq` |
> `slot_pool.slot_pool_pool_uq` |
>
> Fix: Change expected `statement` and `orig_error` values from single
> strings to lists of acceptable values.
>
> Impact: This is a test robustness improvement that makes the tests
> connector-agnostic.
>
> Maintenance burden: Negligible - only affects test expectations, not
> production code.
>
> ### Change 6: `test_sqlalchemy.py` - Fix test expectation after row-lock
> fix
>
> **File:** `airflow-core/tests/unit/utils/test_sqlalchemy.py`
>
> Root cause: After removing the `supports_for_update_of` guard, the test
> case for `("mysql", False, True, ...)` needed its expected value updated
> from `False` to `True`.
>
> Fix: Update the test case.
>
> Impact: Test-only change.
>
> Maintenance burden: Zero.
>
> ## 3. SQLAlchemy Version Requirement
>
> Airflow already requires SQLAlchemy >= 2.0.46. This version includes two
> relevant fixes:
>
> 1. MariaDB `NOCYCLE` DDL compilation - SQLAlchemy 2.0.46 correctly omits
> the `NO CYCLE` clause for MariaDB when creating sequences with
> `cycle=False`
> 2. aiosqlite thread-hanging fix - unrelated to MariaDB but included in the
> same version bump
>
> No application code changes were needed for the NOCYCLE fix - it's handled
> entirely by SQLAlchemy.
>
> ## 4. What Does NOT Need to Change
>
> It's equally important to note what works without any modifications:
> - All SQLAlchemy ORM operations
> - All Alembic migrations (except the two noted above)
> - JSON column type
> - Connection pooling and async
> - All 1900+ core tests pass on MariaDB 11.8 with just these 6 file changes
>
> ## 5. Maintenance Overhead Assessment
>
> ### One-time changes (zero ongoing maintenance)
> - Migrations 0036 and 0049: These are frozen historical migrations
> - Row-lock fix: Code removal
> - RETURNING fix: One-line addition
> - Test improvements: More robust assertions
>
> ### Potential future attention areas
> - Raw SQL with `REGEXP_REPLACE`: If new migrations use database-specific
> regex, they would need a MariaDB branch (but this is rare - only 1
> migration in the entire history needed it)
> - New `DELETE ... RETURNING` with CTE patterns: If new code uses this
> pattern, it would need `.execution_options(return_defaults=False)` - but
> this is good practice regardless
> - SQLAlchemy dialect differences: SQLAlchemy's MySQL/MariaDB runtime
> detection handles the vast majority of differences automatically
>
> ### Quantitative comparison
> The total diff for MariaDB compatibility is approximately +80 / -30 lines
> across 6 files, of which the majority are test improvements and migration
> workarounds that benefit MySQL as well.
>
> ## 6. Testing Infrastructure
>
> The second commit (
>
> https://github.com/apache/airflow/pull/60133/changes/b47cea389a9ed4f97a02820c992f505da82a4e4c
> )
> is just a workaround to enable MariaDB testing in the CI pipeline. (When
> setting "mysql" dialect and prefixing the version to "mariadb:11.4" causes
> a MariaDB Docker image to be spun up instead of a MySQL one)
>
> This is just a temporary mesure, that has permitted us to confirm that the
> previous compatibility commit works with existing tests and adds MariaDB
> testing to the CI pipeline.
>
> ## 7. Conclusion
>
> Supporting MariaDB requires minimal code changes.
>
> The maintenance overhead is demonstrably low: after months of continuous
> rebasing against the main branch, no additional MariaDB-specific changes
> were required beyond the initial 6-file changeset. The compatibility has
> proven stable across multiple Airflow releases.
>
> Furthermore, the number of required corrections has actually decreased over
> time, primarily because requiring a recent version of SQLAlchemy (>=
> 2.0.46) handles many of the dialect differences automatically. Early
> MariaDB support attempts needed more workarounds, but SQLAlchemy's improved
> MariaDB dialect support has eliminated many of those needs.
>
> Looking forward to the community's thoughts on this proposal.
>
> Best regards,
> Diego Dupin
>

Reply via email to