Dev-iL commented on code in PR #64972: URL: https://github.com/apache/airflow/pull/64972#discussion_r3062674213
########## scripts/ci/prek/check_migration_patterns.py: ########## @@ -0,0 +1,384 @@ +#!/usr/bin/env python +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +""" +Static analysis checks for Alembic migration anti-patterns. + +These checks use AST analysis to detect common migration mistakes that can cause +silent failures, particularly on SQLite where ``PRAGMA foreign_keys`` has specific +requirements about transaction state. + +TODO: These checks are candidates for future Ruff rules (AIR category). + +Rules +===== + +MIG001 -- DML before ``disable_sqlite_fkeys`` +---------------------------------------------- + +**What it does:** +Detects ``op.execute()`` calls containing DML keywords (UPDATE, INSERT, DELETE) that +appear before a ``with disable_sqlite_fkeys(op):`` block in the same function. + +**Why is this bad:** +On SQLite, any DML statement triggers *autobegin*, starting a transaction. Once a +transaction is active, ``PRAGMA foreign_keys=off`` (issued by ``disable_sqlite_fkeys``) +becomes a no-op. This means foreign key checks remain enabled, and +``batch_alter_table`` (which recreates tables) may fail with foreign key constraint +violations. + +**Example (bad):** + +.. code-block:: python + + def upgrade(): + op.execute("UPDATE dag SET col = '' WHERE col IS NULL") # triggers autobegin + with disable_sqlite_fkeys(op): # PRAGMA is now a no-op! + with op.batch_alter_table("dag") as batch_op: + batch_op.alter_column("col", nullable=False) + +**Use instead:** + +.. code-block:: python + + def upgrade(): + with disable_sqlite_fkeys(op): + op.execute("UPDATE dag SET col = '' WHERE col IS NULL") + with op.batch_alter_table("dag") as batch_op: + batch_op.alter_column("col", nullable=False) + + +MIG002 -- DDL before ``disable_sqlite_fkeys`` +---------------------------------------------- + +**What it does:** +Detects any Alembic ``op.*`` call (DDL operations like ``op.add_column``, +``op.drop_column``, ``op.create_table``, ``op.drop_table``, ``op.create_index``, +``op.batch_alter_table``, etc.) that appears before a +``with disable_sqlite_fkeys(op):`` block in the same function. Excludes DML calls +covered by MIG001. + +**Why is this bad:** +DDL operations also trigger *autobegin* on SQLite, making the subsequent +``PRAGMA foreign_keys=off`` a no-op, exactly like DML. + +**Example (bad):** + +.. code-block:: python + + def upgrade(): + op.add_column("dag_run", sa.Column("created_at", ...)) # triggers autobegin + with disable_sqlite_fkeys(op): # PRAGMA is now a no-op! + with op.batch_alter_table("backfill") as batch_op: + batch_op.alter_column("col", nullable=True) + +**Use instead:** + +.. code-block:: python + + def upgrade(): + with disable_sqlite_fkeys(op): + op.add_column("dag_run", sa.Column("created_at", ...)) + with op.batch_alter_table("backfill") as batch_op: + batch_op.alter_column("col", nullable=True) + + +MIG003 -- DML without offline-mode guard Review Comment: As mentioned on Slack - I think it's justified to err on the side of false-positive in the case of migration issue detection. If you disagree, I can try to tighten this check or disable it. The question is - does it catch real issues we previously missed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
