This is an automated email from the ASF dual-hosted git repository.
kaxilnaik pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git
The following commit(s) were added to refs/heads/main by this push:
new a06453823a1 Strengthen Copilot review instructions to use imperative
"Flag any" phrasing (#62584)
a06453823a1 is described below
commit a06453823a13960387d73210253fc729504396ca
Author: Kaxil Naik <[email protected]>
AuthorDate: Fri Feb 27 20:38:48 2026 +0000
Strengthen Copilot review instructions to use imperative "Flag any"
phrasing (#62584)
Copilot's review agent converts rules phrased as "Flag any X" into
active code searches but treats passive phrasing as informational
context. Rewrite all critical rules (imports, assert, time.time,
lru_cache, session.commit, N+1 queries, unittest, unspec'd mocks)
to use imperative directives so the agent reliably flags violations.
---
.github/instructions/code-review.instructions.md | 52 ++++++++++++------------
1 file changed, 27 insertions(+), 25 deletions(-)
diff --git a/.github/instructions/code-review.instructions.md
b/.github/instructions/code-review.instructions.md
index 9e9a019f1a6..0d4ce8a8791 100644
--- a/.github/instructions/code-review.instructions.md
+++ b/.github/instructions/code-review.instructions.md
@@ -10,45 +10,49 @@ Use these rules when reviewing pull requests to the Apache
Airflow repository.
## Architecture Boundaries
- **Scheduler must never run user code.** It only processes serialized Dags.
Flag any scheduler-path code that deserializes or executes Dag/task code.
-- **Workers must not access the metadata DB directly.** Task execution
communicates with the API server through the Execution API (`/execution`
endpoints) only.
-- **Dag Processor and Triggerer run user code in isolated processes.** Code in
these components should maintain that isolation.
-- **Providers must not import core internals** like `SUPERVISOR_COMMS` or
task-runner plumbing. Providers interact through the public SDK and execution
API only.
+- **Flag any task execution code that accesses the metadata DB directly**
instead of through the Execution API (`/execution` endpoints).
+- **Flag any code in Dag Processor or Triggerer that breaks process
isolation** — these components run user code in isolated processes.
+- **Flag any provider importing core internals** like `SUPERVISOR_COMMS` or
task-runner plumbing. Providers interact through the public SDK and execution
API only.
## Database and Query Correctness
-- **N+1 queries**: Flag SQLAlchemy queries that access relationships inside
loops without `joinedload()` or `selectinload()`.
-- **`run_id` is only unique per Dag.** Queries that group, partition, or join
on `run_id` alone (without `dag_id`) will collide across Dags. Always require
`(dag_id, run_id)` together.
-- **Cross-database compatibility**: SQL changes must work on PostgreSQL,
MySQL, and SQLite. Flag database-specific features (lateral joins, window
functions) without cross-DB handling.
-- **Session discipline**: In `airflow-core`, functions receiving a `session`
parameter must not call `session.commit()`. Use keyword-only `session`
parameters.
+- **Flag any SQLAlchemy relationship access inside a loop** without
`joinedload()` or `selectinload()` — this is an N+1 query.
+- **Flag any query on `run_id` without `dag_id`.** `run_id` is only unique per
Dag. Queries that filter, group, partition, or join on `run_id` alone will
silently collide across Dags.
+- **Flag any `session.commit()` call in `airflow-core`** code that receives a
`session` parameter. Session lifecycle is managed by the caller, not the callee.
+- **Flag any `session` parameter that is not keyword-only** (`*, session`) in
`airflow-core`.
+- **Flag any database-specific SQL** (e.g., `LATERAL` joins, PostgreSQL-only
functions, MySQL-only syntax) without cross-DB handling. SQL must work on
PostgreSQL, MySQL, and SQLite.
## Code Quality Rules
-- No `assert` in production code (stripped in optimized Python).
-- `time.monotonic()` for durations, not `time.time()`.
-- Imports at top of file. Valid exceptions: circular imports, lazy loading for
worker isolation, `TYPE_CHECKING` blocks.
-- Guard heavy type-only imports (e.g., `kubernetes.client`) with
`TYPE_CHECKING` in multi-process code paths.
-- Unbounded caches are bugs: all `@lru_cache` must have `maxsize`.
-- Resources (files, connections, sessions) must use context managers or
`try/finally`.
+- **Flag any `assert` in non-test code.** `assert` is stripped in optimized
Python (`python -O`), making it a silent no-op in production.
+- **Flag any `time.time()` used for measuring durations.** Use
`time.monotonic()` instead — `time.time()` is affected by system clock
adjustments.
+- **Flag any `from` or `import` statement inside a function or method body.**
Imports must be at the top of the file. The only valid exceptions are: (1)
circular import avoidance, (2) lazy loading for worker isolation, (3)
`TYPE_CHECKING` blocks. If the import is inside a function, ask the author to
justify why it cannot be at module level.
+- **Flag any `@lru_cache(maxsize=None)`.** This creates an unbounded cache —
every unique argument set is cached forever. Note: `@lru_cache()` without
arguments defaults to `maxsize=128` and is fine.
+- **Flag any heavy import** (e.g., `kubernetes.client`) in multi-process code
paths that is not behind a `TYPE_CHECKING` guard.
+- **Flag any file, connection, or session opened without a context manager or
`try/finally`.**
## Testing Requirements
-- New behavior requires tests covering success, failure, and edge cases.
-- Use pytest patterns, not `unittest.TestCase`.
-- Use `spec`/`autospec` when mocking.
-- Use `time_machine` for time-dependent tests.
-- Imports belong at the top of test files, not inside test functions.
-- Issue numbers do not belong in test docstrings.
+- **Flag any new public method or behavior without corresponding tests.**
Tests must cover success, failure, and edge cases.
+- **Flag any `unittest.TestCase` subclass.** Use pytest patterns instead.
+- **Flag any `mock.Mock()` or `mock.MagicMock()` without `spec` or
`autospec`.** Unspec'd mocks silently accept any attribute access, hiding real
bugs.
+- **Flag any `time.sleep` or `datetime.now()` in tests.** Use `time_machine`
for time-dependent tests.
+- **Flag any issue number in test docstrings** (e.g., `"""Fix for #12345"""`)
— test names should describe behavior, not track tickets.
## API Correctness
-- `map_index` must be handled correctly for mapped tasks. Queries without
`map_index` filtering may return arbitrary task instances.
-- Execution API changes must follow Cadwyn versioning (CalVer format).
+- **Flag any query on mapped task instances that does not filter on
`map_index`.** Without it, queries may return arbitrary instances from the
mapped set.
+- **Flag any Execution API change without a Cadwyn version migration** (CalVer
format).
## UI Code (React/TypeScript)
- Avoid `useState + useEffect` to sync derived state. Use nullish coalescing
or nullable override patterns instead.
- Extract shared logic into custom hooks rather than copy-pasting across
components.
+## Generated Files
+
+- **Flag any manual edits to files in `openapi-gen/`** or Task SDK generated
models. These must be regenerated, not hand-edited.
+
## AI-Generated Code Signals
Flag these patterns that indicate low-quality AI-generated contributions:
@@ -63,7 +67,5 @@ Flag these patterns that indicate low-quality AI-generated
contributions:
## Quality Signals to Check
-The absence of these signals in a "fix" or "optimization" PR is itself a red
flag:
-
-- **Bug fixes need regression tests**: A test that fails without the fix and
passes with it.
-- **Existing tests must still pass without modification**: If existing tests
need changes to pass, the PR may introduce a behavioral regression.
+- **For bug-fix PRs, flag if there is no regression test** — a test that fails
without the fix and passes with it.
+- **Flag any existing test modified to accommodate new behavior** — this may
indicate a behavioral regression rather than a genuine fix.