kaxil opened a new pull request, #68735:
URL: https://github.com/apache/airflow/pull/68735

   On Airflow < 3.2, `common.compat`'s `get_hook_lineage_collector()` polyfills 
`add_extra` by
   monkeypatching the `collected_assets` and `has_collected` **class** 
properties of the lineage
   collector. `_add_extra_polyfill` was not idempotent: the trigger gate 
(`_lacks_add_extra_method`)
   checks **instance-level** `_extra`, so a fresh collector instance of an 
already-patched class
   re-enters the polyfill and re-wraps the class property, capturing the 
previously installed wrapper
   as the "original". Each call stacks another layer onto the getter chain, so 
once enough collectors
   are created the next `collected_assets` / `has_collected` access exceeds the 
recursion limit and
   raises `RecursionError`.
   
   ## Where it shows up
   
   In production the global collector is a process singleton (the accessor is 
`@cache`-decorated) and
   the polyfill only runs on Airflow < 3.2, so it is applied exactly once -- no 
impact. The failure
   surfaces on the **Compat 3.1.x provider test matrix**, where a fresh 
`HookLineageCollector` is built
   per test: the class accumulates one wrapper per test until the lineage 
edge-case tests blow the
   recursion limit. This is a test-suite/robustness fix, not a production 
incident.
   
   ## Fix
   
   Make `_add_extra_polyfill` idempotent -- patch each collector class exactly 
once:
   
   - Skip re-patching when the class's own `__dict__` already carries a 
`_compat_extra_polyfilled`
     marker (checked on the exact class, not via inheritance, so a subclass 
that overrides these
     properties is still patched in its own right).
   - Set the marker only after all three patches (`collected_assets`, 
`has_collected`, `add_extra`)
     are installed, so a failure mid-patch leaves the class unmarked and 
retryable rather than
     half-patched.
   - Initialize the instance `_extra` / `_extra_counts` only when missing, so 
re-applying never clears
     a collector that already accumulated extras.
   
   Behavior is otherwise unchanged: every collector still gets `add_extra` and 
the extended
   `collected_assets` / `has_collected`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to