alamashir opened a new pull request, #64938:
URL: https://github.com/apache/airflow/pull/64938

   ## Summary
   
   Fixes #62373
   
   - Moves heavy imports (`google.cloud.dataproc_v1`, `google.cloud.bigquery`, 
hooks, triggers) from module-level to method bodies in `dataproc.py` and 
`bigquery.py` operators
   - Uses `_UNSET` sentinel pattern for `DEFAULT`/`DEFAULT_RETRY` default 
parameter values to avoid importing `google.api_core` at class definition time
   - Moves `_BigQueryHookWithFlexibleProjectId` class definition inside 
`get_db_hook()` method to avoid importing `BigQueryHook` at module level
   - Updates test mock paths to point to actual source modules since symbols 
are no longer re-exported at the operator module level
   
   This prevents 15-26s import delays on small workers (1 vCPU, 2 GiB) that 
cause `DagBag timeout error` (30s limit) during DAG parsing. Heavy imports now 
only happen at task execution time.
   
   ## Test plan
   
   - [x] `pytest 
providers/google/tests/unit/google/cloud/operators/test_dataproc.py` — 108 
passed
   - [x] `pytest 
providers/google/tests/unit/google/cloud/operators/test_bigquery.py` — 128 
passed
   - [x] `ruff check` passes on all changed files
   - [x] Import smoke test: `python -c "from 
airflow.providers.google.cloud.operators.dataproc import 
DataprocCreateBatchOperator"` completes without loading google.cloud.dataproc_v1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to