GitHub user uplsh580 edited a discussion: [Question] Bundle-specific Python 
Path Isolation in Airflow 3.x Git Bundles

### [Question] Bundle-specific Python Path Isolation in Airflow 3.x Git Bundles

**Environment:**
- Airflow Version: 3.1.7
- Deployment: Git Bundles
- Setup: Multi-tenant environment where each Git Repository (Bundle) belongs to 
a specific team.

---

**Context (Path Architecture):**
Our infrastructure deploys code into different paths depending on the Airflow 
component:

1. **DAG Processor:** 
`{BUNDLE_ROOT}/{bundle_name}/tracking_repo/{user_code_root}/`
2. **Worker:** 
`{BUNDLE_ROOT}/{bundle_name}/version/{commit_id}/{user_code_root}/`

**The Directory Structure (Inside `{user_code_root}`):**
```text
{user_code_root}/
├── airflow_lib/       # Internal library (Shared across DAGs in the same 
bundle)
│   ├── __init__.py
│   ├── constants/
│   └── util/
├── dags/              # Actual DAG files
│   ├── my_dag.py
│   └── sub_dir/
│       └── nested_dag.py
└── README.md
```

**The Problem:**
When a DAG (e.g., `my_dag.py`) is parsed by the DAG Processor or executed by 
the Worker, it fails with `ModuleNotFoundError: No module named 'airflow_lib'`.

This is because the `{user_code_root}` directory—which contains the 
`airflow_lib` package—is not automatically added to `sys.path`. Since the path 
is dynamic (varies by `commit_id` on Workers) and inconsistent (differs between 
Processor and Worker), we cannot use a static global `PYTHONPATH`.

**Key Challenges:**

1. **Component Path Discrepancy**: Any solution must work for both the 
`tracking_repo` path on the Processor and the `version/{commit_id}` path on the 
Worker.

2. **Namespace Collisions**: Multiple bundles (teams) might have an 
`airflow_lib` folder. Adding all bundle roots to `sys.path` would cause 
collision and version "cross-talk."

3. **No Manual Code Changes**: We want to avoid forcing hundreds of developers 
to add `sys.path.append` logic to every DAG file.

**Questions:**

1. Is there a way to configure Airflow 3.x to automatically recognize the 
Bundle's specific root (`{user_code_root}`) as a Python source root during 
parsing and execution?

2. Are there any hooks or listeners (like `on_task_instance_run` or DAG 
policies) that are recommended for injecting component-specific paths 
dynamically?

3. How should we handle the fact that the import root changes between the DAG 
Processor (tracking) and Worker (versioned) while maintaining the same import 
statement `from airflow_lib import ...`?

We are looking for a clean, infrastructure-level solution that aligns with the 
AIP-66 design philosophy. Thank you!

GitHub link: https://github.com/apache/airflow/discussions/61901

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to