GitHub user uplsh580 edited a discussion: [Question] Bundle-specific Python 
Path Isolation in Airflow 3.x Git Bundles

### [Question] Bundle-specific Python Path Isolation in Airflow 3.x Git Bundles

**Environment:**
- Airflow Version: 3.1.7
- Deployment: Git Bundles
- Setup: Multi-tenant environment where each Git Repository (Bundle) belongs to 
a specific team.

---

**Context:**
We are operating a multi-tenant Airflow environment. Each team manages their 
own Git repository, which is deployed as a Git Bundle. The directory structure 
is versioned by commit IDs: 
`{BUNDLE_STORAGE_PATH}/{bundle_name}/version/{commit_id}/`

**Assumptions:**
- **One Bundle = One Team:** All DAGs and libraries within a specific 
`{bundle_name}` are maintained by the same team.
- **Internal Shared Library:** Each bundle has an internal library folder 
(e.g., `airflow_lib/`) at the bundle root, intended to be shared across all 
DAGs within that specific bundle and version.

**The Directory Structure:**
```text
{BUNDLE_ROOT}/{bundle_name}/version/{commit_id}/
├── airflow_lib/       # Team-specific internal library (Sibling to dags/)
│   ├── __init__.py
│   ├── constants/
│   └── util/
├── dags/              # DAGs folder
│   ├── team_dag_1.py
│   └── sub_dir/       # Nested DAGs
│       └── team_dag_2.py
└── README.md
```

**The Problem:**
When the DAG Processor parses these DAGs or the Worker executes tasks, we face 
a `ModuleNotFoundError: No module named 'airflow_lib'`.
Currently, Airflow only includes the `dags/` directory in `sys.path`. Since 
`airflow_lib` sits at the bundle root, it is not reachable by default.

**Key Challenges:**

1. Namespace Collisions across Teams: While one team owns a bundle, different 
teams might use identical names for their internal libraries (e.g., 
`airflow_lib`, `utils`, `common`). Therefore, we cannot add all bundle roots to 
a global `PYTHONPATH`.

2. Strict Versioning: A DAG must only import the airflow_lib that belongs to 
its specific `{commit_id}` to ensure consistency.

3. Automated Path Injection: We have hundreds of DAGs across various teams. 
Manually adding `sys.path.append` to every DAG file is not a scalable or clean 
solution.

**Questions:**

1. Does Airflow 3.x/AIP-66 provide a **native mechanism** to automatically add 
the **Bundle Root** to the Python path, scoped only to the DAGs within that 
specific bundle?

2. Is there a way to configure a **"Base Path" for imports** at the Bundle 
level so that the DAG Processor and Workers can resolve sibling packages 
correctly?

3. If this is not yet supported natively, what is the recommended 
"Airflow-native" way to inject this path dynamically without polluting the 
global environment or modifying every DAG file?

We are looking for an architectural pattern that respects bundle isolation 
while allowing for shared internal code. Any advice would be greatly 
appreciated!

GitHub link: https://github.com/apache/airflow/discussions/61901

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to