uplsh580 opened a new issue, #61037:
URL: https://github.com/apache/airflow/issues/61037
### Description
## Summary
Currently, the Helm chart processes all DAG bundles in a single DAG
Processor deployment. This proposal adds an option to create separate
deployments for each DAG bundle defined in `dagBundleConfigList`, enabling
independent resource isolation and scaling per bundle.
## Motivation
### Problems
1. **Resource Contention**: When multiple DAG bundles run in a single DAG
Processor deployment, if one bundle consumes excessive CPU/memory, it impacts
the parsing performance of other bundles.
2. **Parsing Time Delays**: When a specific bundle has many or complex DAG
files, parsing that bundle takes longer and can delay parsing of other bundles.
3. **Scaling Limitations**: Even when bundles have different resource
requirements or priorities, all bundles must currently be handled by the same
deployment.
4. **Lack of Failure Isolation**: Issues occurring in one bundle can affect
the entire DAG Processor.
### Use Cases
- Loading DAGs from multiple Git repositories (different bundle per
repository)
- When specific bundles require higher priority or more resources
## Proposed Solution
### Feature Overview
Add a new `deployPerBundle` section to the `dagProcessor` configuration to
create separate Kubernetes deployments for each bundle. When
`deployPerBundle.enabled` is set to `true`, a separate deployment will be
created for each bundle in `dagBundleConfigList`.
### Implementation Details
#### 1. values.yaml Changes
```yaml
dagProcessor:
enabled: ~
dagBundleConfigList:
- name: dags-folder
classpath: "airflow.dag_processing.bundles.local.LocalDagBundle"
kwargs: {}
# ... more bundles
# Per-bundle deployment option
# When enabled, creates a separate deployment for each bundle in
dagBundleConfigList
deployPerBundle:
enabled: false
# Command args template for per-bundle deployments
# {{ bundleName }} will be replaced with the actual bundle name
args: ["bash", "-c", "exec airflow dag-processor -B {{ bundleName }}"]
# Per-bundle specific overrides (optional)
bundleOverrides:
dags-folder:
replicas: 2
resources:
requests:
memory: "2Gi"
cpu: "1000m"
```
### Backward Compatibility
- When `deployPerBundle.enabled: false` (default), existing behavior is
maintained
- Per-bundle deployments are only created when `deployPerBundle.enabled:
true`
- Existing `dagProcessor` settings (like `replicas`, `resources`, etc.) are
used as defaults for per-bundle deployments
- The `deployPerBundle.args` replaces the default `args` when
`deployPerBundle.enabled` is `true`
- Bundle-specific configuration overrides are possible via
`deployPerBundle.bundleOverrides`
## Benefits
1. **Resource Isolation**: Each bundle runs in independent pods, preventing
resource contention
2. **Independent Scaling**: Different replica counts can be set per bundle
3. **Failure Isolation**: Issues in one bundle do not affect others
4. **Flexible Resource Allocation**: Different resource requests/limits can
be configured per bundle
5. **Easier Monitoring**: Metrics and logs can be separated and tracked per
bundle
### Use case/motivation
_No response_
### Related issues
_No response_
### Are you willing to submit a PR?
- [x] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]