jason810496 opened a new pull request, #59805:
URL: https://github.com/apache/airflow/pull/59805
## Why
Before current refactor, no matter which `airflow` command we execute,
`cli_parser` will import actual AuthManager and Executor we use just to call
`get_cli_commands` for getting optional commands.
Which means no matter what CLI commands we run, even we run `airflow
--help`, we will import heavy module like kubernetes, flask_appbuilder, etc
(based on `AIRFLOW__CORE__AUTH_MANAGER` and `AIRFLOW__CORE__EXECUTOR` config).
In the worse case ( FabAuthManager + CeleryKubernetesExecutor ), it will took
approximately 5 seconds just to show `airflow --help` command based on the
benchmark!
## How
The current `cli_parser` works like the following
```python
for executor_name in ExecutorLoader.get_executor_names(validate_teams=False):
try:
executor, _ = ExecutorLoader.import_executor_cls(executor_name)
airflow_commands.extend(executor.get_cli_commands())
except Exception:
# ....
try:
auth_mgr = get_auth_manager_cls()
airflow_commands.extend(auth_mgr.get_cli_commands())
except Exception:
# ....
```
## What
- Introduce `cli` section in provider metadata
- The CLI definition of
## Benchmark Result
**Summary:**
- Overall average: 3.117s, down 0.931s from 4.048s (22.999% improvement).
- Fastest time: 3.092s, down 0.474s from 3.566s (13.292% improvement).
- Slowest time: 3.155s, down 1.851s from 5.006s (36.976% improvement).
<details>
<summary> Full Airflow CLI Latency Benchmark - After Refactor </summary>
Benchmark results for `airflow --help` command with different Auth Manager
and Executor combinations.
Total combinations tested: 32
## Results Table
| Auth Manager | Executor | Avg Time (s) | Min Time (s) | Status |
|--------------|----------|--------------|--------------|--------|
| Default | Default | 3.133 | 3.072 | ✅ |
| Default | LocalExecutor | 3.112 | 3.075 | ✅ |
| Default | SequentialExecutor | 3.116 | 3.084 | ✅ |
| Default | AwsEcsExecutor | 3.151 | 3.141 | ✅ |
| Default | CeleryExecutor | 3.119 | 3.072 | ✅ |
| Default | CeleryKubernetesExecutor | 3.111 | 3.074 | ✅ |
| Default | KubernetesExecutor | 3.115 | 3.083 | ✅ |
| Default | EdgeExecutor | 3.096 | 3.077 | ✅ |
| AwsAuthManager | Default | 3.107 | 3.085 | ✅ |
| AwsAuthManager | LocalExecutor | 3.112 | 3.071 | ✅ |
| AwsAuthManager | SequentialExecutor | 3.135 | 3.123 | ✅ |
| AwsAuthManager | AwsEcsExecutor | 3.111 | 3.073 | ✅ |
| AwsAuthManager | CeleryExecutor | 3.104 | 3.082 | ✅ |
| AwsAuthManager | CeleryKubernetesExecutor | 3.099 | 3.076 | ✅ |
| AwsAuthManager | KubernetesExecutor | 3.124 | 3.105 | ✅ |
| AwsAuthManager | EdgeExecutor | 3.122 | 3.107 | ✅ |
| FabAuthManager | Default | 3.130 | 3.114 | ✅ |
| FabAuthManager | LocalExecutor | 3.115 | 3.076 | ✅ |
| FabAuthManager | SequentialExecutor | 3.102 | 3.082 | ✅ |
| FabAuthManager | AwsEcsExecutor | 3.108 | 3.082 | ✅ |
| FabAuthManager | CeleryExecutor | 3.110 | 3.073 | ✅ |
| FabAuthManager | CeleryKubernetesExecutor | 3.111 | 3.080 | ✅ |
| FabAuthManager | KubernetesExecutor | 3.101 | 3.071 | ✅ |
| FabAuthManager | EdgeExecutor | 3.128 | 3.086 | ✅ |
| KeycloakAuthManager | Default | 3.139 | 3.082 | ✅ |
| KeycloakAuthManager | LocalExecutor | 3.110 | 3.076 | ✅ |
| KeycloakAuthManager | SequentialExecutor | 3.092 | 3.073 | ✅ |
| KeycloakAuthManager | AwsEcsExecutor | 3.113 | 3.081 | ✅ |
| KeycloakAuthManager | CeleryExecutor | 3.112 | 3.079 | ✅ |
| KeycloakAuthManager | CeleryKubernetesExecutor | 3.120 | 3.106 | ✅ |
| KeycloakAuthManager | KubernetesExecutor | 3.121 | 3.074 | ✅ |
| KeycloakAuthManager | EdgeExecutor | 3.155 | 3.123 | ✅ |
## Summary Statistics
- **Successful combinations**: 32/32
- **Overall average time**: 3.117s
- **Fastest time**: 3.092s
- **Slowest time**: 3.155s
---
*Note: Each combination was run 3 times and averaged.*
</details>
<details>
<summary> Full Airflow CLI Latency Benchmark - Before Refactor </summary>
Benchmark results for `airflow --help` command with different Auth Manager
and Executor combinations.
Total combinations tested: 32
## Results Table
| Auth Manager | Executor | Avg Time (s) | Min Time (s) | Status |
|--------------|----------|--------------|--------------|--------|
| Default | Default | 3.610 | 3.570 | ✅ |
| Default | LocalExecutor | 3.566 | 3.556 | ✅ |
| Default | SequentialExecutor | 3.617 | 3.561 | ✅ |
| Default | AwsEcsExecutor | 3.746 | 3.741 | ✅ |
| Default | CeleryExecutor | 3.578 | 3.567 | ✅ |
| Default | CeleryKubernetesExecutor | 4.761 | 4.748 | ✅ |
| Default | KubernetesExecutor | 4.715 | 4.687 | ✅ |
| Default | EdgeExecutor | 3.968 | 3.919 | ✅ |
| AwsAuthManager | Default | 3.760 | 3.727 | ✅ |
| AwsAuthManager | LocalExecutor | 3.721 | 3.718 | ✅ |
| AwsAuthManager | SequentialExecutor | 3.717 | 3.712 | ✅ |
| AwsAuthManager | AwsEcsExecutor | 3.762 | 3.739 | ✅ |
| AwsAuthManager | CeleryExecutor | 3.785 | 3.729 | ✅ |
| AwsAuthManager | CeleryKubernetesExecutor | 4.954 | 4.923 | ✅ |
| AwsAuthManager | KubernetesExecutor | 4.915 | 4.890 | ✅ |
| AwsAuthManager | EdgeExecutor | 4.067 | 4.042 | ✅ |
| FabAuthManager | Default | 3.813 | 3.783 | ✅ |
| FabAuthManager | LocalExecutor | 3.796 | 3.790 | ✅ |
| FabAuthManager | SequentialExecutor | 3.784 | 3.774 | ✅ |
| FabAuthManager | AwsEcsExecutor | 3.960 | 3.952 | ✅ |
| FabAuthManager | CeleryExecutor | 3.813 | 3.804 | ✅ |
| FabAuthManager | CeleryKubernetesExecutor | 5.006 | 4.982 | ✅ |
| FabAuthManager | KubernetesExecutor | 4.981 | 4.965 | ✅ |
| FabAuthManager | EdgeExecutor | 4.108 | 4.095 | ✅ |
| KeycloakAuthManager | Default | 3.646 | 3.626 | ✅ |
| KeycloakAuthManager | LocalExecutor | 3.654 | 3.625 | ✅ |
| KeycloakAuthManager | SequentialExecutor | 3.632 | 3.617 | ✅ |
| KeycloakAuthManager | AwsEcsExecutor | 3.802 | 3.799 | ✅ |
| KeycloakAuthManager | CeleryExecutor | 3.637 | 3.625 | ✅ |
| KeycloakAuthManager | CeleryKubernetesExecutor | 4.853 | 4.820 | ✅ |
| KeycloakAuthManager | KubernetesExecutor | 4.829 | 4.802 | ✅ |
| KeycloakAuthManager | EdgeExecutor | 3.994 | 3.962 | ✅ |
## Summary Statistics
- **Successful combinations**: 32/32
- **Overall average time**: 4.048s
- **Fastest time**: 3.566s
- **Slowest time**: 5.006s
---
*Note: Each combination was run 3 times and averaged.*
</details>
## Output Difference
<details>
<summary> `airflow --help` output after refactor </summary>
```
Usage: airflow [-h] GROUP_OR_COMMAND ...
Positional Arguments:
GROUP_OR_COMMAND
Groups
assets Manage assets
aws-auth-manager Manage resources used by AWS auth manager
backfill Manage backfills
celery Celery components
config View configuration
connections Manage connections
dags Manage DAGs
db Database operations
db-manager Manage externally connected database managers
edge Edge Worker components
fab-db Manage FAB
jobs Manage jobs
keycloak-auth-manager
Manage resources used by Keycloak auth manager
kubernetes Tools to help run the KubernetesExecutor
pools Manage pools
providers Display providers
roles Manage roles
tasks Manage tasks
teams Manage teams
users Manage users
variables Manage variables
Commands:
api-server Start an Airflow API server instance
cheat-sheet Display cheat sheet
dag-processor Start a dag processor instance
info Show information about current Airflow and
environment
kerberos Start a kerberos ticket renewer
permissions-cleanup
Clean up DAG permissions in Flask-AppBuilder tables
plugins Dump information about loaded plugins
rotate-fernet-key
Rotate encrypted connection credentials and variables
scheduler Start a scheduler instance
standalone Run an all-in-one copy of Airflow
sync-perm Update permissions for existing roles and optionally
DAGs
triggerer Start a triggerer instance
version Show the version
Options:
-h, --help show this help message and exit
```
</details>
<details>
<summary> `airflow --help` output before refactor </summary>
```
Usage: airflow [-h] GROUP_OR_COMMAND ...
Positional Arguments:
GROUP_OR_COMMAND
Groups
assets Manage assets
backfill Manage backfills
config View configuration
connections Manage connections
dags Manage DAGs
db Database operations
db-manager Manage externally connected database managers
jobs Manage jobs
pools Manage pools
providers Display providers
tasks Manage tasks
teams Manage teams
variables Manage variables
Commands:
api-server Start an Airflow API server instance
cheat-sheet Display cheat sheet
dag-processor Start a dag processor instance
info Show information about current Airflow and environment
kerberos Start a kerberos ticket renewer
plugins Dump information about loaded plugins
rotate-fernet-key
Rotate encrypted connection credentials and variables
scheduler Start a scheduler instance
standalone Run an all-in-one copy of Airflow
triggerer Start a triggerer instance
version Show the version
Options:
-h, --help show this help message and exit
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]