softyoungha commented on code in PR #40942:
URL: https://github.com/apache/airflow/pull/40942#discussion_r1688248481
##########
airflow/utils/process_utils.py:
##########
@@ -162,14 +162,17 @@ def signal_procs(sig):
return returncodes
-def execute_in_subprocess(cmd: list[str], cwd: str | None = None) -> None:
+def execute_in_subprocess(cmd: list[str], cwd: str | None = None, env_vars:
dict | None = None) -> None:
"""
Execute a process and stream output to logger.
:param cmd: command and arguments to run
:param cwd: Current working directory passed to the Popen constructor
+ :param env_vars: Additional environment variables to set for the
subprocess. If None,
+ the subprocess will inherit the current environment variables of the
parent process
+ (``os.environ``)
"""
- execute_in_subprocess_with_kwargs(cmd, cwd=cwd)
+ execute_in_subprocess_with_kwargs(cmd, cwd=cwd, env=dict(os.environ,
**env_vars) if env_vars else None)
Review Comment:
As you suggested, it seems better to add it.
Simply assigning `ArgNotSet` to `env_vars` value may not be sufficient for
handling all cases.
I will modify it by adding flags such as `inherit_env_vars` or
`override_env_vars` with default True
(until now, the code always fetched from `os.environ`)
> I excluded cases you mentioned because I thought the
`--system-site-package` option worked based on `PATH`. Ah, I just found out
that it is controlled by site now :)
<details>
<summary>Used sample codes</summary>
```python
import os
import sys
print(sys.executable)
##### output
#
$HOME/.local/share/hatch/env/virtual/apache-airflow/4VQ7OEnp/airflow-310/bin/python
from airflow.utils.process_utils import execute_in_subprocess_with_kwargs
execute_in_subprocess_with_kwargs(
["bash", "-c", "env"], env={"MY_ENV": "VALUE"}
)
execute_in_subprocess_with_kwargs(
["bash", "-c", "echo $PATH"],
env={} # empty environment
)
# [2024-07-23T23:58:07.097+0900] {process_utils.py:190} INFO - Output:
# [2024-07-23T23:58:07.098+0900] {process_utils.py:194} INFO -
PWD=$HOME/projects/airflow-main
# [2024-07-23T23:58:07.098+0900] {process_utils.py:194} INFO - MY_ENV=VALUE
# [2024-07-23T23:58:07.098+0900] {process_utils.py:194} INFO - SHLVL=0
# create venv
execute_in_subprocess_with_kwargs(
[sys.executable, "-m", "venv", "myenv", "--system-site-packages"]
)
# check site packages
execute_in_subprocess_with_kwargs(
[f"{os.getcwd()}/myenv/bin/python", "-c", "import site;
print(site.getsitepackages())"],
env={} # empty environment
)
# [2024-07-23T23:58:08.020+0900] {process_utils.py:190} INFO - Output:
# [2024-07-23T23:58:08.030+0900] {process_utils.py:194} INFO - [
# '$HOME/projects/airflow-main/myenv/lib/python3.10/site-packages',
# '$HOME/projects/airflow-main/myenv/local/lib/python3.10/dist-packages',
# '$HOME/projects/airflow-main/myenv/lib/python3/dist-packages',
# '$HOME/projects/airflow-main/myenv/lib/python3.10/dist-packages',
# '/usr/lib/python3.10/site-packages',
# '/usr/local/lib/python3.10/dist-packages',
# '/usr/lib/python3/dist-packages',
# '/usr/lib/python3.10/dist-packages'
# ]
# pip show apache-airflow
execute_in_subprocess_with_kwargs(
[f"{os.getcwd()}/myenv/bin/python", "-m", "pip", "show",
"apache-airflow"],
env={} # empty environment
)
# [2024-07-24T00:08:34.668+0900] {process_utils.py:190} INFO - Output:
# [2024-07-24T00:08:36.171+0900] {process_utils.py:194} INFO - Name:
apache-airflow
# [2024-07-24T00:08:36.171+0900] {process_utils.py:194} INFO - Version:
2.10.0.dev0
# [2024-07-24T00:08:36.171+0900] {process_utils.py:194} INFO - Summary:
Programmatically author, schedule and monitor data pipelines
# [2024-07-24T00:08:36.171+0900] {process_utils.py:194} INFO - Home-page:
# [2024-07-24T00:08:36.172+0900] {process_utils.py:194} INFO - Author:
# [2024-07-24T00:08:36.172+0900] {process_utils.py:194} INFO - Author-email:
Apache Software Foundation <[email protected]>
# [2024-07-24T00:08:36.172+0900] {process_utils.py:194} INFO - License:
# [2024-07-24T00:08:36.172+0900] {process_utils.py:194} INFO - Location:
$HOME/.local/lib/python3.10/site-packages
# [2024-07-24T00:08:36.172+0900] {process_utils.py:194} INFO - Requires: ....
```
</details>
##########
airflow/utils/process_utils.py:
##########
@@ -162,14 +162,17 @@ def signal_procs(sig):
return returncodes
-def execute_in_subprocess(cmd: list[str], cwd: str | None = None) -> None:
+def execute_in_subprocess(cmd: list[str], cwd: str | None = None, env_vars:
dict | None = None) -> None:
"""
Execute a process and stream output to logger.
:param cmd: command and arguments to run
:param cwd: Current working directory passed to the Popen constructor
+ :param env_vars: Additional environment variables to set for the
subprocess. If None,
+ the subprocess will inherit the current environment variables of the
parent process
+ (``os.environ``)
"""
- execute_in_subprocess_with_kwargs(cmd, cwd=cwd)
+ execute_in_subprocess_with_kwargs(cmd, cwd=cwd, env=dict(os.environ,
**env_vars) if env_vars else None)
Review Comment:
@uranusjr
As you suggested, it seems better to add it.
Simply assigning `ArgNotSet` to `env_vars` value may not be sufficient for
handling all cases.
I will modify it by adding flags such as `inherit_env_vars` or
`override_env_vars` with default True
(until now, the code always fetched from `os.environ`)
> I excluded cases you mentioned because I thought the
`--system-site-package` option worked based on `PATH`. Ah, I just found out
that it is controlled by site now :)
<details>
<summary>Used sample codes</summary>
```python
import os
import sys
print(sys.executable)
##### output
#
$HOME/.local/share/hatch/env/virtual/apache-airflow/4VQ7OEnp/airflow-310/bin/python
from airflow.utils.process_utils import execute_in_subprocess_with_kwargs
execute_in_subprocess_with_kwargs(
["bash", "-c", "env"], env={"MY_ENV": "VALUE"}
)
execute_in_subprocess_with_kwargs(
["bash", "-c", "echo $PATH"],
env={} # empty environment
)
# [2024-07-23T23:58:07.097+0900] {process_utils.py:190} INFO - Output:
# [2024-07-23T23:58:07.098+0900] {process_utils.py:194} INFO -
PWD=$HOME/projects/airflow-main
# [2024-07-23T23:58:07.098+0900] {process_utils.py:194} INFO - MY_ENV=VALUE
# [2024-07-23T23:58:07.098+0900] {process_utils.py:194} INFO - SHLVL=0
# create venv
execute_in_subprocess_with_kwargs(
[sys.executable, "-m", "venv", "myenv", "--system-site-packages"]
)
# check site packages
execute_in_subprocess_with_kwargs(
[f"{os.getcwd()}/myenv/bin/python", "-c", "import site;
print(site.getsitepackages())"],
env={} # empty environment
)
# [2024-07-23T23:58:08.020+0900] {process_utils.py:190} INFO - Output:
# [2024-07-23T23:58:08.030+0900] {process_utils.py:194} INFO - [
# '$HOME/projects/airflow-main/myenv/lib/python3.10/site-packages',
# '$HOME/projects/airflow-main/myenv/local/lib/python3.10/dist-packages',
# '$HOME/projects/airflow-main/myenv/lib/python3/dist-packages',
# '$HOME/projects/airflow-main/myenv/lib/python3.10/dist-packages',
# '/usr/lib/python3.10/site-packages',
# '/usr/local/lib/python3.10/dist-packages',
# '/usr/lib/python3/dist-packages',
# '/usr/lib/python3.10/dist-packages'
# ]
# pip show apache-airflow
execute_in_subprocess_with_kwargs(
[f"{os.getcwd()}/myenv/bin/python", "-m", "pip", "show",
"apache-airflow"],
env={} # empty environment
)
# [2024-07-24T00:08:34.668+0900] {process_utils.py:190} INFO - Output:
# [2024-07-24T00:08:36.171+0900] {process_utils.py:194} INFO - Name:
apache-airflow
# [2024-07-24T00:08:36.171+0900] {process_utils.py:194} INFO - Version:
2.10.0.dev0
# [2024-07-24T00:08:36.171+0900] {process_utils.py:194} INFO - Summary:
Programmatically author, schedule and monitor data pipelines
# [2024-07-24T00:08:36.171+0900] {process_utils.py:194} INFO - Home-page:
# [2024-07-24T00:08:36.172+0900] {process_utils.py:194} INFO - Author:
# [2024-07-24T00:08:36.172+0900] {process_utils.py:194} INFO - Author-email:
Apache Software Foundation <[email protected]>
# [2024-07-24T00:08:36.172+0900] {process_utils.py:194} INFO - License:
# [2024-07-24T00:08:36.172+0900] {process_utils.py:194} INFO - Location:
$HOME/.local/lib/python3.10/site-packages
# [2024-07-24T00:08:36.172+0900] {process_utils.py:194} INFO - Requires: ....
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]