mrkaye97 opened a new issue, #23176:
URL: https://github.com/apache/airflow/issues/23176
### What do you see as an issue?
(Title)
Arguments passed as `op_kwargs` seem to be eagerly evaluated, meaning that
they're evaluated at _parse_ time, instead of at _runtime_
In practice, this means that if you were (e.g.) calling a function that did
some heavy lifting, connecting to a database, etc. all of this would happen on
every _parse_ of the DAG.
See repro:
```python
from datetime import datetime
from airflow import DAG
from airflow.operators.python import PythonOperator
with DAG("bar") as dag:
def do_a_thing(x):
print(x + 1) ## side effect, which should error at runtime
return int(x)
repro = PythonOperator(
task_id = "do-error-thing",
python_callable = lambda x: x + 1,
op_kwargs = {
"x": do_a_thing("2")
}
)
repro
```
Parsing this DAG using `astrocloud dev parse` throws the following:
```python
061460-test-1 | ============================= test session starts
==============================
061460-test-1 | platform linux -- Python 3.9.7, pytest-7.1.1, pluggy-1.0.0
061460-test-1 | rootdir: /usr/local/airflow
061460-test-1 | plugins: anyio-3.5.0
061460-test-1 | collected 2 items
061460-test-1 |
061460-test-1 | .astrocloud/test_dag_integrity_default.py .F
[100%]
061460-test-1 |
061460-test-1 | =================================== FAILURES
===================================
061460-test-1 | _______________________ test_file_imports[dags/test3.py]
_______________________
061460-test-1 |
061460-test-1 | rel_path = 'dags/test3.py'
061460-test-1 | rv = 'Traceback (most recent call last):\n File
"/usr/local/airflow/dags/test3.py", line 18, in <module>\n "x":
do_a_th...irflow/dags/test3.py", line 8, in do_a_thing\n print(x +
1)\nTypeError: can only concatenate str (not "int") to str'
061460-test-1 |
061460-test-1 | @pytest.mark.parametrize("rel_path,rv",
get_import_errors(), ids=[x[0] for x in get_import_errors()])
061460-test-1 | def test_file_imports(rel_path,rv):
061460-test-1 | """ Test for import errors on a file """
061460-test-1 | if rel_path and rv : #Make sure our no op test
doesn't raise an error
061460-test-1 | > raise Exception(f"{rel_path} failed to import
with message \n {rv}")
061460-test-1 | E Exception: dags/test3.py failed to import with
message
061460-test-1 | E Traceback (most recent call last):
061460-test-1 | E File "/usr/local/airflow/dags/test3.py", line
18, in <module>
061460-test-1 | E "x": do_a_thing("2")
061460-test-1 | E File "/usr/local/airflow/dags/test3.py", line
8, in do_a_thing
061460-test-1 | E print(x + 1)
061460-test-1 | E TypeError: can only concatenate str (not "int")
to str
061460-test-1 |
061460-test-1 | .astrocloud/test_dag_integrity_default.py:64: Exception
061460-test-1 | =========================== short test summary info
============================
061460-test-1 | FAILED
.astrocloud/test_dag_integrity_default.py::test_file_imports[dags/test3.py]
061460-test-1 | ========================= 1 failed, 1 passed in 3.44s
==========================
```
### Solving the problem
Unsure how the problem can be "solved"
I wasn't aware of this behavior, and am not sure if this is really an issue
Airflow can do anything about, or if it's just a deeply buried aspect of Python
as a language. It'd be good to add not making function calls like this inside
of `op_kwargs` as a best practice though. This seems to be in line with not
putting things like database connections at the top level, because if this
dictionary is instantiated at parse time, then its contents are effectively
living at the top-level even if they appear to be nested inside of an operator.
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]