uranusjr opened a new pull request #22359:
URL: https://github.com/apache/airflow/pull/22359
Previously, XComArg uses hasattr duck-typing to determine if it needs to
recurse into an object's `template_fields`. However, this errornously lets it
calls `__getattr__` on certain objects that raises unexpected errors.
Instead of relying on duck-typing, this patch makes the check strictly uses
inheritance, making XComArg only recurse into `template_fields` if the object
is an operator. This avoids `__getattr__` being unexpectedly called, and thus
making the logic work better with other parts of the code base.
The bug can be triggered by the following DAG:
```python
@task
def a_list():
return [3, 6, 9]
@task
def ref_context(num, **context):
pass
with DAG(dag_id="ref_context", start_date=datetime(1970, 1, 1)) as dag:
ref_context.expand(num=a_list())
```
The following exception is raised:
```
Traceback (most recent call last):
File "/Users/matt/src/airflow/airflow/decorators/base.py", line 172, in
execute
return_value = super().execute(context)
File "/Users/matt/src/airflow/airflow/operators/python.py", line 169, in
execute
self.op_kwargs = self.determine_kwargs(context)
File "/Users/matt/src/airflow/airflow/models/baseoperator.py", line 1005,
in __setattr__
self.set_xcomargs_dependencies()
File "/Users/matt/src/airflow/airflow/models/baseoperator.py", line 1112,
in set_xcomargs_dependencies
XComArg.apply_upstream_relationship(self, arg)
File "/Users/matt/src/airflow/airflow/models/xcom_arg.py", line 169, in
apply_upstream_relationship
XComArg.apply_upstream_relationship(op, elem)
File "/Users/matt/src/airflow/airflow/models/xcom_arg.py", line 169, in
apply_upstream_relationship
XComArg.apply_upstream_relationship(op, elem)
File "/Users/matt/src/airflow/airflow/models/xcom_arg.py", line 170, in
apply_upstream_relationship
elif hasattr(arg, "template_fields"):
File "/Users/matt/src/airflow/airflow/utils/context.py", line 100, in
__getattr__
self.var = Variable.get(key, deserialize_json=self._deserialize_json)
File "/Users/matt/src/airflow/airflow/models/variable.py", line 138, in get
raise KeyError(f'Variable {key} does not exist')
KeyError: 'Variable template_fields does not exist'
```
because the `hasattr` call triggers
https://github.com/apache/airflow/blob/ecc5b74528ed7e4ecf05c526feb2c0c85f463429/airflow/utils/context.py#L97-L101
But `Variable.get()` raises an unexpected `KeyError`.
Note: The same problem also exists for `ConnectionAccessor`;
`Connection.get_connection_from_secrets()` raises `AirflowNotFoundException`
when a connection is not found. An alternative fix is to catch these exceptions
and re-raise `AttributeError` in those `__getattr__` implementations (which is
caught by `hasattr` to return False), but IMO the current fix proposed in this
PR is better since it guards against future custom objects.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]