uranusjr opened a new pull request #22359:
URL: https://github.com/apache/airflow/pull/22359


   Previously, XComArg uses hasattr duck-typing to determine if it needs to 
recurse into an object's `template_fields`. However, this errornously lets it 
calls `__getattr__` on certain objects that raises unexpected errors.
   
   Instead of relying on duck-typing, this patch makes the check strictly uses 
inheritance, making XComArg only recurse into `template_fields` if the object 
is an operator. This avoids `__getattr__` being unexpectedly called, and thus 
making the logic work better with other parts of the code base.
   
   The bug can be triggered by the following DAG:
   
   ```python
   @task
   def a_list():
       return [3, 6, 9]
   ​
   @task
   def ref_context(num, **context):
       pass
   ​
   with DAG(dag_id="ref_context", start_date=datetime(1970, 1, 1)) as dag:
       ref_context.expand(num=a_list())
   ```
   
   The following exception is raised:
   
   ```
   Traceback (most recent call last):
     File "/Users/matt/src/airflow/airflow/decorators/base.py", line 172, in 
execute
       return_value = super().execute(context)
     File "/Users/matt/src/airflow/airflow/operators/python.py", line 169, in 
execute
       self.op_kwargs = self.determine_kwargs(context)
     File "/Users/matt/src/airflow/airflow/models/baseoperator.py", line 1005, 
in __setattr__
       self.set_xcomargs_dependencies()
     File "/Users/matt/src/airflow/airflow/models/baseoperator.py", line 1112, 
in set_xcomargs_dependencies
       XComArg.apply_upstream_relationship(self, arg)
     File "/Users/matt/src/airflow/airflow/models/xcom_arg.py", line 169, in 
apply_upstream_relationship
       XComArg.apply_upstream_relationship(op, elem)
     File "/Users/matt/src/airflow/airflow/models/xcom_arg.py", line 169, in 
apply_upstream_relationship
       XComArg.apply_upstream_relationship(op, elem)
     File "/Users/matt/src/airflow/airflow/models/xcom_arg.py", line 170, in 
apply_upstream_relationship
       elif hasattr(arg, "template_fields"):
     File "/Users/matt/src/airflow/airflow/utils/context.py", line 100, in 
__getattr__
       self.var = Variable.get(key, deserialize_json=self._deserialize_json)
     File "/Users/matt/src/airflow/airflow/models/variable.py", line 138, in get
       raise KeyError(f'Variable {key} does not exist')
   KeyError: 'Variable template_fields does not exist'
   ```
   
   because the `hasattr` call triggers
   
   
https://github.com/apache/airflow/blob/ecc5b74528ed7e4ecf05c526feb2c0c85f463429/airflow/utils/context.py#L97-L101
   
   But `Variable.get()` raises an unexpected `KeyError`.
   
   Note: The same problem also exists for `ConnectionAccessor`; 
`Connection.get_connection_from_secrets()` raises `AirflowNotFoundException` 
when a connection is not found. An alternative fix is to catch these exceptions 
and re-raise `AttributeError` in those `__getattr__` implementations (which is 
caught by `hasattr` to return False), but IMO the current fix proposed in this 
PR is better since it guards against future custom objects.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to