KadeRyu opened a new issue, #33993:
URL: https://github.com/apache/airflow/issues/33993

   ### What do you see as an issue?
   
   In the [Airflow Best Practices 
article](https://airflow.apache.org/docs/apache-airflow/2.5.3/best-practices.html#top-level-python-code),
 I found a paragraph that was confusing to readers:
   
   ---
   Bad example:
   
   ```python
   from airflow.models import Variable
   
   foo_var = Variable.get("foo")  # DON'T DO THAT
   bash_use_variable_bad_1 = BashOperator(
       task_id="bash_use_variable_bad_1", bash_command="echo variable 
foo=${foo_env}", env={"foo_env": foo_var}
   )
   
   bash_use_variable_bad_2 = BashOperator(
       task_id="bash_use_variable_bad_2",
       bash_command=f"echo variable foo=${Variable.get('foo')}",  # DON'T DO 
THAT
   )
   
   bash_use_variable_bad_3 = BashOperator(
       task_id="bash_use_variable_bad_3",
       bash_command="echo variable foo=${foo_env}",
       env={"foo_env": Variable.get("foo")},  # DON'T DO THAT
   )
   ```
   
   Good example:
   ```python
   bash_use_variable_good = BashOperator(
       task_id="bash_use_variable_good",
       bash_command="echo variable foo=${foo_env}",
       env={"foo_env": "{{ var.value.get('foo') }}"},
   )
   ```
   ```python
   @task
   def my_task():
       var = Variable.get("foo")  # this is fine, because func my_task called 
only run task, not scan DAGs.
       print(var)
   ```
   ---
   
   In the "Bad Example" section, it states that `Variable.get("foo")` should 
not be specified in an Operator. Instead, it recommends calling it using a 
Jinja template, as described in the "Good Example" section.
   However, it explains that it is possible to call it in the `@task` operator. 
This might make it seem like users who want to call the `Variable.get()` method 
shouldn't do so in a traditional Operator.
   
   Secondly, to simplify the code, the Operators are not in the DAG and are not 
indented, so there is room for interpretation that the `BashOperators` in the 
bad example are outside the DAG (top-level code), and therefore the 
`Variable.get()` method is also top-level.
   
   
   ### Solving the problem
   
   I think some sentences should be added to explain why `Variable.get()` 
methods called from inside an Operator are a bad example.
   
   I don't know enough about the logic that the scheduler is parsing to make a 
precise suggestion, but I think there are cases like this
   
   - If there is no case where you need to use the `Variable.get()` method 
inside an Operator, and using a Jinja template is absolutely fine: add to the 
documentation why calling `Variable.get()` inside an Operator is recognized as 
top-level code, and calling `Variable.get()` inside a `@task` operator is fine.
   - When it's okay to use the `Variable.get()` method inside an Operator: add 
appropriate examples to the documentation
   
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to