Dev-iL commented on issue #43176:
URL: https://github.com/apache/airflow/issues/43176#issuecomment-3394039819

   > #### 1. Use `task.short_circuit` when `task.branch` has multiple `return`s 
and exactly one of them is not `[]`
   >   * Is it a typical case?
   
   I ran into this myself, no idea how common it is but I bet I'm not the only 
one. Can be a low priority check.
   
   > #### 2. If `SomeOtherOperator(some_template_field="{{ 
ti.xcom_pull(task_ids=['task1']) }}")` is used, replace it with `...... 
some_template_field=task1.output` instead
   >   * Need to define the exact list of template_field
   
   Did you ask for a list to make this more general? What I had in mind applies 
only to the output of a task, when the xcom key is not provided (and thus set 
to `return_value`) - in other words this check should detect templates of the 
form
   
   ```
   {{ <task instance object>.xcom_pull(task_ids=<iterable of length 1>[, 
key="return_value"])` }}
   ```
   
   > #### 3. The Dag must not have top-level calls to expensive 
classes/services.
   >   * More detail needed. What are "expensive classes/services"? We need an 
exact list to check, and it needs to be generalized enough
   
   Expensive = takes either absolutely or relatively long to load. The main 
example that comes to mind is pandas. However, I think this is something that's 
more reasonable to live inside a unit test (like the dagbag one), or a 
threshold used by the processor to emit warnings. Re generalization - it's 
simpler to define what *is* allowed (e.g. airflow or standard library imports) 
rather than what isn't - but this approach will flag much of the existing code.
   
   > #### 4. The DAG object should be instantiated using a context manager.
   >   * Why? taskflow also works fine?
   
   Taskflow is fine and context manager is fine too. But perhaps `dag = 
DAG(....)` should be discouraged.
   
   > #### 5. The Dag file name should match its dag_id.
   > #### 6. All `dag_id` s should be unique
   
   Can be checked by a unit test (dagbag).
   
   > #### 8. it would be really useful if the Airflow linter could validate the 
user's DAG code against any user-defined task policy or dag policy
   
   Potentially possible via a dagbag test, assuming it processes dag policies 
(might require dev effort to support).
   
   > #### 10. tmp path not returned by tasks 
   >   * not sure whether it's a common case
   
   This is something users might run into when scaling up from a single-worker 
setup - where all of a sudden DAGs start failing. If we can (add this easily I 
say we should.
   
   > #### 11. Don't allow implicit python task decorators, `@task`, and instead 
require specifying `@task.python`.
   >   * netural
   
   Like every other rule, it could be opt-in.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to