SamWheating commented on a change in pull request #17421:
URL: https://github.com/apache/airflow/pull/17421#discussion_r699568515



##########
File path: airflow/operators/python.py
##########
@@ -203,13 +216,26 @@ def execute(self, context: Dict):
             self.log.info('Proceeding with downstream tasks...')
             return
 
-        self.log.info('Skipping downstream tasks...')
-
         downstream_tasks = context['task'].get_flat_relatives(upstream=False)
-        self.log.debug("Downstream task_ids %s", downstream_tasks)
+        self.log.debug("Downstream tasks %s", downstream_tasks)
 
         if downstream_tasks:
-            self.skip(context['dag_run'], context['ti'].execution_date, 
downstream_tasks)
+            dag_run = context["dag_run"]
+            execution_date = context["ti"].execution_date
+
+            if self.mode == "hard":
+                self.log.info("Skipping downstream tasks using a hard 
short...")
+                self.skip(dag_run, execution_date, downstream_tasks)
+            elif self.mode == "soft":
+                self.log.info("Skipping downstream tasks using a soft 
short...")
+                # Explicitly setting the state of the direct, downstream 
task(s) to "skipped" and letting the
+                # Scheduler handle the remaining downstream task(s) 
appropriately.
+                self.skip(dag_run, execution_date, 
context["task"].get_direct_relatives(upstream=False))
+            else:

Review comment:
       Would it be preferable to perform this validation on initialization 
rather than on execution?
   
   In its current state, if and invalid value is provided for `mode`, this task 
will be submitted to the executor and fail, which could then lead to unintended 
execution of downstream tasks.
   
   If we were validating this exception and raising in the `__init__` method, 
it would raise on import which would prevent the broken code from ever reaching 
the executor. 

##########
File path: airflow/operators/python.py
##########
@@ -203,13 +216,26 @@ def execute(self, context: Dict):
             self.log.info('Proceeding with downstream tasks...')
             return
 
-        self.log.info('Skipping downstream tasks...')
-
         downstream_tasks = context['task'].get_flat_relatives(upstream=False)
-        self.log.debug("Downstream task_ids %s", downstream_tasks)
+        self.log.debug("Downstream tasks %s", downstream_tasks)
 
         if downstream_tasks:
-            self.skip(context['dag_run'], context['ti'].execution_date, 
downstream_tasks)
+            dag_run = context["dag_run"]
+            execution_date = context["ti"].execution_date
+
+            if self.mode == "hard":
+                self.log.info("Skipping downstream tasks using a hard 
short...")
+                self.skip(dag_run, execution_date, downstream_tasks)
+            elif self.mode == "soft":
+                self.log.info("Skipping downstream tasks using a soft 
short...")
+                # Explicitly setting the state of the direct, downstream 
task(s) to "skipped" and letting the
+                # Scheduler handle the remaining downstream task(s) 
appropriately.
+                self.skip(dag_run, execution_date, 
context["task"].get_direct_relatives(upstream=False))
+            else:

Review comment:
       Would it be preferable to perform this validation on initialization 
rather than on execution?
   
   In its current state, if an invalid value is provided for `mode`, this task 
will be submitted to the executor and fail, which could then lead to unintended 
execution of downstream tasks.
   
   If we were validating this exception and raising in the `__init__` method, 
it would raise on import which would prevent the broken code from ever reaching 
the executor. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to