Do you have a dag definition that exhibits the issue? If I can reproduce it 
I'll do my best to get it into 1.8.2

Sent from my iPhone

> On 14 Jun 2017, at 13:42, Daniel Huang <dxhu...@gmail.com> wrote:
> 
> I think this is the same issue I've been hitting with ShortCircuitOperator
> and LatestOnlyOperator. I filed
> https://issues.apache.org/jira/browse/AIRFLOW-1296 a few days ago. It
> includes a DAG I can consistently reproduce this with on 1.8.1 and master.
> I get the "This should not happen" log message as well and the DAG fails.
> 
>> On Wed, Jun 14, 2017 at 3:27 AM, Bolke de Bruin <bdbr...@gmail.com> wrote:
>> 
>> Please provide the full logs (you are cutting out too much info), dag
>> definition (sanitized), airflow version.
>> 
>> Bolke
>> 
>> Sent from my iPhone
>> 
>>> On 13 Jun 2017, at 23:51, Rajesh Chamarthi <rajesh.chamar...@gmail.com>
>> wrote:
>>> 
>>> I currently have a dag which follows the following pattern
>>> 
>>> short_circuit_operator -> s3_sensor -> downstream_task_1 ->
>>> Downstream_task_2
>>> 
>>> When short circuit evaluates to false, s3_sensor is skipped, other
>>> downstream task states remains at None and DAG Run fails.
>>> 
>>> couple of questions :
>>> 
>>> 1) Which part/component of the application (scheduler/operator/?) takes
>>> care of cascading the skipped status to downstream jobs? Short Circuit
>>> operator only seems to update the immediate downstream jobs
>>> 
>>> 2) Using CeleryExecutor seems to cause this. Are there any other logs or
>>> processes I can run to figure out the root of the problem?
>>> 
>>> More details below
>>> 
>>> * ShortCircuitOperator Log: (The first downstream task is set to skipped,
>>> although log shows a warning)
>>> 
>>> ```
>>> [2017-06-12 09:00:24,552] {base_task_runner.py:95} INFO - Subtask:
>>> [2017-06-12 09:00:24,552] {python_operator.py:177} INFO - Skipping task:
>>> on_s3_xyz
>>> [2017-06-12 09:00:24,553] {base_task_runner.py:95} INFO - Subtask:
>>> [2017-06-12 09:00:24,553] {python_operator.py:188} WARNING - Task
>>> <Task(S3KeySensor): on_s3_xyz> was not part of a dag run. This should not
>>> happen.
>>> ```
>>> 
>>> * Scheduler log (marks the Dag Run as failed)
>>> 
>>> [2017-06-13 17:57:20,983] {models.py:4184} DagFileProcessor43 INFO -
>>> Deadlock; marking run <DagRun test_inbound @ 2017-06-05 09:00:00:
>>> scheduled__2017-06-05T09:00:00, externally triggered: False> failed
>>> 
>>> When I check the dag run and run through the code, it looks like trigger
>>> rule evaluates to false because upstream is "skipped"
>>> 
>>> ```
>>> Previous Dagrun State True The task did not have depends_on_past set.
>>> Not In Retry Period True The task instance was not marked for retrying.
>>> Trigger Rule False Task's trigger rule 'all_success' requires all
>> upstream
>>> tasks to have succeeded, but found 1 non-success(es).
>>> upstream_tasks_state={'failed': 0, 'successes': 0, 'skipped': 1,
>> 'done': 1,
>>> 'upstream_failed': 0}, upstream_task_ids=['on_s3_xyz']
>>> ```
>> 

Reply via email to