I hope this is an alright place to ask the following:
In a case where some inputs will irregularly be missing, but where it's
okay, I was reading
https://airflow.incubator.apache.org/concepts.html#trigger-rules
and I thought I needed `all_done` for a final task, but a skip is not a
done state, nor does it (seem to) propagate.
Is there a way to trigger something after all upstreams are either
successful or skipped?

My case looks a little like:
sensor_dataA >> preprocess_and_stage_dataA >> process_stagesABC >> clean_up
sensor_dataB >> preprocess_and_stage_dataB >> process_stagesABC
sensor_dataC >> preprocess_and_stage_dataC >> process_stagesABC

I don't want the preprocess to fail because the data isn't there and there
will be side-effects, but if the sensor skips its associated
preprocess_and_stage is not queued. The task doesn't seem to have any state
(like `upstream_skipped`?) so process_stagesABC won't be triggered by
`all_done`. `one_success` seems like it would be prefect except that it
would start before all preprocess tasks have been either run or skipped.

Am I missing a way that this can be done? Is there some general guide to
changing the DAG structure that would handle completing the process? Am I
supposed to be using XCOM here?

If all these answers are "no/maybe" then is there some opportunity to
introduce an `upstream_skipped` state or a different `trigger_rule`... a
kludgy `SkipAheadOperator`, or something?

Thanks,
-Daniel Lamblin

Reply via email to