It sounds like you want a background daemon that continuously monitors the
status of some external system and triggers things on a condition. This
does not sound like an ETL job, and thus airflow is not a great fit for
this type of problem. That said, there are workarounds like you mentioned.
One easy workaround if you can handle a delay between `condition happens ->
dag triggers` is setting your controller dag to have a recurring schedule
(ie: not None). Then when that controlling dag is triggered, you just
perform your sensor check once and then trigger/don't trigger another dag
depending on the condition. The thing I'd be worried about with your
`trigger dagrun` approach is if the trigger dagrun operator fails for any
reason you'll stop monitoring the external system, while with the scheduled
approach you don't have to worry about the failure modes of retrying failed
dags/etc.

On Mon, Oct 23, 2017 at 2:30 AM, Niels Zeilemaker <[email protected]>
wrote:

> Hi Guys,
>
> I've created a Sensor which is monitoring the number of files in an
> Azure Blobstore. If the number of files increases, then I would like
> to trigger another dag. This is more or less similar to the
> example_trigger_controller_dag.py and example_trigger_target_dag.py
> setup.
>
> However, after triggering the target DAG I would want my controller
> DAG to start monitoring the Blobstore again. But since the schedule of
> the controller DAG is set to None, it doesn't continue monitoring. I
> "fixed" this by adding a TriggerDAG which schedules a new run of the
> Controller DAG. But this feels a bit like a hack.
>
> Does someone have any experience which such a continuous monitoring
> sensor? Or know of a better way to achieve this?
>
> Thanks,
> Niels
>

Reply via email to