Do you mean something like this:
file = download_from_s3() dag = DAG(...) with file.lines(): MyTask(dag=dag, ...) If so then the answer will be 100s to 1000s of times per day - each time the scheduler parses the DAG looking for tasks _All_ code at the top level will be run. So if you unconditionally download the file it will do it A LOT. -ash > On 7 Jun 2019, at 18:45, Satya Tumati <satya.tum...@rubrik.com> wrote: > > Hi, > > Our team uses Airflow extensively and we stumbled on an issue that might > need some help understanding the interplay between scheduler and worker. > > Let's say I wrote a test_dag.py in the dags sub dir and it simply downloads > a file from S3 which contains a list of strings. Now, I create a linear DAG > where each task is simply printing a string. These tasks are in the order > they appear in the file. > > How may times does the file exactly be downloaded for a dag run? > > > Thanks, > Satya