Ah.. I completely missed the question.. in my haste to do too many things. Assuming you have a DAG named process_my_data with 3 tasks : read__from_source_table --> transform --> write_to_new_table. This dag should have a @none schedule.
You could write a script to read your list of source tables and call airflow trigger_dag -c <a json string with param you want to pass to your first task> -e <execution date>. This will launch a dag execution run for each of the input that you call. I believe that the execution date should differ by 1 second (timestamp granularity in the db).. so avoid a tight loop with a 1 second sleep between executions. You will see N dag runs, one for each of the N source tables that you pass in. -s On Tue, Jun 20, 2017 at 12:22 PM, Maxime Beauchemin < [email protected]> wrote: > One DAG cannot have multiple shapes at one time, by design. You cannot > parameterize things that will affect the shape of your DAG (though note > that you can fully parameterize what happens within individual task > instances). Think about it, a DAG is one (and only one) graph. It's NOT a > shapeshifting thing. > > As a workaround, and this may or may not be the right thing to do, you can > write a DAG factory function, that will return a DAG object given > parameters, but any given DAG instance (with a unique dag_id) has a single > shape. If you do want to go that route, may want to use > `schedule_interval='@once'` > > If you think the shape of your DAG needs to change from one DAG run to the > next, you may want to re-think what is static and what is dynamic. Are your > database tables schema changing from one DAG run to the next? No right? > That'd be crazy! Most likely you want to think about the shape of your DAG > in a similar way as you think about the schema of your tables: static or > slowly changing. > > Max > > On Mon, Jun 19, 2017 at 4:11 AM, Rob Harrison <[email protected]> wrote: > > > Hi, > > > > I would like to pass a variable to my airflow dag and would like to know > if > > there is a recommended method for doing this. > > > > I am hoping to create a dag with python operators and tasks that read > data > > from a parquet table, perform a calculation then write the results into a > > new table. I'd like to pass the source table name in along with the task > > when calling the dag from the command line. > > > > From what I have read, the following can be used to read a variable from > > the command line: > > > > airflow variables -s myvar="value" > > > > Does anyone have an example of this they can share? > > > > Thank you, > > Rob > > >
