Hello,
I'm currently using Airflow for some ETL tasks where I submit a spark job to a cluster and poll till it is complete. This workflow is nice because it is typically a single Dag. I'm now starting to do more machine learning tasks and need to build a model per client which is 1000+ clients. My spark cluster is capable of handling this workload, however, it doesn't seem scalable to write 1000+ dags to fit models for each client. I want each client to have its own task instance so it can be retried if it fails without having to run all 1000+ tasks over again. How do I handle this type of workflow in Airflow?
