Hi, I'm interested in integrating some traditional batch systems with Airflow so I can run against any available batch resources. My use case is that I'd like to run a single airflow instance as a multi-tenant service which can dispatch to heterogeneous batch systems across the physical globe. A system I maintain does this, and I know HTCondor+DAGMan can do this by treating the batch systems as "grid resources". I'm trying to understand if this makes sense to even try with Airflow, so I have a few questions.
1. Has anyone looked into or tried this before? I've searched for several hours and was unable to find much on this 2. I have a rough idea how AirFlow works but I haven't dug deep into the code. If I was to implement something like this, should this be done as an operator (i.e. extend BashOperator?) or executor (Mesos Executor) or maybe both? 3. I've done this thing in the past, and typically you end up with a daemon/microservice running for each batch system. That microservice may be local to the batch system (works best in the case of LSF/torque/etc), or it may be local to the workflow engine but using some sort of exported remote API (e.g. grid-connected resources, often using globus APIs and x509 certs), or there may be another layer of abstraction involved (in the case of DIRAC). Then you have a wrapper/pilot script which will trap a few signals and communicate back to the microservice or ot message queue (usually through HTTP or email because some batch systems are behind restrictive firewalls) when a job actually starts or finishes. Thanks, Brian
