A quick side note to say that it's common to deploy one or many Airflow sandboxes which are effectively the same configuration as a worker without an actual worker instance working on it. It's similar to the concept of a "gateway node" in Hadoop.
Users typically work in user space with a modified `airflow.cfg` that may point to an alternate metadata database (to insulate production) that may or may not have alternate connections registered to staging / dev counterparts if existing, depending on policy. You'll typically find the same Airflow package and python environment as the one used in production with similar connectivity to other systems and databases. From there you can run any cli commands and even fire up your own Airflow webserver that you can tunnel into if need be. For example at Lyft there's a simple cli application that will prepare your remote home and hook things up (provide a working airflow.cfg, sync/clone the pipeline repo for you, ...) so that it all works and feels similar to other development workflows specific to Lyft. It basically automated the whole "setting up a dev env" with the proper policies. At Airbnb, the "data sandboxes" act as Airflow sandboxes that you can ssh into, AND JupyterHub nodes where you can find the same home whether you ssh or you access Jupyter. In the Kubernetes world, it seems like there should be an easy way to order or "lease" an Airflow sandbox that would have your home directory persisted and mounted on that pod just for the time that you need it. Max On Wed, May 23, 2018 at 3:12 PM Luke Diment <[email protected]> wrote: > Fabric looks perfect for this. > ________________________________________ > From: Kyle Hamlin <[email protected]> > Sent: Thursday, May 24, 2018 6:22 AM > To: [email protected] > Subject: Re: Airflow cli to remote host > > I'd suggest using something like Fabric <http://www.fabfile.org/> for > this. > This is is how I accomplish the same task. > > On Wed, May 23, 2018 at 2:19 PM Frank Maritato <[email protected]> > wrote: > > > Hi All, > > > > I need to be able to run backfill for my jobs against our production > > airflow server. Is there a way to run > > > > airflow backfill job_name -s 2018-05-01 > > > > against a remote server? I didn’t see a -h option to specify a hostname. > > > > If not, is there a way through the ui to do this? I'd rather not have to > > ssh into the production server to run these jobs. > > > > Thanks! > > -- > > Frank Maritato > > > > > > -- > Kyle Hamlin > > > > The contents of this email and any attachments are confidential and may be > legally privileged. If you are not the intended recipient please advise the > sender immediately and delete the email and attachments. Any use, > dissemination, reproduction or distribution of this email and any > attachments by anyone other than the intended recipient is prohibited. >
