Yea, I'm referring specifically to the idea that Docker as a requirement
for doing a hello world Airflow will make things better. I don't think it
will.

On Wednesday, May 4, 2016, Lance Norskog <[email protected]> wrote:

> We use Docker at Edmodo and it really helped for Airflow.
>
> It's easy to say "pip install airflow" itself, but some of the database
> drivers require pip installs that then require dev versions of host .rpm or
> .deb packages because they want a .h file to compile against.
>
> We are porting a large complex Hadoop-based ETL to Airflow and used Docker
> to package web services that we call from Airflow.
>
> Another part of our system is that we want to set up Amazon "AutoStart
> Groups" to launch more Airflow executor servers when our main server
> becomes overloaded. We run a few large-memory Java jobs and this will be a
> problem soon. Our tooling lets us easily set this up with Docker. (We wrote
> something just like Docker Compose that talks to ASG. It's incredibly
> useful.)
>
> So, yeah, "pip install airflow" is fine for kicking the tires but we needed
> binary management rather quickly after that.
>
> Cheers,
>
> Lance
>
> On Wed, May 4, 2016 at 1:28 PM, Chris Riccomini <[email protected]
> <javascript:;>>
> wrote:
>
> > > As far as ease of use, while docker is definitely getting more popular,
> > it
> > is hard to beat the current pip install flow for people not quite up to
> > date
> > on how to setup docker. It seems like one more hurdle if you just want to
> > get started.
> >
> > Strongly agree. We tried to use Vagrant and then Docker with a prior
> > project, and it was a pain. Another project that I'm working with now
> uses
> > Docker for its hello-world stuff, and it's really troublesome. You will
> get
> > WAY more questions if you go this route than the current simple
> pip/sqlite
> > route.
> >
> > On Wed, May 4, 2016 at 12:27 PM, Maxime Beauchemin <
> > [email protected] <javascript:;>> wrote:
> >
> > > Yeah I'd be curious to see how the Docker setup instructions (from
> > scratch)
> > > would compare to the current ones.
> > >
> > > On Wed, May 4, 2016 at 11:05 AM, Arthur Wiedmer <
> > [email protected] <javascript:;>>
> > > wrote:
> > >
> > > > +1, but it feels like just piling on.
> > > >
> > > > One thing we could consider is which part we would like to fix.
> > > >
> > > > - If it is the seriousness/production ready db, but that is still a
> > local
> > > > db/client, we could try something like firebird.
> > > > Relatively small footprint and can do multithreading, it is supported
> > by
> > > > SQLAlchemy, though it is not as easy to install as sqlite on most
> > *nixes.
> > > > We could spend some cycles baking this into containers as well.
> > > >
> > > > - As far as ease of use, while docker is definitely getting more
> > popular,
> > > > it is hard to beat the current pip install flow for people not quite
> up
> > > to
> > > > date on how to setup docker. It seems like one more hurdle if you
> just
> > > want
> > > > to get started.
> > > >
> > > > Best,
> > > > Arthur
> > > >
> > > >
> > > > On Wed, May 4, 2016 at 9:35 AM, Maxime Beauchemin <
> > > > [email protected] <javascript:;>> wrote:
> > > >
> > > > > Making it frictionless for people to get their feet wet is
> extremely
> > > > > important. It's been a requirement since the early prototypes and I
> > > feel
> > > > > strongly about keeping it that way. It's hard to test this
> > hypothesis,
> > > > but
> > > > > it could be a defining factor in the success of this project
> (to-date
> > > and
> > > > > future).
> > > > >
> > > > > Docker may allow for more batteries to be included and offer even
> > less
> > > > > friction than the `pip install` path for folks who are familiar
> with
> > > it.
> > > > > I'd have to look to see if the community contributed Docker images
> > are
> > > up
> > > > > to date. We may want to make that "the way to go" and change the
> > > > tutorial /
> > > > > quick start instructions to reflect that if it makes sense. That
> may
> > > > > require integrating the burning of images as part of the build
> and/or
> > > > > release process.
> > > > >
> > > > > Max
> > > > >
> > > > > On Wed, May 4, 2016 at 6:33 AM, Jeremiah Lowin <[email protected]
> <javascript:;>>
> > > > wrote:
> > > > >
> > > > > > +1, shipping Airflow "batteries included" is very important in my
> > > > > opinion.
> > > > > > There is a lot to grok and the easiest way to learn is by letting
> > > folks
> > > > > > spin up a working installation right away. Unfortunately I don't
> > > think
> > > > > > there's a viable alternative to SQLite that is also supported by
> > > > > > SQLAlchemy.
> > > > > >
> > > > > > On Wed, May 4, 2016 at 2:57 AM Prateek Rungta <
> [email protected] <javascript:;>>
> > > > > wrote:
> > > > > >
> > > > > > > It's documented pretty well that it's only for people to get
> > their
> > > > feet
> > > > > > wet
> > > > > > > with. From the quickstart
> > > > > > > <http://pythonhosted.org/airflow/start.html?highlight=sqlite>:
> > > > > > >
> > > > > > > Out of the box, Airflow uses a sqlite database, which you
> should
> > > > > outgrow
> > > > > > > fairly quickly since no parallelization is possible using this
> > > > database
> > > > > > > backend. It works in conjunction with the SequentialExecutor
> > which
> > > > will
> > > > > > > only run task instances sequentially. While this is very
> > limiting,
> > > it
> > > > > > > allows you to get up and running quickly and take a tour of the
> > UI
> > > > and
> > > > > > the
> > > > > > > command line utilities.
> > > > > > >
> > > > > > > FWIW, I'm now on day 2 of using Airflow. And while I wouldn't
> > dream
> > > > of
> > > > > > > deploying Airflow using SQLite beyond my laptop, I quite
> > > appreciated
> > > > > > being
> > > > > > > able to mess with Airflow without any of the infrastructural
> > > > > constraints.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Tue, May 3, 2016 at 11:18 PM, Siddharth Anand <
> > > [email protected] <javascript:;>>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > From time to time, we run into bugs with the SQLite dialect
> in
> > > > > > SQLAlchemy
> > > > > > > > and close the bugs as "wont-fix" because we don't want to be
> in
> > > the
> > > > > > > > business of fixing such bug. We deem SQLite as a
> "non-serious"
> > > > > database
> > > > > > > > that no one [in his/her right mind] would run in his/her
> > staging,
> > > > qa,
> > > > > > or
> > > > > > > > production environments. However, we rely on the
> > > SequentialExecutor
> > > > > and
> > > > > > > one
> > > > > > > > the SQLite DB for our tests.
> > > > > > > > What should we do with SQLite? Should we lift up the hood and
> > fix
> > > > it
> > > > > > for
> > > > > > > > our needs or find either a different ORM or a different
> option
> > > for
> > > > DB
> > > > > > > > backend?
> > > > > > > > Example of bugs we encounter and close as won't fix : 1.
> > > Deleting a
> > > > > > task
> > > > > > > > instance : https://github.com/airbnb/airflow/issues/9552.
> > Weird
> > > > > pickle
> > > > > > > > issue : https://issues.apache.org/jira/browse/AIRFLOW-46
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Lance Norskog
> [email protected] <javascript:;>
> Redwood City, CA
>

Reply via email to