Yea, I'm referring specifically to the idea that Docker as a requirement for doing a hello world Airflow will make things better. I don't think it will.
On Wednesday, May 4, 2016, Lance Norskog <[email protected]> wrote: > We use Docker at Edmodo and it really helped for Airflow. > > It's easy to say "pip install airflow" itself, but some of the database > drivers require pip installs that then require dev versions of host .rpm or > .deb packages because they want a .h file to compile against. > > We are porting a large complex Hadoop-based ETL to Airflow and used Docker > to package web services that we call from Airflow. > > Another part of our system is that we want to set up Amazon "AutoStart > Groups" to launch more Airflow executor servers when our main server > becomes overloaded. We run a few large-memory Java jobs and this will be a > problem soon. Our tooling lets us easily set this up with Docker. (We wrote > something just like Docker Compose that talks to ASG. It's incredibly > useful.) > > So, yeah, "pip install airflow" is fine for kicking the tires but we needed > binary management rather quickly after that. > > Cheers, > > Lance > > On Wed, May 4, 2016 at 1:28 PM, Chris Riccomini <[email protected] > <javascript:;>> > wrote: > > > > As far as ease of use, while docker is definitely getting more popular, > > it > > is hard to beat the current pip install flow for people not quite up to > > date > > on how to setup docker. It seems like one more hurdle if you just want to > > get started. > > > > Strongly agree. We tried to use Vagrant and then Docker with a prior > > project, and it was a pain. Another project that I'm working with now > uses > > Docker for its hello-world stuff, and it's really troublesome. You will > get > > WAY more questions if you go this route than the current simple > pip/sqlite > > route. > > > > On Wed, May 4, 2016 at 12:27 PM, Maxime Beauchemin < > > [email protected] <javascript:;>> wrote: > > > > > Yeah I'd be curious to see how the Docker setup instructions (from > > scratch) > > > would compare to the current ones. > > > > > > On Wed, May 4, 2016 at 11:05 AM, Arthur Wiedmer < > > [email protected] <javascript:;>> > > > wrote: > > > > > > > +1, but it feels like just piling on. > > > > > > > > One thing we could consider is which part we would like to fix. > > > > > > > > - If it is the seriousness/production ready db, but that is still a > > local > > > > db/client, we could try something like firebird. > > > > Relatively small footprint and can do multithreading, it is supported > > by > > > > SQLAlchemy, though it is not as easy to install as sqlite on most > > *nixes. > > > > We could spend some cycles baking this into containers as well. > > > > > > > > - As far as ease of use, while docker is definitely getting more > > popular, > > > > it is hard to beat the current pip install flow for people not quite > up > > > to > > > > date on how to setup docker. It seems like one more hurdle if you > just > > > want > > > > to get started. > > > > > > > > Best, > > > > Arthur > > > > > > > > > > > > On Wed, May 4, 2016 at 9:35 AM, Maxime Beauchemin < > > > > [email protected] <javascript:;>> wrote: > > > > > > > > > Making it frictionless for people to get their feet wet is > extremely > > > > > important. It's been a requirement since the early prototypes and I > > > feel > > > > > strongly about keeping it that way. It's hard to test this > > hypothesis, > > > > but > > > > > it could be a defining factor in the success of this project > (to-date > > > and > > > > > future). > > > > > > > > > > Docker may allow for more batteries to be included and offer even > > less > > > > > friction than the `pip install` path for folks who are familiar > with > > > it. > > > > > I'd have to look to see if the community contributed Docker images > > are > > > up > > > > > to date. We may want to make that "the way to go" and change the > > > > tutorial / > > > > > quick start instructions to reflect that if it makes sense. That > may > > > > > require integrating the burning of images as part of the build > and/or > > > > > release process. > > > > > > > > > > Max > > > > > > > > > > On Wed, May 4, 2016 at 6:33 AM, Jeremiah Lowin <[email protected] > <javascript:;>> > > > > wrote: > > > > > > > > > > > +1, shipping Airflow "batteries included" is very important in my > > > > > opinion. > > > > > > There is a lot to grok and the easiest way to learn is by letting > > > folks > > > > > > spin up a working installation right away. Unfortunately I don't > > > think > > > > > > there's a viable alternative to SQLite that is also supported by > > > > > > SQLAlchemy. > > > > > > > > > > > > On Wed, May 4, 2016 at 2:57 AM Prateek Rungta < > [email protected] <javascript:;>> > > > > > wrote: > > > > > > > > > > > > > It's documented pretty well that it's only for people to get > > their > > > > feet > > > > > > wet > > > > > > > with. From the quickstart > > > > > > > <http://pythonhosted.org/airflow/start.html?highlight=sqlite>: > > > > > > > > > > > > > > Out of the box, Airflow uses a sqlite database, which you > should > > > > > outgrow > > > > > > > fairly quickly since no parallelization is possible using this > > > > database > > > > > > > backend. It works in conjunction with the SequentialExecutor > > which > > > > will > > > > > > > only run task instances sequentially. While this is very > > limiting, > > > it > > > > > > > allows you to get up and running quickly and take a tour of the > > UI > > > > and > > > > > > the > > > > > > > command line utilities. > > > > > > > > > > > > > > FWIW, I'm now on day 2 of using Airflow. And while I wouldn't > > dream > > > > of > > > > > > > deploying Airflow using SQLite beyond my laptop, I quite > > > appreciated > > > > > > being > > > > > > > able to mess with Airflow without any of the infrastructural > > > > > constraints. > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, May 3, 2016 at 11:18 PM, Siddharth Anand < > > > [email protected] <javascript:;>> > > > > > > > wrote: > > > > > > > > > > > > > > > From time to time, we run into bugs with the SQLite dialect > in > > > > > > SQLAlchemy > > > > > > > > and close the bugs as "wont-fix" because we don't want to be > in > > > the > > > > > > > > business of fixing such bug. We deem SQLite as a > "non-serious" > > > > > database > > > > > > > > that no one [in his/her right mind] would run in his/her > > staging, > > > > qa, > > > > > > or > > > > > > > > production environments. However, we rely on the > > > SequentialExecutor > > > > > and > > > > > > > one > > > > > > > > the SQLite DB for our tests. > > > > > > > > What should we do with SQLite? Should we lift up the hood and > > fix > > > > it > > > > > > for > > > > > > > > our needs or find either a different ORM or a different > option > > > for > > > > DB > > > > > > > > backend? > > > > > > > > Example of bugs we encounter and close as won't fix : 1. > > > Deleting a > > > > > > task > > > > > > > > instance : https://github.com/airbnb/airflow/issues/9552. > > Weird > > > > > pickle > > > > > > > > issue : https://issues.apache.org/jira/browse/AIRFLOW-46 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > Lance Norskog > [email protected] <javascript:;> > Redwood City, CA >
