Totally agree on all your points Sid.

My feeling is that at the moment the most critical thing for the project is
to get a release out and get to a steady pace of high quality releases.

Somehow breaking down the package seem to me like it would really help with
the release process. Maybe an idea is to make 1.9 a release made of smaller
packages where airflow = `(airflow-core + airflow-operators +
airflow-webserver)` or something like that. I'm thinking it would allow to
release often on airflow-operators & airflow-webserver.

Max

On Fri, Nov 18, 2016 at 5:34 PM, siddharth anand <[email protected]> wrote:

> David
> https://issues.apache.org/jira/browse/AIRFLOW-558 (i.e. http
> s://github.com/apache/incubator-airflow/pull/1830 ) Is on my plate.. have
> already gone through many rounds of reviews, testing, and fixes with the
> submitter and does not need to wait till 2.0. We should be able to merge it
> soon. BTW, you are encouraged to vote on these PRs so maintainers can
> prioritize their time.
>
> Max,
>
> Thanks for kicking off this thread.
>
> Regarding 2.0, we've associated feature deprecation and non-backward
> compatible changes with 2.0. Some of this work might be pretty
> earth-shaking to Airflow users. IMHO, changes that increase user pain at
> upgrade time need to be carefully balanced against value.
>
> Watching both Gitter and the email list, there are a variety of stumbling
> points (for new users) that many of us who have been using the product for
> 1-2 years have forgotten. A fair number of people still mention that
> getting Airflow up and running is no simple task - i.e. Alex mentioned this
> in his talk at the last meet-up. The recent BlueYonder talk referenced
> https://cwiki.apache.org/confluence/display/AIRFLOW/Common+Pitfalls
>
> Though we may be numerically near 2.0 in terms of release numbers, I'd
> prefer to prioritize a few things higher than releasing 2.0. We need to
> build and exercise a few necessary muscles : timely PR processing & timely
> Apache releases (i.e. quarterly). Beyond that, I'd like to prioritize the
> "common pitfall" problems to ease on-boarding. Some of these don't need to
> wait for a major release. The ones that do can be developed on a separate
> 2.0 branch and baked, reviewed, and voted on by the community before we
> consider dropping it into master.
>
> That way, we can keep master healthy to support the increasing rate of
> community-submitted PRs that we are seeing and reduce the cycle time of
> cutting stable releases, all while working on big-bang changes for 2.0
> independently.
>
> Just my $0.02
> -s
>
> On Fri, Nov 18, 2016 at 3:57 PM, Chris Riccomini <[email protected]>
> wrote:
>
> > > RIP out the charting application and the data profiler
> >
> > Yes please! +1
> >
> > On Fri, Nov 18, 2016 at 2:41 PM, Maxime Beauchemin
> > <[email protected]> wrote:
> > > Another point that may be controversial for Airflow 2.0: RIP out the
> > > charting application and the data profiler. Even though it's nice to
> have
> > > it there, it's just out of scope and has major security
> > issues/implications.
> > >
> > > I'm not sure how popular it actually is. We may need to run a survey at
> > > some point around this kind of questions.
> > >
> > > Max
> > >
> > > On Fri, Nov 18, 2016 at 2:39 PM, Maxime Beauchemin <
> > > [email protected]> wrote:
> > >
> > >> Using FAB's Model, we get pretty much all of that (REST API,
> auth/perms,
> > >> CRUD) for free:
> > >> http://flask-appbuilder.readthedocs.io/en/latest/
> > >> quickhowto.html?highlight=rest#exposed-methods
> > >>
> > >> I'm pretty intimate with FAB since I use it (and contributed to it)
> for
> > >> Superset/Caravel.
> > >>
> > >> All that's needed is to derive FAB's model class instead of
> SqlAlchemy's
> > >> model class (which FAB's model wraps and adds functionality to and is
> > 100%
> > >> compatible AFAICT).
> > >>
> > >> Max
> > >>
> > >> On Fri, Nov 18, 2016 at 2:07 PM, Chris Riccomini <
> [email protected]
> > >
> > >> wrote:
> > >>
> > >>> > It may be doable to run this as a different package
> > >>> `airflow-webserver`, an
> > >>> > alternate UI at first, and to eventually rip out the old UI off of
> > the
> > >>> main
> > >>> > package.
> > >>>
> > >>> This is the same strategy that I was thinking of for AIRFLOW-85. You
> > >>> can build the new UI in parallel, and then delete the old one later.
> I
> > >>> really think that a REST interface should be a pre-req to any
> > >>> large/new UI changes, though. Getting unified so that everything is
> > >>> driven through REST will be a big win.
> > >>>
> > >>> On Fri, Nov 18, 2016 at 1:51 PM, Maxime Beauchemin
> > >>> <[email protected]> wrote:
> > >>> > A multi-tenant UI with composable roles on top of granular
> > permissions.
> > >>> >
> > >>> > Migrating from Flask-Admin to Flask App Builder would be an
> easy-ish
> > win
> > >>> > (since they're both Flask). FAB Provides a good authentication and
> > >>> > permission model that ships out-of-the-box with a REST api. Suffice
> > to
> > >>> > define FAB models (derivative of SQLAlchemy's model) and you get a
> > set
> > >>> of
> > >>> > perms for the model (can_show, can_list, can_add, can_change,
> > >>> can_delete,
> > >>> > ...) and a set of CRUD REST endpoints. It would also allow us to
> rip
> > out
> > >>> > the authentication backend code out of Airflow and rely on FAB for
> > that.
> > >>> > Also every single view gets permissions auto-created for it, and
> > there
> > >>> are
> > >>> > easy way to define row-level type filters based on user
> permissions.
> > >>> >
> > >>> > It may be doable to run this as a different package
> > >>> `airflow-webserver`, an
> > >>> > alternate UI at first, and to eventually rip out the old UI off of
> > the
> > >>> main
> > >>> > package.
> > >>> >
> > >>> > https://flask-appbuilder.readthedocs.io/en/latest/
> > >>> >
> > >>> > I'd love to carve some time and lead this.
> > >>> >
> > >>> > On Fri, Nov 18, 2016 at 1:32 PM, Chris Riccomini <
> > [email protected]
> > >>> >
> > >>> > wrote:
> > >>> >
> > >>> >> Full-fledged REST API (that the UI also uses) would be great in
> 2.0.
> > >>> >>
> > >>> >> On Fri, Nov 18, 2016 at 6:26 AM, David Kegley <[email protected]>
> wrote:
> > >>> >> > Hi All,
> > >>> >> >
> > >>> >> > We have been using Airflow heavily for the last couple months
> and
> > >>> it’s
> > >>> >> been great so far. Here are a few things we’d like to see
> > prioritized
> > >>> in
> > >>> >> 2.0.
> > >>> >> >
> > >>> >> > 1) Role based access to DAGs:
> > >>> >> > We would like to see better role based access through the UI.
> > >>> There’s a
> > >>> >> related ticket out there but it hasn’t seen any action in a few
> > months
> > >>> >> > https://issues.apache.org/jira/browse/AIRFLOW-85
> > >>> >> >
> > >>> >> > We use a templating system to create/deploy DAGs dynamically
> > based on
> > >>> >> some directory/file structure. This allows analysts to quickly
> > deploy
> > >>> and
> > >>> >> schedule their ETL code without having to interact with the
> Airflow
> > >>> >> installation directly. It would be great if those same analysts
> > could
> > >>> >> access to their own DAGs in the UI so that they can clear DAG
> runs,
> > >>> mark
> > >>> >> success, etc. while keeping them away from our core ETL and other
> > >>> >> people's/organization's DAGs. Some of this can be accomplished
> with
> > >>> ‘filter
> > >>> >> by owner’ but it doesn’t address the use case where a DAG can be
> > >>> maintained
> > >>> >> by multiple users in the same organization when they have separate
> > >>> Airflow
> > >>> >> user accounts.
> > >>> >> >
> > >>> >> > 2) An option to turn off backfill:
> > >>> >> > https://issues.apache.org/jira/browse/AIRFLOW-558
> > >>> >> > For cases where a DAG does an insert overwrite on a table every
> > day.
> > >>> >> This might be a realistic option for the current version but I
> just
> > >>> wanted
> > >>> >> to call attention to this feature request.
> > >>> >> >
> > >>> >> > Best,
> > >>> >> > David
> > >>> >> >
> > >>> >> > On Nov 17, 2016, at 6:19 PM, Maxime Beauchemin <
> > >>> >> [email protected]<mailto:[email protected]>>
> > wrote:
> > >>> >> >
> > >>> >> > *This is a brainstorm email thread about Airflow 2.0!*
> > >>> >> >
> > >>> >> > I wanted to share some ideas around what I would like to do in
> > >>> Airflow
> > >>> >> 2.0
> > >>> >> > and would love to hear what others are thinking. I'll compile
> the
> > >>> ideas
> > >>> >> > that are shared in this thread in a Wiki once the conversation
> > fades.
> > >>> >> >
> > >>> >> > -------------------------------------------
> > >>> >> >
> > >>> >> > First idea, to get the conversation started:
> > >>> >> >
> > >>> >> > *Breaking down the package*
> > >>> >> > `pip install airflow-common airflow-scheduler airflow-webserver
> > >>> >> > airflow-operators-googlecloud ...`
> > >>> >> >
> > >>> >> > It seems to me like we're getting to a point where having
> > different
> > >>> >> > repositories and different packages would make things much
> easier
> > in
> > >>> all
> > >>> >> > sorts of ways. For instance the web server is a lot less
> sensitive
> > >>> than
> > >>> >> the
> > >>> >> > scheduler, and changes to operators should/could be deployed at
> > will,
> > >>> >> > independently from the main package. People in their environment
> > >>> could
> > >>> >> > upgrade only certain packages when needed. Travis builds would
> be
> > >>> more
> > >>> >> > targeted, and take less time, ...
> > >>> >> >
> > >>> >> > Also, the whole current "extra_requires" approach to optional
> > >>> >> dependencies
> > >>> >> > (in setup.py) is kind getting out-of-hand.
> > >>> >> >
> > >>> >> > Of course `pip install airflow` would bring in a collection of
> > >>> >> sub-packages
> > >>> >> > similar in functionality to what it does now, perhaps without so
> > many
> > >>> >> > operators you probably don't need in your environment.
> > >>> >> >
> > >>> >> > The release process is the main pain-point and the biggest risk
> > for
> > >>> the
> > >>> >> > project, and I feel like this a solid solution to address it.
> > >>> >> >
> > >>> >> > Max
> > >>> >> >
> > >>> >>
> > >>>
> > >>
> > >>
> >
>

Reply via email to