Others? WDYT? Shall we start voting on it ? Any more comments?

I think I would like to propose an interim solution where all the
backported packages for 1.10 will be released as a single big package with
Calver Versioning and with some compatibility matrix where we will mark
which of the providers were tested (semi-automatically ?) possibly over
time automatically using system tests (following the AIP-4 proposal).

Eventually - maybe even for 2.0 - we will be able to split the packages on
per-provider basis and release them independently - but that is something
that we can test and agree later - when we will be discussing overall
release approach (including possibly semantic or calendar versioning for
2.* releases).

Let me know if you have any objections, if not, I will call a vote on that
in a day or so.

J.


On Fri, Feb 14, 2020 at 9:46 PM Jarek Potiuk <[email protected]>
wrote:

> How about going both routes ?
>
> 1) Provide one big "backport" package for 1.10
> 2) Once we release 2.0 split providers to micro-packages
>
> J.
>
> On Fri, Feb 14, 2020 at 9:30 PM Ash Berlin-Taylor <[email protected]> wrote:
>
>> I think before we take this discussion any further we should work out
>> what our plan is for AIP-8
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303&src=contextnavpagetreemode
>> ( though likely needs updating as it still talks about contrib which isn't
>> relevant anymore)
>>
>> AIP-8 talks about "One hook or operator per package, following the "micro
>> package" philosophy." as it's long term goal, and I think I broadly agree
>> with that.
>> Given we have almost all the things in place to have this, I would rather
>> we didn't release a single large "backport" package, only to have to have
>> users to then switch over to using new packages.
>> > We can follow the same process/keys etc as for releasing the main
>> airflow
>> > package, but I think it can be a bit more relaxed in terms of testing -
>> and
>> > we can release it more often (as long as there will be new changes in
>> > providers). Those packages might be released on "as-is" basis - without
>> > guarantee that they work for all operators/hooks/sensors - and without
>> > guarantee that they will work for all 1.10.* versions.
>>
>> I'm in favour of this as a general idea.
>> My preferred way is to have each "provider" be it's own package. This is
>> a slightly fuzzy concept, as for instance airflow.providers.goolgle
>> probably makes sense as a single package rather than a .google.cloud and
>> .google.marketing etc packages (as per Kamil's comment on Github), but
>> apache.airflow.providers.apache should _not_ be one package. So there's no
>> easily expressible rule here, but (to me) there is an obvious way for each
>> case.
>> Anyway, to provide smalle releases of providers as per terraform, or to
>> backport to make 2.0 adoption easier?
>> -a
>> On Feb 11 2020, at 3:43 pm, Jarek Potiuk <[email protected]>
>> wrote:
>> > Any more opinions?
>> >
>> > I gave some thoughts to that and I think we should :
>> > 1) release one big providers* package with Calver versioning -
>> > apache-airflow-providers-backport-2020.02.11 if were to release it today
>> > (we can always break them into smaller packages when we decide in 2.0).
>> And
>> > then we could change the package names.
>> > 2) scheduled or regular releases. We should release them as needed -
>> i.e.
>> > if we have large change at one or few of the providers or serious
>> bugfix,
>> > we can release it again.
>> > 3) it should be manual effort involving voting and PMC approvals.
>> >
>> > What do you think?
>> > J.
>> >
>> > On Mon, Feb 10, 2020 at 2:43 PM Tomasz Urbaszek <
>> [email protected]>
>> > wrote:
>> >
>> > > I am ok with users building their own packages.
>> > > T.
>> > > On Mon, Feb 10, 2020 at 1:47 PM Jarek Potiuk <
>> [email protected]>
>> > > wrote:
>> > >
>> > > > I think it should be a deliberate effort for releasing - with
>> voting. We
>> > > > are releasing the source code and IMHO it should follow the same
>> rules as
>> > > > releasing airflow itself.
>> > > > With this change - anyone will be able to build and prepare their
>> own
>> > >
>> > > .whl
>> > > > packages and install them locally, so I do not think there is a
>> need to
>> > > > automatically release those packages?
>> > > >
>> > > > However releasing them in PyPi should be quite an important event
>> as pypi
>> > > > releases are supposed to be used by users not developers.
>> > > >
>> > > > J.
>> > > > On Mon, Feb 10, 2020 at 11:16 AM Tomasz Urbaszek <
>> > > > [email protected]> wrote:
>> > > >
>> > > > > I think as long as we follow:
>> > > > > > The only people who are supposed to know about such developer
>> > > > >
>> > > >
>> > >
>> > > resources
>> > > > > are individuals actively participating in development or
>> following the
>> > > >
>> > > > dev
>> > > > > list and thus aware of the conditions placed on unreleased
>> materials.
>> > > > >
>> > > > > we should be ok. My impression is that people are usually aware of
>> > > > > what "nightly build" means and what are the risks. But it's just a
>> > > > > suggestion that I made thinking about all those people who
>> contribute
>> > > > > integration and can't use it "officialy" for let say the
>> following 2
>> > > > > months. I was also thinking about this result
>> > > > >
>> > > > >
>> > > >
>> > >
>> https://www.digitalocean.com/currents/december-2019/#generational-expectations-for-open-source-maintenance
>> > > > > :)
>> > > > >
>> > > > > T.
>> > > > > On Mon, Feb 10, 2020 at 10:52 AM Ash Berlin-Taylor <
>> [email protected]>
>> > > > wrote:
>> > > > > >
>> > > > > > That might be a grey area according to my reading of the Apache
>> > > release
>> > > > > policies:
>> > > > > >
>> > > > > > https://apache.org/legal/release-policy.html#publication
>> > > > > > > During the process of developing software and preparing a
>> release,
>> > > > > >
>> > > > >
>> > > > > various packages are made available to the development community
>> for
>> > > > > testing purposes. Projects MUST direct outsiders towards official
>> > > >
>> > > > releases
>> > > > > rather than raw source repositories, nightly builds, snapshots,
>> release
>> > > > > candidates, or any other similar packages. The only people who are
>> > > >
>> > > > supposed
>> > > > > to know about such developer resources are individuals actively
>> > > > > participating in development or following the dev list and thus
>> aware
>> > > >
>> > >
>> > > of
>> > > > > the conditions placed on unreleased materials.
>> > > > > > On Feb 10 2020, at 9:49 am, Tomasz Urbaszek <
>> > > > >
>> > > >
>> > > > [email protected]>
>> > > > > wrote:
>> > > > > > > As per the frequency of releases maybe we can consider
>> "nightly
>> > > > > > > builds" for providers? In this way any contributed
>> hook/operator
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > > will
>> > > > > > > be pip-installable in 24h, so users can start to use it =
>> test it.
>> > > > > > > This can help us reduce the number of releases with unworking
>> > > > > > > integrations.
>> > > > > > >
>> > > > > > > Tomek
>> > > > > > > On Mon, Feb 10, 2020 at 12:11 AM Jarek Potiuk <
>> > > > > [email protected]> wrote:
>> > > > > > > >
>> > > > > > > > TL;DR; I wanted to discuss the approach we are going to
>> take for
>> > > > > backported
>> > > > > > > > providers packages. This is important for PMCs to decide
>> about
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > > how
>> > > > > we are
>> > > > > > > > going to make release process for it, but I wanted to make
>> it
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > > public
>> > > > > > > > discussion so that anyone else can chime-in and we can
>> discuss it
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > > as
>> > > > > a
>> > > > > > > > community.
>> > > > > > > >
>> > > > > > > > *Context*
>> > > > > > > > As explained in the other thread - we are close to have
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > releasable/tested
>> > > > > > > > backport packages for Airflow 1.10.* series for "providers"
>> > > > > > > > operators/hooks/packages. The main purpose of those backport
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > packages is to
>> > > > > > > > let users migrate to the new operators before they migrate
>> to
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > > 2.0.*
>> > > > > version
>> > > > > > > > of Airflow.
>> > > > > > > >
>> > > > > > > > The 2.0 version is still some time in the future, and we
>> have a
>> > > > > number of
>> > > > > > > > operators/hooks/sensors implemented that are not actively
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > > used/tests
>> > > > > > > > because they are in master version. There are a number of
>> changes
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > and fixes
>> > > > > > > > only implemented in master/2.0 so it would be great to use
>> them
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > > in
>> > > > > 1.10 -
>> > > > > > > > to use the new features but also to test the master
>> versions as
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > early as
>> > > > > > > > possible.
>> > > > > > > >
>> > > > > > > > Another great property of the backport packages is that
>> they can
>> > > be
>> > > > > used to
>> > > > > > > > ease migration process - users can install the
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > "apache-airflow-providers"
>> > > > > > > > package and start using the new operators without migrating
>> to a
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > > new
>> > > > > > > > Airflow. They can incrementally move all their DAGs to use
>> the
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > > new
>> > > > > > > > "providers" package and only after all that is migrated
>> they can
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > migrate
>> > > > > > > > Airflow to 2.0 when they are ready. That allows to have a
>> smooth
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > migration
>> > > > > > > > path for those users.
>> > > > > > > >
>> > > > > > > > *Testing*
>> > > > > > > > The issue we have with those packages is that we are not
>> 100%
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > > sure
>> > > > > if the
>> > > > > > > > "providers" operators will work with any 1.10.* airflow
>> version.
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > There were
>> > > > > > > > no fundamental changes and they SHOULD work - but we never
>> know
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > until we
>> > > > > > > > test.
>> > > > > > > >
>> > > > > > > > Some preliminary tests with subset of GCP operators show
>> that the
>> > > > > operators
>> > > > > > > > work out-of-the box. We have a big set of "system" tests for
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > > "GCP"
>> > > > > > > > operators that we will run semi-automatically and make sure
>> that
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > > all
>> > > > > GCP
>> > > > > > > > operators are working fine. This is already a great
>> compatibility
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > test (GCP
>> > > > > > > > operators are about 1/3 of all operators for Airflow). But
>> also
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > > the
>> > > > > > > > approach used in GCP system tests can be applied to other
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > > operators.
>> > > > > > > >
>> > > > > > > > I plan to have a matrix of "compatibilities" in
>> > > > >
>> > > >
>> > >
>> https://cwiki.apache.org/confluence/display/AIRFLOW/Backported+providers+packages+for+Airflow+1.10.*+series
>> > > > > > > > and
>> > > > > > > > ask community to add/run tests with other packages as well.
>> It
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > should be
>> > > > > > > > rather easy to add system tests for other systems -
>> following the
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > way it is
>> > > > > > > > implemented for GCP.
>> > > > > > > >
>> > > > > > > > *Releases*
>> > > > > > > > I think the most important decision is how we are going to
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > > release
>> > > > > the
>> > > > > > > > packages. This is where PMCs have to decide I think as we
>> have
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > > legal
>> > > > > > > > responsibility for releasing Apache Airflow official
>> software.
>> > > > > > > >
>> > > > > > > > What we have now (after the PRs get merged) - wheel and
>> source
>> > > > > packages
>> > > > > > > > build automatically in Travis CI and uploaded to file.io
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > > ephemeral
>> > > > > storage.
>> > > > > > > > The builds upload all the packages there - one big
>> "providers"
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > package and
>> > > > > > > > separate packages for each "provider".
>> > > > > > > >
>> > > > > > > > It would be great if we can officially publish packages for
>> > > > > backporting in
>> > > > > > > > pypi however and here where we have to agree on the
>> > > > > > > > process/versioning/cadence.
>> > > > > > > >
>> > > > > > > > We can follow the same process/keys etc as for releasing
>> the main
>> > > > > airflow
>> > > > > > > > package, but I think it can be a bit more relaxed in terms
>> of
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > testing - and
>> > > > > > > > we can release it more often (as long as there will be new
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > > changes
>> > > > in
>> > > > > > > > providers). Those packages might be released on "as-is"
>> basis -
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > without
>> > > > > > > > guarantee that they work for all operators/hooks/sensors -
>> and
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > without
>> > > > > > > > guarantee that they will work for all 1.10.* versions. We
>> can
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > > have
>> > > > > the
>> > > > > > > > "compatibility" statement/matrix in our wiki where people
>> who
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > > tested
>> > > > > some
>> > > > > > > > package might simply state that it works for them. At
>> Polidea we
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > > can
>> > > > > assume
>> > > > > > > > stewardship on the GCP packages and test them using our
>> automated
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > system
>> > > > > > > > tests for every release for example - maybe others can
>> assume
>> > > > > > > > stewardship for other providers.
>> > > > > > > >
>> > > > > > > > For that - we will need some versioning/release policy. I
>> would
>> > > say
>> > > > > a CalVer
>> > > > > > > > <https://calver.org/> approach might work best
>> (YYYY.MM.DD). And
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > > to
>> > > > > make it
>> > > > > > > > simple we should release one "big" providers package with
>> all
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > > providers in.
>> > > > > > > > We can have roughly monthly cadence for it.
>> > > > > > > >
>> > > > > > > > But I am also open to any suggestions here.
>> > > > > > > > Please let me know what you think.
>> > > > > > > > J.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > > Jarek Potiuk
>> > > > > > > > Polidea <https://www.polidea.com/> | Principal Software
>> Engineer
>> > > > > > > >
>> > > > > > > > M: +48 660 796 129 <+48660796129>
>> > > > > > > > [image: Polidea] <https://www.polidea.com/>
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > --
>> > > > > > > Tomasz Urbaszek
>> > > > > > > Polidea | Software Engineer
>> > > > > > >
>> > > > > > > M: +48 505 628 493
>> > > > > > > E: [email protected]
>> > > > > > >
>> > > > > > > Unique Tech
>> > > > > > > Check out our projects!
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Tomasz Urbaszek
>> > > > > Polidea | Software Engineer
>> > > > >
>> > > > > M: +48 505 628 493
>> > > > > E: [email protected]
>> > > > >
>> > > > > Unique Tech
>> > > > > Check out our projects!
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > > Jarek Potiuk
>> > > > Polidea <https://www.polidea.com/> | Principal Software Engineer
>> > > >
>> > > > M: +48 660 796 129 <+48660796129>
>> > > > [image: Polidea] <https://www.polidea.com/>
>> > > >
>> > >
>> > >
>> > > --
>> > > Tomasz Urbaszek
>> > > Polidea <https://www.polidea.com/> | Software Engineer
>> > >
>> > > M: +48 505 628 493 <+48505628493>
>> > > E: [email protected] <[email protected]>
>> > >
>> > > Unique Tech
>> > > Check out our projects! <https://www.polidea.com/our-work>
>> > >
>> >
>> >
>> > --
>> > Jarek Potiuk
>> > Polidea <https://www.polidea.com/> | Principal Software Engineer
>> >
>> > M: +48 660 796 129 <+48660796129>
>> > [image: Polidea] <https://www.polidea.com/>
>> >
>>
>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to