+1 on voting. I read the discussions very carefully and I agree that it is worth creating a interm solution. It is a very good idea if it will not be a temporary and experimental solution indefinitely. We have a clear deadline, so we can go further with this.
On Mon, Feb 17, 2020 at 8:35 AM Jarek Potiuk <[email protected]> wrote: > > Others? WDYT? Shall we start voting on it ? Any more comments? > > I think I would like to propose an interim solution where all the > backported packages for 1.10 will be released as a single big package with > Calver Versioning and with some compatibility matrix where we will mark > which of the providers were tested (semi-automatically ?) possibly over > time automatically using system tests (following the AIP-4 proposal). > > Eventually - maybe even for 2.0 - we will be able to split the packages on > per-provider basis and release them independently - but that is something > that we can test and agree later - when we will be discussing overall > release approach (including possibly semantic or calendar versioning for > 2.* releases). > > Let me know if you have any objections, if not, I will call a vote on that > in a day or so. > > J. > > > On Fri, Feb 14, 2020 at 9:46 PM Jarek Potiuk <[email protected]> > wrote: > > > How about going both routes ? > > > > 1) Provide one big "backport" package for 1.10 > > 2) Once we release 2.0 split providers to micro-packages > > > > J. > > > > On Fri, Feb 14, 2020 at 9:30 PM Ash Berlin-Taylor <[email protected]> wrote: > > > >> I think before we take this discussion any further we should work out > >> what our plan is for AIP-8 > >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303&src=contextnavpagetreemode > >> ( though likely needs updating as it still talks about contrib which isn't > >> relevant anymore) > >> > >> AIP-8 talks about "One hook or operator per package, following the "micro > >> package" philosophy." as it's long term goal, and I think I broadly agree > >> with that. > >> Given we have almost all the things in place to have this, I would rather > >> we didn't release a single large "backport" package, only to have to have > >> users to then switch over to using new packages. > >> > We can follow the same process/keys etc as for releasing the main > >> airflow > >> > package, but I think it can be a bit more relaxed in terms of testing - > >> and > >> > we can release it more often (as long as there will be new changes in > >> > providers). Those packages might be released on "as-is" basis - without > >> > guarantee that they work for all operators/hooks/sensors - and without > >> > guarantee that they will work for all 1.10.* versions. > >> > >> I'm in favour of this as a general idea. > >> My preferred way is to have each "provider" be it's own package. This is > >> a slightly fuzzy concept, as for instance airflow.providers.goolgle > >> probably makes sense as a single package rather than a .google.cloud and > >> .google.marketing etc packages (as per Kamil's comment on Github), but > >> apache.airflow.providers.apache should _not_ be one package. So there's no > >> easily expressible rule here, but (to me) there is an obvious way for each > >> case. > >> Anyway, to provide smalle releases of providers as per terraform, or to > >> backport to make 2.0 adoption easier? > >> -a > >> On Feb 11 2020, at 3:43 pm, Jarek Potiuk <[email protected]> > >> wrote: > >> > Any more opinions? > >> > > >> > I gave some thoughts to that and I think we should : > >> > 1) release one big providers* package with Calver versioning - > >> > apache-airflow-providers-backport-2020.02.11 if were to release it today > >> > (we can always break them into smaller packages when we decide in 2.0). > >> And > >> > then we could change the package names. > >> > 2) scheduled or regular releases. We should release them as needed - > >> i.e. > >> > if we have large change at one or few of the providers or serious > >> bugfix, > >> > we can release it again. > >> > 3) it should be manual effort involving voting and PMC approvals. > >> > > >> > What do you think? > >> > J. > >> > > >> > On Mon, Feb 10, 2020 at 2:43 PM Tomasz Urbaszek < > >> [email protected]> > >> > wrote: > >> > > >> > > I am ok with users building their own packages. > >> > > T. > >> > > On Mon, Feb 10, 2020 at 1:47 PM Jarek Potiuk < > >> [email protected]> > >> > > wrote: > >> > > > >> > > > I think it should be a deliberate effort for releasing - with > >> voting. We > >> > > > are releasing the source code and IMHO it should follow the same > >> rules as > >> > > > releasing airflow itself. > >> > > > With this change - anyone will be able to build and prepare their > >> own > >> > > > >> > > .whl > >> > > > packages and install them locally, so I do not think there is a > >> need to > >> > > > automatically release those packages? > >> > > > > >> > > > However releasing them in PyPi should be quite an important event > >> as pypi > >> > > > releases are supposed to be used by users not developers. > >> > > > > >> > > > J. > >> > > > On Mon, Feb 10, 2020 at 11:16 AM Tomasz Urbaszek < > >> > > > [email protected]> wrote: > >> > > > > >> > > > > I think as long as we follow: > >> > > > > > The only people who are supposed to know about such developer > >> > > > > > >> > > > > >> > > > >> > > resources > >> > > > > are individuals actively participating in development or > >> following the > >> > > > > >> > > > dev > >> > > > > list and thus aware of the conditions placed on unreleased > >> materials. > >> > > > > > >> > > > > we should be ok. My impression is that people are usually aware of > >> > > > > what "nightly build" means and what are the risks. But it's just a > >> > > > > suggestion that I made thinking about all those people who > >> contribute > >> > > > > integration and can't use it "officialy" for let say the > >> following 2 > >> > > > > months. I was also thinking about this result > >> > > > > > >> > > > > > >> > > > > >> > > > >> https://www.digitalocean.com/currents/december-2019/#generational-expectations-for-open-source-maintenance > >> > > > > :) > >> > > > > > >> > > > > T. > >> > > > > On Mon, Feb 10, 2020 at 10:52 AM Ash Berlin-Taylor < > >> [email protected]> > >> > > > wrote: > >> > > > > > > >> > > > > > That might be a grey area according to my reading of the Apache > >> > > release > >> > > > > policies: > >> > > > > > > >> > > > > > https://apache.org/legal/release-policy.html#publication > >> > > > > > > During the process of developing software and preparing a > >> release, > >> > > > > > > >> > > > > > >> > > > > various packages are made available to the development community > >> for > >> > > > > testing purposes. Projects MUST direct outsiders towards official > >> > > > > >> > > > releases > >> > > > > rather than raw source repositories, nightly builds, snapshots, > >> release > >> > > > > candidates, or any other similar packages. The only people who are > >> > > > > >> > > > supposed > >> > > > > to know about such developer resources are individuals actively > >> > > > > participating in development or following the dev list and thus > >> aware > >> > > > > >> > > > >> > > of > >> > > > > the conditions placed on unreleased materials. > >> > > > > > On Feb 10 2020, at 9:49 am, Tomasz Urbaszek < > >> > > > > > >> > > > > >> > > > [email protected]> > >> > > > > wrote: > >> > > > > > > As per the frequency of releases maybe we can consider > >> "nightly > >> > > > > > > builds" for providers? In this way any contributed > >> hook/operator > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > will > >> > > > > > > be pip-installable in 24h, so users can start to use it = > >> test it. > >> > > > > > > This can help us reduce the number of releases with unworking > >> > > > > > > integrations. > >> > > > > > > > >> > > > > > > Tomek > >> > > > > > > On Mon, Feb 10, 2020 at 12:11 AM Jarek Potiuk < > >> > > > > [email protected]> wrote: > >> > > > > > > > > >> > > > > > > > TL;DR; I wanted to discuss the approach we are going to > >> take for > >> > > > > backported > >> > > > > > > > providers packages. This is important for PMCs to decide > >> about > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > how > >> > > > > we are > >> > > > > > > > going to make release process for it, but I wanted to make > >> it > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > public > >> > > > > > > > discussion so that anyone else can chime-in and we can > >> discuss it > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > as > >> > > > > a > >> > > > > > > > community. > >> > > > > > > > > >> > > > > > > > *Context* > >> > > > > > > > As explained in the other thread - we are close to have > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > releasable/tested > >> > > > > > > > backport packages for Airflow 1.10.* series for "providers" > >> > > > > > > > operators/hooks/packages. The main purpose of those backport > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > packages is to > >> > > > > > > > let users migrate to the new operators before they migrate > >> to > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > 2.0.* > >> > > > > version > >> > > > > > > > of Airflow. > >> > > > > > > > > >> > > > > > > > The 2.0 version is still some time in the future, and we > >> have a > >> > > > > number of > >> > > > > > > > operators/hooks/sensors implemented that are not actively > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > used/tests > >> > > > > > > > because they are in master version. There are a number of > >> changes > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > and fixes > >> > > > > > > > only implemented in master/2.0 so it would be great to use > >> them > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > in > >> > > > > 1.10 - > >> > > > > > > > to use the new features but also to test the master > >> versions as > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > early as > >> > > > > > > > possible. > >> > > > > > > > > >> > > > > > > > Another great property of the backport packages is that > >> they can > >> > > be > >> > > > > used to > >> > > > > > > > ease migration process - users can install the > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > "apache-airflow-providers" > >> > > > > > > > package and start using the new operators without migrating > >> to a > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > new > >> > > > > > > > Airflow. They can incrementally move all their DAGs to use > >> the > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > new > >> > > > > > > > "providers" package and only after all that is migrated > >> they can > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > migrate > >> > > > > > > > Airflow to 2.0 when they are ready. That allows to have a > >> smooth > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > migration > >> > > > > > > > path for those users. > >> > > > > > > > > >> > > > > > > > *Testing* > >> > > > > > > > The issue we have with those packages is that we are not > >> 100% > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > sure > >> > > > > if the > >> > > > > > > > "providers" operators will work with any 1.10.* airflow > >> version. > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > There were > >> > > > > > > > no fundamental changes and they SHOULD work - but we never > >> know > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > until we > >> > > > > > > > test. > >> > > > > > > > > >> > > > > > > > Some preliminary tests with subset of GCP operators show > >> that the > >> > > > > operators > >> > > > > > > > work out-of-the box. We have a big set of "system" tests for > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > "GCP" > >> > > > > > > > operators that we will run semi-automatically and make sure > >> that > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > all > >> > > > > GCP > >> > > > > > > > operators are working fine. This is already a great > >> compatibility > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > test (GCP > >> > > > > > > > operators are about 1/3 of all operators for Airflow). But > >> also > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > the > >> > > > > > > > approach used in GCP system tests can be applied to other > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > operators. > >> > > > > > > > > >> > > > > > > > I plan to have a matrix of "compatibilities" in > >> > > > > > >> > > > > >> > > > >> https://cwiki.apache.org/confluence/display/AIRFLOW/Backported+providers+packages+for+Airflow+1.10.*+series > >> > > > > > > > and > >> > > > > > > > ask community to add/run tests with other packages as well. > >> It > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > should be > >> > > > > > > > rather easy to add system tests for other systems - > >> following the > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > way it is > >> > > > > > > > implemented for GCP. > >> > > > > > > > > >> > > > > > > > *Releases* > >> > > > > > > > I think the most important decision is how we are going to > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > release > >> > > > > the > >> > > > > > > > packages. This is where PMCs have to decide I think as we > >> have > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > legal > >> > > > > > > > responsibility for releasing Apache Airflow official > >> software. > >> > > > > > > > > >> > > > > > > > What we have now (after the PRs get merged) - wheel and > >> source > >> > > > > packages > >> > > > > > > > build automatically in Travis CI and uploaded to file.io > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > ephemeral > >> > > > > storage. > >> > > > > > > > The builds upload all the packages there - one big > >> "providers" > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > package and > >> > > > > > > > separate packages for each "provider". > >> > > > > > > > > >> > > > > > > > It would be great if we can officially publish packages for > >> > > > > backporting in > >> > > > > > > > pypi however and here where we have to agree on the > >> > > > > > > > process/versioning/cadence. > >> > > > > > > > > >> > > > > > > > We can follow the same process/keys etc as for releasing > >> the main > >> > > > > airflow > >> > > > > > > > package, but I think it can be a bit more relaxed in terms > >> of > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > testing - and > >> > > > > > > > we can release it more often (as long as there will be new > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > changes > >> > > > in > >> > > > > > > > providers). Those packages might be released on "as-is" > >> basis - > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > without > >> > > > > > > > guarantee that they work for all operators/hooks/sensors - > >> and > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > without > >> > > > > > > > guarantee that they will work for all 1.10.* versions. We > >> can > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > have > >> > > > > the > >> > > > > > > > "compatibility" statement/matrix in our wiki where people > >> who > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > tested > >> > > > > some > >> > > > > > > > package might simply state that it works for them. At > >> Polidea we > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > can > >> > > > > assume > >> > > > > > > > stewardship on the GCP packages and test them using our > >> automated > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > system > >> > > > > > > > tests for every release for example - maybe others can > >> assume > >> > > > > > > > stewardship for other providers. > >> > > > > > > > > >> > > > > > > > For that - we will need some versioning/release policy. I > >> would > >> > > say > >> > > > > a CalVer > >> > > > > > > > <https://calver.org/> approach might work best > >> (YYYY.MM.DD). And > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > to > >> > > > > make it > >> > > > > > > > simple we should release one "big" providers package with > >> all > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > providers in. > >> > > > > > > > We can have roughly monthly cadence for it. > >> > > > > > > > > >> > > > > > > > But I am also open to any suggestions here. > >> > > > > > > > Please let me know what you think. > >> > > > > > > > J. > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > -- > >> > > > > > > > Jarek Potiuk > >> > > > > > > > Polidea <https://www.polidea.com/> | Principal Software > >> Engineer > >> > > > > > > > > >> > > > > > > > M: +48 660 796 129 <+48660796129> > >> > > > > > > > [image: Polidea] <https://www.polidea.com/> > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > -- > >> > > > > > > Tomasz Urbaszek > >> > > > > > > Polidea | Software Engineer > >> > > > > > > > >> > > > > > > M: +48 505 628 493 > >> > > > > > > E: [email protected] > >> > > > > > > > >> > > > > > > Unique Tech > >> > > > > > > Check out our projects! > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > >> > > > > -- > >> > > > > Tomasz Urbaszek > >> > > > > Polidea | Software Engineer > >> > > > > > >> > > > > M: +48 505 628 493 > >> > > > > E: [email protected] > >> > > > > > >> > > > > Unique Tech > >> > > > > Check out our projects! > >> > > > > > >> > > > > >> > > > > >> > > > -- > >> > > > Jarek Potiuk > >> > > > Polidea <https://www.polidea.com/> | Principal Software Engineer > >> > > > > >> > > > M: +48 660 796 129 <+48660796129> > >> > > > [image: Polidea] <https://www.polidea.com/> > >> > > > > >> > > > >> > > > >> > > -- > >> > > Tomasz Urbaszek > >> > > Polidea <https://www.polidea.com/> | Software Engineer > >> > > > >> > > M: +48 505 628 493 <+48505628493> > >> > > E: [email protected] <[email protected]> > >> > > > >> > > Unique Tech > >> > > Check out our projects! <https://www.polidea.com/our-work> > >> > > > >> > > >> > > >> > -- > >> > Jarek Potiuk > >> > Polidea <https://www.polidea.com/> | Principal Software Engineer > >> > > >> > M: +48 660 796 129 <+48660796129> > >> > [image: Polidea] <https://www.polidea.com/> > >> > > >> > >> > > > > -- > > > > Jarek Potiuk > > Polidea <https://www.polidea.com/> | Principal Software Engineer > > > > M: +48 660 796 129 <+48660796129> > > [image: Polidea] <https://www.polidea.com/> > > > > > > -- > > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/>
