Few points to add from my side to discuss today - to plan the work for the upcoming week:
* For 1st Beta I think it will be important to release PyPI packages I believe * I think the automated providers registering might be partially ready as well for the beta, probably not all of the features - I plan them for beta2 * I think even if this is not yet needed for betas, we need to discuss how to approach per-provider documentation separation. It would be great to fully separate - especially the AutoAPI building per-provider (Also on the CI). J. On Sun, Oct 18, 2020 at 10:58 PM Jarek Potiuk <[email protected]> wrote: > The usual pre-meeting summary from my side. > > In preparation for Airflow 2.0 Dev call tomorrow, I have prepared some bug > fixes and improvements to the providers approach. We are steadily moving in > the mini-project https://github.com/apache/airflow/projects/5 . Some of > them already > merged (thanks those who reviewed), but I have 2 PRs in progress to get to > the place where it is ready to release alpha2 > > https://github.com/apache/airflow/pull/11630 - The .tar.gz provider > packages are installable now. > https://github.com/apache/airflow/pull/11586 - Fixes versioning for > pre-release provider packages > > The small thing that I've stumbled upon while preparing the provider > package was the Licences/Notice review - > https://github.com/apache/airflow/issues/11632 as Ash mentioned in the > comment, the current "dupli/tripli-cation of reporting the licences" is > likely OK (as we've done that at incubator graduation). Ry Walker > already had some comments / results of licence checks recently so maybe we > can talk about it and agree some common approach (or agree that there is > nothing to discuss) tomorrow, > > For the provider's packages, I reviewed and removed all the license deps > as not needed, but it would be great to talk about it tomorrow anyway. > > J. > > > On Wed, Oct 14, 2020 at 9:00 AM Jarek Potiuk <[email protected]> > wrote: > >> A small follow up: The 2.0.0a1 release is the "core" release only. It has >> no "providers" installed. Airflow 2.0 will be distributed as a number of >> separate packages: "core" will be released separately and each of the >> providers has its own package to install. >> Once we release it in PyPI, the right provider packages will be installed >> automatically when you install the right extra (so pip install >> apache-airflow[google] will also pull in the latest >> apache-airflow-providers-google package, but for now you need to install >> those packages manually. >> The 0.0.1a versions of all provider packages are available at >> https://dist.apache.org/repos/dist/dev/airflow/providers/0.0.1a1/ >> >> And big congrats to the whole team for pulling this together! That is a >> huge milestone! >> >> J. >> >> >> On Tue, Oct 13, 2020 at 9:47 PM Ash Berlin-Taylor <[email protected]> >> wrote: >> >>> I'm proud to announce the availability of Apache Airlow 2.0.0.alpha1 >>> for testing! >>> >>> First the caveat: this is an alpha release. Do not run it in production, >>> it might not be without serious problems, and in the extreme case you may >>> have to reset your database between this and the beta or release >>> candidates. (This is extremely unlikely, but don't say we didn't warn you.) >>> >>> This "snapshot" is intended for members of the Airflow developer >>> community to test the build and get an early start on testing 2.0.0. For >>> clarity, this is not an official release of Apache Airflow either - that >>> doesn't happen until we make a release candidate and then vote on it, and >>> based on the expected timelines on the Airflow 2.0 planning page >>> <https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+2.0+-+Planning>, >>> we expect that to happen the week of 30th Nov, 2020. >>> >>> This is quite a big change, so for this alpha release you shouldn't >>> necessarily expect your DAGs to work unchanged -- please read >>> https://github.com/apache/airflow/blob/2.0.0a1/UPDATING.md#airflow-200a1 for >>> updating notes. Before we release 2.0.0 fully we will have a 1.10.13 >>> released that provides an automated tool to identify many of the changes >>> that you will need to make before upgrading to 2.0 >>> >>> The alpha snapshot is available at: >>> >>> https://dist.apache.org/repos/dist/dev/airflow/2.0.0a1/ >>> >>> *apache-airflow-2.0.0a1-source.tar.gz* is a source release that comes >>> with INSTALL instructions. >>> >>> *apache-airflow-2.0.0a1-bin.tar.gz* is the binary Python "sdist" >>> snapshot. >>> >>> *apache_airflow-2.0.0a1-py3-none-any.whl* is the binary Python wheel >>> snapshot. >>> >>> This snapshot has *not* been pushed to PyPi. >>> >>> Public keys are available at: https://www.apache.org/dist/airflow/KEYS >>> >>> The full changelog is about 2,000 lines long (already excluding anything >>> backported to 1.10), so for now there is no full change log *yet*, but >>> the major features in 2.0.0alpha1 compared to 1.10.12 are: >>> >>> >>> - Decorated Flows (AIP-31) >>> >>> (Used to be called Functional DAGs.) >>> >>> DAGs are now much much nicer to author especially when using >>> PythonOperator, deps are handled more clearly and XCom is nicer to use >>> >>> Read more here: >>> >>> Decorated Flow Documentation >>> <https://airflow.readthedocs.io/en/latest/concepts.html#decorated-flows> >>> >>> - Fully specified REST API (AIP-32) >>> >>> We now have a fully supported, and no-longer-experimental API with a >>> fully published OpenAPI specification. >>> >>> Read more here: >>> >>> REST API Documentation >>> <https://airflow.readthedocs.io/en/latest/stable-rest-api-ref.html> >>> >>> - Massive Scheduler performance improvements >>> >>> As part of AIP-15 (Scheduler HA+performance) and other work Kamil >>> did we have made significant performance improvements to the Airflow >>> Scheduler and it now starts tasks much, MUCH quicker. >>> >>> We will follow up with exact benchmark figures (we want to triple >>> check them as we don't quite believe the numbers!) >>> >>> - Scheduler is now HA compatible (AIP-15) >>> >>> It's now possible and supported to run more than a single scheduler >>> instance, either for resiliency in case one goes down, or to get higher >>> scheduling performance. >>> >>> To fully use this feature you need Postgres 9.6+ or MySQL 8+ (MySQL >>> 5 won't work with more than one scheduler I'm afraid). >>> >>> There's no config or other set up required to run more than one >>> scheduler—just start up a second scheduler somewhere else (ensuring it >>> has >>> access to the DAG files) and they will all cooperate through the >>> database. >>> >>> Docs PR here: Scheduler HA documentation PR >>> <https://github.com/apache/airflow/pull/11467/files> >>> >>> - Task Groups (AIP-34, Docs) >>> >>> SubDAGs are useful for grouping tasks in the UI but have many >>> drawbacks in their execution behaviour (such as only executing a single >>> task in parallel!) so we've introduced a new concept called "Task Groups" >>> which provide the same grouping behaviour as subdags, but don't have any >>> of >>> the execution-time drawbacks. >>> >>> Read more here: Task Grouping Documentation >>> <https://airflow.readthedocs.io/en/latest/concepts.html#taskgroup> >>> >>> - Refreshed UI >>> >>> We've given the Airflow UI a visual refresh >>> <https://github.com/apache/airflow/pull/11195> and updated some of >>> the styling. Check out the screenshots in the docs >>> <https://airflow.readthedocs.io/en/latest/ui.html>. >>> >>> - Smart Sensors for reduced load from sensors (AIP-17) >>> >>> If you make heavy use of sensors in your Airflow cluster you can >>> start to find that sensor execution starts to take up a significant >>> proportion of your cluster, even with "reshedule" mode. So we've added a >>> new mode called "Smart Sensors. >>> >>> This feature is in "early-access" - it's been well tested by AirBnB, >>> so is "stable"/usable but we reserve the right to make backwards >>> incompatible changes in a future release (if we have to. We'll try very >>> hard not to!) >>> >>> Docs on: Smart Sensors >>> >>> <https://airflow.readthedocs.io/en/latest/smart-sensor.html?highlight=smartsensors> >>> >>> - Simplified KubernetesExecutor >>> >>> For Airflow 2.0, we have re-architected the KubernetesExecutor in a >>> fashion that is simultaneously faster, simpler to understand, and offers >>> far more flexibility to Airflow users. Users will now be able to access >>> the >>> full Kubernetes API to create a yaml `pod_template_file` instead of >>> filling >>> in parameters in their airflow.cfg. >>> >>> We have also replaced the `executor_config` dictionary with the >>> `pod_override` parameter, which takes a Kubernetes V1Pod object for a >>> clear >>> 1:1 override setting. These changes have removed over three thousand >>> lines >>> of code for the KubernetesExecutor, which simultaneously makes it run >>> faster and creates fewer potential errors. >>> >>> Read more here: >>> >>> Docs on pod_template_file >>> >>> <https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-template-file> >>> Docs on pod_override >>> >>> <https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-override> >>> >>> We've tried where possible to make as few breaking changes as possible, >>> and to provide deprecation path in the code, especially in the case of >>> anything called in the DAG, but please read through the UPDATING.md to >>> check what might affect you - for instance we have re-organized the layout >>> of operators (they now all live under airflow.providers.*) but the old >>> names should continue to work, you'll just notice a lot of >>> DeprecationWarnings that you should fix up. >>> >>> Thank you so much to all the contributors over to get us to this point, >>> in no particular order: Kaxil Naik, Daniel Imberman, Jarek Potiuk, Tomek >>> Urbaszek, Kamil Breguła, Gerard Casas Saez, Kevin Yang, James Timmins, >>> Yingbo Wang, Qian Yu, Ryan Hamilton and the 100s of others who keep making >>> Airflow better for everyone. >>> >> >> >> -- >> >> Jarek Potiuk >> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >> M: +48 660 796 129 <+48660796129> >> [image: Polidea] <https://www.polidea.com/> >> >> > > -- > > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> > > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>
