The usual pre-meeting summary from my side. In preparation for Airflow 2.0 Dev call tomorrow, I have prepared some bug fixes and improvements to the providers approach. We are steadily moving in the mini-project https://github.com/apache/airflow/projects/5 . Some of them already merged (thanks those who reviewed), but I have 2 PRs in progress to get to the place where it is ready to release alpha2
https://github.com/apache/airflow/pull/11630 - The .tar.gz provider packages are installable now. https://github.com/apache/airflow/pull/11586 - Fixes versioning for pre-release provider packages The small thing that I've stumbled upon while preparing the provider package was the Licences/Notice review - https://github.com/apache/airflow/issues/11632 as Ash mentioned in the comment, the current "dupli/tripli-cation of reporting the licences" is likely OK (as we've done that at incubator graduation). Ry Walker already had some comments / results of licence checks recently so maybe we can talk about it and agree some common approach (or agree that there is nothing to discuss) tomorrow, For the provider's packages, I reviewed and removed all the license deps as not needed, but it would be great to talk about it tomorrow anyway. J. On Wed, Oct 14, 2020 at 9:00 AM Jarek Potiuk <[email protected]> wrote: > A small follow up: The 2.0.0a1 release is the "core" release only. It has > no "providers" installed. Airflow 2.0 will be distributed as a number of > separate packages: "core" will be released separately and each of the > providers has its own package to install. > Once we release it in PyPI, the right provider packages will be installed > automatically when you install the right extra (so pip install > apache-airflow[google] will also pull in the latest > apache-airflow-providers-google package, but for now you need to install > those packages manually. > The 0.0.1a versions of all provider packages are available at > https://dist.apache.org/repos/dist/dev/airflow/providers/0.0.1a1/ > > And big congrats to the whole team for pulling this together! That is a > huge milestone! > > J. > > > On Tue, Oct 13, 2020 at 9:47 PM Ash Berlin-Taylor <[email protected]> > wrote: > >> I'm proud to announce the availability of Apache Airlow 2.0.0.alpha1 for >> testing! >> >> First the caveat: this is an alpha release. Do not run it in production, >> it might not be without serious problems, and in the extreme case you may >> have to reset your database between this and the beta or release >> candidates. (This is extremely unlikely, but don't say we didn't warn you.) >> >> This "snapshot" is intended for members of the Airflow developer >> community to test the build and get an early start on testing 2.0.0. For >> clarity, this is not an official release of Apache Airflow either - that >> doesn't happen until we make a release candidate and then vote on it, and >> based on the expected timelines on the Airflow 2.0 planning page >> <https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+2.0+-+Planning>, >> we expect that to happen the week of 30th Nov, 2020. >> >> This is quite a big change, so for this alpha release you shouldn't >> necessarily expect your DAGs to work unchanged -- please read >> https://github.com/apache/airflow/blob/2.0.0a1/UPDATING.md#airflow-200a1 for >> updating notes. Before we release 2.0.0 fully we will have a 1.10.13 >> released that provides an automated tool to identify many of the changes >> that you will need to make before upgrading to 2.0 >> >> The alpha snapshot is available at: >> >> https://dist.apache.org/repos/dist/dev/airflow/2.0.0a1/ >> >> *apache-airflow-2.0.0a1-source.tar.gz* is a source release that comes >> with INSTALL instructions. >> >> *apache-airflow-2.0.0a1-bin.tar.gz* is the binary Python "sdist" snapshot. >> >> *apache_airflow-2.0.0a1-py3-none-any.whl* is the binary Python wheel >> snapshot. >> >> This snapshot has *not* been pushed to PyPi. >> >> Public keys are available at: https://www.apache.org/dist/airflow/KEYS >> >> The full changelog is about 2,000 lines long (already excluding anything >> backported to 1.10), so for now there is no full change log *yet*, but >> the major features in 2.0.0alpha1 compared to 1.10.12 are: >> >> >> - Decorated Flows (AIP-31) >> >> (Used to be called Functional DAGs.) >> >> DAGs are now much much nicer to author especially when using >> PythonOperator, deps are handled more clearly and XCom is nicer to use >> >> Read more here: >> >> Decorated Flow Documentation >> <https://airflow.readthedocs.io/en/latest/concepts.html#decorated-flows> >> >> - Fully specified REST API (AIP-32) >> >> We now have a fully supported, and no-longer-experimental API with a >> fully published OpenAPI specification. >> >> Read more here: >> >> REST API Documentation >> <https://airflow.readthedocs.io/en/latest/stable-rest-api-ref.html> >> >> - Massive Scheduler performance improvements >> >> As part of AIP-15 (Scheduler HA+performance) and other work Kamil did >> we have made significant performance improvements to the Airflow Scheduler >> and it now starts tasks much, MUCH quicker. >> >> We will follow up with exact benchmark figures (we want to triple >> check them as we don't quite believe the numbers!) >> >> - Scheduler is now HA compatible (AIP-15) >> >> It's now possible and supported to run more than a single scheduler >> instance, either for resiliency in case one goes down, or to get higher >> scheduling performance. >> >> To fully use this feature you need Postgres 9.6+ or MySQL 8+ (MySQL 5 >> won't work with more than one scheduler I'm afraid). >> >> There's no config or other set up required to run more than one >> scheduler—just start up a second scheduler somewhere else (ensuring it has >> access to the DAG files) and they will all cooperate through the database. >> >> Docs PR here: Scheduler HA documentation PR >> <https://github.com/apache/airflow/pull/11467/files> >> >> - Task Groups (AIP-34, Docs) >> >> SubDAGs are useful for grouping tasks in the UI but have many >> drawbacks in their execution behaviour (such as only executing a single >> task in parallel!) so we've introduced a new concept called "Task Groups" >> which provide the same grouping behaviour as subdags, but don't have any >> of >> the execution-time drawbacks. >> >> Read more here: Task Grouping Documentation >> <https://airflow.readthedocs.io/en/latest/concepts.html#taskgroup> >> >> - Refreshed UI >> >> We've given the Airflow UI a visual refresh >> <https://github.com/apache/airflow/pull/11195> and updated some of >> the styling. Check out the screenshots in the docs >> <https://airflow.readthedocs.io/en/latest/ui.html>. >> >> - Smart Sensors for reduced load from sensors (AIP-17) >> >> If you make heavy use of sensors in your Airflow cluster you can >> start to find that sensor execution starts to take up a significant >> proportion of your cluster, even with "reshedule" mode. So we've added a >> new mode called "Smart Sensors. >> >> This feature is in "early-access" - it's been well tested by AirBnB, >> so is "stable"/usable but we reserve the right to make backwards >> incompatible changes in a future release (if we have to. We'll try very >> hard not to!) >> >> Docs on: Smart Sensors >> >> <https://airflow.readthedocs.io/en/latest/smart-sensor.html?highlight=smartsensors> >> >> - Simplified KubernetesExecutor >> >> For Airflow 2.0, we have re-architected the KubernetesExecutor in a >> fashion that is simultaneously faster, simpler to understand, and offers >> far more flexibility to Airflow users. Users will now be able to access >> the >> full Kubernetes API to create a yaml `pod_template_file` instead of >> filling >> in parameters in their airflow.cfg. >> >> We have also replaced the `executor_config` dictionary with the >> `pod_override` parameter, which takes a Kubernetes V1Pod object for a >> clear >> 1:1 override setting. These changes have removed over three thousand lines >> of code for the KubernetesExecutor, which simultaneously makes it run >> faster and creates fewer potential errors. >> >> Read more here: >> >> Docs on pod_template_file >> >> <https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-template-file> >> Docs on pod_override >> >> <https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-override> >> >> We've tried where possible to make as few breaking changes as possible, >> and to provide deprecation path in the code, especially in the case of >> anything called in the DAG, but please read through the UPDATING.md to >> check what might affect you - for instance we have re-organized the layout >> of operators (they now all live under airflow.providers.*) but the old >> names should continue to work, you'll just notice a lot of >> DeprecationWarnings that you should fix up. >> >> Thank you so much to all the contributors over to get us to this point, >> in no particular order: Kaxil Naik, Daniel Imberman, Jarek Potiuk, Tomek >> Urbaszek, Kamil Breguła, Gerard Casas Saez, Kevin Yang, James Timmins, >> Yingbo Wang, Qian Yu, Ryan Hamilton and the 100s of others who keep making >> Airflow better for everyone. >> > > > -- > > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> > > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>
