Cool! On Tue, Oct 20, 2020 at 3:55 PM Ry Walker <[email protected]> wrote:
> FYI we've also made it easy to run Airflow 2 alpha locally using the astro > CLI. > > 1. Install the CLI: > > curl -sSL https://install.astronomer.io | sudo bash -s -- v0.21.0 > > 2. Change Dockerfile to: > > FROM astronomerio/ap-airflow:2.0.0-1.dev3-buster-onbuild > > 3. You can install packages in requirements.txt using this format: > > apache-airflow-providers-google > apache-airflow-providers-snowflake > apache-airflow-providers-http > apache-airflow-providers-postgres > > If you have some dependency conflicts you can designate what version to > use with this format: > > requests<2.24.0 > idna<2.10 > > I had to do this for snowflake-connector-python package to work, for > example... > > 4. Then run astro dev start to fire everything up locally w/ Docker > Compose > > *Note: you need to have Docker installed on your computer as well for this > to work.* > > -Ry > > > > On Tue, Oct 20, 2020 at 8:05 AM Jarek Potiuk <[email protected]> > wrote: > >> >> Sure. You should install the .whl packages directly via `pip install >> <....>.whl`. For now we do not have yet PIP-released version so you have to >> manually choose the right extras when you install airflow and then install >> the provider: >> >> `pip install 'apache_airflow-2.0.0a1-py3-none-any.whl[google]' >> >> And then the corresponding google provider: >> >> `pip install >> 0.0.1a1/apache_airflow_providers_google-0.0.1a1-py3-none-any.whl --no-deps` >> >> The --no-deps switch is important because we have no pypi and PIP has >> problem with installing alpha versions from files (>2.0.0 works for 2.0.0a1 >> from PyPI but not from wheels). >> >> This version problem will be fixed in the upcoming a2 version. Issue - >> merged: https://github.com/apache/airflow/issues/11577 >> As soon as we get beta versions and release to PyPI it will be enough to >> run 'pip install apache-airflow[google]==2.0.0b1' and the provider package >> will be installed automatically. Corresponding issue and PR >> https://github.com/apache/airflow/issues/11464 >> >> I hope it helps! I will make sure to add more information about >> installation with next release! Thanks for pointing it out ! >> >> J. >> >> >> >> >> On Tue, Oct 20, 2020 at 12:30 PM Julian De Ruiter >> <[email protected]> wrote: >> >>> Hi Jarek, >>> >>> Can you maybe provide some guidelines on how to install these provider >>> packages in the current alpha? Tried some things on my own, but seem to be >>> running into issues. >>> >>> Best, >>> Julian >>> >>> On 2020/10/14 07:00:08, Jarek Potiuk <[email protected]> wrote: >>> > A small follow up: The 2.0.0a1 release is the "core" release only. It >>> has> >>> > no "providers" installed. Airflow 2.0 will be distributed as a number >>> of> >>> > separate packages: "core" will be released separately and each of the> >>> > providers has its own package to install.> >>> > Once we release it in PyPI, the right provider packages will be >>> installed> >>> > automatically when you install the right extra (so pip install> >>> > apache-airflow[google] will also pull in the latest> >>> > apache-airflow-providers-google package, but for now you need to >>> install> >>> > those packages manually.> >>> > The 0.0.1a versions of all provider packages are available at> >>> > https://dist.apache.org/repos/dist/dev/airflow/providers/0.0.1a1/> >>> > >>> > And big congrats to the whole team for pulling this together! That is >>> a> >>> > huge milestone!> >>> > >>> > J.> >>> > >>> > >>> > On Tue, Oct 13, 2020 at 9:47 PM Ash Berlin-Taylor < >>> [email protected]>> >>> > wrote:> >>> > >>> > > I'm proud to announce the availability of Apache Airlow 2.0.0.alpha1 >>> for> >>> > > testing!> >>> > >> >>> > > First the caveat: this is an alpha release. Do not run it in >>> production,> >>> > > it might not be without serious problems, and in the extreme case >>> you may> >>> > > have to reset your database between this and the beta or release> >>> > > candidates. (This is extremely unlikely, but don't say we didn't >>> warn you.)> >>> > >> >>> > > This "snapshot" is intended for members of the Airflow developer >>> community> >>> > > to test the build and get an early start on testing 2.0.0. For >>> clarity,> >>> > > this is not an official release of Apache Airflow either - that >>> doesn't> >>> > > happen until we make a release candidate and then vote on it, and >>> based on> >>> > > the expected timelines on the Airflow 2.0 planning page> >>> > > < >>> https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+2.0+-+Planning>,> >>> >>> > > we expect that to happen the week of 30th Nov, 2020.> >>> > >> >>> > > This is quite a big change, so for this alpha release you shouldn't> >>> > > necessarily expect your DAGs to work unchanged -- please read> >>> > > >>> https://github.com/apache/airflow/blob/2.0.0a1/UPDATING.md#airflow-200a1 >>> for> >>> > > updating notes. Before we release 2.0.0 fully we will have a >>> 1.10.13> >>> > > released that provides an automated tool to identify many of the >>> changes> >>> > > that you will need to make before upgrading to 2.0> >>> > >> >>> > > The alpha snapshot is available at:> >>> > >> >>> > > https://dist.apache.org/repos/dist/dev/airflow/2.0.0a1/> >>> > >> >>> > > *apache-airflow-2.0.0a1-source.tar.gz* is a source release that >>> comes with> >>> > > INSTALL instructions.> >>> > >> >>> > > *apache-airflow-2.0.0a1-bin.tar.gz* is the binary Python "sdist" >>> snapshot.> >>> > >> >>> > > *apache_airflow-2.0.0a1-py3-none-any.whl* is the binary Python >>> wheel> >>> > > snapshot.> >>> > >> >>> > > This snapshot has *not* been pushed to PyPi.> >>> > >> >>> > > Public keys are available at: >>> https://www.apache.org/dist/airflow/KEYS> >>> > >> >>> > > The full changelog is about 2,000 lines long (already excluding >>> anything> >>> > > backported to 1.10), so for now there is no full change log *yet*, >>> but> >>> > > the major features in 2.0.0alpha1 compared to 1.10.12 are:> >>> > >> >>> > >> >>> > > - Decorated Flows (AIP-31)> >>> > >> >>> > > (Used to be called Functional DAGs.)> >>> > >> >>> > > DAGs are now much much nicer to author especially when using> >>> > > PythonOperator, deps are handled more clearly and XCom is nicer >>> to use> >>> > >> >>> > > Read more here:> >>> > >> >>> > > Decorated Flow Documentation> >>> > > < >>> https://airflow.readthedocs.io/en/latest/concepts.html#decorated-flows>> >>> >>> > >> >>> > > - Fully specified REST API (AIP-32)> >>> > >> >>> > > We now have a fully supported, and no-longer-experimental API >>> with a> >>> > > fully published OpenAPI specification.> >>> > >> >>> > > Read more here:> >>> > >> >>> > > REST API Documentation> >>> > > < >>> https://airflow.readthedocs.io/en/latest/stable-rest-api-ref.html>> >>> > >> >>> > > - Massive Scheduler performance improvements> >>> > >> >>> > > As part of AIP-15 (Scheduler HA+performance) and other work Kamil >>> did> >>> > > we have made significant performance improvements to the Airflow >>> Scheduler> >>> > > and it now starts tasks much, MUCH quicker.> >>> > >> >>> > > We will follow up with exact benchmark figures (we want to >>> triple> >>> > > check them as we don't quite believe the numbers!)> >>> > >> >>> > > - Scheduler is now HA compatible (AIP-15)> >>> > >> >>> > > It's now possible and supported to run more than a single >>> scheduler> >>> > > instance, either for resiliency in case one goes down, or to get >>> higher> >>> > > scheduling performance.> >>> > >> >>> > > To fully use this feature you need Postgres 9.6+ or MySQL 8+ >>> (MySQL 5> >>> > > won't work with more than one scheduler I'm afraid).> >>> > >> >>> > > There's no config or other set up required to run more than one> >>> > > scheduler—just start up a second scheduler somewhere else >>> (ensuring it has> >>> > > access to the DAG files) and they will all cooperate through the >>> database.> >>> > >> >>> > > Docs PR here: Scheduler HA documentation PR> >>> > > <https://github.com/apache/airflow/pull/11467/files>> >>> > >> >>> > > - Task Groups (AIP-34, Docs)> >>> > >> >>> > > SubDAGs are useful for grouping tasks in the UI but have many> >>> > > drawbacks in their execution behaviour (such as only executing a >>> single> >>> > > task in parallel!) so we've introduced a new concept called "Task >>> Groups"> >>> > > which provide the same grouping behaviour as subdags, but don't >>> have any of> >>> > > the execution-time drawbacks.> >>> > >> >>> > > Read more here: Task Grouping Documentation> >>> > > <https://airflow.readthedocs.io/en/latest/concepts.html#taskgroup>> >>> >>> > >> >>> > > - Refreshed UI> >>> > >> >>> > > We've given the Airflow UI a visual refresh> >>> > > <https://github.com/apache/airflow/pull/11195> and updated some >>> of the> >>> > > styling. Check out the screenshots in the docs> >>> > > <https://airflow.readthedocs.io/en/latest/ui.html>.> >>> > >> >>> > > - Smart Sensors for reduced load from sensors (AIP-17)> >>> > >> >>> > > If you make heavy use of sensors in your Airflow cluster you can >>> start> >>> > > to find that sensor execution starts to take up a significant >>> proportion of> >>> > > your cluster, even with "reshedule" mode. So we've added a new >>> mode called> >>> > > "Smart Sensors.> >>> > >> >>> > > This feature is in "early-access" - it's been well tested by >>> AirBnB,> >>> > > so is "stable"/usable but we reserve the right to make backwards> >>> > > incompatible changes in a future release (if we have to. We'll >>> try very> >>> > > hard not to!)> >>> > >> >>> > > Docs on: Smart Sensors> >>> > > < >>> https://airflow.readthedocs.io/en/latest/smart-sensor.html?highlight=smartsensors>> >>> >>> > >> >>> > > - Simplified KubernetesExecutor> >>> > >> >>> > > For Airflow 2.0, we have re-architected the KubernetesExecutor in >>> a> >>> > > fashion that is simultaneously faster, simpler to understand, and >>> offers> >>> > > far more flexibility to Airflow users. Users will now be able to >>> access the> >>> > > full Kubernetes API to create a yaml `pod_template_file` instead >>> of filling> >>> > > in parameters in their airflow.cfg.> >>> > >> >>> > > We have also replaced the `executor_config` dictionary with the> >>> > > `pod_override` parameter, which takes a Kubernetes V1Pod object >>> for a clear> >>> > > 1:1 override setting. These changes have removed over three >>> thousand lines> >>> > > of code for the KubernetesExecutor, which simultaneously makes it >>> run> >>> > > faster and creates fewer potential errors.> >>> > >> >>> > > Read more here:> >>> > >> >>> > > Docs on pod_template_file> >>> > > < >>> https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-template-file>> >>> >>> > > Docs on pod_override> >>> > > < >>> https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-override>> >>> >>> > >> >>> > > We've tried where possible to make as few breaking changes as >>> possible,> >>> > > and to provide deprecation path in the code, especially in the case >>> of> >>> > > anything called in the DAG, but please read through the UPDATING.md >>> to> >>> > > check what might affect you - for instance we have re-organized the >>> layout> >>> > > of operators (they now all live under airflow.providers.*) but the >>> old> >>> > > names should continue to work, you'll just notice a lot of> >>> > > DeprecationWarnings that you should fix up.> >>> > >> >>> > > Thank you so much to all the contributors over to get us to this >>> point, in> >>> > > no particular order: Kaxil Naik, Daniel Imberman, Jarek Potiuk, >>> Tomek> >>> > > Urbaszek, Kamil Breguła, Gerard Casas Saez, Kevin Yang, James >>> Timmins,> >>> > > Yingbo Wang, Qian Yu, Ryan Hamilton and the 100s of others who keep >>> making> >>> > > Airflow better for everyone.> >>> > >> >>> > >>> > >>> > -- > >>> > >>> > Jarek Potiuk> >>> > Polidea <https://www.polidea.com/> | Principal Software Engineer> >>> > >>> > M: +48 660 796 129 <+48660796129>> >>> > [image: Polidea] <https://www.polidea.com/>> >>> > >>> >>> Best regards / met vriendelijke groet, >>> >>> Julian de Ruiter >>> Machine learning engineer >>> >>> ▉▉▉▉▉▉▉ GoDataDriven >>> Proudly part of the Xebia group >>> >>> M: +31 6 30 61 26 24 >>> W: http://www.godatadriven.com >>> >>> >> >> -- >> >> Jarek Potiuk >> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >> M: +48 660 796 129 <+48660796129> >> [image: Polidea] <https://www.polidea.com/> >> >> -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>
