The usual pre-meeting summary from my side.

In preparation for Airflow 2.0 Dev call tomorrow, I have prepared some bug
fixes and improvements to the providers approach. We are steadily moving in
the mini-project https://github.com/apache/airflow/projects/5 . Some of
them already
merged (thanks those who reviewed), but I have 2 PRs in progress to get to
the place where it is ready to release alpha2

https://github.com/apache/airflow/pull/11630 - The .tar.gz provider
packages are installable now.
https://github.com/apache/airflow/pull/11586 - Fixes versioning for
pre-release provider packages

The small thing that I've stumbled upon while preparing the provider
package was the Licences/Notice review -
https://github.com/apache/airflow/issues/11632 as Ash mentioned in the
comment, the current "dupli/tripli-cation of reporting the licences" is
likely OK (as we've done that at incubator graduation). Ry Walker
already had some comments / results of licence checks recently so maybe we
can talk about it and agree some common approach (or agree that there is
nothing to discuss) tomorrow,

For the provider's packages, I reviewed and removed all the license deps as
not needed, but it would be great to talk about it tomorrow anyway.

J.


On Wed, Oct 14, 2020 at 9:00 AM Jarek Potiuk <[email protected]>
wrote:

> A small follow up: The 2.0.0a1 release is the "core" release only. It has
> no "providers" installed. Airflow 2.0 will be distributed as a number of
> separate packages: "core" will be released separately and each of the
> providers has its own package to install.
> Once we release it in PyPI, the right provider packages will be installed
> automatically when you install the right extra (so pip install
> apache-airflow[google] will also pull in the latest
> apache-airflow-providers-google package, but for now you need to install
> those packages manually.
> The 0.0.1a versions of all provider packages are available at
> https://dist.apache.org/repos/dist/dev/airflow/providers/0.0.1a1/
>
> And big congrats to the whole team for pulling this together! That is a
> huge milestone!
>
> J.
>
>
> On Tue, Oct 13, 2020 at 9:47 PM Ash Berlin-Taylor <[email protected]>
> wrote:
>
>> I'm proud to announce the availability of Apache Airlow 2.0.0.alpha1 for
>> testing!
>>
>> First the caveat: this is an alpha release. Do not run it in production,
>> it might not be without serious problems, and in the extreme case you may
>> have to reset your database between this and the beta or release
>> candidates. (This is extremely unlikely, but don't say we didn't warn you.)
>>
>> This "snapshot" is intended for members of the Airflow developer
>> community to test the build and get an early start on testing 2.0.0. For
>> clarity, this is not an official release of Apache Airflow either - that
>> doesn't happen until we make a release candidate and then vote on it, and
>> based on the expected timelines on the Airflow 2.0 planning page
>> <https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+2.0+-+Planning>,
>> we expect that to happen the week of 30th Nov, 2020.
>>
>> This is quite a big change, so for this alpha release you shouldn't
>> necessarily expect your DAGs to work unchanged -- please read
>> https://github.com/apache/airflow/blob/2.0.0a1/UPDATING.md#airflow-200a1 for
>> updating notes. Before we release 2.0.0 fully we will have a 1.10.13
>> released that provides an automated tool to identify many of the changes
>> that you will need to make before upgrading to 2.0
>>
>> The alpha snapshot is available at:
>>
>> https://dist.apache.org/repos/dist/dev/airflow/2.0.0a1/
>>
>> *apache-airflow-2.0.0a1-source.tar.gz* is a source release that comes
>> with INSTALL instructions.
>>
>> *apache-airflow-2.0.0a1-bin.tar.gz* is the binary Python "sdist" snapshot.
>>
>> *apache_airflow-2.0.0a1-py3-none-any.whl* is the binary Python wheel
>> snapshot.
>>
>> This snapshot has *not* been pushed to PyPi.
>>
>> Public keys are available at: https://www.apache.org/dist/airflow/KEYS
>>
>> The full changelog is about 2,000 lines long (already excluding anything
>> backported to 1.10), so for now there is no full change log *yet*, but
>> the major features in 2.0.0alpha1 compared to 1.10.12 are:
>>
>>
>>    - Decorated Flows (AIP-31)
>>
>>    (Used to be called Functional DAGs.)
>>
>>    DAGs are now much much nicer to author especially when using
>>    PythonOperator, deps are handled more clearly and XCom is nicer to use
>>
>>    Read more here:
>>
>>    Decorated Flow Documentation
>>    <https://airflow.readthedocs.io/en/latest/concepts.html#decorated-flows>
>>
>>    - Fully specified REST API (AIP-32)
>>
>>    We now have a fully supported, and no-longer-experimental API with a
>>    fully published OpenAPI specification.
>>
>>    Read more here:
>>
>>    REST API Documentation
>>    <https://airflow.readthedocs.io/en/latest/stable-rest-api-ref.html>
>>
>>    - Massive Scheduler performance improvements
>>
>>    As part of AIP-15 (Scheduler HA+performance) and other work Kamil did
>>    we have made significant performance improvements to the Airflow Scheduler
>>    and it now starts tasks much, MUCH quicker.
>>
>>    We will follow up with exact benchmark figures (we want to triple
>>    check them as we don't quite believe the numbers!)
>>
>>    - Scheduler is now HA compatible (AIP-15)
>>
>>    It's now possible and supported to run more than a single scheduler
>>    instance, either for resiliency in case one goes down, or to get higher
>>    scheduling performance.
>>
>>    To fully use this feature you need Postgres 9.6+ or MySQL 8+ (MySQL 5
>>    won't work with more than one scheduler I'm afraid).
>>
>>    There's no config or other set up required to run more than one
>>    scheduler—just start up a second scheduler somewhere else (ensuring it has
>>    access to the DAG files) and they will all cooperate through the database.
>>
>>    Docs PR here: Scheduler HA documentation PR
>>    <https://github.com/apache/airflow/pull/11467/files>
>>
>>    - Task Groups (AIP-34, Docs)
>>
>>    SubDAGs are useful for grouping tasks in the UI but have many
>>    drawbacks in their execution behaviour (such as only executing a single
>>    task in parallel!) so we've introduced a new concept called "Task Groups"
>>    which provide the same grouping behaviour as subdags, but don't have any 
>> of
>>    the execution-time drawbacks.
>>
>>    Read more here: Task Grouping Documentation
>>    <https://airflow.readthedocs.io/en/latest/concepts.html#taskgroup>
>>
>>    - Refreshed UI
>>
>>    We've given the Airflow UI a visual refresh
>>    <https://github.com/apache/airflow/pull/11195> and updated some of
>>    the styling. Check out the screenshots in the docs
>>    <https://airflow.readthedocs.io/en/latest/ui.html>.
>>
>>    - Smart Sensors for reduced load from sensors (AIP-17)
>>
>>    If you make heavy use of sensors in your Airflow cluster you can
>>    start to find that sensor execution starts to take up a significant
>>    proportion of your cluster, even with "reshedule" mode. So we've added a
>>    new mode called "Smart Sensors.
>>
>>    This feature is in "early-access" - it's been well tested by AirBnB,
>>    so is "stable"/usable but we reserve the right to make backwards
>>    incompatible changes in a future release (if we have to. We'll try very
>>    hard not to!)
>>
>>    Docs on: Smart Sensors
>>    
>> <https://airflow.readthedocs.io/en/latest/smart-sensor.html?highlight=smartsensors>
>>
>>    - Simplified KubernetesExecutor
>>
>>    For Airflow 2.0, we have re-architected the KubernetesExecutor in a
>>    fashion that is simultaneously faster, simpler to understand, and offers
>>    far more flexibility to Airflow users. Users will now be able to access 
>> the
>>    full Kubernetes API to create a yaml `pod_template_file` instead of 
>> filling
>>    in parameters in their airflow.cfg.
>>
>>    We have also replaced the `executor_config` dictionary with the
>>    `pod_override` parameter, which takes a Kubernetes V1Pod object for a 
>> clear
>>    1:1 override setting. These changes have removed over three thousand lines
>>    of code for the KubernetesExecutor, which simultaneously makes it run
>>    faster and creates fewer potential errors.
>>
>>    Read more here:
>>
>>    Docs on pod_template_file
>>    
>> <https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-template-file>
>>    Docs on pod_override
>>    
>> <https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-override>
>>
>> We've tried where possible to make as few breaking changes as possible,
>> and to provide deprecation path in the code, especially in the case of
>> anything called in the DAG, but please read through the UPDATING.md to
>> check what might affect you - for instance we have re-organized the layout
>> of operators (they now all live under airflow.providers.*) but the old
>> names should continue to work, you'll just notice a lot of
>> DeprecationWarnings that you should fix up.
>>
>> Thank you so much to all the contributors over to get us to this point,
>> in no particular order: Kaxil Naik, Daniel Imberman, Jarek Potiuk, Tomek
>> Urbaszek, Kamil Breguła, Gerard Casas Saez, Kevin Yang, James Timmins,
>> Yingbo Wang, Qian Yu, Ryan Hamilton and the 100s of others who keep making
>> Airflow better for everyone.
>>
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to