yangrong688 commented on a change in pull request #11842: URL: https://github.com/apache/airflow/pull/11842#discussion_r511939025
########## File path: UPDATING_TO_2.0.md ########## @@ -0,0 +1,889 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> +# Updating Airflow + +This file documents any backwards-incompatible changes in Airflow and +assists users migrating to a new version. + +<!-- START doctoc generated TOC please keep comment here to allow auto update --> +<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --> +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [Step 1: Upgrade to Airflow 1.10.13 (a.k.a our "bridge" release)](#step-1-upgrade-to-airflow-11013-aka-our-bridge-release) +- [Step 3: Upgrade to Python 3](#step-3-upgrade-to-python-3) +- [Step 3: Upgrade Airflow DAGs](#step-3-upgrade-airflow-dags) +- [Changes to the KubernetesPodOperator](#changes-to-the-kubernetespodoperator) +- [Step 4: Update system configurations](#step-4-update-system-configurations) +- [Step 4: Upgrade KubernetesExecutor settings](#step-4-upgrade-kubernetesexecutor-settings) + +<!-- END doctoc generated TOC please keep comment here to allow auto update --> + + +## Step 1: Upgrade to Airflow 1.10.13 (a.k.a our "bridge" release) + +To minimize upgrade friction for Airflow 1.10, we are releasing a "bridge" release that has multiple critical features that can prepare users to upgrade without disrupting existing workflows. These features include: + +1. All breaking DAG and architecture changes of Airflow 2.0 have been backported to Airflow 1.10.13. This backward-compatibility does not mean +that 1.10.13 will process these DAGs the same way as Airflow 2.0. What this does mean is that all Airflow 2.0 +compatible DAGs will work in Airflow 1.10.13. Instead, this backport will give users time to modify their DAGs over time without any service +disruption. +2. We have backported the `pod_template_file` capability for the KubernetesExecutor as well as a script that will generate a `pod_template_file` +based on your `airflow.cfg` settings. To generate this file simply run the following command: + + airflow generate_pod_template -o <output file path> + Once you have performed this step, simply write out the file path to this file in the `pod_template_file` section of the `kubernetes` + section of your `airflow.cfg` +3. Airflow 1.10.13 will contain our "upgrade check" scripts. These scripts will read through your `airflow.cfg` and all of your +Dags and will give a detailed report of all changes required before upgrading. We are testing this script diligently, and our +goal is that any Airflow setup that can pass these tests will be able to upgrade to 2.0 without any issues. + + airflow upgrade_check + +## Step 3: Upgrade to Python 3 + +Airflow 1.10 will be the last release series to support Python 2. Airflow 2.0.0 will only support Python 3.6 and up. + +If you have a specific task that still requires Python 2 then you can use the PythonVirtualenvOperator for this. + +For a list of breaking changes between python 2 and python, please refer to this [handy blog](https://blog.couchbase.com/tips-and-tricks-for-upgrading-from-python-2-to-python-3/) by the CouchBaseDB team. + +## Step 3: Upgrade Airflow DAGs + +### Change to undefined variable handling in templates + +Prior to Airflow 2.0 Jinja Templates would permit the use of undefined variables. They would render as an +empty string, with no indication to the user an undefined variable was used. With this release, any template +rendering involving undefined variables will fail the task, as well as displaying an error in the UI when +rendering. + +The behavior can be reverted when instantiating a DAG. +```python +import jinja2 + +dag = DAG('simple_dag', template_undefined=jinja2.Undefined) +``` + + +## Changes to the KubernetesPodOperator + +Much like the `KubernetesExecutor`, the `KubernetesPodOperator` will no longer take Airflow custom classes and will +instead expect either a pod_template yaml file, or `kubernetes.client.models` objects. + +The one notable exception is that we will continue to support the `airflow.kubernetes.secret.Secret` class. + +Whereas previously a user would import each individual class to build the pod as so: + +```python +from airflow.kubernetes.pod import Port +from airflow.kubernetes.volume import Volume +from airflow.kubernetes.secret import Secret +from airflow.kubernetes.volume_mount import VolumeMount + + +volume_config = { + 'persistentVolumeClaim': { + 'claimName': 'test-volume' + } +} +volume = Volume(name='test-volume', configs=volume_config) +volume_mount = VolumeMount('test-volume', + mount_path='/root/mount_file', + sub_path=None, + read_only=True) + +port = Port('http', 80) +secret_file = Secret('volume', '/etc/sql_conn', 'airflow-secrets', 'sql_alchemy_conn') +secret_env = Secret('env', 'SQL_CONN', 'airflow-secrets', 'sql_alchemy_conn') + +k = KubernetesPodOperator( + namespace='default', + image="ubuntu:16.04", + cmds=["bash", "-cx"], + arguments=["echo", "10"], + labels={"foo": "bar"}, + secrets=[secret_file, secret_env], + ports=[port], + volumes=[volume], + volume_mounts=[volume_mount], + name="airflow-test-pod", + task_id="task", + affinity=affinity, + is_delete_operator_pod=True, + hostnetwork=False, + tolerations=tolerations, + configmaps=configmaps, + init_containers=[init_container], + priority_class_name="medium", +) +``` +Now the user can use the `kubernetes.client.models` class as a single point of entry for creating all k8s objects. + +```python +from kubernetes.client import models as k8s +from airflow.kubernetes.secret import Secret + + +configmaps = ['test-configmap-1', 'test-configmap-2'] + +volume = k8s.V1Volume( + name='test-volume', + persistent_volume_claim=k8s.V1PersistentVolumeClaimVolumeSource(claim_name='test-volume'), +) + +port = k8s.V1ContainerPort(name='http', container_port=80) +secret_file = Secret('volume', '/etc/sql_conn', 'airflow-secrets', 'sql_alchemy_conn') +secret_env = Secret('env', 'SQL_CONN', 'airflow-secrets', 'sql_alchemy_conn') +secret_all_keys = Secret('env', None, 'airflow-secrets-2') +volume_mount = k8s.V1VolumeMount( + name='test-volume', mount_path='/root/mount_file', sub_path=None, read_only=True +) + +k = KubernetesPodOperator( + namespace='default', + image="ubuntu:16.04", + cmds=["bash", "-cx"], + arguments=["echo", "10"], + labels={"foo": "bar"}, + secrets=[secret_file, secret_env], + ports=[port], + volumes=[volume], + volume_mounts=[volume_mount], + name="airflow-test-pod", + task_id="task", + is_delete_operator_pod=True, + hostnetwork=False) +``` +We decided to keep the Secret class as users seem to really like that simplifies the complexity of mounting +Kubernetes secrets into workers. + +For a more detailed list of changes to the KubernetesPodOperator API, please read [here](###Changed Parameters for the KubernetesPodOperator) + +## Step 4: Update system configurations +### Drop legacy UI in favor of FAB RBAC UI + +> WARNING: Breaking change + +Previously we were using two versions of UI, which were hard to maintain as we need to implement/update the same feature +in both versions. With this release we've removed the older UI in favor of Flask App Builder RBAC UI. No need to set the +RBAC UI explicitly in the configuration now as this is the only default UI. We did it to avoid +the huge maintenance burden of two independent user interfaces + +Please note that that custom auth backends will need re-writing to target new FAB based UI. + +As part of this change, a few configuration items in `[webserver]` section are removed and no longer applicable, +including `authenticate`, `filter_by_owner`, `owner_mode`, and `rbac`. + +Before upgrading to this release, we recommend activating the new FAB RBAC UI. For that, you should set +the `rbac` options in `[webserver]` in the `airflow.cfg` file to `true` + +```ini +[webserver] +rbac = true +``` + +In order to login to the interface, you need to create an administrator account. +``` +airflow create_user \ + --role Admin \ + --username admin \ + --firstname FIRST_NAME \ + --lastname LAST_NAME \ + --email [email protected] +``` + +If you have already installed Airflow 2.0, you can create a user with the command `airflow users create`. +You don't need to make changes to the configuration file as the FAB RBAC UI is +the only supported UI. +``` +airflow users create \ + --role Admin \ + --username admin \ + --firstname FIRST_NAME \ + --lastname LAST_NAME \ + --email [email protected] +``` + +### Breaking Change in OAuth + +The flask-ouathlib has been replaced with authlib because flask-outhlib has +been deprecated in favour of authlib. +The Old and New provider configuration keys that have changed are as follows + +| Old Keys | New keys | +|---------------------|-------------------| +| consumer_key | client_id | +| consumer_secret | client_secret | +| base_url | api_base_url | +| request_token_params| client_kwargs | + +For more information, visit https://flask-appbuilder.readthedocs.io/en/latest/security.html#authentication-oauth + + +## Step 4: Upgrade KubernetesExecutor settings + +#### The KubernetesExecutor Will No Longer Read from the airflow.cfg for Base Pod Configurations + +In Airflow 2.0, the KubernetesExecutor will require a base pod template written in yaml. This file can exist +anywhere on the host machine and will be linked using the `pod_template_file` configuration in the airflow.cfg. + +The `airflow.cfg` will still accept values for the `worker_container_repository`, the `worker_container_tag`, and +the default namespace. + +The following `airflow.cfg` values will be deprecated: + +``` +worker_container_image_pull_policy +airflow_configmap +airflow_local_settings_configmap +dags_in_image +dags_volume_subpath +dags_volume_mount_point +dags_volume_claim +logs_volume_subpath +logs_volume_claim +dags_volume_host +logs_volume_host +env_from_configmap_ref +env_from_secret_ref +git_repo +git_branch +git_sync_depth +git_subpath +git_sync_rev +git_user +git_password +git_sync_root +git_sync_dest +git_dags_folder_mount_point +git_ssh_key_secret_name +git_ssh_known_hosts_configmap_name +git_sync_credentials_secret +git_sync_container_repository +git_sync_container_tag +git_sync_init_container_name +git_sync_run_as_user +worker_service_account_name +image_pull_secrets +gcp_service_account_keys +affinity +tolerations +run_as_user +fs_group +[kubernetes_node_selectors] +[kubernetes_annotations] +[kubernetes_environment_variables] +[kubernetes_secrets] +[kubernetes_labels] +``` + +#### The `executor_config` Will Now Expect a `kubernetes.client.models.V1Pod` Class When Launching Tasks + +In Airflow 1.10.x, users could modify task pods at runtime by passing a dictionary to the `executor_config` variable. +Users will now have full access the Kubernetes API via the `kubernetes.client.models.V1Pod`. + +While in the deprecated version a user would mount a volume using the following dictionary: + +```python +second_task = PythonOperator( + task_id="four_task", + python_callable=test_volume_mount, + executor_config={ + "KubernetesExecutor": { + "volumes": [ + { + "name": "example-kubernetes-test-volume", + "hostPath": {"path": "/tmp/"}, + }, + ], + "volume_mounts": [ + { + "mountPath": "/foo/", + "name": "example-kubernetes-test-volume", + }, + ] + } + } +) +``` + +In the new model a user can accomplish the same thing using the following code under the ``pod_override`` key: + +```python +from kubernetes.client import models as k8s + +second_task = PythonOperator( + task_id="four_task", + python_callable=test_volume_mount, + executor_config={"pod_override": k8s.V1Pod( + spec=k8s.V1PodSpec( + containers=[ + k8s.V1Container( + name="base", + volume_mounts=[ + k8s.V1VolumeMount( + mount_path="/foo/", + name="example-kubernetes-test-volume" + ) + ] + ) + ], + volumes=[ + k8s.V1Volume( + name="example-kubernetes-test-volume", + host_path=k8s.V1HostPathVolumeSource( + path="/tmp/" + ) + ) + ] + ) + ) + } +) +``` +For Airflow 2.0, the traditional `executor_config` will continue operation with a deprecation warning, +but will be removed in a future version. + + +###Changed Parameters for the KubernetesPodOperator Review comment: Need to add a space to make this line a title ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
