This is an automated email from the ASF dual-hosted git repository. dimberman pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git
commit 89b06150754f3ee81ddf84ed8b04316d323a23a6 Author: Kamil BreguĊa <[email protected]> AuthorDate: Thu Oct 3 08:23:57 2019 +0200 [AIRFLOW-XXX] Extract operators and hooks to separate page (#6213) (cherry picked from commit bd822dd8c27f4f7c584da4a5d0524140e64d7613) --- docs/index.rst | 1 + docs/integration.rst | 470 +-------------- docs/operators-and-hooks-ref.rst | 1234 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 1247 insertions(+), 458 deletions(-) diff --git a/docs/index.rst b/docs/index.rst index 44717ac..65329d6 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -104,6 +104,7 @@ Content :maxdepth: 1 :caption: References + Operators and hooks <operators-and-hooks-ref> CLI <cli-ref> Macros <macros-ref> Python API <_api/index> diff --git a/docs/integration.rst b/docs/integration.rst index 74a8207..ae252a6 100644 --- a/docs/integration.rst +++ b/docs/integration.rst @@ -18,461 +18,15 @@ Integration =========== -.. contents:: Content - :local: - :depth: 1 - -.. _Azure: - -Azure: Microsoft Azure ----------------------- - -Airflow has limited support for Microsoft Azure: interfaces exist only for Azure Blob -Storage and Azure Data Lake. Hook, Sensor and Operator for Blob Storage and -Azure Data Lake Hook are in contrib section. - -Logging -''''''' - -Airflow can be configured to read and write task logs in Azure Blob Storage. -See :ref:`write-logs-azure`. - - -Azure Blob Storage -'''''''''''''''''' - -All classes communicate via the Window Azure Storage Blob protocol. Make sure that a -Airflow connection of type ``wasb`` exists. Authorization can be done by supplying a -login (=Storage account name) and password (=KEY), or login and SAS token in the extra -field (see connection ``wasb_default`` for an example). - -The operators are defined in the following module: - -* :mod:`airflow.contrib.sensors.wasb_sensor` -* :mod:`airflow.contrib.operators.wasb_delete_blob_operator` -* :mod:`airflow.contrib.operators.file_to_wasb` - -They use :class:`airflow.contrib.hooks.wasb_hook.WasbHook` to communicate with Microsoft Azure. - -Azure File Share -'''''''''''''''' - -Cloud variant of a SMB file share. Make sure that a Airflow connection of -type ``wasb`` exists. Authorization can be done by supplying a login (=Storage account name) -and password (=Storage account key), or login and SAS token in the extra field -(see connection ``wasb_default`` for an example). - -It uses :class:`airflow.contrib.hooks.azure_fileshare_hook.AzureFileShareHook` to communicate with Microsoft Azure. - -Azure CosmosDB -'''''''''''''' - -AzureCosmosDBHook communicates via the Azure Cosmos library. Make sure that a -Airflow connection of type ``azure_cosmos`` exists. Authorization can be done by supplying a -login (=Endpoint uri), password (=secret key) and extra fields database_name and collection_name to specify the -default database and collection to use (see connection ``azure_cosmos_default`` for an example). - -The operators are defined in the following modules: - -* :mod:`airflow.contrib.operators.azure_cosmos_operator` -* :mod:`airflow.contrib.sensors.azure_cosmos_sensor` - -They also use :class:`airflow.contrib.hooks.azure_cosmos_hook.AzureCosmosDBHook` to communicate with Microsoft Azure. - -Azure Data Lake -''''''''''''''' - -AzureDataLakeHook communicates via a REST API compatible with WebHDFS. Make sure that a -Airflow connection of type ``azure_data_lake`` exists. Authorization can be done by supplying a -login (=Client ID), password (=Client Secret) and extra fields tenant (Tenant) and account_name (Account Name) -(see connection ``azure_data_lake_default`` for an example). - -The operators are defined in the following modules: - -* :mod:`airflow.contrib.operators.adls_list_operator` -* :mod:`airflow.contrib.operators.adls_to_gcs` - -They also use :class:`airflow.contrib.hooks.azure_data_lake_hook.AzureDataLakeHook` to communicate with Microsoft Azure. - - -Azure Container Instances -''''''''''''''''''''''''' - -Azure Container Instances provides a method to run a docker container without having to worry -about managing infrastructure. The AzureContainerInstanceHook requires a service principal. The -credentials for this principal can either be defined in the extra field ``key_path``, as an -environment variable named ``AZURE_AUTH_LOCATION``, -or by providing a login/password and tenantId in extras. - -The operator is defined in the :mod:`airflow.contrib.operators.azure_container_instances_operator` module. - -They also use :class:`airflow.contrib.hooks.azure_container_volume_hook.AzureContainerVolumeHook`, -:class:`airflow.contrib.hooks.azure_container_registry_hook.AzureContainerRegistryHook` and -:class:`airflow.contrib.hooks.azure_container_instance_hook.AzureContainerInstanceHook` to communicate with Microsoft Azure. - -The AzureContainerRegistryHook requires a host/login/password to be defined in the connection. - - -.. _AWS: - -AWS: Amazon Web Services ------------------------- - -Airflow has extensive support for Amazon Web Services. But note that the Hooks, Sensors and -Operators are in the contrib section. - -Logging -''''''' - -Airflow can be configured to read and write task logs in Amazon Simple Storage Service (Amazon S3). -See :ref:`write-logs-amazon`. - - -AWS EMR -''''''' - -The operators are defined in the following modules: - -* :mod:`airflow.contrib.operators.emr_add_steps_operator` -* :mod:`airflow.contrib.operators.emr_create_job_flow_operator` -* :mod:`airflow.contrib.operators.emr_terminate_job_flow_operator` - -They also use :class:`airflow.contrib.hooks.emr_hook.EmrHook` to communicate with Amazon Web Service. - -AWS S3 -'''''' - -The operators are defined in the following modules: - -* :mod:`airflow.operators.s3_file_transform_operator` -* :mod:`airflow.contrib.operators.s3_list_operator` -* :mod:`airflow.contrib.operators.s3_to_gcs_operator` -* :mod:`airflow.contrib.operators.s3_to_gcs_transfer_operator` -* :mod:`airflow.operators.s3_to_hive_operator` - -They also use :class:`airflow.hooks.S3_hook.S3Hook` to communicate with Amazon Web Service. - -AWS Batch Service -''''''''''''''''' - -The operator is defined in the :class:`airflow.contrib.operators.awsbatch_operator.AWSBatchOperator` module. - -AWS RedShift -'''''''''''' - -The operators are defined in the following modules: - -* :mod:`airflow.contrib.sensors.aws_redshift_cluster_sensor` -* :mod:`airflow.operators.redshift_to_s3_operator` -* :mod:`airflow.operators.s3_to_redshift_operator` - -They also use :class:`airflow.contrib.hooks.redshift_hook.RedshiftHook` to communicate with Amazon Web Service. - - -AWS DynamoDB -'''''''''''' - -The operator is defined in the :class:`airflow.contrib.operators.hive_to_dynamodb` module. - -It uses :class:`airflow.contrib.hooks.aws_dynamodb_hook.AwsDynamoDBHook` to communicate with Amazon Web Service. - - -AWS Lambda -'''''''''' - -It uses :class:`airflow.contrib.hooks.aws_lambda_hook.AwsLambdaHook` to communicate with Amazon Web Service. - -AWS Kinesis -''''''''''' - -It uses :class:`airflow.contrib.hooks.aws_firehose_hook.AwsFirehoseHook` to communicate with Amazon Web Service. - - -Amazon SageMaker -'''''''''''''''' - -For more instructions on using Amazon SageMaker in Airflow, please see `the SageMaker Python SDK README`_. - -.. _the SageMaker Python SDK README: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/workflow/README.rst - -The operators are defined in the following modules: - -:mod:`airflow.contrib.operators.sagemaker_training_operator` -:mod:`airflow.contrib.operators.sagemaker_tuning_operator` -:mod:`airflow.contrib.operators.sagemaker_model_operator` -:mod:`airflow.contrib.operators.sagemaker_transform_operator` -:mod:`airflow.contrib.operators.sagemaker_endpoint_config_operator` -:mod:`airflow.contrib.operators.sagemaker_endpoint_operator` - -They uses :class:`airflow.contrib.hooks.sagemaker_hook.SageMakerHook` to communicate with Amazon Web Service. - -.. _Databricks: - -Databricks ----------- - -With contributions from `Databricks <https://databricks.com/>`__, Airflow has several operators -which enable the submitting and running of jobs to the Databricks platform. Internally the -operators talk to the ``api/2.0/jobs/runs/submit`` `endpoint <https://docs.databricks.com/api/latest/jobs.html#runs-submit>`_. - -The operators are defined in the :class:`airflow.contrib.operators.databricks_operator` module. - -.. _GCP: - -GCP: Google Cloud Platform --------------------------- - -Airflow has extensive support for the Google Cloud Platform. But note that most Hooks and -Operators are in the contrib section. Meaning that they have a *beta* status, meaning that -they can have breaking changes between minor releases. - -See the :doc:`GCP connection type <howto/connection/gcp>` documentation to -configure connections to GCP. - -Logging -''''''' - -Airflow can be configured to read and write task logs in Google Cloud Storage. -See :ref:`write-logs-gcp`. - - -GoogleCloudBaseHook -''''''''''''''''''' - -All hooks is based on :class:`airflow.contrib.hooks.gcp_api_base_hook.GoogleCloudBaseHook`. - - -BigQuery -'''''''' - -The operators are defined in the following module: - - * :mod:`airflow.contrib.operators.bigquery_check_operator` - * :mod:`airflow.contrib.operators.bigquery_get_data` - * :mod:`airflow.contrib.operators.bigquery_table_delete_operator` - * :mod:`airflow.contrib.operators.bigquery_to_bigquery` - * :mod:`airflow.contrib.operators.bigquery_to_gcs` - -They also use :class:`airflow.contrib.hooks.bigquery_hook.BigQueryHook` to communicate with Google Cloud Platform. - - -Cloud Spanner -''''''''''''' - -The operator is defined in the :class:`airflow.contrib.operators.gcp_spanner_operator` package. - -They also use :class:`airflow.contrib.hooks.gcp_spanner_hook.CloudSpannerHook` to communicate with Google Cloud Platform. - - -Cloud SQL -''''''''' - -The operator is defined in the :class:`airflow.contrib.operators.gcp_sql_operator` package. - -They also use :class:`airflow.contrib.hooks.gcp_sql_hook.CloudSqlDatabaseHook` and :class:`airflow.contrib.hooks.gcp_sql_hook.CloudSqlHook` to communicate with Google Cloud Platform. - - -Cloud Bigtable -'''''''''''''' - -The operator is defined in the :class:`airflow.contrib.operators.gcp_bigtable_operator` package. - - -They also use :class:`airflow.contrib.hooks.gcp_bigtable_hook.BigtableHook` to communicate with Google Cloud Platform. - -Cloud Build -''''''''''' - -The operator is defined in the :class:`airflow.contrib.operators.gcp_cloud_build_operator` package. - -They also use :class:`airflow.contrib.hooks.gcp_cloud_build_hook.CloudBuildHook` to communicate with Google Cloud Platform. - - -Compute Engine -'''''''''''''' - -The operators are defined in the :class:`airflow.contrib.operators.gcp_compute_operator` package. - -They also use :class:`airflow.contrib.hooks.gcp_compute_hook.GceHook` to communicate with Google Cloud Platform. - - -Cloud Functions -''''''''''''''' - -The operators are defined in the :class:`airflow.contrib.operators.gcp_function_operator` package. - -They also use :class:`airflow.contrib.hooks.gcp_function_hook.GcfHook` to communicate with Google Cloud Platform. - - -Cloud DataFlow -'''''''''''''' - -The operators are defined in the :class:`airflow.contrib.operators.dataflow_operator` package. - -They also use :class:`airflow.contrib.hooks.gcp_dataflow_hook.DataFlowHook` to communicate with Google Cloud Platform. - - -Cloud DataProc -'''''''''''''' - -The operators are defined in the :class:`airflow.contrib.operators.dataproc_operator` package. - - -Cloud Datastore -''''''''''''''' - -:class:`airflow.contrib.operators.datastore_export_operator.DatastoreExportOperator` - Export entities from Google Cloud Datastore to Cloud Storage. - -:class:`airflow.contrib.operators.datastore_import_operator.DatastoreImportOperator` - Import entities from Cloud Storage to Google Cloud Datastore. - -They also use :class:`airflow.contrib.hooks.datastore_hook.DatastoreHook` to communicate with Google Cloud Platform. - - -Cloud ML Engine -''''''''''''''' - -:class:`airflow.contrib.operators.mlengine_operator.MLEngineBatchPredictionOperator` - Start a Cloud ML Engine batch prediction job. - -:class:`airflow.contrib.operators.mlengine_operator.MLEngineModelOperator` - Manages a Cloud ML Engine model. - -:class:`airflow.contrib.operators.mlengine_operator.MLEngineTrainingOperator` - Start a Cloud ML Engine training job. - -:class:`airflow.contrib.operators.mlengine_operator.MLEngineVersionOperator` - Manages a Cloud ML Engine model version. - -The operators are defined in the :class:`airflow.contrib.operators.mlengine_operator` package. - -They also use :class:`airflow.contrib.hooks.gcp_mlengine_hook.MLEngineHook` to communicate with Google Cloud Platform. - -Cloud Storage -''''''''''''' - -The operators are defined in the following module: - - * :mod:`airflow.contrib.operators.file_to_gcs` - * :mod:`airflow.contrib.operators.gcs_acl_operator` - * :mod:`airflow.contrib.operators.gcs_download_operator` - * :mod:`airflow.contrib.operators.gcs_list_operator` - * :mod:`airflow.contrib.operators.gcs_operator` - * :mod:`airflow.contrib.operators.gcs_to_bq` - * :mod:`airflow.contrib.operators.gcs_to_gcs` - * :mod:`airflow.contrib.operators.mysql_to_gcs` - * :mod:`airflow.contrib.operators.mssql_to_gcs` - * :mod:`airflow.contrib.sensors.gcs_sensor` - * :mod:`airflow.contrib.operators.gcs_delete_operator` - -They also use :class:`airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook` to communicate with Google Cloud Platform. - - -Transfer Service -'''''''''''''''' - - -The operators are defined in the following module: - - * :mod:`airflow.contrib.operators.gcp_transfer_operator` - * :mod:`airflow.contrib.sensors.gcp_transfer_operator` - -They also use :class:`airflow.contrib.hooks.gcp_transfer_hook.GCPTransferServiceHook` to communicate with Google Cloud Platform. - - -Cloud Vision -'''''''''''' - - -The operator is defined in the :class:`airflow.contrib.operators.gcp_vision_operator` package. - -They also use :class:`airflow.contrib.hooks.gcp_vision_hook.CloudVisionHook` to communicate with Google Cloud Platform. - -Cloud Text to Speech -'''''''''''''''''''' - -The operator is defined in the :class:`airflow.contrib.operators.gcp_text_to_speech_operator` package. - -They also use :class:`airflow.contrib.hooks.gcp_text_to_speech_hook.GCPTextToSpeechHook` to communicate with Google Cloud Platform. - -Cloud Speech to Text -'''''''''''''''''''' - -The operator is defined in the :class:`airflow.contrib.operators.gcp_speech_to_text_operator` package. - -They also use :class:`airflow.contrib.hooks.gcp_speech_to_text_hook.GCPSpeechToTextHook` to communicate with Google Cloud Platform. - -Cloud Speech Translate Operators --------------------------------- - -The operator is defined in the :class:`airflow.contrib.operators.gcp_translate_speech_operator` package. - -They also use :class:`airflow.contrib.hooks.gcp_speech_to_text_hook.GCPSpeechToTextHook` and - :class:`airflow.contrib.hooks.gcp_translate_hook.CloudTranslateHook` to communicate with Google Cloud Platform. - -Cloud Translate -''''''''''''''' - -Cloud Translate Text Operators -"""""""""""""""""""""""""""""" - -:class:`airflow.contrib.operators.gcp_translate_operator.CloudTranslateTextOperator` - Translate a string or list of strings. - -The operator is defined in the :class:`airflow.contrib.operators.gcp_translate_operator` package. - -Cloud Video Intelligence -'''''''''''''''''''''''' - -The operators are defined in the :class:`airflow.contrib.operators.gcp_video_intelligence_operator` package. - -They also use :class:`airflow.contrib.hooks.gcp_video_intelligence_hook.CloudVideoIntelligenceHook` to communicate with Google Cloud Platform. - -Google Kubernetes Engine -'''''''''''''''''''''''' - -The operators are defined in the :class:`airflow.contrib.operators.gcp_container_operator` package. - - -They also use :class:`airflow.contrib.hooks.gcp_container_hook.GKEClusterHook` to communicate with Google Cloud Platform. - - -Google Natural Language -''''''''''''''''''''''' - -The operators are defined in the :class:`airflow.contrib.operators.gcp_natural_language_operator` package. - -They also use :class:`airflow.contrib.hooks.gcp_natural_language_operator.CloudNaturalLanguageHook` to communicate with Google Cloud Platform. - - -Google Cloud Data Loss Prevention (DLP) -''''''''''''''''''''''''''''''''''''''' - -The operators are defined in the :class:`airflow.contrib.operators.gcp_dlp_operator` package. - -They also use :class:`airflow.contrib.hooks.gcp_dlp_hook.CloudDLPHook` to communicate with Google Cloud Platform. - - -Google Cloud Tasks -'''''''''''''''''' - -The operators are defined in the :class:`airflow.contrib.operators.gcp_tasks_operator` package. - -They also use :class:`airflow.contrib.hooks.gcp_tasks_hook.CloudTasksHook` to communicate with Google Cloud Platform. - - -.. _Qubole: - -Qubole ------- - -Apache Airflow has a native operator and hooks to talk to `Qubole <https://qubole.com/>`__, -which lets you submit your big data jobs directly to Qubole from Apache Airflow. - -The operators are defined in the following module: - - * :mod:`airflow.contrib.operators.qubole_operator` - * :mod:`airflow.contrib.sensors.qubole_sensor` - * :mod:`airflow.contrib.sensors.qubole_sensor` - * :mod:`airflow.contrib.operators.qubole_check_operator` +Airflow has a mechanism that allows you to expand its functionality and integrate with other systems. + +* :doc:`Operators and hooks </operators-and-hooks-ref>` +* :doc:`Executor </executor/index>` +* :doc:`Plugins </plugins>` +* :doc:`Metrics (statsd) </metrics>` +* :doc:`Authentication backends </security>` +* :doc:`Logging </howto/write-logs>` +* :doc:`Trakcing systems </howto/tracking-user-activity>` + +It also has integration with :doc:`Sentry <errors>` service for error tracking. Other applications can also integrate using +the :doc:`REST API <rest-api-ref>`. diff --git a/docs/operators-and-hooks-ref.rst b/docs/operators-and-hooks-ref.rst new file mode 100644 index 0000000..6c80858 --- /dev/null +++ b/docs/operators-and-hooks-ref.rst @@ -0,0 +1,1234 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +Operators and Hooks Reference +============================= + +.. contents:: Content + :local: + :depth: 1 + +.. _Apache: + +ASF: Apache Software Foundation +------------------------------- + +Airflow supports various software created by `Apache Software Foundation <https://www.apache.org/foundation/>`__. + +Software operators and hooks +'''''''''''''''''''''''''''' + +These integrations allow you to perform various operations within software developed by Apache Software +Foundation. + +.. list-table:: + :header-rows: 1 + + * - Service name + - Guides + - Hook + - Operators + - Sensors + + * - `Apache Cassandra <http://cassandra.apache.org/>`__ + - + - :mod:`airflow.contrib.hooks.cassandra_hook` + - + - :mod:`airflow.contrib.sensors.cassandra_record_sensor`, + :mod:`airflow.contrib.sensors.cassandra_table_sensor` + + * - `Apache Druid <https://druid.apache.org/>`__ + - + - :mod:`airflow.hooks.druid_hook` + - :mod:`airflow.contrib.operators.druid_operator`, + :mod:`airflow.operators.druid_check_operator` + - + * - `Hadoop Distributed File System (HDFS) <https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html>`__ + - + - :mod:`airflow.hooks.hdfs_hook` + - + - :mod:`airflow.sensors.hdfs_sensor`, + :mod:`airflow.contrib.sensors.hdfs_sensor` + + * - `Apache Hive <https://hive.apache.org/>`__ + - + - :mod:`airflow.hooks.hive_hooks` + - :mod:`airflow.operators.hive_operator`, + :mod:`airflow.operators.hive_stats_operator` + - :mod:`airflow.sensors.named_hive_partition_sensor`, + :mod:`airflow.sensors.hive_partition_sensor`, + :mod:`airflow.sensors.metastore_partition_sensor` + + * - `Apache Pig <https://pig.apache.org/>`__ + - + - :mod:`airflow.hooks.pig_hook` + - :mod:`airflow.operators.pig_operator` + - + + * - `Apache Pinot <https://pinot.apache.org/>`__ + - + - :mod:`airflow.contrib.hooks.pinot_hook` + - + - + + * - `Apache Spark <https://spark.apache.org/>`__ + - + - :mod:`airflow.contrib.hooks.spark_jdbc_hook`, + :mod:`airflow.contrib.hooks.spark_jdbc_script`, + :mod:`airflow.contrib.hooks.spark_sql_hook`, + :mod:`airflow.contrib.hooks.spark_submit_hook` + - :mod:`airflow.contrib.operators.spark_jdbc_operator`, + :mod:`airflow.contrib.operators.spark_sql_operator`, + :mod:`airflow.contrib.operators.spark_submit_operator` + - + + * - `Apache Sqoop <https://sqoop.apache.org/>`__ + - + - :mod:`airflow.contrib.hooks.sqoop_hook` + - :mod:`airflow.contrib.operators.sqoop_operator` + - + + * - `WebHDFS <https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/WebHDFS.html>`__ + - + - :mod:`airflow.hooks.webhdfs_hook` + - + - :mod:`airflow.sensors.web_hdfs_sensor` + + +Transfer operators and hooks +'''''''''''''''''''''''''''' + +These integrations allow you to copy data from/to software developed by Apache Software +Foundation. + +.. list-table:: + :header-rows: 1 + + * - Source + - Destination + - Guide + - Operators + + * - `Apache Hive <https://hive.apache.org/>`__ + - `Amazon DynamoDB <https://aws.amazon.com/dynamodb/>`__ + - + - :mod:`airflow.contrib.operators.hive_to_dynamodb` + + * - `Apache Hive <https://hive.apache.org/>`__ + - `Apache Druid <https://druid.apache.org/>`__ + - + - :mod:`airflow.operators.hive_to_druid` + + * - `Apache Hive <https://hive.apache.org/>`__ + - `MySQL <https://www.mysql.com/>`__ + - + - :mod:`airflow.operators.hive_to_mysql` + + * - `Apache Hive <https://hive.apache.org/>`__ + - `Samba <https://www.samba.org/>`__ + - + - :mod:`airflow.operators.hive_to_samba_operator` + + * - `Microsoft SQL Server (MSSQL) <https://www.microsoft.com/pl-pl/sql-server/sql-server-downloads>`__ + - `Apache Hive <https://hive.apache.org/>`__ + - + - :mod:`airflow.operators.mssql_to_hive` + + * - `MySQL <https://www.mysql.com/>`__ + - `Apache Hive <https://hive.apache.org/>`__ + - + - :mod:`airflow.operators.mysql_to_hive` + + * - `Vertica <https://www.vertica.com/>`__ + - `Apache Hive <https://hive.apache.org/>`__ + - + - :mod:`airflow.contrib.operators.vertica_to_hive` + + * - `Apache Cassandra <http://cassandra.apache.org/>`__ + - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - + - :mod:`airflow.operators.cassandra_to_gcs` + + * - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`_ + - `Apache Hive <https://hive.apache.org/>`__ + - + - :mod:`airflow.operators.s3_to_hive_operator` + + * - `Apache Hive <https://hive.apache.org/>`__ + - `MySQL <https://www.mysql.com/>`__ + - + - :mod:`airflow.operators.hive_to_mysql` + +.. _Azure: + +Azure: Microsoft Azure +---------------------- + +Airflow has limited support for `Microsoft Azure <https://azure.microsoft.com/>`__. + +Service operators and hooks +''''''''''''''''''''''''''' + +These integrations allow you to perform various operations within the Microsoft Azure. + + +.. list-table:: + :header-rows: 1 + + * - Service name + - Hook + - Operators + - Sensors + + * - `Azure Blob Storage <https://azure.microsoft.com/en-us/services/storage/blobs/>`__ + - :mod:`airflow.contrib.hooks.wasb_hook` + - :mod:`airflow.contrib.operators.wasb_delete_blob_operator` + - :mod:`airflow.contrib.sensors.wasb_sensor` + + * - `Azure Container Instances <https://azure.microsoft.com/en-us/services/container-instances/>`__ + - :mod:`airflow.contrib.hooks.azure_container_instance_hook`, + :mod:`airflow.contrib.hooks.azure_container_registry_hook`, + :mod:`airflow.contrib.hooks.azure_container_volume_hook` + - :mod:`airflow.contrib.operators.azure_container_instances_operator` + - + + * - `Azure Cosmos DB <https://azure.microsoft.com/en-us/services/cosmos-db/>`__ + - :mod:`airflow.contrib.hooks.azure_cosmos_hook` + - :mod:`airflow.contrib.operators.azure_cosmos_operator` + - :mod:`airflow.contrib.sensors.azure_cosmos_sensor` + + * - `Azure Data Lake Storage <https://azure.microsoft.com/en-us/services/storage/data-lake-storage/>`__ + - :mod:`airflow.contrib.hooks.azure_data_lake_hook` + - :mod:`airflow.contrib.operators.adls_list_operator` + - + + * - `Azure Files <https://azure.microsoft.com/en-us/services/storage/files/>`__ + - :mod:`airflow.contrib.hooks.azure_fileshare_hook` + - + - + + +Transfer operators and hooks +"""""""""""""""""""""""""""" + +These integrations allow you to copy data from/to Microsoft Azure. + +.. list-table:: + :header-rows: 1 + + * - Source + - Destination + - Guide + - Operators + + * - `Azure Data Lake Storage <https://azure.microsoft.com/en-us/services/storage/data-lake-storage/>`__ + - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - + - :mod:`airflow.operators.adls_to_gcs` + + * - Local + - `Azure Blob Storage <https://azure.microsoft.com/en-us/services/storage/blobs/>`__ + - + - :mod:`airflow.contrib.operators.file_to_wasb` + + * - `Oracle <https://www.oracle.com/pl/database/>`__ + - `Azure Data Lake Storage <https://azure.microsoft.com/en-us/services/storage/data-lake-storage/>`__ + - + - :mod:`airflow.contrib.operators.oracle_to_azure_data_lake_transfer` + + +.. _AWS: + +AWS: Amazon Web Services +------------------------ + +Airflow has support for `Amazon Web Services <https://aws.amazon.com/>`__. + +All hooks are based on :mod:`airflow.contrib.hooks.aws_hook`. + +Service operators and hooks +''''''''''''''''''''''''''' + +These integrations allow you to perform various operations within the Amazon Web Services. + +.. list-table:: + :header-rows: 1 + + * - Service name + - Hook + - Operators + - Sensors + + * - `Amazon Athena <https://aws.amazon.com/athena/>`__ + - :mod:`airflow.contrib.hooks.aws_athena_hook` + - :mod:`airflow.contrib.operators.aws_athena_operator` + - :mod:`airflow.contrib.sensors.aws_athena_sensor` + + * - `AWS Batch <https://aws.amazon.com/athena/>`__ + - + - :mod:`airflow.contrib.operators.awsbatch_operator` + - + + * - `Amazon CloudWatch Logs <https://aws.amazon.com/cloudwatch/>`__ + - :mod:`airflow.contrib.hooks.aws_logs_hook` + - + - + + * - `Amazon DynamoDB <https://aws.amazon.com/dynamodb/>`__ + - :mod:`airflow.contrib.hooks.aws_dynamodb_hook` + - + - + + * - `Amazon EC2 <https://aws.amazon.com/ec2/>`__ + - + - :mod:`airflow.contrib.operators.ecs_operator` + - + + * - `Amazon EMR <https://aws.amazon.com/emr/>`__ + - :mod:`airflow.contrib.hooks.emr_hook` + - :mod:`airflow.contrib.operators.emr_add_steps_operator`, + :mod:`airflow.contrib.operators.emr_create_job_flow_operator`, + :mod:`airflow.contrib.operators.emr_terminate_job_flow_operator` + - :mod:`airflow.contrib.sensors.emr_base_sensor`, + :mod:`airflow.contrib.sensors.emr_job_flow_sensor`, + :mod:`airflow.contrib.sensors.emr_step_sensor` + + * - `AWS Glue Catalog <https://aws.amazon.com/glue/>`__ + - :mod:`airflow.contrib.hooks.aws_glue_catalog_hook` + - + - :mod:`airflow.contrib.sensors.aws_glue_catalog_partition_sensor` + + * - `Amazon Kinesis Data Firehose <https://aws.amazon.com/kinesis/data-firehose/>`__ + - :mod:`airflow.contrib.hooks.aws_firehose_hook` + - + - + + * - `AWS Lambda <https://aws.amazon.com/kinesis/>`__ + - :mod:`airflow.contrib.hooks.aws_lambda_hook` + - + - + + * - `Amazon Redshift <https://aws.amazon.com/redshift/>`__ + - :mod:`airflow.contrib.hooks.redshift_hook` + - + - :mod:`airflow.contrib.sensors.aws_redshift_cluster_sensor` + + * - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`__ + - :mod:`airflow.hooks.S3_hook` + - :mod:`airflow.operators.s3_file_transform_operator`, + :mod:`airflow.contrib.operators.s3_copy_object_operator`, + :mod:`airflow.contrib.operators.s3_delete_objects_operator`, + :mod:`airflow.contrib.operators.s3_list_operator` + - :mod:`airflow.sensors.s3_key_sensor`, + :mod:`airflow.sensors.s3_prefix_sensor` + + * - `Amazon SageMaker <https://aws.amazon.com/sagemaker/>`__ + - :mod:`airflow.contrib.hooks.sagemaker_hook` + - :mod:`airflow.contrib.operators.sagemaker_base_operator`, + :mod:`airflow.contrib.operators.sagemaker_endpoint_config_operator`, + :mod:`airflow.contrib.operators.sagemaker_endpoint_operator`, + :mod:`airflow.contrib.operators.sagemaker_model_operator`, + :mod:`airflow.contrib.operators.sagemaker_training_operator`, + :mod:`airflow.contrib.operators.sagemaker_transform_operator`, + :mod:`airflow.contrib.operators.sagemaker_tuning_operator` + - :mod:`airflow.contrib.sensors.sagemaker_base_sensor`, + :mod:`airflow.contrib.sensors.sagemaker_endpoint_sensor`, + :mod:`airflow.contrib.sensors.sagemaker_training_sensor`, + :mod:`airflow.contrib.sensors.sagemaker_transform_sensor`, + :mod:`airflow.contrib.sensors.sagemaker_tuning_sensor` + + * - `Amazon Simple Notification Service (SNS) <https://aws.amazon.com/sns/>`__ + - :mod:`airflow.contrib.hooks.aws_sns_hook` + - :mod:`airflow.contrib.operators.sns_publish_operator` + - + + * - `Amazon Simple Queue Service (SQS) <https://aws.amazon.com/sns/>`__ + - :mod:`airflow.contrib.hooks.aws_sqs_hook` + - :mod:`airflow.contrib.operators.aws_sqs_publish_operator` + - :mod:`airflow.contrib.sensors.aws_sqs_sensor` + +Transfer operators and hooks +"""""""""""""""""""""""""""" + +These integrations allow you to copy data from/to Amazon Web Services. + +.. list-table:: + :header-rows: 1 + + * - Source + - Destination + - Guide + - Operators + + * - + .. _integration:AWS-Discovery-ref: + + All GCP services :ref:`[1] <integration:GCP-Discovery>` + - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`__ + - + - :mod:`airflow.operators.google_api_to_s3_transfer` + + * - `Apache Hive <https://hive.apache.org/>`__ + - `Amazon DynamoDB <https://aws.amazon.com/dynamodb/>`__ + - + - :mod:`airflow.contrib.operators.hive_to_dynamodb` + + * - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`__ + - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - :doc:`How to use <howto/operator/gcp/cloud_storage_transfer_service>` + - :mod:`airflow.contrib.operators.s3_to_gcs_operator`, + :mod:`airflow.gcp.operators.cloud_storage_transfer_service` + + * - `Amazon Redshift <https://aws.amazon.com/redshift/>`__ + - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`_ + - + - :mod:`airflow.operators.redshift_to_s3_operator` + + * - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`_ + - `Apache Hive <https://hive.apache.org/>`__ + - + - :mod:`airflow.operators.s3_to_hive_operator` + + * - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`_ + - `Amazon Redshift <https://aws.amazon.com/redshift/>`__ + - + - :mod:`airflow.operators.s3_to_redshift_operator` + + * - `Amazon DynamoDB <https://aws.amazon.com/dynamodb/>`__ + - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`_ + - + - :mod:`airflow.contrib.operators.dynamodb_to_s3` + + * - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`_ + - `SSH File Transfer Protocol (SFTP) <https://tools.ietf.org/wg/secsh/draft-ietf-secsh-filexfer/>`__ + - + - :mod:`airflow.contrib.operators.s3_to_sftp_operator` + + * - `SSH File Transfer Protocol (SFTP) <https://tools.ietf.org/wg/secsh/draft-ietf-secsh-filexfer/>`__ + - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`_ + - + - :mod:`airflow.contrib.operators.sftp_to_s3_operator` + + * - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`__ + - + - :mod:`airflow.operators.gcs_to_s3` + + * - `Internet Message Access Protocol (IMAP) <https://tools.ietf.org/html/rfc3501>`__ + - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`__ + - + - :mod:`airflow.contrib.operators.imap_attachment_to_s3_operator` + +:ref:`[1] <integration:AWS-Discovery-ref>` Those discovery-based operators use +:class:`airflow.gcp.hooks.discovery_api.GoogleDiscoveryApiHook` to communicate with Google +Services via the `Google API Python Client <https://github.com/googleapis/google-api-python-client>`__. +Please note that this library is in maintenance mode hence it won't fully support GCP in the future. +Therefore it is recommended that you use the custom GCP Service Operators for working with the Google +Cloud Platform. + +.. _GCP: + +GCP: Google Cloud Platform +-------------------------- + +Airflow has extensive support for the `Google Cloud Platform <https://cloud.google.com/>`__. + +See the :doc:`GCP connection type <howto/connection/gcp>` documentation to +configure connections to GCP. + +All hooks are based on :class:`airflow.contrib.hooks.gcp_api_base_hook.GoogleCloudBaseHook`. + +Service operators and hooks +''''''''''''''''''''''''''' + +These integrations allow you to perform various operations within the Google Cloud Platform. + +.. + PLEASE KEEP THE ALPHABETICAL ORDER OF THE LIST BELOW, BUT OMIT THE "Cloud" PREFIX + +.. list-table:: + :header-rows: 1 + + * - Service name + - Guide + - Hook + - Operators + - Sensors + + * - `AutoML <https://cloud.google.com/automl/>`__ + - :doc:`How to use <howto/operator/gcp/automl>` + - :mod:`airflow.gcp.hooks.automl` + - :mod:`airflow.gcp.operators.automl` + - + + * - `BigQuery <https://cloud.google.com/bigquery/>`__ + - + - :mod:`airflow.gcp.hooks.bigquery` + - :mod:`airflow.gcp.operators.bigquery` + - :mod:`airflow.gcp.sensors.bigquery` + + * - `BigQuery Data Transfer Service <https://cloud.google.com/bigquery/transfer/>`__ + - :doc:`How to use <howto/operator/gcp/bigquery_dts>` + - :mod:`airflow.gcp.hooks.bigquery_dts` + - :mod:`airflow.gcp.operators.bigquery_dts` + - :mod:`airflow.gcp.sensors.bigquery_dts` + + * - `Bigtable <https://cloud.google.com/bigtable/>`__ + - :doc:`How to use <howto/operator/gcp/bigtable>` + - :mod:`airflow.gcp.hooks.bigtable` + - :mod:`airflow.gcp.operators.bigtable` + - :mod:`airflow.gcp.sensors.bigtable` + + * - `Cloud Build <https://cloud.google.com/cloud-build/>`__ + - :doc:`How to use <howto/operator/gcp/cloud_build>` + - :mod:`airflow.gcp.hooks.cloud_build` + - :mod:`airflow.gcp.operators.cloud_build` + - + + * - `Compute Engine <https://cloud.google.com/compute/>`__ + - :doc:`How to use <howto/operator/gcp/compute>` + - :mod:`airflow.gcp.hooks.compute` + - :mod:`airflow.gcp.operators.compute` + - + + * - `Cloud Data Loss Prevention (DLP) <https://cloud.google.com/dlp/>`__ + - + - :mod:`airflow.gcp.hooks.dlp` + - :mod:`airflow.gcp.operators.dlp` + - + + * - `Dataflow <https://cloud.google.com/dataflow/>`__ + - + - :mod:`airflow.gcp.hooks.dataflow` + - :mod:`airflow.gcp.operators.dataflow` + - + + * - `Dataproc <https://cloud.google.com/dataproc/>`__ + - + - :mod:`airflow.gcp.hooks.dataproc` + - :mod:`airflow.gcp.operators.dataproc` + - + + * - `Datastore <https://cloud.google.com/datastore/>`__ + - + - :mod:`airflow.gcp.hooks.datastore` + - :mod:`airflow.gcp.operators.datastore` + - + + * - `Cloud Functions <https://cloud.google.com/functions/>`__ + - :doc:`How to use <howto/operator/gcp/functions>` + - :mod:`airflow.gcp.hooks.functions` + - :mod:`airflow.gcp.operators.functions` + - + + * - `Cloud Key Management Service (KMS) <https://cloud.google.com/kms/>`__ + - + - :mod:`airflow.gcp.hooks.kms` + - + - + + * - `Kubernetes Engine <https://cloud.google.com/kubernetes_engine/>`__ + - + - :mod:`airflow.gcp.hooks.kubernetes_engine` + - :mod:`airflow.gcp.operators.kubernetes_engine` + - + + * - `Machine Learning Engine <https://cloud.google.com/ml-engine/>`__ + - + - :mod:`airflow.gcp.hooks.mlengine` + - :mod:`airflow.gcp.operators.mlengine` + - + + * - `Cloud Memorystore <https://cloud.google.com/memorystore/>`__ + - :doc:`How to use <howto/operator/gcp/cloud_memorystore>` + - :mod:`airflow.gcp.hooks.cloud_memorystore` + - :mod:`airflow.gcp.operators.cloud_memorystore` + - + + * - `Natural Language <https://cloud.google.com/natural-language/>`__ + - :doc:`How to use <howto/operator/gcp/natural_language>` + - :mod:`airflow.gcp.hooks.natural_language` + - :mod:`airflow.gcp.operators.natural_language` + - + + * - `Cloud Pub/Sub <https://cloud.google.com/pubsub/>`__ + - + - :mod:`airflow.gcp.hooks.pubsub` + - :mod:`airflow.gcp.operators.pubsub` + - :mod:`airflow.gcp.sensors.pubsub` + + * - `Cloud Spanner <https://cloud.google.com/spanner/>`__ + - :doc:`How to use <howto/operator/gcp/spanner>` + - :mod:`airflow.gcp.hooks.spanner` + - :mod:`airflow.gcp.operators.spanner` + - + + * - `Cloud Speech-to-Text <https://cloud.google.com/speech-to-text/>`__ + - :doc:`How to use <howto/operator/gcp/speech>` + - :mod:`airflow.gcp.hooks.speech_to_text` + - :mod:`airflow.gcp.operators.speech_to_text` + - + + * - `Cloud SQL <https://cloud.google.com/sql/>`__ + - :doc:`How to use <howto/operator/gcp/sql>` + - :mod:`airflow.gcp.hooks.cloud_sql` + - :mod:`airflow.gcp.operators.cloud_sql` + - + + * - `Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - :doc:`How to use <howto/operator/gcp/gcs>` + - :mod:`airflow.gcp.hooks.gcs` + - :mod:`airflow.gcp.operators.gcs` + - :mod:`airflow.gcp.sensors.gcs` + + * - `Storage Transfer Service <https://cloud.google.com/storage/transfer/>`__ + - :doc:`How to use <howto/operator/gcp/cloud_storage_transfer_service>` + - :mod:`airflow.gcp.hooks.cloud_storage_transfer_service` + - :mod:`airflow.gcp.operators.cloud_storage_transfer_service` + - :mod:`airflow.gcp.sensors.cloud_storage_transfer_service` + + * - `Cloud Tasks <https://cloud.google.com/tasks/>`__ + - + - :mod:`airflow.gcp.hooks.tasks` + - :mod:`airflow.gcp.operators.tasks` + - + + * - `Cloud Text-to-Speech <https://cloud.google.com/text-to-speech/>`__ + - :doc:`How to use <howto/operator/gcp/speech>` + - :mod:`airflow.gcp.hooks.text_to_speech` + - :mod:`airflow.gcp.operators.text_to_speech` + - + + * - `Cloud Translation <https://cloud.google.com/translate/>`__ + - :doc:`How to use <howto/operator/gcp/translate>` + - :mod:`airflow.gcp.hooks.translate` + - :mod:`airflow.gcp.operators.translate` + - + + * - `Cloud Video Intelligence <https://cloud.google.com/video_intelligence/>`__ + - :doc:`How to use <howto/operator/gcp/video_intelligence>` + - :mod:`airflow.gcp.hooks.video_intelligence` + - :mod:`airflow.gcp.operators.video_intelligence` + - + + * - `Cloud Vision <https://cloud.google.com/vision/>`__ + - :doc:`How to use <howto/operator/gcp/vision>` + - :mod:`airflow.gcp.hooks.vision` + - :mod:`airflow.gcp.operators.vision` + - + + +Transfer operators and hooks +'''''''''''''''''''''''''''' + +These integrations allow you to copy data from/to Google Cloud Platform. + +.. list-table:: + :header-rows: 1 + + * - Source + - Destination + - Guide + - Operators + + * - + .. _integration:GCP-Discovery-ref: + + All services :ref:`[1] <integration:GCP-Discovery>` + - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`__ + - + - :mod:`airflow.operators.google_api_to_s3_transfer` + + * - `Azure Data Lake Storage <https://azure.microsoft.com/pl-pl/services/storage/data-lake-storage/>`__ + - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - + - :mod:`airflow.operators.adls_to_gcs` + + * - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`__ + - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - :doc:`How to use <howto/operator/gcp/cloud_storage_transfer_service>` + - :mod:`airflow.contrib.operators.s3_to_gcs_operator`, + :mod:`airflow.gcp.operators.cloud_storage_transfer_service` + + * - `Google BigQuery <https://cloud.google.com/bigquery/>`__ + - `Google BigQuery <https://cloud.google.com/bigquery/>`__ + - + - :mod:`airflow.operators.bigquery_to_bigquery` + + * - `Google BigQuery <https://cloud.google.com/bigquery/>`__ + - `Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - + - :mod:`airflow.operators.bigquery_to_gcs` + + * - `BigQuery <https://cloud.google.com/bigquery/>`__ + - `MySQL <https://www.mysql.com/>`__ + - + - :mod:`airflow.operators.bigquery_to_mysql` + + * - `Apache Cassandra <http://cassandra.apache.org/>`__ + - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - + - :mod:`airflow.operators.cassandra_to_gcs` + + * - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - `Google BigQuery <https://cloud.google.com/bigquery/>`__ + - + - :mod:`airflow.operators.gcs_to_bq` + + * - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - :doc:`How to use <howto/operator/gcp/gcs_to_gcs>`, + :doc:`How to use <howto/operator/gcp/cloud_storage_transfer_service>` + - :mod:`airflow.operators.gcs_to_gcs`, + :mod:`airflow.gcp.operators.cloud_storage_transfer_service` + + * - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`__ + - + - :mod:`airflow.operators.gcs_to_s3` + + * - Local + - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - + - :mod:`airflow.operators.local_to_gcs` + + * - `Microsoft SQL Server (MSSQL) <https://www.microsoft.com/pl-pl/sql-server/sql-server-downloads>`__ + - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - + - :mod:`airflow.operators.mssql_to_gcs` + + * - `MySQL <https://www.mysql.com/>`__ + - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - + - :mod:`airflow.operators.mysql_to_gcs` + + * - `PostgresSQL <https://www.postgresql.org/>`__ + - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - + - :mod:`airflow.operators.postgres_to_gcs` + + * - SQL + - `Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - + - :mod:`airflow.operators.sql_to_gcs` + + * - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - `Google Drive <https://www.google.com/drive/>`__ + - + - :mod:`airflow.contrib.operators.gcs_to_gdrive_operator` + + +.. _integration:GCP-Discovery: + +:ref:`[1] <integration:GCP-Discovery-ref>` Those discovery-based operators use +:class:`airflow.gcp.hooks.discovery_api.GoogleDiscoveryApiHook` to communicate with Google +Services via the `Google API Python Client <https://github.com/googleapis/google-api-python-client>`__. +Please note that this library is in maintenance mode hence it won't fully support GCP in the future. +Therefore it is recommended that you use the custom GCP Service Operators for working with the Google +Cloud Platform. + +.. note:: + You can learn how to use GCP integrations by analyzing the + `source code <https://github.com/apache/airflow/tree/master/airflow/gcp/example_dags/>`_ of the particular example DAGs. + +Other operators and hooks +''''''''''''''''''''''''' + +.. list-table:: + :header-rows: 1 + + * - Guide + - Operators + - Hooks + + * - :doc:`How to use <howto/operator/gcp/translate-speech>` + - :mod:`airflow.gcp.operators.translate_speech` + - + + * - + - + - :mod:`airflow.gcp.hooks.discovery_api` + +.. _service: + +Service integrations +-------------------- + +Service operators and hooks +''''''''''''''''''''''''''' + +These integrations allow you to perform various operations within various services. + +.. list-table:: + :header-rows: 1 + + * - Service name + - Guide + - Hook + - Operators + - Sensors + + * - `Atlassian Jira <https://www.atlassian.com/pl/software/jira>`__ + - + - :mod:`airflow.contrib.hooks.jira_hook` + - :mod:`airflow.contrib.operators.jira_operator` + - :mod:`airflow.contrib.sensors.jira_sensor` + + * - `Databricks <https://databricks.com/>`__ + - + - :mod:`airflow.contrib.hooks.databricks_hook` + - :mod:`airflow.contrib.operators.databricks_operator` + - + + * - `Datadog <https://www.datadoghq.com/>`__ + - + - :mod:`airflow.contrib.hooks.datadog_hook` + - + - :mod:`airflow.contrib.sensors.datadog_sensor` + + + * - `Dingding <https://oapi.dingtalk.com>`__ + - :doc:`How to use <howto/operator/dingding>` + - :mod:`airflow.contrib.hooks.dingding_hook` + - :mod:`airflow.contrib.operators.dingding_operator` + - + + * - `Discord <https://discordapp.com>`__ + - + - :mod:`airflow.contrib.hooks.discord_webhook_hook` + - :mod:`airflow.contrib.operators.discord_webhook_operator` + - + + * - `Google Drive <https://www.google.com/drive/>`__ + - + - :mod:`airflow.contrib.hooks.gdrive_hook` + - + - + + * - `Google Spreadsheet <https://www.google.com/intl/en/sheets/about/>`__ + - + - :mod:`airflow.gcp.hooks.gsheets` + - + - + + * - `IBM Cloudant <https://www.ibm.com/cloud/cloudant>`__ + - + - :mod:`airflow.contrib.hooks.cloudant_hook` + - + - + + * - `Jenkins <https://jenkins.io/>`__ + - + - :mod:`airflow.contrib.hooks.jenkins_hook` + - :mod:`airflow.contrib.operators.jenkins_job_trigger_operator` + - + + * - `Opsgenie <https://www.opsgenie.com/>`__ + - + - :mod:`airflow.contrib.hooks.opsgenie_alert_hook` + - :mod:`airflow.contrib.operators.opsgenie_alert_operator` + - + + * - `Qubole <https://www.qubole.com/>`__ + - + - :mod:`airflow.contrib.hooks.qubole_hook`, + :mod:`airflow.contrib.hooks.qubole_check_hook` + - :mod:`airflow.contrib.operators.qubole_operator`, + :mod:`airflow.contrib.operators.qubole_check_operator` + - :mod:`airflow.contrib.sensors.qubole_sensor` + + * - `Salesforce <https://www.salesforce.com/>`__ + - + - :mod:`airflow.contrib.hooks.salesforce_hook` + - + - + + * - `Segment <https://oapi.dingtalk.com>`__ + - + - :mod:`airflow.contrib.hooks.segment_hook` + - :mod:`airflow.contrib.operators.segment_track_event_operator` + - + + * - `Slack <https://slack.com/>`__ + - + - :mod:`airflow.hooks.slack_hook`, + :mod:`airflow.contrib.hooks.slack_webhook_hook` + - :mod:`airflow.operators.slack_operator`, + :mod:`airflow.contrib.operators.slack_webhook_operator` + - + + * - `Snowflake <https://www.snowflake.com/>`__ + - + - :mod:`airflow.contrib.hooks.snowflake_hook` + - :mod:`airflow.contrib.operators.snowflake_operator` + - + + * - `Vertica <https://www.vertica.com/>`__ + - + - :mod:`airflow.contrib.hooks.vertica_hook` + - :mod:`airflow.contrib.operators.vertica_operator` + - + + * - `Zendesk <https://www.zendesk.com/>`__ + - + - :mod:`airflow.hooks.zendesk_hook` + - + - + +Transfer operators and hooks +'''''''''''''''''''''''''''' + +These integrations allow you to perform various operations within various services. + +.. list-table:: + :header-rows: 1 + + * - Source + - Destination + - Guide + - Operators + + * - `Vertica <https://www.vertica.com/>`__ + - `MySQL <https://www.mysql.com/>`__ + - + - :mod:`airflow.contrib.operators.vertica_to_mysql` + + * - `Vertica <https://www.vertica.com/>`__ + - `Apache Hive <https://hive.apache.org/>`__ + - + - :mod:`airflow.contrib.operators.vertica_to_hive` + + * - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - `Google Drive <https://www.google.com/drive/>`__ + - + - :mod:`airflow.contrib.operators.gcs_to_gdrive_operator` + +.. _software: + +Software integrations +--------------------- + +Software operators and hooks +'''''''''''''''''''''''''''' + +These integrations allow you to perform various operations using various software. + +.. list-table:: + :header-rows: 1 + + * - Service name + - Guide + - Hook + - Operators + - Sensors + + * - `Celery <http://www.celeryproject.org/>`__ + - + - + - + - :mod:`airflow.contrib.sensors.celery_queue_sensor` + + * - `Docker <https://docs.docker.com/install/>`__ + - + - :mod:`airflow.hooks.docker_hook` + - :mod:`airflow.operators.docker_operator`, + :mod:`airflow.contrib.operators.docker_swarm_operator` + - + + * - `GNU Bash <https://www.gnu.org/software/bash/>`__ + - :doc:`How to use <howto/operator/bash>` + - + - :mod:`airflow.operators.bash_operator` + - :mod:`airflow.contrib.sensors.bash_sensor` + + * - `Kubernetes <https://kubernetes.io/>`__ + - :doc:`How to use <howto/operator/kubernetes>` + - + - :mod:`airflow.contrib.operators.kubernetes_pod_operator` + - + + * - `Microsoft SQL Server (MSSQL) <https://www.microsoft.com/pl-pl/sql-server/sql-server-downloads>`__ + - + - :mod:`airflow.hooks.mssql_hook` + - :mod:`airflow.operators.mssql_operator` + - + + * - `MongoDB <https://www.mongodb.com/what-is-mongodb>`__ + - + - :mod:`airflow.contrib.hooks.mongo_hook` + - + - :mod:`airflow.contrib.sensors.mongo_sensor` + + + * - `MySQL <https://www.mysql.com/products/>`__ + - + - :mod:`airflow.hooks.mysql_hook` + - :mod:`airflow.operators.mysql_operator` + - + + * - `OpenFaaS <https://www.openfaas.com/>`__ + - + - :mod:`airflow.contrib.hooks.openfaas_hook` + - + - + + * - `Oracle <https://www.oracle.com/pl/database/>`__ + - + - :mod:`airflow.hooks.oracle_hook` + - :mod:`airflow.operators.oracle_operator` + - + + * - `Papermill <https://github.com/nteract/papermill>`__ + - :doc:`How to use <howto/operator/papermill>` + - + - :mod:`airflow.operators.papermill_operator` + - + + * - `PostgresSQL <https://www.postgresql.org/>`__ + - + - :mod:`airflow.hooks.postgres_hook` + - :mod:`airflow.operators.postgres_operator` + - + + * - `Presto <http://prestodb.github.io/>`__ + - + - :mod:`airflow.hooks.presto_hook` + - :mod:`airflow.operators.presto_check_operator` + - + + * - `Python <https://www.python.org>`__ + - + - + - :mod:`airflow.operators.python_operator` + - :mod:`airflow.contrib.sensors.python_sensor` + + * - `Redis <https://redis.io/>`__ + - + - :mod:`airflow.contrib.hooks.redis_hook` + - :mod:`airflow.contrib.operators.redis_publish_operator` + - :mod:`airflow.contrib.sensors.redis_pub_sub_sensor`, + :mod:`airflow.contrib.sensors.redis_key_sensor` + + * - `Samba <https://www.samba.org/>`__ + - + - :mod:`airflow.hooks.samba_hook` + - + - + + * - `SQLite <https://www.sqlite.org/index.html>`__ + - + - :mod:`airflow.hooks.sqlite_hook` + - :mod:`airflow.operators.sqlite_operator` + - + + +Transfer operators and hooks +'''''''''''''''''''''''''''' + +These integrations allow you to copy data. + +.. list-table:: + :header-rows: 1 + + * - Source + - Destination + - Guide + - Operators + + * - `Oracle <https://www.oracle.com/pl/database/>`__ + - `Azure Data Lake Storage <https://azure.microsoft.com/en-us/services/storage/data-lake-storage/>`__ + - + - :mod:`airflow.contrib.operators.oracle_to_azure_data_lake_transfer` + + * - `Oracle <https://www.oracle.com/pl/database/>`__ + - `Oracle <https://www.oracle.com/pl/database/>`__ + - + - :mod:`airflow.contrib.operators.oracle_to_oracle_transfer` + + * - `BigQuery <https://cloud.google.com/bigquery/>`__ + - `MySQL <https://www.mysql.com/>`__ + - + - :mod:`airflow.operators.bigquery_to_mysql` + + * - `Microsoft SQL Server (MSSQL) <https://www.microsoft.com/pl-pl/sql-server/sql-server-downloads>`__ + - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - + - :mod:`airflow.operators.mssql_to_gcs` + + * - `Microsoft SQL Server (MSSQL) <https://www.microsoft.com/pl-pl/sql-server/sql-server-downloads>`__ + - `Apache Hive <https://hive.apache.org/>`__ + - + - :mod:`airflow.operators.mssql_to_hive` + + * - `MySQL <https://www.mysql.com/>`__ + - `Apache Hive <https://hive.apache.org/>`__ + - + - :mod:`airflow.operators.mysql_to_hive` + + * - `MySQL <https://www.mysql.com/>`__ + - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - + - :mod:`airflow.operators.mysql_to_gcs` + + * - `PostgresSQL <https://www.postgresql.org/>`__ + - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - + - :mod:`airflow.operators.postgres_to_gcs` + + * - SQL + - `Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - + - :mod:`airflow.operators.sql_to_gcs` + + * - `Vertica <https://www.vertica.com/>`__ + - `Apache Hive <https://hive.apache.org/>`__ + - + - :mod:`airflow.contrib.operators.vertica_to_hive` + + * - `Vertica <https://www.vertica.com/>`__ + - `MySQL <https://www.mysql.com/>`__ + - + - :mod:`airflow.contrib.operators.vertica_to_mysql` + + * - `Presto <https://prestodb.github.io/>`__ + - `MySQL <https://www.mysql.com/>`__ + - + - :mod:`airflow.operators.presto_to_mysql` + + * - `Apache Hive <https://hive.apache.org/>`__ + - `Samba <https://www.samba.org/>`__ + - + - :mod:`airflow.operators.hive_to_samba_operator` + + +.. _protocol: + +Protocol integrations +--------------------- + +Protocol operators and hooks +'''''''''''''''''''''''''''' + +These integrations allow you to perform various operations within various services using standardized +communication protocols or interface. + +.. list-table:: + :header-rows: 1 + + * - Service name + - Guide + - Hook + - Operators + - Sensors + + * - `Internet Message Access Protocol (IMAP) <https://tools.ietf.org/html/rfc3501>`__ + - + - :mod:`airflow.contrib.hooks.imap_hook` + - + - :mod:`airflow.contrib.sensors.imap_attachment_sensor` + + * - `Secure Shell (SSH) <https://tools.ietf.org/html/rfc4251>`__ + - + - :mod:`airflow.contrib.hooks.ssh_hook` + - :mod:`airflow.contrib.operators.ssh_operator` + - + + * - Filesystem + - + - :mod:`airflow.contrib.hooks.fs_hook` + - + - :mod:`airflow.contrib.sensors.file_sensor` + + * - `SSH File Transfer Protocol (SFTP) <https://tools.ietf.org/wg/secsh/draft-ietf-secsh-filexfer/>`__ + - + - :mod:`airflow.contrib.hooks.sftp_hook` + - :mod:`airflow.contrib.operators.sftp_operator` + - :mod:`airflow.contrib.sensors.sftp_sensor` + + * - `File Transfer Protocol (FTP) <https://tools.ietf.org/html/rfc114>`__ + - + - :mod:`airflow.contrib.hooks.ftp_hook` + - + - :mod:`airflow.contrib.sensors.ftp_sensor` + + * - `Hypertext Transfer Protocol (HTTP) <https://www.w3.org/Protocols/>`__ + - + - :mod:`airflow.hooks.http_hook` + - :mod:`airflow.operators.http_operator` + - :mod:`airflow.sensors.http_sensor` + + * - `gRPC <https://grpc.io/>`__ + - + - :mod:`airflow.contrib.hooks.grpc_hook` + - :mod:`airflow.contrib.operators.grpc_operator` + - + + * - `Simple Mail Transfer Protocol (SMTP) <https://tools.ietf.org/html/rfc821>`__ + - + - + - :mod:`airflow.operators.email_operator` + - + + * - `Java Database Connectivity (JDBC) <https://docs.oracle.com/javase/8/docs/technotes/guides/jdbc/>`__ + - + - :mod:`airflow.hooks.jdbc_hook` + - :mod:`airflow.operators.jdbc_operator` + - + + * - `Windows Remote Management (WinRM) <https://docs.microsoft.com/en-gb/windows/win32/winrm/portal>`__ + - + - :mod:`airflow.contrib.hooks.winrm_hook` + - :mod:`airflow.contrib.operators.winrm_operator` + - + +Transfer operators and hooks +"""""""""""""""""""""""""""" + +These integrations allow you to copy data. + +.. list-table:: + :header-rows: 1 + + * - Source + - Destination + - Guide + - Operators + + * - Filesystem + - `Azure Blob Storage <https://azure.microsoft.com/en-us/services/storage/blobs/>`__ + - + - :mod:`airflow.contrib.operators.file_to_wasb` + + * - Filesystem + - `Google Cloud Storage (GCS) <https://cloud.google.com/gcs/>`__ + - + - :mod:`airflow.operators.local_to_gcs` + + * - `Internet Message Access Protocol (IMAP) <https://tools.ietf.org/html/rfc3501>`__ + - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`__ + - + - :mod:`airflow.contrib.operators.imap_attachment_to_s3_operator` + + * - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`_ + - `SSH File Transfer Protocol (SFTP) <https://tools.ietf.org/wg/secsh/draft-ietf-secsh-filexfer/>`__ + - + - :mod:`airflow.contrib.operators.s3_to_sftp_operator` + + * - `SSH File Transfer Protocol (SFTP) <https://tools.ietf.org/wg/secsh/draft-ietf-secsh-filexfer/>`__ + - `Amazon Simple Storage Service (S3) <https://aws.amazon.com/s3/>`_ + - + - :mod:`airflow.contrib.operators.sftp_to_s3_operator`
