alekseiloginov opened a new pull request #20882:
URL: https://github.com/apache/airflow/pull/20882


   ### Description
   
   This PR adds new Airflow operators that make use of the new 
[Looker](https://cloud.google.com/looker)'s [Start/Stop 
API](https://docs.looker.com/reference/api-and-integration/api-reference/v4.0/derived-table#start_a_pdt_materialization)
 for  Persistent Derived Tables 
([PDTs](https://docs.looker.com/data-modeling/learning-lookml/derived-tables)). 
These operators allow to manage PDT materialization via [Looker 
API](https://docs.looker.com/reference/api-and-integration/api-reference/v4.0).
   
   Currently supported operations are:
   - Start PDT build
   - Check PDT build status
   - Stop PDT build
   
   Under the hood these new operators use [Looker 
SDK](https://pypi.org/project/looker-sdk/) - a python client for Looker API.
   
   _Note_: Looker PDT operators are being released as a part of [Google Cloud 
operators](https://airflow.apache.org/docs/apache-airflow-providers-google/1.0.0/operators/cloud/index.html).
 While Looker is a part of GCP now, it's still in the process of migrating to 
the GCP infrastructure. That's why for now we can't use [Google Cloud 
connection](https://airflow.apache.org/docs/apache-airflow-providers-google/stable/connections/gcp.html),
 and a custom _Looker-specific_ Airflow connection needs to be created.
   
   ### Changes
   
   This PR adds a new Looker operator, sensor, hook, example DAGs, tests, docs, 
and logo. 
   It also updates provider and setup files.
   
   New classes:
   - `LookerStartPdtBuildOperator`
     - submits a job to start a PDT build in a synchronous/asynchronous way
     - checks the state of the PDT build submitted in a _synchronous_ way
   - `LookerCheckPdtBuildSensor`
     - checks the state of the PDT build submitted in a _asynchronous_ way
   - `LookerHook`
     - configures a connection to Looker SDK using Airflow connection settings
     - handles communication b/w the operator/sensor and Looker SDK
   
   ### Tests
   
   This PR adds the following unit tests:
   - `TestLookerStartPdtBuildOperator`
     - tests synchronous/asynchronous execution of `LookerStartPdtBuildOperator`
     - tests `cancel_on_kill` flag that specifies whether to cancel the PDT 
build or not, when a task is manually killed
   - `TestLookerCheckPdtBuildSensor`
     - tests possible build states: `done`, `error`, `wait`, `cancelled`
   - `TestLookerHook`
     - tests `wait_for_job` used by `LookerStartPdtBuildOperator` in 
_synchronous_ mode
     - tests `check_pdt_build` used to get a build status from Looker SDK
     - tests `start_pdt_build` used to start a build via Looker SDK
     - tests `stop_pdt_build` used to stop a build via Looker SDK
   
   _Note_: There are no system tests since Looker doesn't have an automated 
process to create a test environment.
   
   ### Documentation
   
   The main documentation is in `looker.rst` file as well as in docstrings.
   
   `looker.rst` contains:
   - High level information about Looker and Looker API
   - Prerequisite tasks with examples
   - How to start a PDT materialization job
   
   ### Screenshots
   
   Example of using Looker PDT operator in _synchronous_ mode (start and status 
are combined in one task) 
   
![image](https://user-images.githubusercontent.com/11273622/151217636-dbb46ad8-877f-4838-9f0a-aa6e716a56ec.png)
   
   Example of using Looker PDT operator in _asynchronous_ mode (start and 
status are separate tasks) 
   
![image](https://user-images.githubusercontent.com/11273622/151217521-c776ec4d-4363-4f83-9b8a-ff031ee3397d.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to