Zhengliang Zhu created AIRFLOW-4894:
---------------------------------------
Summary: Add hook and operator for GCP Data Loss Prevention API
Key: AIRFLOW-4894
URL: https://issues.apache.org/jira/browse/AIRFLOW-4894
Project: Apache Airflow
Issue Type: New Feature
Components: api, gcp, hooks, operators, tests
Affects Versions: 1.10.3
Reporter: Zhengliang Zhu
Assignee: Zhengliang Zhu
Add a hook and operator to manipulate and use Google Cloud Data Loss
Prevention(DLP) API. DLP API allow users to inspect or redact sensitive data in
text contents or GCP storage locations.
The hook includes the following APIs, implemented with Google service discovery
API:
* inspect/deidentify/reidentify for content:
[https://cloud.google.com/dlp/docs/reference/rest/v2/projects.content]
* create/delete/get/list/patch for inspectTemplates:
[https://cloud.google.com/dlp/docs/reference/rest/v2/organizations.inspectTemplates],
[https://cloud.google.com/dlp/docs/reference/rest/v2/projects.inspectTemplates]
* create/delete/get/list/patch for storedInfoTypes:
[https://cloud.google.com/dlp/docs/reference/rest/v2/organizations.storedInfoTypes],
[https://cloud.google.com/dlp/docs/reference/rest/v2/projects.storedInfoTypes]
* create/list/get/delete/cancel for dlpJobs:
[https://cloud.google.com/dlp/docs/reference/rest/v2/projects.dlpJobs]
The operator creates a long-running dlp job (for storage inspection or risk
analysis), keeps polling its status and waits for it to be done or
canceled/deleted.
Apart from unit tests, also tested locally in DAG level(not included in PR).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)