spbail commented on a change in pull request #11113:
URL: https://github.com/apache/airflow/pull/11113#discussion_r504902263
##########
File path: airflow/providers/google/cloud/operators/bigquery.py
##########
@@ -2109,3 +2122,287 @@ def execute(self, context: Any):
def on_kill(self):
if self.job_id and self.cancel_on_kill:
self.hook.cancel_job(job_id=self.job_id,
project_id=self.project_id, location=self.location)
+
+
+class GreatExpectationsValidations(enum.Enum):
+ SQL = "SQL"
+ TABLE = "TABLE"
+
+
+class GreatExpectationsBigQueryOperator(GreatExpectationsBaseOperator):
+ """
+ Use Great Expectations to validate data expectations against a BigQuery
table or the result of a SQL query.
+ The expectations need to be stored in a JSON file sitting in an
accessible GCS bucket. The validation results
+ are output to GCS in both JSON and HTML formats.
+ Here's the current list of expectations types:
+
https://docs.greatexpectations.io/en/latest/reference/glossary_of_expectations.html
+ Here's how to create expectations files:
+
https://docs.greatexpectations.io/en/latest/guides/tutorials/how_to_create_expectations.html
+ :param gcp_project: The GCP project which houses the GCS buckets
where the expectations files are stored
+ and where the validation files & data docs will be output (e.g.
HTML docs showing if the data matches
+ expectations).
+ :type gcp_project: str
+ :param expectations_file_name: The name of the JSON file containing
the expectations for the data.
Review comment:
Just noticed this here, is there a specific reason for using the
filename rather than the expectation suite name? I think that would be more in
line with how we've been referring to expectation suites, and you should be
able to load the expectation suite from the context by name.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]