[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r245064704 ## File path: docs/integration.rst ## @@ -687,11 +687,40 @@ Cloud Spanner Cloud Spanner Operators """ +- :ref:`CloudSpannerInstanceDatabaseDeleteOperator` : deletes an existing database from + a Google Cloud Spanner instance or returns success if the database is missing. +- :ref:`CloudSpannerInstanceDatabaseDeployOperator` : creates a new database in a Google + Cloud instance or returns success if the database already exists. +- :ref:`CloudSpannerInstanceDatabaseUpdateOperator` : updates the structure of a + Google Cloud Spanner database. - :ref:`CloudSpannerInstanceDatabaseQueryOperator` : executes an arbitrary DML query (INSERT, UPDATE, DELETE). -- :ref:`CloudSpannerInstanceDeployOperator` : creates a new Cloud Spanner instance or, - if an instance with the same name exists, updates it. -- :ref:`CloudSpannerInstanceDeleteOperator` : deletes a Cloud Spanner instance. +- :ref:`CloudSpannerInstanceDeployOperator` : creates a new Google Cloud Spanner instance, + or if an instance with the same name exists, updates the instance. +- :ref:`CloudSpannerInstanceDeleteOperator` : deletes a Google Cloud Spanner instance. + + + Review comment: Right This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r245064308 ## File path: airflow/contrib/operators/gcp_spanner_operator.py ## @@ -197,3 +200,201 @@ def execute(self, context): def sanitize_queries(queries): if len(queries) and queries[-1] == '': del queries[-1] + + +class CloudSpannerInstanceDatabaseDeployOperator(BaseOperator): +""" +Creates a new Cloud Spanner database, or if database exists, +the operator does nothing. + + +:param project_id: The ID of the project that owns the Cloud Spanner Database. +:type project_id: str +:param instance_id: The Cloud Spanner instance ID. +:type instance_id: str +:param database_id: The Cloud Spanner database ID. +:type database_id: str +:param ddl_statements: The string list containing DDL for the new database. +:type ddl_statements: [str] +:param gcp_conn_id: The connection ID used to connect to Google Cloud Platform. +:type gcp_conn_id: str +""" +# [START gcp_spanner_database_deploy_template_fields] +template_fields = ('project_id', 'instance_id', 'database_id', 'ddl_statements', + 'gcp_conn_id') +template_ext = ('.sql', ) +# [END gcp_spanner_database_deploy_template_fields] + +@apply_defaults +def __init__(self, + project_id, + instance_id, + database_id, + ddl_statements, + gcp_conn_id='google_cloud_default', + *args, **kwargs): +# type: (str, str, str, [str], str, object, object) -> None +self.instance_id = instance_id +self.project_id = project_id +self.database_id = database_id +self.ddl_statements = ddl_statements +self.gcp_conn_id = gcp_conn_id +self._validate_inputs() +self._hook = CloudSpannerHook(gcp_conn_id=gcp_conn_id) +super(CloudSpannerInstanceDatabaseDeployOperator, self).__init__(*args, **kwargs) + +def _validate_inputs(self): +if not self.project_id: +raise AirflowException("The required parameter 'project_id' is empty") +if not self.instance_id: +raise AirflowException("The required parameter 'instance_id' is empty") +if not self.database_id: +raise AirflowException("The required parameter 'database_id' is empty") +if not self.ddl_statements: +raise AirflowException("The required parameter 'ddl_statements' is empty") + +def execute(self, context): +if not self._hook.get_database(self.project_id, + self.instance_id, + self.database_id): +self.log.info("Creating Cloud Spanner database " + "'{}' in project '{}' and instance '{}'". + format(self.database_id, self.project_id, self.instance_id)) +return self._hook.create_database(project_id=self.project_id, + instance_id=self.instance_id, + database_id=self.database_id, + ddl_statements=self.ddl_statements) +else: +self.log.info("The database '{}' in project '{}' and instance '{}'" + " already exists. Nothing to do. Exiting.". + format(self.database_id, self.project_id, self.instance_id)) +return True + + +class CloudSpannerInstanceDatabaseUpdateOperator(BaseOperator): +""" +Updates a Cloud Spanner database with the specified DDL statement. + +:param project_id: The ID of the project that owns the the Cloud Spanner Database. +:type project_id: str +:param instance_id: The Cloud Spanner instance ID. +:type instance_id: str +:param database_id: The Cloud Spanner database ID. +:type database_id: str +:param ddl_statements: The string list containing DDL to apply to the database. +:type ddl_statements: [str] +:param operation_id: (Optional) Unique per database operation id that can + be specified to implement idempotency check. +:type operation_id: [str] +:param gcp_conn_id: The connection ID used to connect to Google Cloud Platform. +:type gcp_conn_id: str +""" +# [START gcp_spanner_database_update_template_fields] +template_fields = ('project_id', 'instance_id', 'database_id', 'ddl_statements', + 'gcp_conn_id') +template_ext = ('.sql', ) +# [END gcp_spanner_database_update_template_fields] + +@apply_defaults +def __init__(self, + project_id, + instance_id, + database_id, + ddl_statements, +
[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r245064252 ## File path: airflow/contrib/operators/gcp_spanner_operator.py ## @@ -197,3 +200,201 @@ def execute(self, context): def sanitize_queries(queries): if len(queries) and queries[-1] == '': del queries[-1] + + +class CloudSpannerInstanceDatabaseDeployOperator(BaseOperator): +""" +Creates a new Cloud Spanner database, or if database exists, +the operator does nothing. + + +:param project_id: The ID of the project that owns the Cloud Spanner Database. +:type project_id: str +:param instance_id: The Cloud Spanner instance ID. +:type instance_id: str +:param database_id: The Cloud Spanner database ID. +:type database_id: str +:param ddl_statements: The string list containing DDL for the new database. +:type ddl_statements: [str] +:param gcp_conn_id: The connection ID used to connect to Google Cloud Platform. +:type gcp_conn_id: str +""" +# [START gcp_spanner_database_deploy_template_fields] +template_fields = ('project_id', 'instance_id', 'database_id', 'ddl_statements', + 'gcp_conn_id') +template_ext = ('.sql', ) +# [END gcp_spanner_database_deploy_template_fields] + +@apply_defaults +def __init__(self, + project_id, + instance_id, + database_id, + ddl_statements, + gcp_conn_id='google_cloud_default', + *args, **kwargs): +# type: (str, str, str, [str], str, object, object) -> None +self.instance_id = instance_id +self.project_id = project_id +self.database_id = database_id +self.ddl_statements = ddl_statements +self.gcp_conn_id = gcp_conn_id +self._validate_inputs() +self._hook = CloudSpannerHook(gcp_conn_id=gcp_conn_id) +super(CloudSpannerInstanceDatabaseDeployOperator, self).__init__(*args, **kwargs) + +def _validate_inputs(self): +if not self.project_id: +raise AirflowException("The required parameter 'project_id' is empty") +if not self.instance_id: +raise AirflowException("The required parameter 'instance_id' is empty") +if not self.database_id: +raise AirflowException("The required parameter 'database_id' is empty") +if not self.ddl_statements: +raise AirflowException("The required parameter 'ddl_statements' is empty") + +def execute(self, context): +if not self._hook.get_database(self.project_id, + self.instance_id, + self.database_id): +self.log.info("Creating Cloud Spanner database " + "'{}' in project '{}' and instance '{}'". + format(self.database_id, self.project_id, self.instance_id)) +return self._hook.create_database(project_id=self.project_id, + instance_id=self.instance_id, + database_id=self.database_id, + ddl_statements=self.ddl_statements) +else: +self.log.info("The database '{}' in project '{}' and instance '{}'" + " already exists. Nothing to do. Exiting.". + format(self.database_id, self.project_id, self.instance_id)) +return True + + +class CloudSpannerInstanceDatabaseUpdateOperator(BaseOperator): +""" +Updates a Cloud Spanner database with the specified DDL statement. + +:param project_id: The ID of the project that owns the the Cloud Spanner Database. +:type project_id: str +:param instance_id: The Cloud Spanner instance ID. +:type instance_id: str +:param database_id: The Cloud Spanner database ID. +:type database_id: str +:param ddl_statements: The string list containing DDL to apply to the database. +:type ddl_statements: [str] +:param operation_id: (Optional) Unique per database operation id that can + be specified to implement idempotency check. +:type operation_id: [str] +:param gcp_conn_id: The connection ID used to connect to Google Cloud Platform. +:type gcp_conn_id: str +""" +# [START gcp_spanner_database_update_template_fields] +template_fields = ('project_id', 'instance_id', 'database_id', 'ddl_statements', + 'gcp_conn_id') +template_ext = ('.sql', ) +# [END gcp_spanner_database_update_template_fields] + +@apply_defaults +def __init__(self, + project_id, Review comment: Again - if you insist, I am happy to make it, but I think doing it for all operators at once makes
[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r245064016 ## File path: airflow/contrib/operators/gcp_spanner_operator.py ## @@ -197,3 +200,201 @@ def execute(self, context): def sanitize_queries(queries): if len(queries) and queries[-1] == '': del queries[-1] + + +class CloudSpannerInstanceDatabaseDeployOperator(BaseOperator): +""" +Creates a new Cloud Spanner database, or if database exists, +the operator does nothing. + + +:param project_id: The ID of the project that owns the Cloud Spanner Database. +:type project_id: str +:param instance_id: The Cloud Spanner instance ID. +:type instance_id: str +:param database_id: The Cloud Spanner database ID. +:type database_id: str +:param ddl_statements: The string list containing DDL for the new database. +:type ddl_statements: [str] +:param gcp_conn_id: The connection ID used to connect to Google Cloud Platform. +:type gcp_conn_id: str +""" +# [START gcp_spanner_database_deploy_template_fields] +template_fields = ('project_id', 'instance_id', 'database_id', 'ddl_statements', + 'gcp_conn_id') +template_ext = ('.sql', ) +# [END gcp_spanner_database_deploy_template_fields] + +@apply_defaults +def __init__(self, + project_id, + instance_id, + database_id, + ddl_statements, + gcp_conn_id='google_cloud_default', + *args, **kwargs): +# type: (str, str, str, [str], str, object, object) -> None +self.instance_id = instance_id +self.project_id = project_id +self.database_id = database_id +self.ddl_statements = ddl_statements +self.gcp_conn_id = gcp_conn_id +self._validate_inputs() +self._hook = CloudSpannerHook(gcp_conn_id=gcp_conn_id) +super(CloudSpannerInstanceDatabaseDeployOperator, self).__init__(*args, **kwargs) + +def _validate_inputs(self): +if not self.project_id: +raise AirflowException("The required parameter 'project_id' is empty") +if not self.instance_id: +raise AirflowException("The required parameter 'instance_id' is empty") +if not self.database_id: +raise AirflowException("The required parameter 'database_id' is empty") +if not self.ddl_statements: +raise AirflowException("The required parameter 'ddl_statements' is empty") + +def execute(self, context): +if not self._hook.get_database(self.project_id, + self.instance_id, + self.database_id): +self.log.info("Creating Cloud Spanner database " + "'{}' in project '{}' and instance '{}'". + format(self.database_id, self.project_id, self.instance_id)) +return self._hook.create_database(project_id=self.project_id, + instance_id=self.instance_id, + database_id=self.database_id, + ddl_statements=self.ddl_statements) +else: +self.log.info("The database '{}' in project '{}' and instance '{}'" Review comment: Fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r245063993 ## File path: airflow/contrib/operators/gcp_spanner_operator.py ## @@ -197,3 +200,201 @@ def execute(self, context): def sanitize_queries(queries): if len(queries) and queries[-1] == '': del queries[-1] + + +class CloudSpannerInstanceDatabaseDeployOperator(BaseOperator): +""" +Creates a new Cloud Spanner database, or if database exists, +the operator does nothing. + + +:param project_id: The ID of the project that owns the Cloud Spanner Database. +:type project_id: str +:param instance_id: The Cloud Spanner instance ID. +:type instance_id: str +:param database_id: The Cloud Spanner database ID. +:type database_id: str +:param ddl_statements: The string list containing DDL for the new database. +:type ddl_statements: [str] +:param gcp_conn_id: The connection ID used to connect to Google Cloud Platform. +:type gcp_conn_id: str +""" +# [START gcp_spanner_database_deploy_template_fields] +template_fields = ('project_id', 'instance_id', 'database_id', 'ddl_statements', + 'gcp_conn_id') +template_ext = ('.sql', ) +# [END gcp_spanner_database_deploy_template_fields] + +@apply_defaults +def __init__(self, + project_id, + instance_id, + database_id, + ddl_statements, + gcp_conn_id='google_cloud_default', + *args, **kwargs): +# type: (str, str, str, [str], str, object, object) -> None +self.instance_id = instance_id +self.project_id = project_id +self.database_id = database_id +self.ddl_statements = ddl_statements +self.gcp_conn_id = gcp_conn_id +self._validate_inputs() +self._hook = CloudSpannerHook(gcp_conn_id=gcp_conn_id) +super(CloudSpannerInstanceDatabaseDeployOperator, self).__init__(*args, **kwargs) + +def _validate_inputs(self): +if not self.project_id: +raise AirflowException("The required parameter 'project_id' is empty") +if not self.instance_id: +raise AirflowException("The required parameter 'instance_id' is empty") +if not self.database_id: +raise AirflowException("The required parameter 'database_id' is empty") +if not self.ddl_statements: +raise AirflowException("The required parameter 'ddl_statements' is empty") + +def execute(self, context): +if not self._hook.get_database(self.project_id, + self.instance_id, + self.database_id): +self.log.info("Creating Cloud Spanner database " + "'{}' in project '{}' and instance '{}'". + format(self.database_id, self.project_id, self.instance_id)) +return self._hook.create_database(project_id=self.project_id, + instance_id=self.instance_id, + database_id=self.database_id, + ddl_statements=self.ddl_statements) +else: +self.log.info("The database '{}' in project '{}' and instance '{}'" + " already exists. Nothing to do. Exiting.". + format(self.database_id, self.project_id, self.instance_id)) +return True + + +class CloudSpannerInstanceDatabaseUpdateOperator(BaseOperator): +""" +Updates a Cloud Spanner database with the specified DDL statement. + +:param project_id: The ID of the project that owns the the Cloud Spanner Database. +:type project_id: str +:param instance_id: The Cloud Spanner instance ID. +:type instance_id: str +:param database_id: The Cloud Spanner database ID. +:type database_id: str +:param ddl_statements: The string list containing DDL to apply to the database. +:type ddl_statements: [str] +:param operation_id: (Optional) Unique per database operation id that can + be specified to implement idempotency check. +:type operation_id: [str] +:param gcp_conn_id: The connection ID used to connect to Google Cloud Platform. +:type gcp_conn_id: str +""" +# [START gcp_spanner_database_update_template_fields] +template_fields = ('project_id', 'instance_id', 'database_id', 'ddl_statements', + 'gcp_conn_id') +template_ext = ('.sql', ) +# [END gcp_spanner_database_update_template_fields] + +@apply_defaults +def __init__(self, + project_id, + instance_id, + database_id, + ddl_statements, +
[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r245063552 ## File path: airflow/contrib/operators/gcp_spanner_operator.py ## @@ -197,3 +200,201 @@ def execute(self, context): def sanitize_queries(queries): if len(queries) and queries[-1] == '': del queries[-1] + + +class CloudSpannerInstanceDatabaseDeployOperator(BaseOperator): +""" +Creates a new Cloud Spanner database, or if database exists, +the operator does nothing. + + +:param project_id: The ID of the project that owns the Cloud Spanner Database. +:type project_id: str +:param instance_id: The Cloud Spanner instance ID. +:type instance_id: str +:param database_id: The Cloud Spanner database ID. +:type database_id: str +:param ddl_statements: The string list containing DDL for the new database. +:type ddl_statements: [str] +:param gcp_conn_id: The connection ID used to connect to Google Cloud Platform. +:type gcp_conn_id: str +""" +# [START gcp_spanner_database_deploy_template_fields] +template_fields = ('project_id', 'instance_id', 'database_id', 'ddl_statements', + 'gcp_conn_id') +template_ext = ('.sql', ) +# [END gcp_spanner_database_deploy_template_fields] + +@apply_defaults +def __init__(self, + project_id, + instance_id, + database_id, + ddl_statements, + gcp_conn_id='google_cloud_default', + *args, **kwargs): +# type: (str, str, str, [str], str, object, object) -> None Review comment: True. Docstrings work as well here. Removed. While [str] worked for type hints, it did not work with docstrings for Intellij autocomplete and inspections, but with list[str] in docstring it started to work :) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r245062825 ## File path: airflow/contrib/operators/gcp_spanner_operator.py ## @@ -197,3 +200,201 @@ def execute(self, context): def sanitize_queries(queries): if len(queries) and queries[-1] == '': del queries[-1] + + +class CloudSpannerInstanceDatabaseDeployOperator(BaseOperator): +""" +Creates a new Cloud Spanner database, or if database exists, +the operator does nothing. + + +:param project_id: The ID of the project that owns the Cloud Spanner Database. +:type project_id: str +:param instance_id: The Cloud Spanner instance ID. +:type instance_id: str +:param database_id: The Cloud Spanner database ID. +:type database_id: str +:param ddl_statements: The string list containing DDL for the new database. +:type ddl_statements: [str] +:param gcp_conn_id: The connection ID used to connect to Google Cloud Platform. +:type gcp_conn_id: str +""" +# [START gcp_spanner_database_deploy_template_fields] +template_fields = ('project_id', 'instance_id', 'database_id', 'ddl_statements', + 'gcp_conn_id') +template_ext = ('.sql', ) +# [END gcp_spanner_database_deploy_template_fields] + +@apply_defaults +def __init__(self, + project_id, + instance_id, + database_id, + ddl_statements, + gcp_conn_id='google_cloud_default', + *args, **kwargs): +# type: (str, str, str, [str], str, object, object) -> None +self.instance_id = instance_id +self.project_id = project_id +self.database_id = database_id +self.ddl_statements = ddl_statements +self.gcp_conn_id = gcp_conn_id +self._validate_inputs() +self._hook = CloudSpannerHook(gcp_conn_id=gcp_conn_id) +super(CloudSpannerInstanceDatabaseDeployOperator, self).__init__(*args, **kwargs) + +def _validate_inputs(self): +if not self.project_id: +raise AirflowException("The required parameter 'project_id' is empty") +if not self.instance_id: +raise AirflowException("The required parameter 'instance_id' is empty") +if not self.database_id: +raise AirflowException("The required parameter 'database_id' is empty") +if not self.ddl_statements: +raise AirflowException("The required parameter 'ddl_statements' is empty") + +def execute(self, context): +if not self._hook.get_database(self.project_id, + self.instance_id, + self.database_id): +self.log.info("Creating Cloud Spanner database " + "'{}' in project '{}' and instance '{}'". + format(self.database_id, self.project_id, self.instance_id)) +return self._hook.create_database(project_id=self.project_id, + instance_id=self.instance_id, + database_id=self.database_id, + ddl_statements=self.ddl_statements) +else: +self.log.info("The database '{}' in project '{}' and instance '{}'" Review comment: Make sense. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r245062261 ## File path: airflow/contrib/operators/gcp_spanner_operator.py ## @@ -197,3 +200,201 @@ def execute(self, context): def sanitize_queries(queries): if len(queries) and queries[-1] == '': del queries[-1] + + +class CloudSpannerInstanceDatabaseDeployOperator(BaseOperator): +""" +Creates a new Cloud Spanner database, or if database exists, +the operator does nothing. + + Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r245062137 ## File path: airflow/contrib/hooks/gcp_spanner_hook.py ## @@ -168,39 +168,175 @@ def delete_instance(self, project_id, instance_id): """ Deletes an existing Cloud Spanner instance. -:param project_id: The ID of the project which owns the instances, tables and data. +:param project_id: The ID of the GCP project that owns the Cloud Spanner database. :type project_id: str -:param instance_id: The ID of the instance. +:param instance_id: The ID of the Cloud Spanner instance. :type instance_id: str """ -client = self.get_client(project_id) -instance = client.instance(instance_id) +instance = self.get_client(project_id).instance(instance_id) try: instance.delete() return True except GoogleAPICallError as e: -self.log.error('An error occurred: %s. Aborting.', e.message) +self.log.error('An error occurred: %s. Exiting.', e.message) +raise e + +def get_database(self, project_id, instance_id, database_id): +# type: (str, str, str) -> Optional[Database] +""" +Retrieves a database in Cloud Spanner. If the database does not exist +in the specified instance, it returns None. + +:param project_id: The ID of the GCP project that owns the Cloud Spanner database. +:type project_id: str +:param instance_id: The ID of the Cloud Spanner instance. +:type instance_id: str +:param database_id: The ID of the database in Cloud Spanner. +:type database_id: str +:return: Database object or None if database does not exist +:rtype: Union[Database, None] +""" + +instance = self.get_client(project_id=project_id).instance( +instance_id=instance_id) +if not instance.exists(): +raise AirflowException("The instance {} does not exist in project {} !". + format(instance_id, project_id)) +database = instance.database(database_id=database_id) +if not database.exists(): +return None +else: +return database + +def create_database(self, project_id, instance_id, database_id, ddl_statements): +# type: (str, str, str, [str]) -> bool +""" +Creates a new database in Cloud Spanner. + +:param project_id: The ID of the GCP project that owns the Cloud Spanner database. +:type project_id: str +:param instance_id: The ID of the Cloud Spanner instance. +:type instance_id: str +:param database_id: The ID of the database to create in Cloud Spanner. +:type database_id: str +:param ddl_statements: The string list containing DDL for the new database. +:type ddl_statements: [str] +:return: True if everything succeeded +:rtype: bool +""" + +instance = self.get_client(project_id=project_id).instance( +instance_id=instance_id) +if not instance.exists(): +raise AirflowException("The instance {} does not exist in project {} !". + format(instance_id, project_id)) +database = instance.database(database_id=database_id, + ddl_statements=ddl_statements) +try: +operation = database.create() # type: Operation +except GoogleAPICallError as e: +self.log.error('An error occurred: %s. Exiting.', e.message) raise e +if operation: +result = operation.result() +self.log.info(result) +return True + +def update_database(self, project_id, instance_id, database_id, ddl_statements, +operation_id=None): +# type: (str, str, str, [str], str) -> bool +""" +Updates DDL of a database in Cloud Spanner. + +:param project_id: The ID of the GCP project that owns the Cloud Spanner database. +:type project_id: str +:param instance_id: The ID of the Cloud Spanner instance. +:type instance_id: str +:param database_id: The ID of the database in Cloud Spanner. +:type database_id: str +:param ddl_statements: The string list containing DDL for the new database. +:type ddl_statements: [str] Review comment: I actually tried to find a standard for that and could not. In fact this is not parsed According to this link in pycharm there is no real standard and I see different variants used across Airflow's code. https://www.jetbrains.com/help/pycharm/type-syntax-for-docstrings.html Note that in python 2 type hint comments
[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r244109948 ## File path: airflow/contrib/hooks/gcp_spanner_hook.py ## @@ -41,11 +42,12 @@ def __init__(self, def get_client(self, project_id): # type: (str) -> Client """ -Provides a client for interacting with Cloud Spanner API. +Provides a client for interacting with the Cloud Spanner API. -:param project_id: The ID of the project which owns the instances, tables and data. +:param project_id: TThe ID of the GCP project that owns the Cloud Spanner Review comment: OK. I will wait with further rebases until you review all :) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r244038642 ## File path: airflow/contrib/hooks/gcp_spanner_hook.py ## @@ -168,39 +168,171 @@ def delete_instance(self, project_id, instance_id): """ Deletes an existing Cloud Spanner instance. -:param project_id: The ID of the project which owns the instances, tables and data. +:param project_id: The ID of the GCP project that owns the Cloud Spanner database. :type project_id: str -:param instance_id: The ID of the instance. +:param instance_id: The ID of the Cloud Spanner instance. :type instance_id: str """ -client = self.get_client(project_id) -instance = client.instance(instance_id) +instance = self.get_client(project_id).instance(instance_id) try: instance.delete() return True except GoogleAPICallError as e: -self.log.error('An error occurred: %s. Aborting.', e.message) +self.log.error('An error occurred: %s. Exiting.', e.message) +raise e + +def get_database(self, project_id, instance_id, database_id): +# type: (str, str, str) -> Optional[Database] +""" +Retrieves a database in Cloud Spanner. If the database does not exist +in the specified instance, it returns None. + +:param project_id: The ID of the GCP project that owns the Cloud Spanner database. +:type project_id: str +:param instance_id: The ID of the Cloud Spanner instance. +:type instance_id: str +:param database_id: The ID of the database in Cloud Spanner. +:type database_id: str +:return: Review comment: Fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r244038274 ## File path: airflow/contrib/hooks/gcp_spanner_hook.py ## @@ -147,15 +149,13 @@ def _apply_to_instance(self, project_id, instance_id, configuration_name, node_c :param func: Method of the instance to be called. :type func: Callable """ -client = self.get_client(project_id) -instance = client.instance(instance_id, - configuration_name=configuration_name, - node_count=node_count, - display_name=display_name) +instance = self.get_client(project_id).instance( Review comment: I'd rather wait with hook tests to subsequent PRs (and maybe we solve it in a bit different way). I think it is rather difficult to test automatically the hooks to GCP using mocking/classic unit tests of Airflow. Please let me know what you think about the approach we use? Maybe we are missing something and we can add some unit test for hooks which could add value :)? We chose a slightly different strategy for testing GCP operators (and hooks): * First of all we try to make hooks as straightforward as possible - basically 1-1 operation of an existing library method but we make it synchronous - waiting for operation to complete. Then we put pretty much all logic into operators. * What we realised is that we can mostly test with mocking and unit tests in this case, is whether this particular library method has been called. Which is a bit redundant - that's why we do not have those hook tests. * instead we run automated system tests ("we" means the team from my company Polidea - 3 people who work on those operators). We use example DAGs (`example_gcp_spanner.py` in this case) to run them automatically with a real GCP_PROJECT. We even have a way to run them using (skipped by default) unit testss (see for example CloudSpannerExampleDagsTest). We have a separate and quite sophisticated environment for that - we run it automatically in GCP Cloud Build on every push and this way we test e-2-e whether the operators (and thus hooks) work with real GCP. I am going to try to contribute it very soon to the community (it's already open-sourced and I am polishing it) so that others will be able to do the same with their own GCP projects. You can see it here https://github.com/PolideaInternal/airflow-breeze and I will be happy to involve you for comments/re view when we are ready to share it (I think in a couple of days). What do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r244035912 ## File path: setup.py ## @@ -245,6 +245,7 @@ def write_version(filename=os.path.join(*['airflow', devel = [ 'click==6.7', 'freezegun', +'freezegun', Review comment: indeed. Fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services