TobiasHammarstrom opened a new issue, #35795:
URL: https://github.com/apache/airflow/issues/35795

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### What happened
   
   The function 
[run_grant_dataset_view_access](https://airflow.apache.org/docs/apache-airflow-providers-google/10.10.1/_api/airflow/providers/google/cloud/hooks/bigquery/index.html#airflow.providers.google.cloud.hooks.bigquery.BigQueryHook.run_grant_dataset_view_access)
 does not correctly check if the view already exists, breaking what's specified 
in the docs
   
   > If this view has already been granted access to the dataset, do nothing
   
   ### What you think should happen instead
   
   The function should skip trying to create the view if it already exists
   
   ### How to reproduce
   
   Use GCP Composer, create a DAG with a BigQueryHook, call 
run_grant_dataset_view_access to grant a view in dataset A access to dataset B, 
but the view already has access to dataset B.
   
   ### Operating System
   
   composer-2.5.1-airflow-2.6.3
   
   ### Versions of Apache Airflow Providers
   
   The ones included in the above installation (the provider in question is 
google-cloud-bigquery==3.12.0)
   
   ### Deployment
   
   Google Cloud Composer
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   Investigation:
   
   The logs include "Granting table xxx authorized view access to xxx 
dataset.", which indicates that [this 
if-statement](https://github.com/apache/airflow/blob/b71c14c74a2715009bbd2134a81d32d3f41f7e1e/airflow/providers/google/cloud/hooks/bigquery.py#L1129)
 is True.
   
   When running a python script locally with the same version of 
**google-cloud-bigquery** I found that the AccessEntry object [fetched from the 
dataset](https://github.com/apache/airflow/blob/b71c14c74a2715009bbd2134a81d32d3f41f7e1e/airflow/providers/google/cloud/hooks/bigquery.py#L1126)
 does not match the [created AccessEntry 
object](https://github.com/apache/airflow/blob/b71c14c74a2715009bbd2134a81d32d3f41f7e1e/airflow/providers/google/cloud/hooks/bigquery.py#L1120).
 The mismatch occurs in the _properties dict, where the fetched AccessEntry 
only contains one entry _view_ whereas the created one contains two entries, 
_view_ and _role_. The following script shows the problem
   
   ```python
   from google.cloud import bigquery
   from google.cloud.bigquery.dataset import AccessEntry
   from copy import deepcopy
   
   def main():
       client = bigquery.Client(project=<project>, location=<location>)
       dataset = client.get_dataset(<dataset>)
       access_entries = dataset.access_entries 
   
       view_access = AccessEntry(
           role=None, 
           entity_type="view", 
           entity_id={
               "projectId": <project>, 
               "datasetId": <dataset>, 
               "tableId": <table>
           })
   
       view_access_no_role = AccessEntry(
           role=None, 
           entity_type="view", 
           entity_id={
               "projectId": <project>, 
               "datasetId": <dataset>, 
               "tableId": <table>
           })
       del view_access_no_role._properties['role']
   
   
       isInAccessEntries_v1 = view_access in access_entries #False
       isInAccessEntries_v2 = view_access_no_role in access_entries #True
   
       existing_view_access = access_entries[<index of existing view>]
       existing_view_access_fixed = deepcopy(existing_view_access)
       existing_view_access_fixed._properties['role'] = None
       access_entries.append(existing_view_access_fixed)
   
       isInAccessEntries_v3 = view_access in access_entries #True
   
   if __name__ == "__main__":
       main()
   ```
   
   
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to