Kamil Gałuszka created BEAM-10647:
-------------------------------------

             Summary: BigQueryIO BigQueryWrapper.get_query_location can end up 
in permission issue
                 Key: BEAM-10647
                 URL: https://issues.apache.org/jira/browse/BEAM-10647
             Project: Beam
          Issue Type: Bug
          Components: io-py-gcp
            Reporter: Kamil Gałuszka


This bug is not deterministic because of Google BigQuery API, but let me try to 
describe the problem, as we were hunting down this for whole 2 days.

So imagine that you have one dataset with table XYZ. You added to that dataset 
Authorized View that is referencing table in project that you don't have access 
to. Only via Authorized View you can query that table.

Unfortunately when executing method
{code:java}
`get_query_location`{code}
To determine location where to write temp_dataset:
{code:java}
referenced_tables = response.statistics.query.referencedTables 
if referenced_tables: # Guards against both non-empty and non-None
    table = referenced_tables[0] 
    location = self.get_table_location( table.projectId, table.datasetId, 
table.tableId)
{code}
 The issue with that code is that, referenced_tables, will not reference where 
view is but it will give you information about underlying table in that 
authorised view.

So if it would be first in your result (and implementation of 
get_query_location only cares about first result), you will get permission 
error, that you cannot retrieve dataset which is correct! User has access to 
Authorised view, that he can query, but not to underlying table.
Therefore, what should happen, implementation should be changed, to loops 
through tables until it finds location.

Mainly my point boils down to:
* You can get table, that you don't have access and it's dataset via Authorised 
Views.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to