[ 
https://issues.apache.org/jira/browse/BEAM-9804?focusedWorklogId=549264&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-549264
 ]

ASF GitHub Bot logged work on BEAM-9804:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 07/Feb/21 13:00
            Start Date: 07/Feb/21 13:00
    Worklog Time Spent: 10m 
      Work Description: frankzhao commented on pull request #12960:
URL: https://github.com/apache/beam/pull/12960#issuecomment-774670479


   Regarding (1): BigQuerySource is deprecated since 2.25.0 
https://github.com/apache/beam/blob/b74fcf7b30d956fb42830d652a57b265a1546973/sdks/python/apache_beam/io/gcp/bigquery.py#L479
 and users should use ReadFromBigQuery, which handles the `temp_dataset` kwarg.
   
   Regarding (2): Yes, we should probably have the same doc comment for 
`temp_dataset` in ReadFromBigQuery.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 549264)
    Time Spent: 3h 20m  (was: 3h 10m)

> beam.io.BigQuerySource needs permissions to create datasets to be able to run 
> queries
> -------------------------------------------------------------------------------------
>
>                 Key: BEAM-9804
>                 URL: https://issues.apache.org/jira/browse/BEAM-9804
>             Project: Beam
>          Issue Type: Bug
>          Components: io-py-gcp
>            Reporter: Jonathan Sulman
>            Priority: P3
>             Fix For: 2.26.0
>
>          Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Based on BEAM-8458, which was closed with a Java fix in 2.20.0. However, the 
> bug still exists in the python SDK.
> When using BigQuerySource with the query option, BigQueryReader creates a 
> temporary dataset to store the results of the query.
> Therefore, Beam requires permissions to create datasets just to be able to 
> run a query. In practice, this means that Beam requires the role 
> bigQuery.User just to run queries, whereas if you use {{from}} (to read from 
> a table), the role bigQuery.jobUser suffices.
> BigqueryDataSource should have an option to set an existing dataset  to write 
> the temp results of
>  a query, so it would be enough with having the role bigQuery.jobUser.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to