Thanks Cham!
Yes, that solved both problems!!
... now I can actually start writing the datapipeline itself :-)
Thanks!
Mark
On Tue, Oct 12, 2021 at 5:18 PM Chamikara Jayalath <[email protected]>
wrote:
> For #1, I think to read from a view you need to use a query ("SELECT *
> from <view>") using [1] instead of directly reading it as a table.
> For #2, the BQ query job will be billed to the project that is used to
> execute the Dataflow job using PipelineOption [2].
>
> Thanks,
> Cham
>
> [1]
> https://github.com/apache/beam/blob/1ce290bab031192c22f643cac92bd6470788798d/sdks/python/apache_beam/io/gcp/bigquery.py#L495
> [2]
> https://github.com/apache/beam/blob/1ce290bab031192c22f643cac92bd6470788798d/sdks/python/apache_beam/options/pipeline_options.py#L613
>
>
> On Tue, Oct 12, 2021 at 5:03 PM Mark Striebeck <[email protected]>
> wrote:
>
>> The problem with that solution is that I would effectively use the vendor
>> resources every time I read the data (and they would incur GCP costs). I
>> know that GCP can support this such that I access the data in the vendor
>> project, but the execution happens in my project.
>>
>> Thanks
>> Mark
>>
>> On Tue, Oct 12, 2021 at 3:27 PM Luke Cwik <[email protected]> wrote:
>>
>>> For #2, I believe you'll want to create an IAM role in your project that
>>> has the appropriate access to the vendor project. You should then launch
>>> your pipeline using the IAM role.
>>>
>>> On Tue, Oct 12, 2021 at 12:06 PM Ahmet Altay <[email protected]> wrote:
>>>
>>>> /cc +Pablo Estrada <[email protected]>
>>>>
>>>> On Tue, Oct 12, 2021 at 11:45 AM Mark Striebeck <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> We need to read from a BigQuery view from a data provider project. I
>>>>> run into two issues:
>>>>>
>>>>> 1. Is it possible to use beam.io.ReadFromBigQuery with a view (not a
>>>>> table)
>>>>> 2. In order to read, the user needs bigquery.jobs.create permissions.
>>>>> But we want to create the job in our project, not in the vendor project
>>>>> (otherwise they will get charged every time we read data)
>>>>>
>>>>> Is that possible with the beam API?
>>>>>
>>>>> Thanks
>>>>> Mark
>>>>>
>>>>