Hey Taher,

Regarding the first question about what API Beam uses, that depends on the
BigQuery method you set in the connector's configuration. We have 4
different write methods, and a high-level description of each can be found
in the documentation:
https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.Write.Method.html.
At this point in time, we discourage using the streaming inserts API and
recommend file loads or Storage Write API instead.

For the second question, yes there is a chance you can hit the maximum
quota. When this happens, Beam will just wait a little then retry the write
operation. FYI the Storage Write API quota [1] limits to 3gb/s per project,
compared to streaming insert's 1gb/s [2].

[1] https://cloud.google.com/bigquery/quotas#write-api-limits
[2] https://cloud.google.com/bigquery/quotas#streaming_inserts

On Thu, Feb 22, 2024 at 8:57 AM Taher Koitawala <taher...@gmail.com> wrote:

> Hi All,
>           I want to ask questions regarding sinking a very high volume
> stream to Bigquery.
>
> I will read messages from a Pubsub topic and write to Bigquery. In this
> steaming job i am worried about hitting the bigquery streaming inserts
> limit of 1gb per second on streaming Api writes
>
> I am firstly unsure if Beam uses that Api or uses a temp directory to
> write files and commits on intervals which brings me to another question do
> i have to do windowing to save myself from hitting the 1gb per second
> limit?
>
> Please advise. Thanks
>

Reply via email to