[jira] [Commented] (BEAM-2801) Implement a BigQuery custom sink

Tural Neymanov (JIRA) Tue, 02 Jul 2019 07:22:20 -0700


    [ 
https://issues.apache.org/jira/browse/BEAM-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877014#comment-16877014
 ]


Tural Neymanov commented on BEAM-2801:
--------------------------------------

Hello,

We are still encountering the "Too many sources provided: XXXXX. Limit is 
10000." when attempting to write large amount of data to BQ (using around 3.5k 
concurrent cores in Dataflow). Our assumption was that the new sink would 
resolve this issue on its own, and we wouldn't need to partition the output 
ourselves (similar to this [hack|https://stackoverflow.com/a/44401828]).

Were we incorrect in assuming that new the sink would fix this error, or is 
there a bug of sorts?

Thanks,
Tural

> Implement a BigQuery custom sink
> --------------------------------
>
>                 Key: BEAM-2801
>                 URL: https://issues.apache.org/jira/browse/BEAM-2801
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-core
>            Reporter: Chamikara Jayalath
>            Assignee: Pablo Estrada
>            Priority: Major
>             Fix For: 2.12.0
>
>
> Currently Python SDK has a native (Dataflow) BigQuery sink. We need to 
> implement a custom BigQuery sink to support following.
> * overcome BigQuery per load job quotas by executing multiple load jobs.
> * support SDK level features such as data-dependent writes 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (BEAM-2801) Implement a BigQuery custom sink

Reply via email to