[
https://issues.apache.org/jira/browse/BEAM-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877014#comment-16877014
]
Tural Neymanov commented on BEAM-2801:
--------------------------------------
Hello,
We are still encountering the "Too many sources provided: XXXXX. Limit is
10000." when attempting to write large amount of data to BQ (using around 3.5k
concurrent cores in Dataflow). Our assumption was that the new sink would
resolve this issue on its own, and we wouldn't need to partition the output
ourselves (similar to this [hack|https://stackoverflow.com/a/44401828]).
Were we incorrect in assuming that new the sink would fix this error, or is
there a bug of sorts?
Thanks,
Tural
> Implement a BigQuery custom sink
> --------------------------------
>
> Key: BEAM-2801
> URL: https://issues.apache.org/jira/browse/BEAM-2801
> Project: Beam
> Issue Type: New Feature
> Components: sdk-py-core
> Reporter: Chamikara Jayalath
> Assignee: Pablo Estrada
> Priority: Major
> Fix For: 2.12.0
>
>
> Currently Python SDK has a native (Dataflow) BigQuery sink. We need to
> implement a custom BigQuery sink to support following.
> * overcome BigQuery per load job quotas by executing multiple load jobs.
> * support SDK level features such as data-dependent writes
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)