[ https://issues.apache.org/jira/browse/BEAM-7742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pablo Estrada resolved BEAM-7742. --------------------------------- Fix Version/s: 2.16.0 Resolution: Fixed > BigQuery File Loads to work well with load job size limits > ---------------------------------------------------------- > > Key: BEAM-7742 > URL: https://issues.apache.org/jira/browse/BEAM-7742 > Project: Beam > Issue Type: Improvement > Components: io-py-gcp > Reporter: Pablo Estrada > Assignee: Tanay Tummalapalli > Priority: Major > Fix For: 2.16.0 > > Time Spent: 5h > Remaining Estimate: 0h > > Load jobs into BigQuery have a number of limitations: > [https://cloud.google.com/bigquery/quotas#load_jobs] > > Currently, the python BQ sink implemented in `bigquery_file_loads.py` does > not handle these limitations well. Improvements need to be made to the > miplementation, to: > * Decide to use temp_tables dynamically at pipeline execution > * Add code to determine when a load job to a single destination needs to be > partitioned into multiple jobs. > * When this happens, then we definitely need to use temp_tables, in case one > of the two load jobs fails, and the pipeline is rerun. > Tanay, would you be able to look at this? -- This message was sent by Atlassian Jira (v8.3.2#803003)