[
https://issues.apache.org/jira/browse/BEAM-8960?focusedWorklogId=363217&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363217
]
ASF GitHub Bot logged work on BEAM-8960:
----------------------------------------
Author: ASF GitHub Bot
Created on: 25/Dec/19 07:52
Start Date: 25/Dec/19 07:52
Worklog Time Spent: 10m
Work Description: reuvenlax commented on pull request #10427:
[BEAM-8960]: Add an option for user to opt out of using insert id for BigQuery
streaming insert.
URL: https://github.com/apache/beam/pull/10427#discussion_r361274528
##########
File path:
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
##########
@@ -2241,6 +2246,14 @@ static String getExtractDestinationUri(String
extractDestinationDir) {
return toBuilder().setIgnoreUnknownValues(true).build();
}
+ /**
+ * Performs streaming insert without insert id. Insert id is used to offer
best effort insert
+ * deduplication. Default is false, which always inserts with insert id.
+ */
+ public Write<T> ignoreInsertIds() {
Review comment:
Maybe rename withIgnoreInsertId for consistency?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 363217)
Remaining Estimate: 22h 10m (was: 22h 20m)
Time Spent: 1h 50m (was: 1h 40m)
> Add an option for user to be able to opt out of using insert id for BigQuery
> streaming insert.
> ----------------------------------------------------------------------------------------------
>
> Key: BEAM-8960
> URL: https://issues.apache.org/jira/browse/BEAM-8960
> Project: Beam
> Issue Type: New Feature
> Components: io-java-gcp
> Reporter: Yiru Tang
> Priority: Minor
> Original Estimate: 24h
> Time Spent: 1h 50m
> Remaining Estimate: 22h 10m
>
> BigQuery streaming insert id offers best effort insert deduplication. If user
> choose to opt out of using insert ids, they could potentially to be opt into
> using our current new streaming backend which gives higher speed and more
> quota. Insert id deduplication is best effort and doesn't have ultimate just
> once guarantees.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)