[
https://issues.apache.org/jira/browse/BEAM-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16619823#comment-16619823
]
Chamikara Jayalath commented on BEAM-5426:
------------------------------------------
In that case, how about keeping track of load jobs for different destinations,
and failing the job if we detect two load jobs for the same destination ? We
should find a way to actively fail for this case, since currently this ends up
being a silent data loss.
> Use both destination and TableDestination for BQ load job IDs
> -------------------------------------------------------------
>
> Key: BEAM-5426
> URL: https://issues.apache.org/jira/browse/BEAM-5426
> Project: Beam
> Issue Type: Improvement
> Components: io-java-gcp
> Reporter: Chamikara Jayalath
> Priority: Major
>
> Currently we use TableDestination when creating a unique load job ID for a
> destination:
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java#L359]
>
> This can result in a data loss issue if a user returns the same
> TableDestination for different destination IDs. I think we can prevent this
> if we include both IDs in the BQ load job ID.
>
> CC: [~reuvenlax]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)