reuvenlax commented on a change in pull request #16906:
URL: https://github.com/apache/beam/pull/16906#discussion_r816328786
##########
File path:
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/StorageApiWritesShardedRecords.java
##########
@@ -404,22 +428,27 @@ public String toString() {
"Got error " + failedContext.getError() + " closing " +
failedContext.streamName);
clearClients.accept(contexts);
appendFailures.inc();
- if (statusCode.equals(Code.OUT_OF_RANGE) ||
statusCode.equals(Code.ALREADY_EXISTS)) {
+ // This means that the offset we have stored does not match the
current end of
+ // the stream in the Storage API. Usually this happens because a
crash or a bundle
+ // failure
+ // happened after an append but before the worker could
checkpoint it's
+ // state. The records that were appended in a failed bundle will
be retried,
+ // meaning that the unflushed tail of the stream must be
discarded to prevent
+ // duplicates.
+ boolean offsetMismatch =
+ statusCode.equals(Code.OUT_OF_RANGE) ||
statusCode.equals(Code.ALREADY_EXISTS);
+ // This implies that the stream doesn't exist or has already
been finalized. In this
+ // case we have no
+ // choice but to create a new stream.
+ boolean streamDoesntExist =
statusCode.equals(Code.INVALID_ARGUMENT);
Review comment:
@yirutang is it true that FAILED_PRECONDITION can also be returned?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]