[
https://issues.apache.org/jira/browse/BEAM-13606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17470831#comment-17470831
]
Ning commented on BEAM-13606:
-----------------------------
# The Beam side never retries/handles cases when there is a valid Status that
isn't OK.
# The `_do_mutate_retryable_rows` function raises _BigtableRetryableError when
running into RETRYABLE_MUTATION_ERRORS. In that case, responses are not
populated at all and statuses remain None. [1]
A fix needs to be done on both sides to make this right:
Beam should retry retryable errors and handle non-retryable errors.
Bigtable should tell the client about retryable errors instead of suppress it.
[1]
https://github.com/googleapis/python-bigtable/blob/fec06fcd28c36d0d3b347b43d1f3d264e5f5aa39/google/cloud/bigtable/table.py#L1139
> bigtable io doesn't handle non-ok row mutations
> -----------------------------------------------
>
> Key: BEAM-13606
> URL: https://issues.apache.org/jira/browse/BEAM-13606
> Project: Beam
> Issue Type: Bug
> Components: io-py-gcp
> Reporter: Ning
> Assignee: Ning
> Priority: P1
>
> bigtable io has no logic to retry row mutations for rows with non-ok return
> status (this includes None return value when bigtable suppresses retryable
> errors, details see BEAM-13602).
>
> To avoid data loss, the solution should be:
> # Retry for those retryable-failed row mutations;
> # Tagged output for those non-retryable-failed row mutations.
> Or clarify that the I/O doesn't handle failed row mutations in docstrings.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)