[
https://issues.apache.org/jira/browse/BEAM-13376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17491183#comment-17491183
]
Ning edited comment on BEAM-13376 at 2/11/22, 10:30 PM:
--------------------------------------------------------
[~wangrobert] In that case, for this ticket, it doesn't sound like a bug in the
bigtable client. The client is just not used as intended.
In BEAM-13606, the bug of the bigtable client is very generic and could happen
to any code that invokes row mutation, including their own status for-loop code
snippet.
Things are wrong on bigtable client side involves:
* the client suppresses retryable errors, causing data loss because the invoker
(Beam I/O) doesn't know which rows are failed.
* when retryable errors are suppressed, None is returned instead of Status. So
the for-loop in the [sample
code](https://github.com/googleapis/python-bigtable/blob/main/samples/snippets/writes/write_batch.py#L40)
will explode because it tries to access `None.code`.
* the client's worker raises those later suppressed retryable errors. But note
a retryable error (such as deadline_exceeded or unavailable) for a row in a
batch pollutes other rows in that batch: other rows could be successful or
running into non-retryable errors. But now a retryable error is raised, the
worker try-except-breaks at that point and retries the whole batch; eventually
the worker gives up, bubbles up the error, suppresses it and return None for
all rows in that batch: ending in a corrupted state for the whole system.
was (Author: ningk):
[~wangrobert] In that case, for this ticket, it doesn't sound like a bug in the
bigtable client. The client is just not used as intended.
In BEAM-13606, the bug of the bigtable client is very generic and could happen
to any code that invokes row mutation, including their own status for-loop code
snippet.
Things are wrong involves:
* the client suppresses retryable errors, causing data loss because the invoker
(Beam I/O) doesn't know which rows are failed.
* when retryable errors are suppressed, None is returned instead of Status. So
the for-loop in the [sample
code](https://github.com/googleapis/python-bigtable/blob/main/samples/snippets/writes/write_batch.py#L40)
will explode because it tries to access `None.code`.
* the client's worker raises those later suppressed retryable errors. But note
a retryable error (such as deadline_exceeded or unavailable) for a row in a
batch pollutes other rows in that batch: other rows could be successful or
running into non-retryable errors. But now a retryable error is raised, the
worker try-except-breaks at that point and retries the whole batch; eventually
the worker gives up, bubbles up the error, suppresses it and return None for
all rows in that batch: ending in a corrupted state for the whole system.
> Missing error for nonexistent column family BigTable
> ----------------------------------------------------
>
> Key: BEAM-13376
> URL: https://issues.apache.org/jira/browse/BEAM-13376
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp
> Reporter: PierreOberholzer
> Priority: P1
>
> Currently, there is no error thrown by BigTable in case the Column Families
> are not defined at writing time. That is a misleading behavior as the user
> believes the job has completed, though with empty table.
> A bug was raised on BigTable:
> [https://issuetracker.google.com/issues/186053077?pli=1]
> But it should be made sure that the Beam IO will log this error appropriately.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)