damccorm opened a new issue, #21080:
URL: https://github.com/apache/beam/issues/21080

   `insertAll` will retry forever on a streaming pipeline running on `2.31.0`, 
with `insert_retry_strategy=RetryStrategy.RETRY_NEVER`, and 
`create_disposition=BigQueryDisposition.CREATE_NEVER`.
   
   Found while testing error handling for a pipeline by writing to a table that 
doesn't exist, ending up with no element in `BigQueryWriteFn.FAILED_ROWS` and 
these errors repeated in the logs:
   ```
   
   Error message from worker: generic::unknown: Traceback (most recent call 
last):
     File "apache_beam/runners/common.py",
   line 1257, in apache_beam.runners.common.DoFnRunner._invoke_bundle_method
     File "apache_beam/runners/common.py",
   line 510, in apache_beam.runners.common.DoFnInvoker.invoke_finish_bundle
     File "apache_beam/runners/common.py",
   line 516, in apache_beam.runners.common.DoFnInvoker.invoke_finish_bundle
     File 
"/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/bigquery.py",
   line 1268, in finish_bundle
       return self._flush_all_batches()
     File 
"/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/bigquery.py",
   line 1278, in _flush_all_batches
       for destination in list(self._rows_buffer.keys())
     File 
"/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/bigquery.py",
   line 1279, in <listcomp>
       if self._rows_buffer[destination]
     File 
"/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/bigquery.py",
   line 1312, in _flush_batch
       skip_invalid_rows=True)
     File 
"/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/bigquery_tools.py",
   line 1125, in insert_rows
       project_id, dataset_id, table_id, final_rows, skip_invalid_rows)
     File
   "/usr/local/lib/python3.7/site-packages/apache_beam/utils/retry.py", line 
253, in wrapper
       return
   fun(*args, **kwargs)
     File 
"/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/bigquery_tools.py",
   line 637, in _insert_all_rows
       response = self.client.tabledata.InsertAll(request)
     File 
"/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/internal/clients/bigquery/bigquery_v2_client.py",
   line 795, in InsertAll
       config, request, global_params=global_params)
     File "/usr/local/lib/python3.7/site-packages/apitools/base/py/base_api.py",
   line 731, in _RunMethod
       return self.ProcessHttpResponse(method_config, http_response, request)
   
    File "/usr/local/lib/python3.7/site-packages/apitools/base/py/base_api.py", 
line 737, in ProcessHttpResponse
   
      self.__ProcessHttpResponse(method_config, http_response, request))
     File "/usr/local/lib/python3.7/site-packages/apitools/base/py/base_api.py",
   line 604, in __ProcessHttpResponse
       http_response, method_config=method_config, request=request)
   apitools.base.py.exceptions.HttpNotFoundError:
   HttpError accessing 
<https://bigquery.googleapis.com/bigquery/v2/projects/<REDACTED>/datasets/testdb__dbo__raw/tables/customers/insertAll?alt=json>:
   response: <{'vary': 'Origin, X-Origin, Referer', 'content-type': 
'application/json; charset=UTF-8',
   'date': 'Sat, 21 Aug 2021 10:00:13 GMT', 'server': 'ESF', 'cache-control': 
'private', 'x-xss-protection':
   '0', 'x-frame-options': 'SAMEORIGIN', 'transfer-encoding': 'chunked', 
'status': '404', 'content-length':
   '344', '-content-encoding': 'gzip'}>, content <{
     "error": {
       "code": 404,
       "message": "Not
   found: Table <REDACTED>:testdb__dbo__raw.customers",
       "errors": [
         {
           "message": "Not
   found: Table <REDACTED>:testdb__dbo__raw.customers",
           "domain": "global",
           "reason":
   "notFound"
         }
       ],
       "status": "NOT_FOUND"
     }
   }
   ...
   
   ```
   
   Possibly related to BEAM-12362. Had been running on `2.29.0` previously, 
which would send errors repeatedly with no trace:
   ```
   
   There were errors inserting to BigQuery. Will not retry. Errors were []
   
   ```
   
   `2.31.0` is logging the errors but ignores retry strategy, preventing errors 
from being handled through `FailedRows` tag.
   
   Imported from Jira 
[BEAM-12783](https://issues.apache.org/jira/browse/BEAM-12783). Original Jira 
may contain additional context.
   Reported by: ajdub980a.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to