ahmedabu98 commented on code in PR #17589:
URL: https://github.com/apache/beam/pull/17589#discussion_r869675478


##########
sdks/python/apache_beam/io/gcp/bigquery_test.py:
##########
@@ -482,6 +486,113 @@ def test_temp_dataset_is_configurable(
     delete_table.assert_called_with(
         temp_dataset.projectId, temp_dataset.datasetId, mock.ANY)
 
+  @parameterized.expand([
+      param(exception_type=exceptions.Conflict, error_message='duplicate'),
+      param(
+          exception_type=exceptions.InternalServerError,
+          error_message='internalError'),
+      param(exception_type=exceptions.Forbidden, error_message='accessDenied'),
+      param(
+          exception_type=exceptions.ServiceUnavailable,
+          error_message='backendError'),
+  ])
+  @mock.patch('time.sleep')
+  @mock.patch.object(bigquery_v2_client.BigqueryV2.JobsService, 'Insert')
+  @mock.patch.object(bigquery_v2_client.BigqueryV2.DatasetsService, 'Insert')
+  def test_create_temp_dataset_exception(
+      self,
+      mock_api,
+      unused_mock_query_job,
+      unused_mock,
+      exception_type,
+      error_message):
+    mock_api.side_effect = exception_type(error_message)
+
+    with self.assertRaises(Exception) as exc:
+      with beam.Pipeline() as p:
+        _ = p | ReadFromBigQuery(
+            project='apache-beam-testing',
+            query='SELECT * FROM `project.dataset.table`',
+            gcs_location='gs://temp_location')
+
+    self.assertEqual(16, mock_api.call_count)

Review Comment:
   We expect 16 calls here because 
[get_or_create_dataset](https://github.com/apache/beam/blob/228fd1a00215aa2e05e74916d5e9beebea9a0206/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L785)
 (where the dataset Insert API is called from) has the `@retry...` decorator 
and 
[create_temporary_dataset](https://github.com/apache/beam/blob/228fd1a00215aa2e05e74916d5e9beebea9a0206/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L874),
 where `get_or_create_dataset` is called from, also has the `@retry...` 
decorator. 
   
   For each `create_temporary_dataset` retry, we have four 
`get_or_create_dataset` retries, giving us a total of 16.
   Having 16 retries seems weird, but @pabloem and I thought this might be by 
design to emulate an exponential backoff and may be of benefit to users. 
Thoughts on this?
   
   @johnjcasey 
   @chamikaramj 
   @ihji 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to