[
https://issues.apache.org/jira/browse/BEAM-14422?focusedWorklogId=768715&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768715
]
ASF GitHub Bot logged work on BEAM-14422:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 10/May/22 21:00
Start Date: 10/May/22 21:00
Worklog Time Spent: 10m
Work Description: ahmedabu98 commented on code in PR #17589:
URL: https://github.com/apache/beam/pull/17589#discussion_r869675478
##########
sdks/python/apache_beam/io/gcp/bigquery_test.py:
##########
@@ -482,6 +486,113 @@ def test_temp_dataset_is_configurable(
delete_table.assert_called_with(
temp_dataset.projectId, temp_dataset.datasetId, mock.ANY)
+ @parameterized.expand([
+ param(exception_type=exceptions.Conflict, error_message='duplicate'),
+ param(
+ exception_type=exceptions.InternalServerError,
+ error_message='internalError'),
+ param(exception_type=exceptions.Forbidden, error_message='accessDenied'),
+ param(
+ exception_type=exceptions.ServiceUnavailable,
+ error_message='backendError'),
+ ])
+ @mock.patch('time.sleep')
+ @mock.patch.object(bigquery_v2_client.BigqueryV2.JobsService, 'Insert')
+ @mock.patch.object(bigquery_v2_client.BigqueryV2.DatasetsService, 'Insert')
+ def test_create_temp_dataset_exception(
+ self,
+ mock_api,
+ unused_mock_query_job,
+ unused_mock,
+ exception_type,
+ error_message):
+ mock_api.side_effect = exception_type(error_message)
+
+ with self.assertRaises(Exception) as exc:
+ with beam.Pipeline() as p:
+ _ = p | ReadFromBigQuery(
+ project='apache-beam-testing',
+ query='SELECT * FROM `project.dataset.table`',
+ gcs_location='gs://temp_location')
+
+ self.assertEqual(16, mock_api.call_count)
Review Comment:
We expect 16 calls here because
[get_or_create_dataset](https://github.com/apache/beam/blob/228fd1a00215aa2e05e74916d5e9beebea9a0206/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L785)
(where the dataset Insert API is called from) has the `@retry...` decorator
and
[create_temporary_dataset](https://github.com/apache/beam/blob/228fd1a00215aa2e05e74916d5e9beebea9a0206/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L874),
where `get_or_create_dataset` is called from, also has the `@retry...`
decorator.
For each `create_temporary_dataset` retry, we have four
`get_or_create_dataset` retries, giving us a total of 16.
Having 16 retries seems weird, but @pabloem and I thought this might be by
design to emulate an exponential backoff and may be of benefit to users.
Thoughts on this?
@johnjcasey
@chamikaramj
@ihji
Issue Time Tracking
-------------------
Worklog Id: (was: 768715)
Remaining Estimate: 0h
Time Spent: 10m
> ReadFromBigQuery with query requires exception handling tests
> -------------------------------------------------------------
>
> Key: BEAM-14422
> URL: https://issues.apache.org/jira/browse/BEAM-14422
> Project: Beam
> Issue Type: Bug
> Components: io-py-gcp
> Reporter: Ahmed Abualsaud
> Assignee: Ahmed Abualsaud
> Priority: P2
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Tests need to be created to test the behavior of our code when expected
> exceptions come up when querying from BigQuery.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)