[jira] [Work logged] (BEAM-7856) BigQuery table creation race condition error when executing pipeline on multiple workers
[ https://issues.apache.org/jira/browse/BEAM-7856?focusedWorklogId=299675=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299675 ] ASF GitHub Bot logged work on BEAM-7856: Author: ASF GitHub Bot Created on: 22/Aug/19 19:38 Start Date: 22/Aug/19 19:38 Worklog Time Spent: 10m Work Description: angoenka commented on pull request #9396: [BEAM-7856] Re Raise exception for code other than 409 URL: https://github.com/apache/beam/pull/9396 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 299675) Time Spent: 2h (was: 1h 50m) > BigQuery table creation race condition error when executing pipeline on > multiple workers > > > Key: BEAM-7856 > URL: https://issues.apache.org/jira/browse/BEAM-7856 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > This is non-fatal issue and just prints error in the logs as far as I can > tell. > The issue is when we check and create big query table on multiple workers at > the same time. This causes the race condition. > > {noformat} > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute response = task() File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 190, in self._execute(lambda: worker.do_instruction(work), > work) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 342, in do_instruction request.instruction_id) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 368, in process_bundle bundle_processor.process_bundle(instruction_id)) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 593, in process_bundle data.ptransform_id].process_encoded(data.data) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 143, in process_encoded self.output(decoded_value) File > "apache_beam/runners/worker/operations.py", line 255, in > apache_beam.runners.worker.operations.Operation.output def output(self, > windowed_value, output_index=0): File > "apache_beam/runners/worker/operations.py", line 256, in > apache_beam.runners.worker.operations.Operation.output cython.cast(Receiver, > self.receivers[output_index]).receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in > apache_beam.runners.worker.operations.SingletonConsumerSet.receive > self.consumer.process(windowed_value) File > "apache_beam/runners/worker/operations.py", line 593, in > apache_beam.runners.worker.operations.DoOperation.process with > self.scoped_process_state: File "apache_beam/runners/worker/operations.py", > line 594, in apache_beam.runners.worker.operations.DoOperation.process > delayed_application = self.dofn_receiver.receive(o) File > "apache_beam/runners/common.py", line 799, in > apache_beam.runners.common.DoFnRunner.receive self.process(windowed_value) > File "apache_beam/runners/common.py", line 805, in > apache_beam.runners.common.DoFnRunner.process self._reraise_augmented(exn) > File "apache_beam/runners/common.py", line 857, in > apache_beam.runners.common.DoFnRunner._reraise_augmented raise File > "apache_beam/runners/common.py", line 803, in > apache_beam.runners.common.DoFnRunner.process return > self.do_fn_invoker.invoke_process(windowed_value) File > "apache_beam/runners/common.py", line 610, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > self._invoke_process_per_window( File "apache_beam/runners/common.py", line > 682, in > apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window > output_processor.process_outputs( File "apache_beam/runners/common.py", line > 903, in apache_beam.runners.common._OutputProcessor.process_outputs def > process_outputs(self, windowed_input_element, results): File > "apache_beam/runners/common.py", line 942, in > apache_beam.runners.common._OutputProcessor.process_outputs > self.main_receivers.receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in > apache_beam.runners.worker.operations.SingletonConsumerSet.receive >
[jira] [Work logged] (BEAM-7856) BigQuery table creation race condition error when executing pipeline on multiple workers
[ https://issues.apache.org/jira/browse/BEAM-7856?focusedWorklogId=299060=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299060 ] ASF GitHub Bot logged work on BEAM-7856: Author: ASF GitHub Bot Created on: 21/Aug/19 22:57 Start Date: 21/Aug/19 22:57 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #9396: [BEAM-7856] Re Raise exception for code other than 409 URL: https://github.com/apache/beam/pull/9396#issuecomment-523680815 LGTM. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 299060) Time Spent: 1h 50m (was: 1h 40m) > BigQuery table creation race condition error when executing pipeline on > multiple workers > > > Key: BEAM-7856 > URL: https://issues.apache.org/jira/browse/BEAM-7856 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > This is non-fatal issue and just prints error in the logs as far as I can > tell. > The issue is when we check and create big query table on multiple workers at > the same time. This causes the race condition. > > {noformat} > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute response = task() File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 190, in self._execute(lambda: worker.do_instruction(work), > work) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 342, in do_instruction request.instruction_id) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 368, in process_bundle bundle_processor.process_bundle(instruction_id)) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 593, in process_bundle data.ptransform_id].process_encoded(data.data) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 143, in process_encoded self.output(decoded_value) File > "apache_beam/runners/worker/operations.py", line 255, in > apache_beam.runners.worker.operations.Operation.output def output(self, > windowed_value, output_index=0): File > "apache_beam/runners/worker/operations.py", line 256, in > apache_beam.runners.worker.operations.Operation.output cython.cast(Receiver, > self.receivers[output_index]).receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in > apache_beam.runners.worker.operations.SingletonConsumerSet.receive > self.consumer.process(windowed_value) File > "apache_beam/runners/worker/operations.py", line 593, in > apache_beam.runners.worker.operations.DoOperation.process with > self.scoped_process_state: File "apache_beam/runners/worker/operations.py", > line 594, in apache_beam.runners.worker.operations.DoOperation.process > delayed_application = self.dofn_receiver.receive(o) File > "apache_beam/runners/common.py", line 799, in > apache_beam.runners.common.DoFnRunner.receive self.process(windowed_value) > File "apache_beam/runners/common.py", line 805, in > apache_beam.runners.common.DoFnRunner.process self._reraise_augmented(exn) > File "apache_beam/runners/common.py", line 857, in > apache_beam.runners.common.DoFnRunner._reraise_augmented raise File > "apache_beam/runners/common.py", line 803, in > apache_beam.runners.common.DoFnRunner.process return > self.do_fn_invoker.invoke_process(windowed_value) File > "apache_beam/runners/common.py", line 610, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > self._invoke_process_per_window( File "apache_beam/runners/common.py", line > 682, in > apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window > output_processor.process_outputs( File "apache_beam/runners/common.py", line > 903, in apache_beam.runners.common._OutputProcessor.process_outputs def > process_outputs(self, windowed_input_element, results): File > "apache_beam/runners/common.py", line 942, in > apache_beam.runners.common._OutputProcessor.process_outputs > self.main_receivers.receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in >
[jira] [Work logged] (BEAM-7856) BigQuery table creation race condition error when executing pipeline on multiple workers
[ https://issues.apache.org/jira/browse/BEAM-7856?focusedWorklogId=299054=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299054 ] ASF GitHub Bot logged work on BEAM-7856: Author: ASF GitHub Bot Created on: 21/Aug/19 22:55 Start Date: 21/Aug/19 22:55 Worklog Time Spent: 10m Work Description: angoenka commented on issue #9396: [BEAM-7856] Re Raise exception for code other than 409 URL: https://github.com/apache/beam/pull/9396#issuecomment-523680327 R: @chamikaramj This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 299054) Time Spent: 1h 40m (was: 1.5h) > BigQuery table creation race condition error when executing pipeline on > multiple workers > > > Key: BEAM-7856 > URL: https://issues.apache.org/jira/browse/BEAM-7856 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > This is non-fatal issue and just prints error in the logs as far as I can > tell. > The issue is when we check and create big query table on multiple workers at > the same time. This causes the race condition. > > {noformat} > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute response = task() File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 190, in self._execute(lambda: worker.do_instruction(work), > work) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 342, in do_instruction request.instruction_id) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 368, in process_bundle bundle_processor.process_bundle(instruction_id)) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 593, in process_bundle data.ptransform_id].process_encoded(data.data) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 143, in process_encoded self.output(decoded_value) File > "apache_beam/runners/worker/operations.py", line 255, in > apache_beam.runners.worker.operations.Operation.output def output(self, > windowed_value, output_index=0): File > "apache_beam/runners/worker/operations.py", line 256, in > apache_beam.runners.worker.operations.Operation.output cython.cast(Receiver, > self.receivers[output_index]).receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in > apache_beam.runners.worker.operations.SingletonConsumerSet.receive > self.consumer.process(windowed_value) File > "apache_beam/runners/worker/operations.py", line 593, in > apache_beam.runners.worker.operations.DoOperation.process with > self.scoped_process_state: File "apache_beam/runners/worker/operations.py", > line 594, in apache_beam.runners.worker.operations.DoOperation.process > delayed_application = self.dofn_receiver.receive(o) File > "apache_beam/runners/common.py", line 799, in > apache_beam.runners.common.DoFnRunner.receive self.process(windowed_value) > File "apache_beam/runners/common.py", line 805, in > apache_beam.runners.common.DoFnRunner.process self._reraise_augmented(exn) > File "apache_beam/runners/common.py", line 857, in > apache_beam.runners.common.DoFnRunner._reraise_augmented raise File > "apache_beam/runners/common.py", line 803, in > apache_beam.runners.common.DoFnRunner.process return > self.do_fn_invoker.invoke_process(windowed_value) File > "apache_beam/runners/common.py", line 610, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > self._invoke_process_per_window( File "apache_beam/runners/common.py", line > 682, in > apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window > output_processor.process_outputs( File "apache_beam/runners/common.py", line > 903, in apache_beam.runners.common._OutputProcessor.process_outputs def > process_outputs(self, windowed_input_element, results): File > "apache_beam/runners/common.py", line 942, in > apache_beam.runners.common._OutputProcessor.process_outputs > self.main_receivers.receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in > apache_beam.runners.worker.operations.SingletonConsumerSet.receive
[jira] [Work logged] (BEAM-7856) BigQuery table creation race condition error when executing pipeline on multiple workers
[ https://issues.apache.org/jira/browse/BEAM-7856?focusedWorklogId=299053=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299053 ] ASF GitHub Bot logged work on BEAM-7856: Author: ASF GitHub Bot Created on: 21/Aug/19 22:54 Start Date: 21/Aug/19 22:54 Worklog Time Spent: 10m Work Description: angoenka commented on pull request #9396: [BEAM-7856] Re Raise exception for code other than 409 URL: https://github.com/apache/beam/pull/9396 **Please** add a meaningful description for your change here Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-7856) BigQuery table creation race condition error when executing pipeline on multiple workers
[ https://issues.apache.org/jira/browse/BEAM-7856?focusedWorklogId=294340=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294340 ] ASF GitHub Bot logged work on BEAM-7856: Author: ASF GitHub Bot Created on: 14/Aug/19 02:05 Start Date: 14/Aug/19 02:05 Worklog Time Spent: 10m Work Description: angoenka commented on pull request #9204: [BEAM-7856] Suppress error on table bigquery table already exists URL: https://github.com/apache/beam/pull/9204#discussion_r313679387 ## File path: sdks/python/apache_beam/io/gcp/bigquery_tools.py ## @@ -659,12 +659,19 @@ def get_or_create_table( if found_table and write_disposition != BigQueryDisposition.WRITE_TRUNCATE: return found_table else: - created_table = self._create_table( - project_id=project_id, - dataset_id=dataset_id, - table_id=table_id, - schema=schema or found_table.schema, - additional_parameters=additional_create_parameters) + created_table = None + try: +created_table = self._create_table( +project_id=project_id, +dataset_id=dataset_id, +table_id=table_id, +schema=schema or found_table.schema, +additional_parameters=additional_create_parameters) + except HttpError as exn: +if exn.status_code == 409: Review comment: I agree, Using a single element transform to create the table would be idle. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 294340) Time Spent: 1h 10m (was: 1h) > BigQuery table creation race condition error when executing pipeline on > multiple workers > > > Key: BEAM-7856 > URL: https://issues.apache.org/jira/browse/BEAM-7856 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > This is non-fatal issue and just prints error in the logs as far as I can > tell. > The issue is when we check and create big query table on multiple workers at > the same time. This causes the race condition. > > {noformat} > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute response = task() File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 190, in self._execute(lambda: worker.do_instruction(work), > work) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 342, in do_instruction request.instruction_id) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 368, in process_bundle bundle_processor.process_bundle(instruction_id)) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 593, in process_bundle data.ptransform_id].process_encoded(data.data) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 143, in process_encoded self.output(decoded_value) File > "apache_beam/runners/worker/operations.py", line 255, in > apache_beam.runners.worker.operations.Operation.output def output(self, > windowed_value, output_index=0): File > "apache_beam/runners/worker/operations.py", line 256, in > apache_beam.runners.worker.operations.Operation.output cython.cast(Receiver, > self.receivers[output_index]).receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in > apache_beam.runners.worker.operations.SingletonConsumerSet.receive > self.consumer.process(windowed_value) File > "apache_beam/runners/worker/operations.py", line 593, in > apache_beam.runners.worker.operations.DoOperation.process with > self.scoped_process_state: File "apache_beam/runners/worker/operations.py", > line 594, in apache_beam.runners.worker.operations.DoOperation.process > delayed_application = self.dofn_receiver.receive(o) File > "apache_beam/runners/common.py", line 799, in > apache_beam.runners.common.DoFnRunner.receive self.process(windowed_value) > File "apache_beam/runners/common.py", line 805, in > apache_beam.runners.common.DoFnRunner.process self._reraise_augmented(exn) > File "apache_beam/runners/common.py", line 857, in >
[jira] [Work logged] (BEAM-7856) BigQuery table creation race condition error when executing pipeline on multiple workers
[ https://issues.apache.org/jira/browse/BEAM-7856?focusedWorklogId=294341=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294341 ] ASF GitHub Bot logged work on BEAM-7856: Author: ASF GitHub Bot Created on: 14/Aug/19 02:05 Start Date: 14/Aug/19 02:05 Worklog Time Spent: 10m Work Description: angoenka commented on pull request #9204: [BEAM-7856] Suppress error on table bigquery table already exists URL: https://github.com/apache/beam/pull/9204 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 294341) Time Spent: 1h 20m (was: 1h 10m) > BigQuery table creation race condition error when executing pipeline on > multiple workers > > > Key: BEAM-7856 > URL: https://issues.apache.org/jira/browse/BEAM-7856 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > This is non-fatal issue and just prints error in the logs as far as I can > tell. > The issue is when we check and create big query table on multiple workers at > the same time. This causes the race condition. > > {noformat} > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute response = task() File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 190, in self._execute(lambda: worker.do_instruction(work), > work) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 342, in do_instruction request.instruction_id) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 368, in process_bundle bundle_processor.process_bundle(instruction_id)) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 593, in process_bundle data.ptransform_id].process_encoded(data.data) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 143, in process_encoded self.output(decoded_value) File > "apache_beam/runners/worker/operations.py", line 255, in > apache_beam.runners.worker.operations.Operation.output def output(self, > windowed_value, output_index=0): File > "apache_beam/runners/worker/operations.py", line 256, in > apache_beam.runners.worker.operations.Operation.output cython.cast(Receiver, > self.receivers[output_index]).receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in > apache_beam.runners.worker.operations.SingletonConsumerSet.receive > self.consumer.process(windowed_value) File > "apache_beam/runners/worker/operations.py", line 593, in > apache_beam.runners.worker.operations.DoOperation.process with > self.scoped_process_state: File "apache_beam/runners/worker/operations.py", > line 594, in apache_beam.runners.worker.operations.DoOperation.process > delayed_application = self.dofn_receiver.receive(o) File > "apache_beam/runners/common.py", line 799, in > apache_beam.runners.common.DoFnRunner.receive self.process(windowed_value) > File "apache_beam/runners/common.py", line 805, in > apache_beam.runners.common.DoFnRunner.process self._reraise_augmented(exn) > File "apache_beam/runners/common.py", line 857, in > apache_beam.runners.common.DoFnRunner._reraise_augmented raise File > "apache_beam/runners/common.py", line 803, in > apache_beam.runners.common.DoFnRunner.process return > self.do_fn_invoker.invoke_process(windowed_value) File > "apache_beam/runners/common.py", line 610, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > self._invoke_process_per_window( File "apache_beam/runners/common.py", line > 682, in > apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window > output_processor.process_outputs( File "apache_beam/runners/common.py", line > 903, in apache_beam.runners.common._OutputProcessor.process_outputs def > process_outputs(self, windowed_input_element, results): File > "apache_beam/runners/common.py", line 942, in > apache_beam.runners.common._OutputProcessor.process_outputs > self.main_receivers.receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in > apache_beam.runners.worker.operations.SingletonConsumerSet.receive >
[jira] [Work logged] (BEAM-7856) BigQuery table creation race condition error when executing pipeline on multiple workers
[ https://issues.apache.org/jira/browse/BEAM-7856?focusedWorklogId=294164=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294164 ] ASF GitHub Bot logged work on BEAM-7856: Author: ASF GitHub Bot Created on: 13/Aug/19 20:24 Start Date: 13/Aug/19 20:24 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #9204: [BEAM-7856] Suppress error on table bigquery table already exists URL: https://github.com/apache/beam/pull/9204#issuecomment-520994785 LGTM for getting this in as a short term fix. But prob. file a JIRA for a proper fix. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 294164) Time Spent: 1h (was: 50m) > BigQuery table creation race condition error when executing pipeline on > multiple workers > > > Key: BEAM-7856 > URL: https://issues.apache.org/jira/browse/BEAM-7856 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > This is non-fatal issue and just prints error in the logs as far as I can > tell. > The issue is when we check and create big query table on multiple workers at > the same time. This causes the race condition. > > {noformat} > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute response = task() File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 190, in self._execute(lambda: worker.do_instruction(work), > work) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 342, in do_instruction request.instruction_id) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 368, in process_bundle bundle_processor.process_bundle(instruction_id)) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 593, in process_bundle data.ptransform_id].process_encoded(data.data) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 143, in process_encoded self.output(decoded_value) File > "apache_beam/runners/worker/operations.py", line 255, in > apache_beam.runners.worker.operations.Operation.output def output(self, > windowed_value, output_index=0): File > "apache_beam/runners/worker/operations.py", line 256, in > apache_beam.runners.worker.operations.Operation.output cython.cast(Receiver, > self.receivers[output_index]).receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in > apache_beam.runners.worker.operations.SingletonConsumerSet.receive > self.consumer.process(windowed_value) File > "apache_beam/runners/worker/operations.py", line 593, in > apache_beam.runners.worker.operations.DoOperation.process with > self.scoped_process_state: File "apache_beam/runners/worker/operations.py", > line 594, in apache_beam.runners.worker.operations.DoOperation.process > delayed_application = self.dofn_receiver.receive(o) File > "apache_beam/runners/common.py", line 799, in > apache_beam.runners.common.DoFnRunner.receive self.process(windowed_value) > File "apache_beam/runners/common.py", line 805, in > apache_beam.runners.common.DoFnRunner.process self._reraise_augmented(exn) > File "apache_beam/runners/common.py", line 857, in > apache_beam.runners.common.DoFnRunner._reraise_augmented raise File > "apache_beam/runners/common.py", line 803, in > apache_beam.runners.common.DoFnRunner.process return > self.do_fn_invoker.invoke_process(windowed_value) File > "apache_beam/runners/common.py", line 610, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > self._invoke_process_per_window( File "apache_beam/runners/common.py", line > 682, in > apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window > output_processor.process_outputs( File "apache_beam/runners/common.py", line > 903, in apache_beam.runners.common._OutputProcessor.process_outputs def > process_outputs(self, windowed_input_element, results): File > "apache_beam/runners/common.py", line 942, in > apache_beam.runners.common._OutputProcessor.process_outputs > self.main_receivers.receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line
[jira] [Work logged] (BEAM-7856) BigQuery table creation race condition error when executing pipeline on multiple workers
[ https://issues.apache.org/jira/browse/BEAM-7856?focusedWorklogId=290019=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-290019 ] ASF GitHub Bot logged work on BEAM-7856: Author: ASF GitHub Bot Created on: 06/Aug/19 21:52 Start Date: 06/Aug/19 21:52 Worklog Time Spent: 10m Work Description: chamikaramj commented on pull request #9204: [BEAM-7856] Suppress error on table bigquery table already exists URL: https://github.com/apache/beam/pull/9204#discussion_r311291293 ## File path: sdks/python/apache_beam/io/gcp/bigquery_tools.py ## @@ -659,12 +659,19 @@ def get_or_create_table( if found_table and write_disposition != BigQueryDisposition.WRITE_TRUNCATE: return found_table else: - created_table = self._create_table( - project_id=project_id, - dataset_id=dataset_id, - table_id=table_id, - schema=schema or found_table.schema, - additional_parameters=additional_create_parameters) + created_table = None + try: +created_table = self._create_table( +project_id=project_id, +dataset_id=dataset_id, +table_id=table_id, +schema=schema or found_table.schema, +additional_parameters=additional_create_parameters) + except HttpError as exn: +if exn.status_code == 409: Review comment: Instead of suppressing the error, can we move table creation to a step that preceeds writing ? That seems cleaner. Java SDK seems to be doing something like this based on a quick look: https://github.com/apache/beam/blob/08d0146791e38be4641ff80ffb2539cdc81f5b6d/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/StreamingInserts.java#L178 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 290019) Time Spent: 50m (was: 40m) > BigQuery table creation race condition error when executing pipeline on > multiple workers > > > Key: BEAM-7856 > URL: https://issues.apache.org/jira/browse/BEAM-7856 > Project: Beam > Issue Type: Bug > Components: io-python-gcp >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > This is non-fatal issue and just prints error in the logs as far as I can > tell. > The issue is when we check and create big query table on multiple workers at > the same time. This causes the race condition. > > {noformat} > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute response = task() File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 190, in self._execute(lambda: worker.do_instruction(work), > work) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 342, in do_instruction request.instruction_id) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 368, in process_bundle bundle_processor.process_bundle(instruction_id)) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 593, in process_bundle data.ptransform_id].process_encoded(data.data) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 143, in process_encoded self.output(decoded_value) File > "apache_beam/runners/worker/operations.py", line 255, in > apache_beam.runners.worker.operations.Operation.output def output(self, > windowed_value, output_index=0): File > "apache_beam/runners/worker/operations.py", line 256, in > apache_beam.runners.worker.operations.Operation.output cython.cast(Receiver, > self.receivers[output_index]).receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in > apache_beam.runners.worker.operations.SingletonConsumerSet.receive > self.consumer.process(windowed_value) File > "apache_beam/runners/worker/operations.py", line 593, in > apache_beam.runners.worker.operations.DoOperation.process with > self.scoped_process_state: File "apache_beam/runners/worker/operations.py", > line 594, in apache_beam.runners.worker.operations.DoOperation.process > delayed_application = self.dofn_receiver.receive(o) File > "apache_beam/runners/common.py", line 799, in >
[jira] [Work logged] (BEAM-7856) BigQuery table creation race condition error when executing pipeline on multiple workers
[ https://issues.apache.org/jira/browse/BEAM-7856?focusedWorklogId=287045=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-287045 ] ASF GitHub Bot logged work on BEAM-7856: Author: ASF GitHub Bot Created on: 01/Aug/19 21:28 Start Date: 01/Aug/19 21:28 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #9204: [BEAM-7856] Suppress error on table bigquery table already exists URL: https://github.com/apache/beam/pull/9204#issuecomment-517464716 Passing the review to Pablo. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 287045) Time Spent: 40m (was: 0.5h) > BigQuery table creation race condition error when executing pipeline on > multiple workers > > > Key: BEAM-7856 > URL: https://issues.apache.org/jira/browse/BEAM-7856 > Project: Beam > Issue Type: Bug > Components: io-python-gcp >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > This is non-fatal issue and just prints error in the logs as far as I can > tell. > The issue is when we check and create big query table on multiple workers at > the same time. This causes the race condition. > > {noformat} > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute response = task() File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 190, in self._execute(lambda: worker.do_instruction(work), > work) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 342, in do_instruction request.instruction_id) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 368, in process_bundle bundle_processor.process_bundle(instruction_id)) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 593, in process_bundle data.ptransform_id].process_encoded(data.data) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 143, in process_encoded self.output(decoded_value) File > "apache_beam/runners/worker/operations.py", line 255, in > apache_beam.runners.worker.operations.Operation.output def output(self, > windowed_value, output_index=0): File > "apache_beam/runners/worker/operations.py", line 256, in > apache_beam.runners.worker.operations.Operation.output cython.cast(Receiver, > self.receivers[output_index]).receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in > apache_beam.runners.worker.operations.SingletonConsumerSet.receive > self.consumer.process(windowed_value) File > "apache_beam/runners/worker/operations.py", line 593, in > apache_beam.runners.worker.operations.DoOperation.process with > self.scoped_process_state: File "apache_beam/runners/worker/operations.py", > line 594, in apache_beam.runners.worker.operations.DoOperation.process > delayed_application = self.dofn_receiver.receive(o) File > "apache_beam/runners/common.py", line 799, in > apache_beam.runners.common.DoFnRunner.receive self.process(windowed_value) > File "apache_beam/runners/common.py", line 805, in > apache_beam.runners.common.DoFnRunner.process self._reraise_augmented(exn) > File "apache_beam/runners/common.py", line 857, in > apache_beam.runners.common.DoFnRunner._reraise_augmented raise File > "apache_beam/runners/common.py", line 803, in > apache_beam.runners.common.DoFnRunner.process return > self.do_fn_invoker.invoke_process(windowed_value) File > "apache_beam/runners/common.py", line 610, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > self._invoke_process_per_window( File "apache_beam/runners/common.py", line > 682, in > apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window > output_processor.process_outputs( File "apache_beam/runners/common.py", line > 903, in apache_beam.runners.common._OutputProcessor.process_outputs def > process_outputs(self, windowed_input_element, results): File > "apache_beam/runners/common.py", line 942, in > apache_beam.runners.common._OutputProcessor.process_outputs > self.main_receivers.receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in >
[jira] [Work logged] (BEAM-7856) BigQuery table creation race condition error when executing pipeline on multiple workers
[ https://issues.apache.org/jira/browse/BEAM-7856?focusedWorklogId=286979=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-286979 ] ASF GitHub Bot logged work on BEAM-7856: Author: ASF GitHub Bot Created on: 01/Aug/19 19:42 Start Date: 01/Aug/19 19:42 Worklog Time Spent: 10m Work Description: angoenka commented on issue #9204: [BEAM-7856] Suppress error on table bigquery table already exists URL: https://github.com/apache/beam/pull/9204#issuecomment-517429586 Ping for the review This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 286979) Time Spent: 0.5h (was: 20m) > BigQuery table creation race condition error when executing pipeline on > multiple workers > > > Key: BEAM-7856 > URL: https://issues.apache.org/jira/browse/BEAM-7856 > Project: Beam > Issue Type: Bug > Components: io-python-gcp >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > This is non-fatal issue and just prints error in the logs as far as I can > tell. > The issue is when we check and create big query table on multiple workers at > the same time. This causes the race condition. > > {noformat} > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute response = task() File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 190, in self._execute(lambda: worker.do_instruction(work), > work) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 342, in do_instruction request.instruction_id) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 368, in process_bundle bundle_processor.process_bundle(instruction_id)) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 593, in process_bundle data.ptransform_id].process_encoded(data.data) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 143, in process_encoded self.output(decoded_value) File > "apache_beam/runners/worker/operations.py", line 255, in > apache_beam.runners.worker.operations.Operation.output def output(self, > windowed_value, output_index=0): File > "apache_beam/runners/worker/operations.py", line 256, in > apache_beam.runners.worker.operations.Operation.output cython.cast(Receiver, > self.receivers[output_index]).receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in > apache_beam.runners.worker.operations.SingletonConsumerSet.receive > self.consumer.process(windowed_value) File > "apache_beam/runners/worker/operations.py", line 593, in > apache_beam.runners.worker.operations.DoOperation.process with > self.scoped_process_state: File "apache_beam/runners/worker/operations.py", > line 594, in apache_beam.runners.worker.operations.DoOperation.process > delayed_application = self.dofn_receiver.receive(o) File > "apache_beam/runners/common.py", line 799, in > apache_beam.runners.common.DoFnRunner.receive self.process(windowed_value) > File "apache_beam/runners/common.py", line 805, in > apache_beam.runners.common.DoFnRunner.process self._reraise_augmented(exn) > File "apache_beam/runners/common.py", line 857, in > apache_beam.runners.common.DoFnRunner._reraise_augmented raise File > "apache_beam/runners/common.py", line 803, in > apache_beam.runners.common.DoFnRunner.process return > self.do_fn_invoker.invoke_process(windowed_value) File > "apache_beam/runners/common.py", line 610, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > self._invoke_process_per_window( File "apache_beam/runners/common.py", line > 682, in > apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window > output_processor.process_outputs( File "apache_beam/runners/common.py", line > 903, in apache_beam.runners.common._OutputProcessor.process_outputs def > process_outputs(self, windowed_input_element, results): File > "apache_beam/runners/common.py", line 942, in > apache_beam.runners.common._OutputProcessor.process_outputs > self.main_receivers.receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in >
[jira] [Work logged] (BEAM-7856) BigQuery table creation race condition error when executing pipeline on multiple workers
[ https://issues.apache.org/jira/browse/BEAM-7856?focusedWorklogId=285500=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-285500 ] ASF GitHub Bot logged work on BEAM-7856: Author: ASF GitHub Bot Created on: 31/Jul/19 04:37 Start Date: 31/Jul/19 04:37 Worklog Time Spent: 10m Work Description: angoenka commented on pull request #9204: [BEAM-7856] Suppress error on table bigquery table already exists URL: https://github.com/apache/beam/pull/9204 **Please** add a meaningful description for your change here Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-7856) BigQuery table creation race condition error when executing pipeline on multiple workers
[ https://issues.apache.org/jira/browse/BEAM-7856?focusedWorklogId=285501=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-285501 ] ASF GitHub Bot logged work on BEAM-7856: Author: ASF GitHub Bot Created on: 31/Jul/19 04:37 Start Date: 31/Jul/19 04:37 Worklog Time Spent: 10m Work Description: angoenka commented on issue #9204: [BEAM-7856] Suppress error on table bigquery table already exists URL: https://github.com/apache/beam/pull/9204#issuecomment-516691858 R: @chamikaramj This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 285501) Time Spent: 20m (was: 10m) > BigQuery table creation race condition error when executing pipeline on > multiple workers > > > Key: BEAM-7856 > URL: https://issues.apache.org/jira/browse/BEAM-7856 > Project: Beam > Issue Type: Bug > Components: io-python-gcp >Reporter: Ankur Goenka >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > This is non-fatal issue and just prints error in the logs as far as I can > tell. > The issue is when we check and create big query table on multiple workers at > the same time. This causes the race condition. > > {noformat} > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute response = task() File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 190, in self._execute(lambda: worker.do_instruction(work), > work) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 342, in do_instruction request.instruction_id) File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 368, in process_bundle bundle_processor.process_bundle(instruction_id)) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 593, in process_bundle data.ptransform_id].process_encoded(data.data) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 143, in process_encoded self.output(decoded_value) File > "apache_beam/runners/worker/operations.py", line 255, in > apache_beam.runners.worker.operations.Operation.output def output(self, > windowed_value, output_index=0): File > "apache_beam/runners/worker/operations.py", line 256, in > apache_beam.runners.worker.operations.Operation.output cython.cast(Receiver, > self.receivers[output_index]).receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in > apache_beam.runners.worker.operations.SingletonConsumerSet.receive > self.consumer.process(windowed_value) File > "apache_beam/runners/worker/operations.py", line 593, in > apache_beam.runners.worker.operations.DoOperation.process with > self.scoped_process_state: File "apache_beam/runners/worker/operations.py", > line 594, in apache_beam.runners.worker.operations.DoOperation.process > delayed_application = self.dofn_receiver.receive(o) File > "apache_beam/runners/common.py", line 799, in > apache_beam.runners.common.DoFnRunner.receive self.process(windowed_value) > File "apache_beam/runners/common.py", line 805, in > apache_beam.runners.common.DoFnRunner.process self._reraise_augmented(exn) > File "apache_beam/runners/common.py", line 857, in > apache_beam.runners.common.DoFnRunner._reraise_augmented raise File > "apache_beam/runners/common.py", line 803, in > apache_beam.runners.common.DoFnRunner.process return > self.do_fn_invoker.invoke_process(windowed_value) File > "apache_beam/runners/common.py", line 610, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > self._invoke_process_per_window( File "apache_beam/runners/common.py", line > 682, in > apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window > output_processor.process_outputs( File "apache_beam/runners/common.py", line > 903, in apache_beam.runners.common._OutputProcessor.process_outputs def > process_outputs(self, windowed_input_element, results): File > "apache_beam/runners/common.py", line 942, in > apache_beam.runners.common._OutputProcessor.process_outputs > self.main_receivers.receive(windowed_value) File > "apache_beam/runners/worker/operations.py", line 143, in > apache_beam.runners.worker.operations.SingletonConsumerSet.receive >