TheNeuralBit opened a new issue, #22440:
URL: https://github.com/apache/beam/issues/22440

   ### What happened?
   
   See 
https://ci-beam.apache.org/job/beam_LoadTests_Python_SideInput_Dataflow_Batch/654/console
   
   Job is failing with
   ```
   13:00:34 
INFO:apache_beam.runners.dataflow.dataflow_runner:2022-07-25T20:00:32.836Z: 
JOB_MESSAGE_BASIC: Stopping **** pool...
   13:00:34 INFO:apache_beam.runners.dataflow.dataflow_runner:Job 
2022-07-25_09_00_14-1969075893707144978 is in state JOB_STATE_CANCELLING
   13:00:35 
INFO:apache_beam.runners.dataflow.dataflow_runner:2022-07-25T20:01:07.531Z: 
JOB_MESSAGE_DETAILED: Autoscaling: Resized **** pool from 10 to 0.
   13:01:08 
INFO:apache_beam.runners.dataflow.dataflow_runner:2022-07-25T20:01:07.588Z: 
JOB_MESSAGE_BASIC: Worker pool stopped.
   13:01:08 
INFO:apache_beam.runners.dataflow.dataflow_runner:2022-07-25T20:01:07.612Z: 
JOB_MESSAGE_DEBUG: Tearing down pending resources...
   13:01:08 INFO:apache_beam.runners.dataflow.dataflow_runner:Job 
2022-07-25_09_00_14-1969075893707144978 is in state JOB_STATE_CANCELLED
   13:01:15 ERROR:apache_beam.runners.dataflow.dataflow_runner:Console URL: 
https://console.cloud.google.com/dataflow/jobs/<RegionId>/2022-07-25_09_00_14-1969075893707144978?project=<ProjectId>
   13:01:16 Traceback (most recent call last):
   13:01:16   File "/usr/lib/python3.7/runpy.py", line 193, in 
_run_module_as_main
   13:01:16     "__main__", mod_spec)
   13:01:16   File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
   13:01:16     exec(code, run_globals)
   13:01:16   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_SideInput_Dataflow_Batch/src/sdks/python/apache_beam/testing/load_tests/sideinput_test.py",
 line 216, in <module>
   13:01:16     SideInputTest().run()
   13:01:16   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_SideInput_Dataflow_Batch/src/sdks/python/apache_beam/testing/load_tests/load_test.py",
 line 151, in run
   13:01:16     self.result.wait_until_finish(duration=self.timeout_ms)
   13:01:16   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_SideInput_Dataflow_Batch/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
 line 1676, in wait_until_finish
   13:01:16     self)
   13:01:16 
apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow 
pipeline failed. State: CANCELLED, Error:
   13:01:16 None
   13:01:16 
   13:01:17 > Task :sdks:python:apache_beam:testing:load_tests:run FAILED
   ```
   
   Opening up the Dataflow console we see the following errors in the worker 
logs:
   ```
   An exception was raised when trying to execute the workitem 
8003945814083367836 : Traceback (most recent call last):
     File "apache_beam/runners/common.py", line 1417, in 
apache_beam.runners.common.DoFnRunner.process
     File "apache_beam/runners/common.py", line 837, in 
apache_beam.runners.common.PerWindowInvoker.invoke_process
     File "apache_beam/runners/common.py", line 983, in 
apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
     File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_SideInput_Dataflow_Batch/src/sdks/python/apache_beam/testing/load_tests/sideinput_test.py",
 line 123, in process
     File 
"/usr/local/lib/python3.7/site-packages/apache_beam/transforms/sideinputs.py", 
line 114, in __iter__
       for wv in self._iterable:
     File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sideinputs.py",
 line 180, in __iter__
       raise self.reader_exceptions.get()
     File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sideinputs.py",
 line 130, in _reader_thread
       for value in reader:
     File 
"/usr/local/lib/python3.7/site-packages/dataflow_worker/nativefileio.py", line 
204, in __iter__
       for record in self.read_next_block():
     File 
"/usr/local/lib/python3.7/site-packages/dataflow_worker/nativeavroio.py", line 
362, in read_next_block
       fastavro_block = next(self._block_iterator)
     File "fastavro/_read.pyx", line 1051, in 
fastavro._read.file_reader.__next__
     File "fastavro/_read.pyx", line 953, in _iter_avro_blocks
     File "fastavro/_read.pyx", line 854, in fastavro._read.snappy_read_block
     File "fastavro/_read.pyx", line 856, in fastavro._read.snappy_read_block
     File 
"/usr/local/lib/python3.7/site-packages/apache_beam/io/filesystemio.py", line 
112, in readinto
       data = self._downloader.get_range(start, end)
     File "/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/gcsio.py", 
line 701, in get_range
       self._downloader.GetRange(start, end - 1)
     File 
"/usr/local/lib/python3.7/site-packages/apitools/base/py/transfer.py", line 
486, in GetRange
       response = self.__ProcessResponse(response)
     File 
"/usr/local/lib/python3.7/site-packages/apitools/base/py/transfer.py", line 
424, in __ProcessResponse
       raise exceptions.HttpError.FromResponse(response)
   apitools.base.py.exceptions.HttpNotFoundError: HttpError accessing 
<https://www.googleapis.com/storage/v1/b/temp-storage-for-perf-tests/o/loadtests%2Fload-tests-python-dataflow-batch-sideinput-9-0725100319.1658764814.369494%2Ftmp-e6377a3786602f76-00003-of-00028.avro?alt=media&generation=1658765290656871>:
 response: <{'x-guploader-uploadid': 
'ADPycds9lcszewL5vWbRPedbhn7ewfjlPPHltdw2rk8IwZa4aEcwdveOb1sNgE5JHOXCGd166jZ-q0raGSyH1mAAOj9ZEA',
 'content-type': 'text/html; charset=UTF-8', 'date': 'Mon, 25 Jul 2022 20:00:32 
GMT', 'vary': 'Origin, X-Origin', 'expires': 'Mon, 25 Jul 2022 20:00:32 GMT', 
'cache-control': 'private, max-age=0', 'content-length': '168', 'server': 
'UploadServer', 'status': '404'}>, content <No such object: 
temp-storage-for-perf-tests/loadtests/load-tests-python-dataflow-batch-sideinput-9-0725100319.1658764814.369494/tmp-e6377a3786602f76-00003-of-00028.avro>
   ```
   
   ### Issue Priority
   
   Priority: 1
   
   ### Issue Component
   
   Component: test-failures


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to