kennknowles opened a new issue, #19156: URL: https://github.com/apache/beam/issues/19156
The Python [load test of ParDo operation for SyntheticSources](https://github.com/apache/beam/blob/faff82860c66e4050f0cfa5e874ffe6035ed0c1c/sdks/python/apache_beam/testing/load_tests/par_do_test.py#L133) that I created contains parametrized loop of ParDo with no operation inside besides metrics (this issue). With setting the number of iterations to \>~200 and running the test on DirectRunner I was encountering test failures. The test outputs whole (really long) pipeline logs. Some test runs raised the following exception: ``` Traceback (most recent call last): File "/Users/kasia/Repos/beam/sdks/python/apache_beam/testing/load_tests/par_do_test.py", line 144, in testParDo result = p.run() File "/Users/kasia/Repos/beam/sdks/python/apache_beam/testing/test_pipeline.py", line 104, in run result = super(TestPipeline, self).run(test_runner_api) File "/Users/kasia/Repos/beam/sdks/python/apache_beam/pipeline.py", line 403, in run self.to_runner_api(), self.runner, self._options).run(False) File "/Users/kasia/Repos/beam/sdks/python/apache_beam/pipeline.py", line 416, in run return self.runner.run_pipeline(self) File "/Users/kasia/Repos/beam/sdks/python/apache_beam/runners/direct/direct_runner.py", line 139, in run_pipeline return runner.run_pipeline(pipeline) File "/Users/kasia/Repos/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", line 229, in run_pipeline return self.run_via_runner_api(pipeline.to_runner_api()) File "/Users/kasia/Repos/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", line 232, in run_via_runner_api return self.run_stages(*self.create_stages(pipeline_proto)) File "/Users/kasia/Repos/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", line 1015, in run_stages pcoll_buffers, safe_coders).process_bundle.metrics File "/Users/kasia/Repos/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", line 1132, in run_stage self._progress_frequency).process_bundle(data_input, data_output) File "/Users/kasia/Repos/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", line 1388, in process_bundle result_future = self._controller.control_handler.push(process_bundle) File "/Users/kasia/Repos/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", line 1260, in push response = self.worker.do_instruction(request) File "/Users/kasia/Repos/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", line 212, in do_instruction request.instruction_id) File "/Users/kasia/Repos/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", line 231, in process_bundle self.data_channel_factory) File "/Users/kasia/Repos/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", line 343, in __init__ self.ops = self.create_execution_tree(self.process_bundle_descriptor) File "/Users/kasia/Repos/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", line 385, in create_execution_tree descriptor.transforms, key=topological_height, reverse=True)]) File "/Users/kasia/Repos/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", line 320, in wrapper result = cache[args] = func(*args) File "/Users/kasia/Repos/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", line 368, in get_operation in descriptor.transforms[transform_id].outputs.items() File "/Users/kasia/Repos/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", line 367, in <dictcomp> for tag, pcoll_id ... (3 last lines repeated for long period) RuntimeError: maximum recursion depth exceeded ``` From my observation, I can say the problem appeared with various iteration number depending on computer resources. On my weaker computer started failing on ~150 iterations. The test succeeds on DataFlow with 1000 iterations (I didn't check higher number). I provide whole test output in Attachements. Imported from Jira [BEAM-5777](https://issues.apache.org/jira/browse/BEAM-5777). Original Jira may contain additional context. Reported by: kasiak. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
