rafalpotempa opened a new issue, #27501:
URL: https://github.com/apache/beam/issues/27501
### What happened?
Apache Beam: 2.46.0
`Direct Runner`
We are running some parts of our pipelines using Direct Runners, since we
are processing small batches of data there. The rest is using Dataflow runner,
which has some auto-recovery. For direct runners it fails the run and triggers
alerts for SRE team.
I couldn't reproduce the issue myself, the issue is non-deterministic.
The part of our company's proprietary system sometimes (not very often ~100
runs) fails with:
> apache_beam/runners/worker/sdk_worker.py in
shutdown_inactive_bundle_processors at line 585
```log
Exception in thread Thread-37:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/threading.py", line 980, in _bootstrap_inner
self.run()
File
"/usr/local/lib/python3.9/site-packages/sentry_sdk/integrations/threading.py",
line 72, in run
reraise(*_capture_exception())
File "/usr/local/lib/python3.9/site-packages/sentry_sdk/_compat.py", line
60, in reraise
raise value
File
"/usr/local/lib/python3.9/site-packages/sentry_sdk/integrations/threading.py",
line 70, in run
return old_run_func(self, *a, **kw)
File
"/usr/local/lib/python3.9/site-packages/apache_beam/runners/worker/data_plane.py",
line 228, in run
self._function(*self._args, **self._kwargs)
File
"/usr/local/lib/python3.9/site-packages/apache_beam/runners/worker/sdk_worker.py",
line 585, in shutdown_inactive_bundle_processors
for descriptor_id, last_access_time in self.last_access_times.items():
RuntimeError: dictionary changed size during iteration
```
### Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
### Issue Components
- [X] Component: Python SDK
- [ ] Component: Java SDK
- [ ] Component: Go SDK
- [ ] Component: Typescript SDK
- [ ] Component: IO connector
- [ ] Component: Beam examples
- [ ] Component: Beam playground
- [ ] Component: Beam katas
- [ ] Component: Website
- [ ] Component: Spark Runner
- [ ] Component: Flink Runner
- [ ] Component: Samza Runner
- [ ] Component: Twister2 Runner
- [ ] Component: Hazelcast Jet Runner
- [ ] Component: Google Cloud Dataflow Runner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]