rafalpotempa opened a new issue, #27501:
URL: https://github.com/apache/beam/issues/27501

   ### What happened?
   
   Apache Beam: 2.46.0
   `Direct Runner`
   
   We are running some parts of our pipelines using Direct Runners, since we 
are processing small batches of data there. The rest is using Dataflow runner, 
which has some auto-recovery. For direct runners it fails the run and triggers 
alerts for SRE team.
   
   I couldn't reproduce the issue myself, the issue is non-deterministic.
   The part of our company's proprietary system sometimes (not very often ~100 
runs) fails with:
   > apache_beam/runners/worker/sdk_worker.py in 
shutdown_inactive_bundle_processors at line 585
   ```log
   Exception in thread Thread-37:
   Traceback (most recent call last):
     File "/usr/local/lib/python3.9/threading.py", line 980, in _bootstrap_inner
       self.run()
     File 
"/usr/local/lib/python3.9/site-packages/sentry_sdk/integrations/threading.py", 
line 72, in run
       reraise(*_capture_exception())
     File "/usr/local/lib/python3.9/site-packages/sentry_sdk/_compat.py", line 
60, in reraise
       raise value
     File 
"/usr/local/lib/python3.9/site-packages/sentry_sdk/integrations/threading.py", 
line 70, in run
       return old_run_func(self, *a, **kw)
     File 
"/usr/local/lib/python3.9/site-packages/apache_beam/runners/worker/data_plane.py",
 line 228, in run
       self._function(*self._args, **self._kwargs)
     File 
"/usr/local/lib/python3.9/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 585, in shutdown_inactive_bundle_processors
       for descriptor_id, last_access_time in self.last_access_times.items():
   RuntimeError: dictionary changed size during iteration
   ```
   
   ### Issue Priority
   
   Priority: 2 (default / most bugs should be filed as P2)
   
   ### Issue Components
   
   - [X] Component: Python SDK
   - [ ] Component: Java SDK
   - [ ] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [ ] Component: IO connector
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Spark Runner
   - [ ] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [ ] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to