gemini-code-assist[bot] commented on code in PR #38851:
URL: https://github.com/apache/beam/pull/38851#discussion_r3377400387
##########
sdks/python/apache_beam/runners/worker/statesampler_test.py:
##########
@@ -312,6 +313,46 @@ def test_do_operation_process_timer_with_exception(self,
mock_get_dofn_specs):
actual_value, state_duration_ms * (1.0 - margin_of_error))
_LOGGER.info("Exception test finished successfully.")
+ def test_concurrent_nsecs_reads(self):
+ """Verify that concurrent reads of nsecs behave correctly under thread
contention.
+
+ This test runs state transitions on the main thread and reads `nsecs`
properties
+ from a secondary Python thread, while the background sampler thread is
concurrently
+ updating counter states.
+ """
+ if not statesampler.FAST_SAMPLER:
+ self.skipTest('Requires FAST_SAMPLER')
+
+ counter_factory = CounterFactory()
+ sampler = statesampler.StateSampler(
+ 'concurrent', counter_factory, sampling_period_ms=1)
+
+ sampler.start()
+ state_a = sampler.scoped_state('step1', 'statea')
+ state_b = sampler.scoped_state('step1', 'stateb')
+
+ stop_signal = False
+
+ def read_nsecs_loop():
+ while not stop_signal:
+ _ = state_a.nsecs
+ _ = state_b.nsecs
+ time.sleep(0.001)
+
+ reader_thread = threading.Thread(target=read_nsecs_loop)
+ reader_thread.start()
+
+ try:
+ for _ in range(100):
+ with state_a:
+ time.sleep(0.001)
+ with state_b:
+ time.sleep(0.001)
+ finally:
+ stop_signal = True
+ reader_thread.join()
+ sampler.stop()
Review Comment:

If an exception occurs during `sampler.scoped_state` or before the `try`
block is entered, `sampler.stop()` will not be called, leaving the background
sampler thread running. This can cause resource leaks and test flakiness in
subsequent tests.
We should wrap the entire block after `sampler.start()` in a `try...finally`
block to guarantee that the sampler is stopped and the reader thread is joined
if it was started.
```python
sampler.start()
reader_thread = None
try:
state_a = sampler.scoped_state('step1', 'statea')
state_b = sampler.scoped_state('step1', 'stateb')
stop_signal = False
def read_nsecs_loop():
while not stop_signal:
_ = state_a.nsecs
_ = state_b.nsecs
time.sleep(0.001)
reader_thread = threading.Thread(target=read_nsecs_loop)
reader_thread.start()
for _ in range(100):
with state_a:
time.sleep(0.001)
with state_b:
time.sleep(0.001)
finally:
if reader_thread is not None:
stop_signal = True
reader_thread.join()
sampler.stop()
```
##########
sdks/python/apache_beam/runners/worker/statesampler_fast.pyx:
##########
@@ -217,7 +217,12 @@ cdef class ScopedState(object):
@property
def nsecs(self):
- return self._nsecs
+ cdef int64_t val
+ with nogil:
+ pythread.PyThread_acquire_lock(self.sampler.lock, pythread.WAIT_LOCK)
+ val = self._nsecs
+ pythread.PyThread_release_lock(self.sampler.lock)
+ return val
Review Comment:

Accessing nested attributes of a `cdef` class (like `self.sampler.lock`)
inside a `with nogil` block can be unsafe. If `self.sampler` is `None`,
dereferencing it inside `nogil` can lead to a segmentation fault (null pointer
dereference) because Cython cannot safely raise a Python exception without the
GIL.
To ensure safety and avoid any implicit GIL acquisition or compilation
issues, extract the lock to a local variable before entering the `with nogil`
block.
```
cdef pythread.PyThread_type_lock lock = self.sampler.lock
cdef int64_t val
with nogil:
pythread.PyThread_acquire_lock(lock, pythread.WAIT_LOCK)
val = self._nsecs
pythread.PyThread_release_lock(lock)
return val
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]