[
https://issues.apache.org/jira/browse/BEAM-8403?focusedWorklogId=329005&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329005
]
ASF GitHub Bot logged work on BEAM-8403:
----------------------------------------
Author: ASF GitHub Bot
Created on: 16/Oct/19 07:19
Start Date: 16/Oct/19 07:19
Worklog Time Spent: 10m
Work Description: mxm commented on issue #9800: [BEAM-8403] Guard request
id generation to prevent concurrent worker access
URL: https://github.com/apache/beam/pull/9800#issuecomment-542559827
Thanks for the reviews!
I doubt that using the atomic long (native) library will make any measurable
difference. The reason is that in Python there can only ever be one thread
active at a time (due to the Global Interpreter Lock (GIL)), so the lock
basically comes for free. Only in the very rare case of a thread having to wait
for the lock (which even in tests it was hard to reproduce), there will be a
super minimal delay for switching to another thread to release the lock.
Generally speaking, we should be more concerned about the parallelism model
of the Python harness. Effectively, we are scarifying performance by letting
multiple Runner workers process data with a GIL-threaded Python harness. I'm
not sure if you have done performance measurements but the throughput of a
single SDK Harness with N workers vs N instances of an SDK Harness should be
huge. The former being effectively single-threaded while the latter having N
"real" threads.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 329005)
Time Spent: 1h 20m (was: 1h 10m)
> Race condition in request id generation of GrpcStateRequestHandler
> ------------------------------------------------------------------
>
> Key: BEAM-8403
> URL: https://issues.apache.org/jira/browse/BEAM-8403
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-harness
> Reporter: Maximilian Michels
> Assignee: Maximilian Michels
> Priority: Major
> Fix For: 2.17.0
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> There is a race condition in {{GrpcStateRequestHandler}} which surfaced after
> the recent changes to process append/clear state request asynchronously. The
> race condition can occur if multiple Runner workers process a transform with
> state requests with the same SDK Harness. For example, this setup occurs with
> Flink when a TaskManager has multiple task slots and two or more of those
> slots process the same stateful stage against an SDK Harness.
> CC [~robertwb]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)