[jira] [Work logged] (BEAM-8403) Race condition in request id generation of GrpcStateRequestHandler

ASF GitHub Bot (Jira) Wed, 16 Oct 2019 00:20:16 -0700


     [ 
https://issues.apache.org/jira/browse/BEAM-8403?focusedWorklogId=329005&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329005
 ]


ASF GitHub Bot logged work on BEAM-8403:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 16/Oct/19 07:19
            Start Date: 16/Oct/19 07:19
    Worklog Time Spent: 10m 
      Work Description: mxm commented on issue #9800: [BEAM-8403] Guard request 
id generation to prevent concurrent worker access
URL: https://github.com/apache/beam/pull/9800#issuecomment-542559827
 
 
   Thanks for the reviews!
   
   I doubt that using the atomic long (native) library will make any measurable 
difference. The reason is that in Python there can only ever be one thread 
active at a time (due to the Global Interpreter Lock (GIL)), so the lock 
basically comes for free. Only in the very rare case of a thread having to wait 
for the lock (which even in tests it was hard to reproduce), there will be a 
super minimal delay for switching to another thread to release the lock.
   
   Generally speaking, we should be more concerned about the parallelism model 
of the Python harness. Effectively, we are scarifying performance by letting 
multiple Runner workers process data with a GIL-threaded Python harness. I'm 
not sure if you have done performance measurements but the throughput of a 
single SDK Harness with N workers vs N instances of an SDK Harness should be 
huge. The former being effectively single-threaded while the latter having N 
"real" threads.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 329005)
    Time Spent: 1h 20m  (was: 1h 10m)

> Race condition in request id generation of GrpcStateRequestHandler
> ------------------------------------------------------------------
>
>                 Key: BEAM-8403
>                 URL: https://issues.apache.org/jira/browse/BEAM-8403
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-harness
>            Reporter: Maximilian Michels
>            Assignee: Maximilian Michels
>            Priority: Major
>             Fix For: 2.17.0
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> There is a race condition in {{GrpcStateRequestHandler}} which surfaced after 
> the recent changes to process append/clear state request asynchronously. The 
> race condition can occur if multiple Runner workers process a transform with 
> state requests with the same SDK Harness. For example, this setup occurs with 
> Flink when a TaskManager has multiple task slots and two or more of those 
> slots process the same stateful stage against an SDK Harness.
> CC [~robertwb]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-8403) Race condition in request id generation of GrpcStateRequestHandler

Reply via email to