[ 
https://issues.apache.org/jira/browse/BEAM-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Burke updated BEAM-9039:
-------------------------------
    Status: Open  (was: Triage Needed)

> Fix Datachannel stuckness on errors
> -----------------------------------
>
>                 Key: BEAM-9039
>                 URL: https://issues.apache.org/jira/browse/BEAM-9039
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-go
>    Affects Versions: Not applicable
>            Reporter: Robert Burke
>            Assignee: Robert Burke
>            Priority: Major
>             Fix For: Not applicable
>
>
> Catch all task for any data channel stuckness issues, in particular if any 
> happen on errors.
> The last known one I have is a race condition on DataChannel.readErr 
> Close off a race condition where a closing DataChannel might have new readers 
> created for it while it is failing, causing stuckness in the bundles.
> In particular, the c.readErr must be interacted while c.mu is held.
> Otherwise something like the following happens.
> Given a channel C, and goroutines G1,G2.
>  # G1 A request for a new reader on C arrives, checks C.readErr finds it null.
>  # G2 An error occurs on reading. The lock is acquired, and C.readErr is set. 
> Readers are closed. The channel is officially closed with A.forceRecreate, 
> removing it from the DataManager cache.
>  # G1 calls A.makeReader, and creates a new reader there.
> There could be an arbitrary number of G1s.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to