[ https://issues.apache.org/jira/browse/BEAM-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Burke updated BEAM-9039: ------------------------------- Status: Open (was: Triage Needed) > Fix Datachannel stuckness on errors > ----------------------------------- > > Key: BEAM-9039 > URL: https://issues.apache.org/jira/browse/BEAM-9039 > Project: Beam > Issue Type: Sub-task > Components: sdk-go > Affects Versions: Not applicable > Reporter: Robert Burke > Assignee: Robert Burke > Priority: Major > Fix For: Not applicable > > > Catch all task for any data channel stuckness issues, in particular if any > happen on errors. > The last known one I have is a race condition on DataChannel.readErr > Close off a race condition where a closing DataChannel might have new readers > created for it while it is failing, causing stuckness in the bundles. > In particular, the c.readErr must be interacted while c.mu is held. > Otherwise something like the following happens. > Given a channel C, and goroutines G1,G2. > # G1 A request for a new reader on C arrives, checks C.readErr finds it null. > # G2 An error occurs on reading. The lock is acquired, and C.readErr is set. > Readers are closed. The channel is officially closed with A.forceRecreate, > removing it from the DataManager cache. > # G1 calls A.makeReader, and creates a new reader there. > There could be an arbitrary number of G1s. > -- This message was sent by Atlassian Jira (v8.3.4#803005)