[jira] [Updated] (GEODE-5307) Hang with servers all in waitForPrimaryMember and one server in NO_PRIMARY_HOSTING state
[ https://issues.apache.org/jira/browse/GEODE-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nabarun updated GEODE-5307: --- Fix Version/s: (was: 1.8.0) 1.7.0 > Hang with servers all in waitForPrimaryMember and one server in > NO_PRIMARY_HOSTING state > > > Key: GEODE-5307 > URL: https://issues.apache.org/jira/browse/GEODE-5307 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.1.0, 1.2.0, 1.3.0, 1.2.1, 1.4.0, 1.5.0, 1.6.0 >Reporter: Bruce Schuchardt >Assignee: Bruce Schuchardt >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > I've run into a hang in a test where servers are continuously creating PRs, > doing putAll ops on them and closing/local-destroying the PR. Sometimes the > servers hang with any thread needing a particular bucket in > waitingForPrimaryMember(). > This seems to happen because of this sequence of events: > 1. two servers create a partitioned region > 2. one server initiates a putAll and requests the other server manage a bucket > 3. the putAll server closes or locally-destroys its region > 4. the close() operation completes > 5. the other server initializes its bucket and still uses the requesting > server as a primaryElector. This keeps it from deciding to volunteer to > become primary. > The problem is that the server that closed its region caused exceptions to be > thrown in the putAll thread and abandon creation of the bucket. No-one will > ever trip the switch that makes the other server become the primary for the > bucket. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (GEODE-5307) Hang with servers all in waitForPrimaryMember and one server in NO_PRIMARY_HOSTING state
[ https://issues.apache.org/jira/browse/GEODE-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-5307: -- Labels: pull-request-available (was: ) > Hang with servers all in waitForPrimaryMember and one server in > NO_PRIMARY_HOSTING state > > > Key: GEODE-5307 > URL: https://issues.apache.org/jira/browse/GEODE-5307 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.1.0, 1.2.0, 1.3.0, 1.2.1, 1.4.0, 1.5.0, 1.6.0 >Reporter: Bruce Schuchardt >Assignee: Bruce Schuchardt >Priority: Major > Labels: pull-request-available > > I've run into a hang in a test where servers are continuously creating PRs, > doing putAll ops on them and closing/local-destroying the PR. Sometimes the > servers hang with any thread needing a particular bucket in > waitingForPrimaryMember(). > This seems to happen because of this sequence of events: > 1. two servers create a partitioned region > 2. one server initiates a putAll and requests the other server manage a bucket > 3. the putAll server closes or locally-destroys its region > 4. the close() operation completes > 5. the other server initializes its bucket and still uses the requesting > server as a primaryElector. This keeps it from deciding to volunteer to > become primary. > The problem is that the server that closed its region caused exceptions to be > thrown in the putAll thread and abandon creation of the bucket. No-one will > ever trip the switch that makes the other server become the primary for the > bucket. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (GEODE-5307) Hang with servers all in waitForPrimaryMember and one server in NO_PRIMARY_HOSTING state
[ https://issues.apache.org/jira/browse/GEODE-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce Schuchardt updated GEODE-5307: Affects Version/s: 1.1.0 1.2.0 1.3.0 1.2.1 1.4.0 1.5.0 1.6.0 > Hang with servers all in waitForPrimaryMember and one server in > NO_PRIMARY_HOSTING state > > > Key: GEODE-5307 > URL: https://issues.apache.org/jira/browse/GEODE-5307 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.1.0, 1.2.0, 1.3.0, 1.2.1, 1.4.0, 1.5.0, 1.6.0 >Reporter: Bruce Schuchardt >Assignee: Bruce Schuchardt >Priority: Major > > I've run into a hang in a test where servers are continuously creating PRs, > doing putAll ops on them and closing/local-destroying the PR. Sometimes the > servers hang with any thread needing a particular bucket in > waitingForPrimaryMember(). > This seems to happen because of this sequence of events: > 1. two servers create a partitioned region > 2. one server initiates a putAll and requests the other server manage a bucket > 3. the putAll server closes or locally-destroys its region > 4. the close() operation completes > 5. the other server initializes its bucket and still uses the requesting > server as a primaryElector. This keeps it from deciding to volunteer to > become primary. > The problem is that the server that closed its region caused exceptions to be > thrown in the putAll thread and abandon creation of the bucket. No-one will > ever trip the switch that makes the other server become the primary for the > bucket. -- This message was sent by Atlassian JIRA (v7.6.3#76005)