[jira] [Updated] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever
[ https://issues.apache.org/jira/browse/GEODE-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-8200: -- Labels: GeodeOperationAPI blocks-1.15.0 pull-request-available (was: GeodeOperationAPI blocks-1.15.0) > Rebalance operations stuck in "IN_PROGRESS" state forever > - > > Key: GEODE-8200 > URL: https://issues.apache.org/jira/browse/GEODE-8200 > Project: Geode > Issue Type: Bug > Components: management >Affects Versions: 1.14.0, 1.15.0 >Reporter: Aaron Lindsey >Assignee: Anilkumar Gingade >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.0, pull-request-available > Attachments: GEODE-8200-exportedLogs.zip > > > We use the management REST API to call rebalance immediately before stopping > a server to limit the possibility of data loss. In a cluster with 3 locators, > 3 servers, and no regions, we noticed that sometimes the rebalance operation > never ends if one of the locators is restarting concurrently with the > rebalance operation. > More specifically, the scenario where we see this issue crop up is during an > automated "rolling restart" operation in a Kubernetes environment which > proceeds as follows: > * At most one locator and one server are restarting at any point in time > * Each locator/server waits until the previous locator/server is fully online > before restarting > * Immediately before stopping a server, a rebalance operation is performed > and the server is not stopped until the rebalance operation is completed > The impact of this issue is that the "rolling restart" operation will never > complete, because it cannot proceed with stopping a server until the > rebalance operation is completed. A human is then required to intervene and > manually trigger a rebalance and stop the server. This type of "rolling > restart" operation is triggered fairly often in Kubernetes — any time part of > the configuration of the locators or servers changes. > The following JSON is a sample response from the management REST API that > shows the rebalance operation stuck in "IN_PROGRESS". > {code} > { > "statusCode": "IN_PROGRESS", > "links": { > "self": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;, > "list": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances; > }, > "operationStart": "2020-05-27T22:38:30.619Z", > "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7", > "operation": { > "simulate": false > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever
[ https://issues.apache.org/jira/browse/GEODE-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nabarun Nag updated GEODE-8200: --- Affects Version/s: 1.15.0 1.14.0 > Rebalance operations stuck in "IN_PROGRESS" state forever > - > > Key: GEODE-8200 > URL: https://issues.apache.org/jira/browse/GEODE-8200 > Project: Geode > Issue Type: Bug > Components: management >Affects Versions: 1.14.0, 1.15.0 >Reporter: Aaron Lindsey >Assignee: Jianxia Chen >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.0 > Attachments: GEODE-8200-exportedLogs.zip > > > We use the management REST API to call rebalance immediately before stopping > a server to limit the possibility of data loss. In a cluster with 3 locators, > 3 servers, and no regions, we noticed that sometimes the rebalance operation > never ends if one of the locators is restarting concurrently with the > rebalance operation. > More specifically, the scenario where we see this issue crop up is during an > automated "rolling restart" operation in a Kubernetes environment which > proceeds as follows: > * At most one locator and one server are restarting at any point in time > * Each locator/server waits until the previous locator/server is fully online > before restarting > * Immediately before stopping a server, a rebalance operation is performed > and the server is not stopped until the rebalance operation is completed > The impact of this issue is that the "rolling restart" operation will never > complete, because it cannot proceed with stopping a server until the > rebalance operation is completed. A human is then required to intervene and > manually trigger a rebalance and stop the server. This type of "rolling > restart" operation is triggered fairly often in Kubernetes — any time part of > the configuration of the locators or servers changes. > The following JSON is a sample response from the management REST API that > shows the rebalance operation stuck in "IN_PROGRESS". > {code} > { > "statusCode": "IN_PROGRESS", > "links": { > "self": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;, > "list": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances; > }, > "operationStart": "2020-05-27T22:38:30.619Z", > "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7", > "operation": { > "simulate": false > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever
[ https://issues.apache.org/jira/browse/GEODE-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nabarun Nag updated GEODE-8200: --- Fix Version/s: (was: 1.13.1) (was: 1.14.0) > Rebalance operations stuck in "IN_PROGRESS" state forever > - > > Key: GEODE-8200 > URL: https://issues.apache.org/jira/browse/GEODE-8200 > Project: Geode > Issue Type: Bug > Components: management >Reporter: Aaron Lindsey >Assignee: Jianxia Chen >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.0 > Attachments: GEODE-8200-exportedLogs.zip > > > We use the management REST API to call rebalance immediately before stopping > a server to limit the possibility of data loss. In a cluster with 3 locators, > 3 servers, and no regions, we noticed that sometimes the rebalance operation > never ends if one of the locators is restarting concurrently with the > rebalance operation. > More specifically, the scenario where we see this issue crop up is during an > automated "rolling restart" operation in a Kubernetes environment which > proceeds as follows: > * At most one locator and one server are restarting at any point in time > * Each locator/server waits until the previous locator/server is fully online > before restarting > * Immediately before stopping a server, a rebalance operation is performed > and the server is not stopped until the rebalance operation is completed > The impact of this issue is that the "rolling restart" operation will never > complete, because it cannot proceed with stopping a server until the > rebalance operation is completed. A human is then required to intervene and > manually trigger a rebalance and stop the server. This type of "rolling > restart" operation is triggered fairly often in Kubernetes — any time part of > the configuration of the locators or servers changes. > The following JSON is a sample response from the management REST API that > shows the rebalance operation stuck in "IN_PROGRESS". > {code} > { > "statusCode": "IN_PROGRESS", > "links": { > "self": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;, > "list": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances; > }, > "operationStart": "2020-05-27T22:38:30.619Z", > "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7", > "operation": { > "simulate": false > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever
[ https://issues.apache.org/jira/browse/GEODE-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anilkumar Gingade updated GEODE-8200: - Labels: GeodeOperationAPI blocks-1.15.0 (was: GeodeOperationAPI) > Rebalance operations stuck in "IN_PROGRESS" state forever > - > > Key: GEODE-8200 > URL: https://issues.apache.org/jira/browse/GEODE-8200 > Project: Geode > Issue Type: Bug > Components: management >Reporter: Aaron Lindsey >Assignee: Jianxia Chen >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.0 > Fix For: 1.13.1, 1.14.0 > > Attachments: GEODE-8200-exportedLogs.zip > > > We use the management REST API to call rebalance immediately before stopping > a server to limit the possibility of data loss. In a cluster with 3 locators, > 3 servers, and no regions, we noticed that sometimes the rebalance operation > never ends if one of the locators is restarting concurrently with the > rebalance operation. > More specifically, the scenario where we see this issue crop up is during an > automated "rolling restart" operation in a Kubernetes environment which > proceeds as follows: > * At most one locator and one server are restarting at any point in time > * Each locator/server waits until the previous locator/server is fully online > before restarting > * Immediately before stopping a server, a rebalance operation is performed > and the server is not stopped until the rebalance operation is completed > The impact of this issue is that the "rolling restart" operation will never > complete, because it cannot proceed with stopping a server until the > rebalance operation is completed. A human is then required to intervene and > manually trigger a rebalance and stop the server. This type of "rolling > restart" operation is triggered fairly often in Kubernetes — any time part of > the configuration of the locators or servers changes. > The following JSON is a sample response from the management REST API that > shows the rebalance operation stuck in "IN_PROGRESS". > {code} > { > "statusCode": "IN_PROGRESS", > "links": { > "self": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;, > "list": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances; > }, > "operationStart": "2020-05-27T22:38:30.619Z", > "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7", > "operation": { > "simulate": false > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever
[ https://issues.apache.org/jira/browse/GEODE-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen Nichols updated GEODE-8200: Fix Version/s: 1.13.1 > Rebalance operations stuck in "IN_PROGRESS" state forever > - > > Key: GEODE-8200 > URL: https://issues.apache.org/jira/browse/GEODE-8200 > Project: Geode > Issue Type: Bug > Components: management >Reporter: Aaron Lindsey >Assignee: Jianxia Chen >Priority: Major > Labels: GeodeOperationAPI > Fix For: 1.14.0, 1.13.1 > > Attachments: GEODE-8200-exportedLogs.zip > > > We use the management REST API to call rebalance immediately before stopping > a server to limit the possibility of data loss. In a cluster with 3 locators, > 3 servers, and no regions, we noticed that sometimes the rebalance operation > never ends if one of the locators is restarting concurrently with the > rebalance operation. > More specifically, the scenario where we see this issue crop up is during an > automated "rolling restart" operation in a Kubernetes environment which > proceeds as follows: > * At most one locator and one server are restarting at any point in time > * Each locator/server waits until the previous locator/server is fully online > before restarting > * Immediately before stopping a server, a rebalance operation is performed > and the server is not stopped until the rebalance operation is completed > The impact of this issue is that the "rolling restart" operation will never > complete, because it cannot proceed with stopping a server until the > rebalance operation is completed. A human is then required to intervene and > manually trigger a rebalance and stop the server. This type of "rolling > restart" operation is triggered fairly often in Kubernetes — any time part of > the configuration of the locators or servers changes. > The following JSON is a sample response from the management REST API that > shows the rebalance operation stuck in "IN_PROGRESS". > {code} > { > "statusCode": "IN_PROGRESS", > "links": { > "self": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;, > "list": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances; > }, > "operationStart": "2020-05-27T22:38:30.619Z", > "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7", > "operation": { > "simulate": false > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever
[ https://issues.apache.org/jira/browse/GEODE-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinmei Liao updated GEODE-8200: --- Fix Version/s: 1.14.0 > Rebalance operations stuck in "IN_PROGRESS" state forever > - > > Key: GEODE-8200 > URL: https://issues.apache.org/jira/browse/GEODE-8200 > Project: Geode > Issue Type: Bug > Components: management >Reporter: Aaron Lindsey >Assignee: Jianxia Chen >Priority: Major > Labels: GeodeOperationAPI > Fix For: 1.14.0 > > Attachments: GEODE-8200-exportedLogs.zip > > > We use the management REST API to call rebalance immediately before stopping > a server to limit the possibility of data loss. In a cluster with 3 locators, > 3 servers, and no regions, we noticed that sometimes the rebalance operation > never ends if one of the locators is restarting concurrently with the > rebalance operation. > More specifically, the scenario where we see this issue crop up is during an > automated "rolling restart" operation in a Kubernetes environment which > proceeds as follows: > * At most one locator and one server are restarting at any point in time > * Each locator/server waits until the previous locator/server is fully online > before restarting > * Immediately before stopping a server, a rebalance operation is performed > and the server is not stopped until the rebalance operation is completed > The impact of this issue is that the "rolling restart" operation will never > complete, because it cannot proceed with stopping a server until the > rebalance operation is completed. A human is then required to intervene and > manually trigger a rebalance and stop the server. This type of "rolling > restart" operation is triggered fairly often in Kubernetes — any time part of > the configuration of the locators or servers changes. > The following JSON is a sample response from the management REST API that > shows the rebalance operation stuck in "IN_PROGRESS". > {code} > { > "statusCode": "IN_PROGRESS", > "links": { > "self": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;, > "list": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances; > }, > "operationStart": "2020-05-27T22:38:30.619Z", > "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7", > "operation": { > "simulate": false > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever
[ https://issues.apache.org/jira/browse/GEODE-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Lindsey updated GEODE-8200: - Attachment: GEODE-8200-exportedLogs.zip > Rebalance operations stuck in "IN_PROGRESS" state forever > - > > Key: GEODE-8200 > URL: https://issues.apache.org/jira/browse/GEODE-8200 > Project: Geode > Issue Type: Bug > Components: management >Reporter: Aaron Lindsey >Assignee: Jianxia Chen >Priority: Major > Labels: GeodeOperationAPI > Attachments: GEODE-8200-exportedLogs.zip > > > We use the management REST API to call rebalance immediately before stopping > a server to limit the possibility of data loss. In a cluster with 3 locators, > 3 servers, and no regions, we noticed that sometimes the rebalance operation > never ends if one of the locators is restarting concurrently with the > rebalance operation. > More specifically, the scenario where we see this issue crop up is during an > automated "rolling restart" operation in a Kubernetes environment which > proceeds as follows: > * At most one locator and one server are restarting at any point in time > * Each locator/server waits until the previous locator/server is fully online > before restarting > * Immediately before stopping a server, a rebalance operation is performed > and the server is not stopped until the rebalance operation is completed > The impact of this issue is that the "rolling restart" operation will never > complete, because it cannot proceed with stopping a server until the > rebalance operation is completed. A human is then required to intervene and > manually trigger a rebalance and stop the server. This type of "rolling > restart" operation is triggered fairly often in Kubernetes — any time part of > the configuration of the locators or servers changes. > The following JSON is a sample response from the management REST API that > shows the rebalance operation stuck in "IN_PROGRESS". > {code} > { > "statusCode": "IN_PROGRESS", > "links": { > "self": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;, > "list": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances; > }, > "operationStart": "2020-05-27T22:38:30.619Z", > "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7", > "operation": { > "simulate": false > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever
[ https://issues.apache.org/jira/browse/GEODE-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anilkumar Gingade updated GEODE-8200: - Labels: GeodeOperationAPI (was: ) > Rebalance operations stuck in "IN_PROGRESS" state forever > - > > Key: GEODE-8200 > URL: https://issues.apache.org/jira/browse/GEODE-8200 > Project: Geode > Issue Type: Bug > Components: management >Reporter: Aaron Lindsey >Priority: Major > Labels: GeodeOperationAPI > > We use the management REST API to call rebalance immediately before stopping > a server to limit the possibility of data loss. In a cluster with 3 locators, > 3 servers, and no regions, we noticed that sometimes the rebalance operation > never ends if one of the locators is restarting concurrently with the > rebalance operation. > More specifically, the scenario where we see this issue crop up is during an > automated "rolling restart" operation in a Kubernetes environment which > proceeds as follows: > * At most one locator and one server are restarting at any point in time > * Each locator/server waits until the previous locator/server is fully online > before restarting > * Immediately before stopping a server, a rebalance operation is performed > and the server is not stopped until the rebalance operation is completed > The impact of this issue is that the "rolling restart" operation will never > complete, because it cannot proceed with stopping a server until the > rebalance operation is completed. A human is then required to intervene and > manually trigger a rebalance and stop the server. This type of "rolling > restart" operation is triggered fairly often in Kubernetes — any time part of > the configuration of the locators or servers changes. > The following JSON is a sample response from the management REST API that > shows the rebalance operation stuck in "IN_PROGRESS". > {code} > { > "statusCode": "IN_PROGRESS", > "links": { > "self": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;, > "list": > "http://geodecluster-sample-locator.default/management/v1/operations/rebalances; > }, > "operationStart": "2020-05-27T22:38:30.619Z", > "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7", > "operation": { > "simulate": false > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)