[ https://issues.apache.org/jira/browse/CASSSIDECAR-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18005216#comment-18005216 ]
Yifan Cai commented on CASSSIDECAR-266: --------------------------------------- {quote}In the scenarios above the intent would not be cleared right ? {quote} Right. The intent field won't be cleared up, if the intent is not fulfilled. In the case of operation failure, it would be helpful if the operation status response includes the result and ask for operator's interference. The message field sort of indicates the result, but its an arbitrary text is hard to parse programmatically. A dedicated field would help. An sample failure response could look like this. {code:java} { "state": "up", "intent": "down", "result": "failed", "message": "Unsafe to stop currently because there are not enough replicas available" } {code} WDYT? > Implement lifecycle APIs for safely stopping, starting, and restarting local > Cassandra instances > ------------------------------------------------------------------------------------------------- > > Key: CASSSIDECAR-266 > URL: https://issues.apache.org/jira/browse/CASSSIDECAR-266 > Project: Sidecar for Apache Cassandra > Issue Type: New Feature > Components: Rest API > Reporter: Andres Beck-Ruiz > Assignee: Andres Beck-Ruiz > Priority: Normal > > We would like to implement APIs to safely stop, start, and restart local > connected Cassandra instances through Cassandra Sidecar in a generic way. > This could lead to future work to implement Cassandra native rolling restarts > in Sidecar and automate the Cassandra upgrade process. > We propose implementing an {{AbstractLifecycleOperationsHandler}} interface > that defines start, stop, restart, and status endpoints to allow Sidecar > operators to implement their own lifecycle handlers, depending on how they > host their Cassandra processes. To provide a default implementation, we would > create a {{LocalProcessLifecycleOperationsHandler}} to implement this > interface and provide lifecycle operations for OS native Cassandra processes. > This could be defined as the default lifecycle manager in > {{{}sidecar.yaml{}}}, disabled by default. > We propose the following APIs, leveraging the {{OperationalJob}} interface to > provide support for async non-blocking jobs. We will use the existing > implemented {{OperationalJobRoute}} , > {{/api/v1/cassandra/operational-jobs/:operationId}} , to track the status of > these jobs. These endpoints will live under a {{/node}} path to specify > operations on the local connected Cassandra instance, allowing for future > development of lifecycle endpoints for an entire Cassandra cluster: > h5. *GET /api/v1/cassandra/operations/lifecycle/node/status* > Gets the status of whether the local Cassandra process is running. > h6. Response > * 200 Ok > ** {{cassandra_running :: bool}} > * 500 Internal Sever Error > ** {{error :: string}} > h5. *POST /api/v1/cassandra/operations/lifecycle/node/start* > Start the connected Cassandra process. This request will succeed if the > process is already started to ensure idempotency. > h6. Parameters > * {{block :: boolean (default False)}} > h6. Response > * 202 Accepted > ** {{operationId :: string}} > * 500 Internal Sever Error > ** {{error :: string}} > h5. *POST /api/v1/cassandra/operations/lifecycle/node/stop* > Stop the connected Cassandra process after a pluggable health check passes. > This request will succeed if the process is already stopped to ensure > idempotency. > h6. Parameters > * {{block :: boolean (default False)}} > * {{skipHealthCheck :: boolean (default False)}} > h6. Response > * 202 Accepted > ** {{operationId :: string}} > * 412 Precondition Failed > ** {{error :: string (health check fails)}} > * 500 Internal Sever Error > ** {{error :: string}} > h5. *POST /api/v1/cassandra/operations/lifecycle/node/restart* > Restart the connected Cassandra process after a pluggable health check > passes. > h6. Parameters > * {{block :: boolean (default False)}} > * {{skipHealthCheck :: boolean (default False)}} > h6. Response > * 202 Accepted > ** {{operationId :: string}} > * 412 Precondition Failed > ** {{error :: string (health check fails)}} > * 500 Internal Sever Error > ** {{error :: string}} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org