[ 
https://issues.apache.org/jira/browse/CASSSIDECAR-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18003903#comment-18003903
 ] 

Andres Beck-Ruiz commented on CASSSIDECAR-266:
----------------------------------------------

After discussing with my team, we would like to propose declarative and RESTful 
lifecycle APIs, following the principles of existing Sidecar APIs and allowing 
this functionality to be asynchronous. This also aligns with the original 
proposal outlined in 
[CEP-1|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652224].
 This API would include an intent based system, with lifecycle handlers 
updating and retrieving intent from an {{{}IntentManager{}}}, which will send 
stop and start actions to the {{AbstractLifecycleManager}} on update and 
periodically if intent does not match state. We will limit the scope of this 
ticket to two APIs:
h5. *POST /api/v1/cassandra/lifecycle/<nodeId>*

Specify intent for a Cassandra node to be up or down. This request will succeed 
if the intent for the node was recorded.
h6. Parameters
 * {{state :: string <up|down>}}

h6. Response
 * 202 Accepted
 * 412 Precondition Failed (such as lifecycle not being enabled or healthcheck 
failing)
 ** {{error :: string}}

 * 500 Internal Sever Error
 ** {{error :: string}}

h5. *GET /api/v1/cassandra/lifecycle/<nodeId>*

Gets the state and intent of a Cassandra node, including an optional message 
explaining why these do not match (eg: why a stop or start action failed). 
h6. Response
 * 200 Ok
 ** {{{{}state :: string <up|down>,{}}}
{{intent :: string <up|down>,}}
{{{}message :: string{}}}}

 * 404 Not Found
 ** {{error :: string (Node <nodeId> does not exist)}}
 * 500 Internal Sever Error
 ** {{error :: string}}

In the future, we are planning to build a restart API on top of this basic 
start/stop lifecycle API, which would allow restarting anywhere from a single 
node up to an entire cluster. 

Please let me know your thoughts, any feedback is appreciated. 

> Implement lifecycle APIs for safely stopping, starting, and restarting local 
> Cassandra instances 
> -------------------------------------------------------------------------------------------------
>
>                 Key: CASSSIDECAR-266
>                 URL: https://issues.apache.org/jira/browse/CASSSIDECAR-266
>             Project: Sidecar for Apache Cassandra
>          Issue Type: New Feature
>          Components: Rest API
>            Reporter: Andres Beck-Ruiz
>            Priority: Normal
>
> We would like to implement APIs to safely stop, start, and restart local 
> connected Cassandra instances through Cassandra Sidecar in a generic way. 
> This could lead to future work to implement Cassandra native rolling restarts 
> in Sidecar and automate the Cassandra upgrade process. 
> We propose implementing an {{AbstractLifecycleOperationsHandler}} interface 
> that defines start, stop, restart, and status endpoints to allow Sidecar 
> operators to implement their own lifecycle handlers, depending on how they 
> host their Cassandra processes. To provide a default implementation, we would 
> create a {{LocalProcessLifecycleOperationsHandler}} to implement this 
> interface and provide lifecycle operations for OS native Cassandra processes. 
> This could be defined as the default lifecycle manager in 
> {{{}sidecar.yaml{}}}, disabled by default.
> We propose the following APIs, leveraging the {{OperationalJob}} interface to 
> provide support for async non-blocking jobs. We will use the existing 
> implemented {{OperationalJobRoute}} , 
> {{/api/v1/cassandra/operational-jobs/:operationId}} , to track the status of 
> these jobs. These endpoints will live under a {{/node}} path to specify 
> operations on the local connected Cassandra instance, allowing for future 
> development of lifecycle endpoints for an entire Cassandra cluster:
> h5. *GET /api/v1/cassandra/operations/lifecycle/node/status*
> Gets the status of whether the local Cassandra process is running. 
> h6. Response
>  * 200 Ok
>  ** {{cassandra_running :: bool}}
>  * 500 Internal Sever Error
>  ** {{error :: string}}
> h5. *POST /api/v1/cassandra/operations/lifecycle/node/start*
> Start the connected Cassandra process. This request will succeed if the 
> process is already started to ensure idempotency.
> h6. Parameters
>  * {{block :: boolean (default False)}}
> h6. Response
>  * 202 Accepted
>  ** {{operationId :: string}}
>  * 500 Internal Sever Error
>  ** {{error :: string}}
> h5. *POST /api/v1/cassandra/operations/lifecycle/node/stop*
> Stop the connected Cassandra process after a pluggable health check passes. 
> This request will succeed if the process is already stopped to ensure 
> idempotency. 
> h6. Parameters
>  * {{block :: boolean (default False)}}
>  * {{skipHealthCheck :: boolean (default False)}}
> h6. Response
>  * 202 Accepted
>  ** {{operationId :: string}}
>  * 412 Precondition Failed
>  ** {{error :: string (health check fails)}}
>  * 500 Internal Sever Error
>  ** {{error :: string}}
> h5. *POST /api/v1/cassandra/operations/lifecycle/node/restart*
> Restart the connected Cassandra process after a pluggable health check 
> passes. 
> h6. Parameters
>  * {{block :: boolean (default False)}}
>  * {{skipHealthCheck :: boolean (default False)}}
> h6. Response
>  * 202 Accepted
>  ** {{operationId :: string}}
>  * 412 Precondition Failed
>  ** {{error :: string (health check fails)}}
>  * 500 Internal Sever Error
>  ** {{error :: string}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to