[ 
https://issues.apache.org/jira/browse/CASSSIDECAR-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18016644#comment-18016644
 ] 

Andres Beck-Ruiz commented on CASSSIDECAR-266:
----------------------------------------------

I just opened up a patch, which is available for review here: 
https://github.com/apache/cassandra-sidecar/pull/256

This patch implements the {{InJVMDLifecycleProvider}} for Jvm DTests, and 
provides a stencil of a {{ProccessLifecycleProvider}} for managing Cassandra 
instances that are hosted as OS processes. In order to make review more 
manageable, this provider will be tested and implemented in a separate patch. 
This patch will also contain a new {{lifecycle}} section in {{sidecar.yaml}} to 
allow for pluggable lifecycle providers, which can implement the generic 
{{LifecycleProvider}} interface.

The API was updated as follows:  

1) Some cosmetic changes to lifecycle information returned:
 * {{"state"}} -> {{"current_state"}}
 * {{"intent"}} -> {{"desired_state"}}
 * {{"result"}} -> {{"status"}}
 * {{"message"}} -> {{"last_update"}}

Lifecycle states are {{RUNNING}} or {{{}STOPPED{}}}-- this is to avoid 
confusion with UP or DOWN states from nodetool status, as a DOWN node does not 
necessarily mean that a Cassandra process is not running.

Lifecycle status (old results) can now be {{{}CONVERGED{}}}, 
{{{}CONVERGING{}}}, {{{}DIVERGED{}}}, or {{UNDEFINED}} (the default value for 
lifecycle when a desired state has not been submitted yet).

2) {{PUT /api/v1/cassandra/lifecycle}} now also returns the current lifecycle 
information to a user when the desired state has been updated. Furthermore, if 
the desired state submitted is the same as the one that exists, a 200 OK will 
be returned instead of 202 ACCEPTED:
{code:java}
curl -v -XPUT http://sidecar:9043/api/v1/cassandra/lifecycle -d '{"state": 
"start"}'

{"current_state":"STOPPED","desired_state":"RUNNING","status":"CONVERGING","last_update":"Starting
 instance"}
{code}
3) Finally, as we discussed, the result and message (now {{"status"}} and 
{{{}"last_update"{}}}) are changed when the state of a Cassandra node diverges 
from the desired state (eg: the node dies). Here's an example:

C* node running:
{code:java}
curl  http://sidecar:9043/api/v1/cassandra/lifecycle 

{"current_state":"RUNNING","desired_state":"RUNNING","status":"CONVERGED","last_update":"Instance
 has started"}
{code}
C* node dies:
{code:java}
curl  http://sidecar:9043/api/v1/cassandra/lifecycle 

{"current_state":"RUNNING","desired_state":"STOPPED","status":"DIVERGED","last_update":"Instance
 cassandra123 has unexpectedly diverged from the desired state RUNNING to 
STOPPED"}
{code}

> Add lifecycle APIs for starting and stopping Cassandra
> ------------------------------------------------------
>
>                 Key: CASSSIDECAR-266
>                 URL: https://issues.apache.org/jira/browse/CASSSIDECAR-266
>             Project: Sidecar for Apache Cassandra
>          Issue Type: New Feature
>          Components: Rest API
>            Reporter: Andres Beck-Ruiz
>            Assignee: Andres Beck-Ruiz
>            Priority: Normal
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> We would like to implement APIs to safely stop, start, and restart local 
> connected Cassandra instances through Cassandra Sidecar in a generic way. 
> This could lead to future work to implement Cassandra native rolling restarts 
> in Sidecar and automate the Cassandra upgrade process. 
> We propose implementing an {{AbstractLifecycleOperationsHandler}} interface 
> that defines start, stop, restart, and status endpoints to allow Sidecar 
> operators to implement their own lifecycle handlers, depending on how they 
> host their Cassandra processes. To provide a default implementation, we would 
> create a {{LocalProcessLifecycleOperationsHandler}} to implement this 
> interface and provide lifecycle operations for OS native Cassandra processes. 
> This could be defined as the default lifecycle manager in 
> {{{}sidecar.yaml{}}}, disabled by default.
> We propose the following APIs, leveraging the {{OperationalJob}} interface to 
> provide support for async non-blocking jobs. We will use the existing 
> implemented {{OperationalJobRoute}} , 
> {{/api/v1/cassandra/operational-jobs/:operationId}} , to track the status of 
> these jobs. These endpoints will live under a {{/node}} path to specify 
> operations on the local connected Cassandra instance, allowing for future 
> development of lifecycle endpoints for an entire Cassandra cluster:
> h5. *GET /api/v1/cassandra/operations/lifecycle/node/status*
> Gets the status of whether the local Cassandra process is running. 
> h6. Response
>  * 200 Ok
>  ** {{cassandra_running :: bool}}
>  * 500 Internal Sever Error
>  ** {{error :: string}}
> h5. *POST /api/v1/cassandra/operations/lifecycle/node/start*
> Start the connected Cassandra process. This request will succeed if the 
> process is already started to ensure idempotency.
> h6. Parameters
>  * {{block :: boolean (default False)}}
> h6. Response
>  * 202 Accepted
>  ** {{operationId :: string}}
>  * 500 Internal Sever Error
>  ** {{error :: string}}
> h5. *POST /api/v1/cassandra/operations/lifecycle/node/stop*
> Stop the connected Cassandra process after a pluggable health check passes. 
> This request will succeed if the process is already stopped to ensure 
> idempotency. 
> h6. Parameters
>  * {{block :: boolean (default False)}}
>  * {{skipHealthCheck :: boolean (default False)}}
> h6. Response
>  * 202 Accepted
>  ** {{operationId :: string}}
>  * 412 Precondition Failed
>  ** {{error :: string (health check fails)}}
>  * 500 Internal Sever Error
>  ** {{error :: string}}
> h5. *POST /api/v1/cassandra/operations/lifecycle/node/restart*
> Restart the connected Cassandra process after a pluggable health check 
> passes. 
> h6. Parameters
>  * {{block :: boolean (default False)}}
>  * {{skipHealthCheck :: boolean (default False)}}
> h6. Response
>  * 202 Accepted
>  ** {{operationId :: string}}
>  * 412 Precondition Failed
>  ** {{error :: string (health check fails)}}
>  * 500 Internal Sever Error
>  ** {{error :: string}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to