[jira] [Commented] (CASSSIDECAR-266) Implement lifecycle APIs for safely stopping, starting, and restarting local Cassandra instances

Paulo Motta (Jira) Fri, 11 Jul 2025 07:27:34 -0700


    [ 
https://issues.apache.org/jira/browse/CASSSIDECAR-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18004710#comment-18004710
 ]


Paulo Motta commented on CASSSIDECAR-266:
-----------------------------------------

{quote}If you were suggesting that the intent field can be left empty, I would 
be against the idea. The intent should be always specified explicitly.
{quote}
I understood that Andres suggested the initial intent would be "UP" when the 
node is initialized - but I think it would be actually +empty+ if the API was 
never called? Otherwise it would be fetched from storage since we would persist 
the last intent. Andres can probably confirm what he meant.
{quote}Do you mean a scenario that operator just place the binary somewhere and 
call the API to start Cassandra node? In this case, if the API demand the 
explicit intent, the autostart configuration becomes unnecessary.
{quote}
I meant the operator could override the initial intent to be "UP" instead of 
empty via this setting, so sidecar would start the Cassandra process 
automatically during initialization. However, if the persisted intent is "DOWN" 
this configuration would override the intent expressed which I think would be 
wrong. Based on this I think the intent should be empty if there's no persisted 
intent and the persisted intent should be the one honored during sidecar 
initialization and we shouldn't add this configuration.
{quote}For starting a C* node, how does sidecar know where to look up the 
binary and all the relevant configurations (maybe even supporting configuration 
overrides too)?
{quote}
I have been collaborating with Andres on this and the idea is to have an 
{{AbstractLifecycleProvider}} with methods start/stop/is_running and these 
details would be left to the provider. For example, a docker provider would use 
the docker API for that and create mounts for cassandra configuration, HTTP 
provider would call an API to do this and so on.

For the default {{ProcessLifecycleProvider }}we've been thinking of it 
requiring a {{cassandra_home}} setting to locate the cassandra binary, and a 
{{cassandra_conf}} directory for each managed instance that would be supplied 
as environment variable during startup.
{quote}Stopping a C* node is more straightforward. But sidecar still need to 
know where to look up the pid of the stopping
{quote}
PID-based stopping is an implementation detail of the 
{{{}ProcessLifecycleProvider{}}}, other providers may stop by other means - ie. 
HTTP or docker interface. For the the {{ProcessLifecycleProvider }}idea is to 
save a pidfile locally somewhere and perhaps retrieve it by process name if not 
found.

Let me know wat do you think.

> Implement lifecycle APIs for safely stopping, starting, and restarting local 
> Cassandra instances 
> -------------------------------------------------------------------------------------------------
>
>                 Key: CASSSIDECAR-266
>                 URL: https://issues.apache.org/jira/browse/CASSSIDECAR-266
>             Project: Sidecar for Apache Cassandra
>          Issue Type: New Feature
>          Components: Rest API
>            Reporter: Andres Beck-Ruiz
>            Priority: Normal
>
> We would like to implement APIs to safely stop, start, and restart local 
> connected Cassandra instances through Cassandra Sidecar in a generic way. 
> This could lead to future work to implement Cassandra native rolling restarts 
> in Sidecar and automate the Cassandra upgrade process. 
> We propose implementing an {{AbstractLifecycleOperationsHandler}} interface 
> that defines start, stop, restart, and status endpoints to allow Sidecar 
> operators to implement their own lifecycle handlers, depending on how they 
> host their Cassandra processes. To provide a default implementation, we would 
> create a {{LocalProcessLifecycleOperationsHandler}} to implement this 
> interface and provide lifecycle operations for OS native Cassandra processes. 
> This could be defined as the default lifecycle manager in 
> {{{}sidecar.yaml{}}}, disabled by default.
> We propose the following APIs, leveraging the {{OperationalJob}} interface to 
> provide support for async non-blocking jobs. We will use the existing 
> implemented {{OperationalJobRoute}} , 
> {{/api/v1/cassandra/operational-jobs/:operationId}} , to track the status of 
> these jobs. These endpoints will live under a {{/node}} path to specify 
> operations on the local connected Cassandra instance, allowing for future 
> development of lifecycle endpoints for an entire Cassandra cluster:
> h5. *GET /api/v1/cassandra/operations/lifecycle/node/status*
> Gets the status of whether the local Cassandra process is running. 
> h6. Response
>  * 200 Ok
>  ** {{cassandra_running :: bool}}
>  * 500 Internal Sever Error
>  ** {{error :: string}}
> h5. *POST /api/v1/cassandra/operations/lifecycle/node/start*
> Start the connected Cassandra process. This request will succeed if the 
> process is already started to ensure idempotency.
> h6. Parameters
>  * {{block :: boolean (default False)}}
> h6. Response
>  * 202 Accepted
>  ** {{operationId :: string}}
>  * 500 Internal Sever Error
>  ** {{error :: string}}
> h5. *POST /api/v1/cassandra/operations/lifecycle/node/stop*
> Stop the connected Cassandra process after a pluggable health check passes. 
> This request will succeed if the process is already stopped to ensure 
> idempotency. 
> h6. Parameters
>  * {{block :: boolean (default False)}}
>  * {{skipHealthCheck :: boolean (default False)}}
> h6. Response
>  * 202 Accepted
>  ** {{operationId :: string}}
>  * 412 Precondition Failed
>  ** {{error :: string (health check fails)}}
>  * 500 Internal Sever Error
>  ** {{error :: string}}
> h5. *POST /api/v1/cassandra/operations/lifecycle/node/restart*
> Restart the connected Cassandra process after a pluggable health check 
> passes. 
> h6. Parameters
>  * {{block :: boolean (default False)}}
>  * {{skipHealthCheck :: boolean (default False)}}
> h6. Response
>  * 202 Accepted
>  ** {{operationId :: string}}
>  * 412 Precondition Failed
>  ** {{error :: string (health check fails)}}
>  * 500 Internal Sever Error
>  ** {{error :: string}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSSIDECAR-266) Implement lifecycle APIs for safely stopping, starting, and restarting local Cassandra instances

Reply via email to