[jira] [Commented] (CASSANDRA-18555) A new nodetool/JMX command that tells whether node's decommission failed or not

Stefan Miklosovic (Jira) Tue, 06 Jun 2023 01:47:05 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-18555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17729662#comment-17729662
 ]


Stefan Miklosovic commented on CASSANDRA-18555:
-----------------------------------------------

I have also added this to nodetool info

{code}
$ ./bin/nodetool info
ID                     : 0ae1c2be-5a7b-477d-9835-a84c7af7a950
Gossip active          : true
Native Transport active: true
Load                   : 208.67 KiB
Uncompressed load      : 300.5 KiB
Generation No          : 1686041148
Uptime (seconds)       : 7
Heap Memory (MB)       : 285.82 / 973.00
Off Heap Memory (MB)   : 0.00
Data Center            : datacenter1
Rack                   : rack1
Exceptions             : 0
Key Cache              : entries 20, size 1.66 KiB, capacity 46 MiB, 143 hits, 
167 requests, 0.856 recent hit rate, 14400 save period in seconds
Row Cache              : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 
requests, NaN recent hit rate, 0 save period in seconds
Counter Cache          : entries 0, size 0 bytes, capacity 23 MiB, 0 hits, 0 
requests, NaN recent hit rate, 7200 save period in seconds
Network Cache          : size 0 bytes, overflow size: 0 bytes, capacity 58 MiB
Percent Repaired       : 100.0%
Token                  : (invoke with -T/--tokens to see all 16 tokens)
Bootstrap state        : COMPLETED
{code}

Check the last line.

> A new nodetool/JMX command that tells whether node's decommission failed or 
> not
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18555
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18555
>             Project: Cassandra
>          Issue Type: Task
>          Components: Observability/JMX
>            Reporter: Jaydeepkumar Chovatia
>            Assignee: Jaydeepkumar Chovatia
>            Priority: Normal
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, when a node is being decommissioned and if any failure happens, 
> then an exception is thrown back to the caller.
> But Cassandra's decommission takes considerable time ranging from minutes to 
> hours to days. There are various scenarios in that the caller may need to 
> probe the status again:
>  * The caller times out
>  * It is not possible to keep the caller hanging for such a long time
> And If the caller does not know what happened internally, then it cannot 
> retry, etc., leading to other issues.
> So, in this ticket, I am going to add a new nodetool/JMX command that can be 
> invoked by the caller anytime, and it will return the correct status.
> It might look like a smaller change, but when we need to operate Cassandra at 
> scale in a large-scale fleet, then this becomes a bottleneck and require 
> constant operator intervention.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-18555) A new nodetool/JMX command that tells whether node's decommission failed or not

Reply via email to