[ 
https://issues.apache.org/jira/browse/CASSANDRA-19902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18009797#comment-18009797
 ] 

Paulo Motta edited comment on CASSANDRA-19902 at 7/24/25 10:59 PM:
-------------------------------------------------------------------

{quote}Of course, there may well have been things we missed, but I'd be quite 
confident that if this is what trunk does now it's what trunk/5.0 did at the 
time, which was before CASSANDRA-11537 landed.
{quote}
When CASSANDRA-11537 landed, the mode was set to {{JOINING}} before 
{{initServer}} had finished initialization on [StorageService.Init > 
StorageService.joinTokenRing > 
StorageService.prepareForBootstrap|https://github.com/apache/cassandra/blob/035705f49464a4854482e1f1280a7af45f7f0203/src/java/org/apache/cassandra/service/StorageService.java#L1837],
 while currently the actual state is only returned after 
[StorageService.init|#L879] completes which I think is after bootstrap 
completes. I wouldn't be surprised this is not caught by a dtest since it's 
pretty specific, you'd need to query nodetool netstats operation mode during 
bootstrap and parse the output to check it's JOINING.

I did a quick local test with 2 docker instances and added a sleep 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/dht/BootStrapper.java#L120]
 simulating a long bootstrap and confirmed that even though {{node1}} sees 
{{node2}} as UJ, {{node2}} see its operation mode as {{STARTING}} instead of 
the expected {{{}JOINING as in 4.1{}}}, so I believe this is legit. I believe 
this mostly affects bootstrap/replace since this is done within 
StorageService.init.
{code:none}
$ ./nodetool.sh node1 status -r
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address            Load       Tokens  Owns (effective)  Host ID             
                  Rack
UN  node1              73.91 KiB  16      100.0%            
6d194555-f6eb-41d0-c000-000000000001  rack1
UJ  node2.casstest  ?          16      ?                 
6d194555-f6eb-41d0-c000-000000000002  rack1

$ ./nodetool.sh node2 netstats
Mode: STARTING
Not sending any streams.
{code}
{quote}So in that sense "if (!isInitialized())" serves as "insurance" that by 
the time myNodeState() is called in that method, initServer() finished so it 
returns something meaningful. If that happens (myNodeState is null) I would 
return STARTING.
{quote}
I would think this is the proper behavior, since it may be possible that CMS 
has not finished initializing when operationMode() is queried, even though this 
would be pretty unlikely.


was (Author: paulo):
{quote}Of course, there may well have been things we missed, but I'd be quite 
confident that if this is what trunk does now it's what trunk/5.0 did at the 
time, which was before CASSANDRA-11537 landed.
{quote}
When CASSANDRA-11537 landed, the mode was set to {{JOINING}} before 
{{initServer}} had finished initialization on [StorageService.Init > 
StorageService.joinTokenRing > 
StorageService.prepareForBootstrap|https://github.com/apache/cassandra/blob/035705f49464a4854482e1f1280a7af45f7f0203/src/java/org/apache/cassandra/service/StorageService.java#L1837],
 while currently the actual state is only returned after 
[StorageService.init|#L879] completes which I think is after bootstrap 
completes. I wouldn't be surprised this is not caught by a dtest since it's 
pretty specific, you'd need to query nodetool netstats operation mode during 
bootstrap and parse the output to check it's JOINING.

I did a quick local test with 2 docker instances and added a sleep 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/dht/BootStrapper.java#L120]
 simulating a long bootstrap and confirmed that even though {{node1}} sees 
{{node2}} as UJ, {{node2}} see its operation mode as {{STARTING}} instead of 
the expected {{{}JOINING as in 4.1{}}}, so I believe this is legit. I believe 
this mostly affects bootstrap/replace since this is done within 
StorageService.init.
{code:none}
$ ./nodetool.sh node1 status -r
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address            Load       Tokens  Owns (effective)  Host ID             
                  Rack
UN  node1              73.91 KiB  16      100.0%            
6d194555-f6eb-41d0-c000-000000000001  rack1
UJ  node2.casstest  ?          16      ?                 
6d194555-f6eb-41d0-c000-000000000002  rack1

$ ./nodetool.sh node2 netstats
Mode: STARTING
Not sending any streams.
{code}
{quote}So in that sense "if (!isInitialized())" serves as "insurance" that by 
the time myNodeState() is called in that method, initServer() finished so it 
returns something meaningful. If that happens (myNodeState is null) I would 
return STARTING.
{quote}
I would think this is the proper behavior, since it's possible that CMS has not 
finished initializing at that point.

> StorageService JMX mbean is not available during bootstrap
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-19902
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19902
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Tool/nodetool
>            Reporter: Paulo Motta
>            Assignee: Paulo Motta
>            Priority: Normal
>             Fix For: 5.0.x, 5.x
>
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> Looks like the seemingly harmless cosmetic patch from CASSANDRA-11537 causes 
> the StorageServiceMBean to not be available during bootstrap. This causes 
> commands like "nodetool nestats/status/etc" to not be available on the 
> boostrapping node with the following error:
> {code:none}
> - StackTrace --
> javax.management.InstanceNotFoundException: 
> org.apache.cassandra.db:type=StorageService
>         at 
> java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1083)
>         at 
> java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:637)
> {code}
> This ticket is just to revert CASSANDRA-11537, we can re-add the improvement 
> of that ticket later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to