Sergio and Abe, thanks so much for responding to me so quickly!

I managed to figure out the problem and the solution. In the Terraform
scripts we used to stand up the EC2 instances, we have a template file for
the jmxremote.access file with the content:

monitorRole   readonly
controlRole   readwrite \
              create javax.management.monitor.*,javax.management.timer.* \
              unregister
${USERNAME}   ${JMX_USER_ACCESSTYPE}

${JMX_USER_ACCESSTYPE} will get replaced by readwrite. Of course,
${USERNAME} will get replaced by the user we run nodetool as.

What I'm seeing now is that the jmxremote.access file just contains the
default content and that, at least in one case, it has a fairly recent
timestamp. This indicates that some security upgrades were indeed performed
without consulting with the Cassandra DBAs. If I restore the desired
content, and in particular the last line, nodetool works again.

The mystery is why this problem is only occurring piecemeal in some nodes
when the security upgrade was performed in a much broader cross-section of
nodes. But I might just need to leave that as a mystery.

On Sun, Feb 26, 2023 at 11:46 AM Sergio <lapostadiser...@gmail.com> wrote:

> Hey!
> I would try to spin up a new node and see if the problem occurs on it.
> If it happens, I would check the history of changes on the
> cookbook recipe, if you don't find any problem on the new node you might
> replace all the nodes having problems one by one with a new one and
> decommission the affected ones.
> it would cost some time and money but better than having a node tool not
> working
>
> Best,
>
> Sergio
>
> Il giorno dom 26 feb 2023 alle ore 10:51 Abe Ratnofsky <a...@aber.io> ha
> scritto:
>
>> Hey Mitch,
>>
>> The security upgrade schedule that your colleague is working on may well
>> be relevant. Is your entire cluster on 3.11.6 or are the failing hosts
>> possibly on a newer version?
>>
>> Abe
>>
>> On Feb 26, 2023, at 10:38, Mitch Gitman <mgit...@gmail.com> wrote:
>>
>> 
>>
>> We're running Cassandra 3.11.6 on AWS EC2 instances. These clusters have
>> been running for a few years.
>>
>>
>> We're suddenly noticing now that on one of our clusters the nodetool
>> command is failing on certain nodes but not on others.
>>
>>
>> The failure:
>>
>> nodetool: Failed to connect to '...:7199' - SecurityException: 'Access
>> denied! Invalid access level for requested MBeanServer operation.'.
>>
>>
>> I suspect that this stems from some colleague I'm not in coordination
>> with recently doing some security upgrades, but that's a bit of an academic
>> matter for now.
>>
>>
>> I've compared the jmxremote.access and jvm.options files on a host where
>> nodetool is not working vs. a host where nodetool is working, and no
>> meaningful differences.
>>
>>
>> Any ideas? The interesting aspect of this problem is that it is occurring
>> on some nodes in the one cluster but not others.
>>
>>
>> I'll update on this thread if I find any solutions on my end.
>>
>>

Reply via email to