Re: cassandra 4.0 java 11 support

2021-07-27 Thread Erick Ramirez
There's been some discussion around removing the "experimental" tag for C*
4.0 + Java 11 so by all means, we encourage everyone to try it and report
back to the community if you run into issues. Java 11 support was added 2
years ago so I think most of the issues have been ironed out. Now that 4.0
is out, we're hoping there would be more users testing it out. Cheers!


Re: cassandra 4.0 java 11 support

2021-07-27 Thread Bowen Song

Experimental means anything can happen - dragons, unicorns, ...

On 27/07/2021 21:32, CPC wrote:

Hi ,

At cassandra site 
https://cassandra.apache.org/doc/latest/cassandra/new/java11.html 
 , 
it says java 11 support is experimental and not recommended for 
production. What is the reason for that? I mean performance or bugs?


Thank you...


Re: Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Chahat Bhatia
Okay. Sure. Thanks a lot for all the information. Really helped. :)

On Tue, 27 Jul 2021 at 21:05, Bowen Song  wrote:

> Based on the information I know, I'd say that you don't have any specific
> issue with the authentication related tables, but you do have a general
> overloading problem during peak load. I think it's fairly likely that your
> 7 nodes cluster (6 nodes in one DC) is not able to keep up with the peak
> load, and you will need to either scale up for the peak load or tune the
> application to avoid the bursty behaviour.
> On 27/07/2021 16:27, Chahat Bhatia wrote:
>
> Yes, the application in quite read heavy and the request pattern is bursty
> too. Hence that big a request failure in such less time.
> Also, nothing out of the ordinary in cfstats and proxyhistograms.
>
> But there are Native-Transport-Requests dropped messages (Almost similar
> stats on all the nodes) :
>
>  /opt/evive/pkgs/apache-cassandra-3.0.13/bin/nodetool tpstats
>>
>> Pool Name  Active   Pending  Completed
>> BlockedAll time blocked
>> ...
>> Native-Transport-Requests 6 02476
>> 0  8262
>>
>> Message type   Dropped
>> READ 24260
>>
>
>
> * We have the below values for the following params:*
> concurrent_reads: 96
> native_transport_max_threads: 320
> JVM_OPTS="$JVM_OPTS -Dcassandra.max_queued_native_transport_requests=3072"
>
> Do you think it would be advisable to tune the number in the above params
> to have lesser load on the node?
>
>
> On Tue, 27 Jul 2021 at 20:13, Bowen Song  
> wrote:
>
>> Wow, 15 seconds timeout? That's pretty long... You may want to check the 
>> nodetool
>> tpstats and make sure the NTP thread pool isn't blocking things.
>>
>> 16k read requests dropped in 5 seconds, or over 3k requests per second on
>> a single node, is a bit suspicious. Does your read requests tend to be
>> bursty?
>> On 27/07/2021 15:32, Chahat Bhatia wrote:
>>
>> Yes, RF=6 for system auth. Sorry my bad.
>>
>>
>> No, we are not using cassandra user for the application. We have a custom
>> super user for our operational and administrative tasks and a separate role
>> with needed perms for the application.
>>
>>>  role  | super | login | options
>>> +---+---+-
>>>  EviveCassandra |  True |  True |{}
>>
>>   ccusr | False |  True |{}
>>
>>
>>
>>
>>-  gc_warn_threshold_in_ms: 1000
>>- read_request_timeout_in_ms: 15000  // been the same since the start
>>- cross_node_timeout: false   // but we have ntp on each
>>server and the clocks are synced.
>>
>>- On checking the logs a little more I found out these:
>>
>> INFO  [ScheduledTasks:1] 2021-07-26 07:21:33,279
>>> MessagingService.java:1013 - READ messages were dropped in last 5000 ms:
>>> 289 for internal timeout and 0 for cross node timeout
>>
>> INFO  [ScheduledTasks:1] 2021-07-26 07:22:33,894
>>> MessagingService.java:1013 - READ messages were dropped in last 5000 ms:
>>> 273 for internal timeout and 0 for cross node timeout
>>
>> INFO  [ScheduledTasks:1] 2021-07-27 06:12:03,988
>>> MessagingService.java:1013 - READ messages were dropped in last 5000 ms:
>>> 160 for internal timeout and 0 for cross node timeout
>>
>> INFO  [ScheduledTasks:1] 2021-07-27 06:13:09,125
>>> MessagingService.java:1013 - READ messages were dropped in last 5000 ms:
>>> 223 for internal timeout and 0 for cross node timeout
>>
>> INFO  [ScheduledTasks:1] 2021-07-27 06:13:44,212
>>> MessagingService.java:1013 - READ messages were dropped in last 5000 ms: 72
>>> for internal timeout and 0 for cross node timeout
>>
>> INFO  [ScheduledTasks:1] 2021-07-27 06:14:14,496
>>> MessagingService.java:1013 - READ messages were dropped in last 5000 ms:
>>> 16036 for internal timeout and 0 for cross node timeout
>>
>> INFO  [ScheduledTasks:1] 2021-07-27 06:15:59,744
>>> MessagingService.java:1013 - READ messages were dropped in last 5000 ms:
>>> 287 for internal timeout and 0 for cross node timeout
>>
>>
>>
>>- Also, I am checking cfstats and proxyhistorgrams is in progress,
>>will update incase anythings suspicious.
>>
>>
>> On Tue, 27 Jul 2021 at 18:09, Bowen Song  
>> wrote:
>>
>>> Hello Chahat,
>>>
>>>
>>> You haven't replied to the first point, are you using the "cassandra"
>>> user?
>>>
>>>
>>> The schema and your description don't quite match. When you said:
>>>
>>> *the system_auth  for 2 DCs :   **us-east*
>>> * with 6 nodes (and RF=3) and ... *
>>>
>>> I assume you meant to say 6 nodes and RF=6? That sounds totally fine.
>>>
>>>
>>> As of the GC logs, do you see any WARN level logs? You may also want to
>>> double check the choice of the gc_warn_threshold_in_ms value is
>>> reasonable. I'm a bit surprised to see the GC logs full of ~1.6ms GC
>>> pauses. AFAIK, by default Cassandra only logs GC pauses longer than 200ms.
>>> I have to assume that's the result of your 

Re: Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Bowen Song
Based on the information I know, I'd say that you don't have any 
specific issue with the authentication related tables, but you do have a 
general overloading problem during peak load. I think it's fairly likely 
that your 7 nodes cluster (6 nodes in one DC) is not able to keep up 
with the peak load, and you will need to either scale up for the peak 
load or tune the application to avoid the bursty behaviour.


On 27/07/2021 16:27, Chahat Bhatia wrote:
Yes, the application in quite read heavy and the request pattern is 
bursty too. Hence that big a request failure in such less time.

Also, nothing out of the ordinary in cfstats and proxyhistograms.

But there are Native-Transport-Requests dropped messages (Almost 
similar stats on all the nodes) :


 /opt/evive/pkgs/apache-cassandra-3.0.13/bin/nodetool tpstats

Pool Name               Active   Pending      Completed    Blocked
All time blocked
...
Native-Transport-Requests         6         0 2476         0  
           8262

Message type           Dropped
READ                     24260


*
We have the below values for the following params:*
concurrent_reads: 96
native_transport_max_threads: 320
JVM_OPTS="$JVM_OPTS -Dcassandra.max_queued_native_transport_requests=3072"

Do you think it would be advisable to tune the number in the above 
params to have lesser load on the node?



On Tue, 27 Jul 2021 at 20:13, Bowen Song  wrote:

Wow, 15 seconds timeout? That's pretty long... You may want to
check the nodetool tpstats and make sure the NTP thread pool isn't
blocking things.

16k read requests dropped in 5 seconds, or over 3k requests per
second on a single node, is a bit suspicious. Does your read
requests tend to be bursty?

On 27/07/2021 15:32, Chahat Bhatia wrote:

Yes, RF=6 for system auth. Sorry my bad.


No, we are not using cassandra user for the application. We have
a custom super user for our operational and administrative tasks
and a separate role with needed perms for the application.

 role         | super | login | options
+---+---+-
 EviveCassandra |  True |  True |        {} 


ccusr | False |  True |        {}


  *  gc_warn_threshold_in_ms: 1000
  * read_request_timeout_in_ms: 15000  // been the same since the
start
  * cross_node_timeout: false           // but we have ntp on
each server and the clocks are synced.

  * On checking the logs a little more I found out these:

INFO  [ScheduledTasks:1] 2021-07-26 07:21:33,279
MessagingService.java:1013 - READ messages were dropped
in last 5000 ms: 289 for internal timeout and 0 for cross
node timeout

INFO  [ScheduledTasks:1] 2021-07-26 07:22:33,894
MessagingService.java:1013 - READ messages were dropped
in last 5000 ms: 273 for internal timeout and 0 for cross
node timeout

INFO  [ScheduledTasks:1] 2021-07-27 06:12:03,988
MessagingService.java:1013 - READ messages were dropped
in last 5000 ms: 160 for internal timeout and 0 for cross
node timeout

INFO  [ScheduledTasks:1] 2021-07-27 06:13:09,125
MessagingService.java:1013 - READ messages were dropped
in last 5000 ms: 223 for internal timeout and 0 for cross
node timeout

INFO  [ScheduledTasks:1] 2021-07-27 06:13:44,212
MessagingService.java:1013 - READ messages were dropped
in last 5000 ms: 72 for internal timeout and 0 for cross
node timeout

INFO  [ScheduledTasks:1] 2021-07-27 06:14:14,496
MessagingService.java:1013 - READ messages were dropped
in last 5000 ms: 16036 for internal timeout and 0 for
cross node timeout

INFO  [ScheduledTasks:1] 2021-07-27 06:15:59,744
MessagingService.java:1013 - READ messages were dropped
in last 5000 ms: 287 for internal timeout and 0 for cross
node timeout


  * Also, I am checking cfstats and proxyhistorgrams is in
progress, will update incase anythings suspicious.


On Tue, 27 Jul 2021 at 18:09, Bowen Song 
 wrote:

Hello Chahat,


You haven't replied to the first point, are you using the
"cassandra" user?


The schema and your description don't quite match. When you said:

/the system_auth  for 2 DCs : //*us-east*//with 6 nodes
(and RF=3) and ...
/

I assume you meant to say 6 nodes and RF=6? That sounds
totally fine.


As of the GC logs, do you see any WARN level logs? You may
also want to double check the choice of the
gc_warn_threshold_in_ms value is reasonable. I'm a bit
surprised to see the GC logs full of ~1.6ms GC pauses. AFAIK,

Re: Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Chahat Bhatia
Yes, the application in quite read heavy and the request pattern is bursty
too. Hence that big a request failure in such less time.
Also, nothing out of the ordinary in cfstats and proxyhistograms.

But there are Native-Transport-Requests dropped messages (Almost similar
stats on all the nodes) :

 /opt/evive/pkgs/apache-cassandra-3.0.13/bin/nodetool tpstats
>
> Pool Name  Active   Pending  Completed
> BlockedAll time blocked
> ...
> Native-Transport-Requests 6 02476
> 0  8262
>
> Message type   Dropped
> READ 24260
>


*We have the below values for the following params:*
concurrent_reads: 96
native_transport_max_threads: 320
JVM_OPTS="$JVM_OPTS -Dcassandra.max_queued_native_transport_requests=3072"

Do you think it would be advisable to tune the number in the above params
to have lesser load on the node?


On Tue, 27 Jul 2021 at 20:13, Bowen Song  wrote:

> Wow, 15 seconds timeout? That's pretty long... You may want to check the 
> nodetool
> tpstats and make sure the NTP thread pool isn't blocking things.
>
> 16k read requests dropped in 5 seconds, or over 3k requests per second on
> a single node, is a bit suspicious. Does your read requests tend to be
> bursty?
> On 27/07/2021 15:32, Chahat Bhatia wrote:
>
> Yes, RF=6 for system auth. Sorry my bad.
>
>
> No, we are not using cassandra user for the application. We have a custom
> super user for our operational and administrative tasks and a separate role
> with needed perms for the application.
>
>>  role  | super | login | options
>> +---+---+-
>>  EviveCassandra |  True |  True |{}
>
>   ccusr | False |  True |{}
>
>
>
>
>-  gc_warn_threshold_in_ms: 1000
>- read_request_timeout_in_ms: 15000  // been the same since the start
>- cross_node_timeout: false   // but we have ntp on each
>server and the clocks are synced.
>
>- On checking the logs a little more I found out these:
>
> INFO  [ScheduledTasks:1] 2021-07-26 07:21:33,279
>> MessagingService.java:1013 - READ messages were dropped in last 5000 ms:
>> 289 for internal timeout and 0 for cross node timeout
>
> INFO  [ScheduledTasks:1] 2021-07-26 07:22:33,894
>> MessagingService.java:1013 - READ messages were dropped in last 5000 ms:
>> 273 for internal timeout and 0 for cross node timeout
>
> INFO  [ScheduledTasks:1] 2021-07-27 06:12:03,988
>> MessagingService.java:1013 - READ messages were dropped in last 5000 ms:
>> 160 for internal timeout and 0 for cross node timeout
>
> INFO  [ScheduledTasks:1] 2021-07-27 06:13:09,125
>> MessagingService.java:1013 - READ messages were dropped in last 5000 ms:
>> 223 for internal timeout and 0 for cross node timeout
>
> INFO  [ScheduledTasks:1] 2021-07-27 06:13:44,212
>> MessagingService.java:1013 - READ messages were dropped in last 5000 ms: 72
>> for internal timeout and 0 for cross node timeout
>
> INFO  [ScheduledTasks:1] 2021-07-27 06:14:14,496
>> MessagingService.java:1013 - READ messages were dropped in last 5000 ms:
>> 16036 for internal timeout and 0 for cross node timeout
>
> INFO  [ScheduledTasks:1] 2021-07-27 06:15:59,744
>> MessagingService.java:1013 - READ messages were dropped in last 5000 ms:
>> 287 for internal timeout and 0 for cross node timeout
>
>
>
>- Also, I am checking cfstats and proxyhistorgrams is in progress,
>will update incase anythings suspicious.
>
>
> On Tue, 27 Jul 2021 at 18:09, Bowen Song  
> wrote:
>
>> Hello Chahat,
>>
>>
>> You haven't replied to the first point, are you using the "cassandra"
>> user?
>>
>>
>> The schema and your description don't quite match. When you said:
>>
>> *the system_auth  for 2 DCs :   **us-east*
>> * with 6 nodes (and RF=3) and ... *
>>
>> I assume you meant to say 6 nodes and RF=6? That sounds totally fine.
>>
>>
>> As of the GC logs, do you see any WARN level logs? You may also want to
>> double check the choice of the gc_warn_threshold_in_ms value is
>> reasonable. I'm a bit surprised to see the GC logs full of ~1.6ms GC
>> pauses. AFAIK, by default Cassandra only logs GC pauses longer than 200ms.
>> I have to assume that's the result of your custom script. You may want to
>> double check the script actually does what it says on the tin, just in case
>> it misses something.
>>
>>
>> Then, the other relevant questions.
>>
>> Do you use cross node timeout? If so, are the clocks in sync between all
>> servers?
>>
>> What is the timeout values in the cassandra.yaml file? (
>> read_request_timeout_in_ms, etc.) Are they reasonably long enough for
>> the corresponding request type to complete?
>>
>> Since you've only got 7 nodes, I'd also recommend you to check the nodetool
>> cfstats & nodetool cfhistograms output for the tables in the system_auth
>> keyspace on each node, and see if you can spot anything out of ordinary.
>>
>>
>> Regards,
>>
>> Bowen
>> On 

Re: Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Bowen Song
Wow, 15 seconds timeout? That's pretty long... You may want to check the 
nodetool tpstats and make sure the NTP thread pool isn't blocking things.


16k read requests dropped in 5 seconds, or over 3k requests per second 
on a single node, is a bit suspicious. Does your read requests tend to 
be bursty?


On 27/07/2021 15:32, Chahat Bhatia wrote:

Yes, RF=6 for system auth. Sorry my bad.


No, we are not using cassandra user for the application. We have a 
custom super user for our operational and administrative tasks and a 
separate role with needed perms for the application.


 role                     | super | login | options
+---+---+-
 EviveCassandra |  True |  True |        {} 


                ccusr | False |  True |        {}


  *  gc_warn_threshold_in_ms: 1000
  * read_request_timeout_in_ms: 15000  // been the same since the start
  * cross_node_timeout: false           // but we have ntp on each
server and the clocks are synced.

  * On checking the logs a little more I found out these:

INFO  [ScheduledTasks:1] 2021-07-26 07:21:33,279
MessagingService.java:1013 - READ messages were dropped in
last 5000 ms: 289 for internal timeout and 0 for cross node
timeout

INFO  [ScheduledTasks:1] 2021-07-26 07:22:33,894
MessagingService.java:1013 - READ messages were dropped in
last 5000 ms: 273 for internal timeout and 0 for cross node
timeout

INFO  [ScheduledTasks:1] 2021-07-27 06:12:03,988
MessagingService.java:1013 - READ messages were dropped in
last 5000 ms: 160 for internal timeout and 0 for cross node
timeout

INFO  [ScheduledTasks:1] 2021-07-27 06:13:09,125
MessagingService.java:1013 - READ messages were dropped in
last 5000 ms: 223 for internal timeout and 0 for cross node
timeout

INFO  [ScheduledTasks:1] 2021-07-27 06:13:44,212
MessagingService.java:1013 - READ messages were dropped in
last 5000 ms: 72 for internal timeout and 0 for cross node timeout

INFO  [ScheduledTasks:1] 2021-07-27 06:14:14,496
MessagingService.java:1013 - READ messages were dropped in
last 5000 ms: 16036 for internal timeout and 0 for cross node
timeout

INFO  [ScheduledTasks:1] 2021-07-27 06:15:59,744
MessagingService.java:1013 - READ messages were dropped in
last 5000 ms: 287 for internal timeout and 0 for cross node
timeout


  * Also, I am checking cfstats and proxyhistorgrams is in progress,
will update incase anythings suspicious.


On Tue, 27 Jul 2021 at 18:09, Bowen Song  wrote:

Hello Chahat,


You haven't replied to the first point, are you using the
"cassandra" user?


The schema and your description don't quite match. When you said:

/the system_auth  for 2 DCs : //*us-east*//with 6 nodes (and
RF=3) and ...
/

I assume you meant to say 6 nodes and RF=6? That sounds totally fine.


As of the GC logs, do you see any WARN level logs? You may also
want to double check the choice of the gc_warn_threshold_in_ms
value is reasonable. I'm a bit surprised to see the GC logs full
of ~1.6ms GC pauses. AFAIK, by default Cassandra only logs GC
pauses longer than 200ms. I have to assume that's the result of
your custom script. You may want to double check the script
actually does what it says on the tin, just in case it misses
something.


Then, the other relevant questions.

Do you use cross node timeout? If so, are the clocks in sync
between all servers?

What is the timeout values in the cassandra.yaml file?
(read_request_timeout_in_ms, etc.) Are they reasonably long enough
for the corresponding request type to complete?

Since you've only got 7 nodes, I'd also recommend you to check the
nodetool cfstats & nodetool cfhistograms output for the tables in
the system_auth keyspace on each node, and see if you can spot
anything out of ordinary.


Regards,

Bowen

On 27/07/2021 09:23, Chahat Bhatia wrote:

Thanks for the prompt response.
*
Here is the system_schema.keyspaces entry:*

system_auth |           True | {'class':
'org.apache.cassandra.locator.NetworkTopologyStrategy',
'us-east': '6', 'us-east-backup': '1'}
                     census |           True | {'class':
'org.apache.cassandra.locator.NetworkTopologyStrategy',
'us-east': '3', 'us-east-backup': '1'}


So, the system_auth  for 2 DCs : /*us-east* with 6 nodes (and
RF=3) and *us-east-backup* with a single node and 100% of the data./
*Keyspace census* showing RF=3 for the main DC and RF=1 for the
backup site.


And for running the application, we have a user specifically
created for that application itself and that user also has
its 

Re: Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Chahat Bhatia
Yes, RF=6 for system auth. Sorry my bad.


No, we are not using cassandra user for the application. We have a custom
super user for our operational and administrative tasks and a separate role
with needed perms for the application.

>  role  | super | login | options
> +---+---+-
>  EviveCassandra |  True |  True |{}

  ccusr | False |  True |{}




   -  gc_warn_threshold_in_ms: 1000
   - read_request_timeout_in_ms: 15000  // been the same since the start
   - cross_node_timeout: false   // but we have ntp on each server
   and the clocks are synced.

   - On checking the logs a little more I found out these:

INFO  [ScheduledTasks:1] 2021-07-26 07:21:33,279 MessagingService.java:1013
> - READ messages were dropped in last 5000 ms: 289 for internal timeout and
> 0 for cross node timeout

INFO  [ScheduledTasks:1] 2021-07-26 07:22:33,894 MessagingService.java:1013
> - READ messages were dropped in last 5000 ms: 273 for internal timeout and
> 0 for cross node timeout

INFO  [ScheduledTasks:1] 2021-07-27 06:12:03,988 MessagingService.java:1013
> - READ messages were dropped in last 5000 ms: 160 for internal timeout and
> 0 for cross node timeout

INFO  [ScheduledTasks:1] 2021-07-27 06:13:09,125 MessagingService.java:1013
> - READ messages were dropped in last 5000 ms: 223 for internal timeout and
> 0 for cross node timeout

INFO  [ScheduledTasks:1] 2021-07-27 06:13:44,212 MessagingService.java:1013
> - READ messages were dropped in last 5000 ms: 72 for internal timeout and 0
> for cross node timeout

INFO  [ScheduledTasks:1] 2021-07-27 06:14:14,496 MessagingService.java:1013
> - READ messages were dropped in last 5000 ms: 16036 for internal timeout
> and 0 for cross node timeout

INFO  [ScheduledTasks:1] 2021-07-27 06:15:59,744 MessagingService.java:1013
> - READ messages were dropped in last 5000 ms: 287 for internal timeout and
> 0 for cross node timeout



   - Also, I am checking cfstats and proxyhistorgrams is in progress, will
   update incase anythings suspicious.


On Tue, 27 Jul 2021 at 18:09, Bowen Song  wrote:

> Hello Chahat,
>
>
> You haven't replied to the first point, are you using the "cassandra" user?
>
>
> The schema and your description don't quite match. When you said:
>
> *the system_auth  for 2 DCs :   **us-east*
> * with 6 nodes (and RF=3) and ... *
>
> I assume you meant to say 6 nodes and RF=6? That sounds totally fine.
>
>
> As of the GC logs, do you see any WARN level logs? You may also want to
> double check the choice of the gc_warn_threshold_in_ms value is
> reasonable. I'm a bit surprised to see the GC logs full of ~1.6ms GC
> pauses. AFAIK, by default Cassandra only logs GC pauses longer than 200ms.
> I have to assume that's the result of your custom script. You may want to
> double check the script actually does what it says on the tin, just in case
> it misses something.
>
>
> Then, the other relevant questions.
>
> Do you use cross node timeout? If so, are the clocks in sync between all
> servers?
>
> What is the timeout values in the cassandra.yaml file? (
> read_request_timeout_in_ms, etc.) Are they reasonably long enough for the
> corresponding request type to complete?
>
> Since you've only got 7 nodes, I'd also recommend you to check the nodetool
> cfstats & nodetool cfhistograms output for the tables in the system_auth
> keyspace on each node, and see if you can spot anything out of ordinary.
>
>
> Regards,
>
> Bowen
> On 27/07/2021 09:23, Chahat Bhatia wrote:
>
> Thanks for the prompt response.
>
> * Here is the system_schema.keyspaces entry:*
>
> system_auth |   True | {'class':
>> 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'us-east': '6',
>> 'us-east-backup': '1'}
>>  census |   True | {'class':
>> 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'us-east': '3',
>> 'us-east-backup': '1'}
>
>
> So, the system_auth  for 2 DCs :   *us-east with 6 nodes (and RF=3) and
> us-east-backup with a single node and 100% of the data.*
> *Keyspace census* showing RF=3 for the main DC and RF=1 for the backup
> site.
>
>
> And for running the application, we have a user specifically created for
>> that application itself and that user also has its appropriate permissions
>> in cassandra to select, modify and delete from concerned tables. And it
>> uses LOCAL_QUORUM for querying the data and the local-dc is set to
>> 'us-east'.
>
>
>
> Also, there is no excessive GC for any of the nodes, we run a custom
> script to trackthe GC stats (from the cassandra log itself) and output it.
> Below is the output from the current running script and its similar for all
> the servers:
>
> 2021-07-27 03:04:45,072 INFO gcstats:58 - Application Thread stop time
>> 0.001656 seconds.
>> 2021-07-27 03:04:45,080 INFO gcstats:58 - Application Thread stop time
>> 0.001669 seconds.
>> 2021-07-27 03:04:45,087 INFO gcstats:58 - Application 

Re: Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Bowen Song

Hello Chahat,


You haven't replied to the first point, are you using the "cassandra" user?


The schema and your description don't quite match. When you said:
//

   /the system_auth  for 2 DCs : //*us-east*//with 6 nodes (and RF=3)
   and ...
   /

I assume you meant to say 6 nodes and RF=6? That sounds totally fine.


As of the GC logs, do you see any WARN level logs? You may also want to 
double check the choice of the gc_warn_threshold_in_ms value is 
reasonable. I'm a bit surprised to see the GC logs full of ~1.6ms GC 
pauses. AFAIK, by default Cassandra only logs GC pauses longer than 
200ms. I have to assume that's the result of your custom script. You may 
want to double check the script actually does what it says on the tin, 
just in case it misses something.



Then, the other relevant questions.

Do you use cross node timeout? If so, are the clocks in sync between all 
servers?


What is the timeout values in the cassandra.yaml file? 
(read_request_timeout_in_ms, etc.) Are they reasonably long enough for 
the corresponding request type to complete?


Since you've only got 7 nodes, I'd also recommend you to check the 
nodetool cfstats & nodetool cfhistograms output for the tables in the 
system_auth keyspace on each node, and see if you can spot anything out 
of ordinary.



Regards,

Bowen

On 27/07/2021 09:23, Chahat Bhatia wrote:

Thanks for the prompt response.
*
Here is the system_schema.keyspaces entry:*

system_auth |           True | {'class':
'org.apache.cassandra.locator.NetworkTopologyStrategy',
'us-east': '6', 'us-east-backup': '1'}
                     census |           True | {'class':
'org.apache.cassandra.locator.NetworkTopologyStrategy',
'us-east': '3', 'us-east-backup': '1'}


So, the system_auth  for 2 DCs : /*us-east* with 6 nodes (and RF=3) 
and *us-east-backup* with a single node and 100% of the data./
*Keyspace census* showing RF=3 for the main DC and RF=1 for the backup 
site.



And for running the application, we have a user specifically
created for that application itself and that user also has its
appropriate permissions in cassandra to select, modify and delete
from concerned tables. And it uses LOCAL_QUORUM for querying the
data and the local-dc is set to 'us-east'. 




Also, there is no excessive GC for any of the nodes, we run a custom 
script to trackthe GC stats (from the cassandra log itself) and output 
it. Below is the output from the current running script and its 
similar for all the servers:


2021-07-27 03:04:45,072 INFO gcstats:58 - Application Thread
stop time 0.001656 seconds.
2021-07-27 03:04:45,080 INFO gcstats:58 - Application Thread
stop time 0.001669 seconds.
2021-07-27 03:04:45,087 INFO gcstats:58 - Application Thread
stop time 0.001601 seconds.
2021-07-27 03:04:45,095 INFO gcstats:58 - Application Thread
stop time 0.001713 seconds.
2021-07-27 03:04:45,103 INFO gcstats:58 - Application Thread
stop time 0.001586 seconds.
2021-07-27 03:04:45,110 INFO gcstats:58 - Application Thread
stop time 0.001671 seconds.
2021-07-27 03:04:45,118 INFO gcstats:58 - Application Thread
stop time 0.001691 seconds.
2021-07-27 03:04:45,127 INFO gcstats:58 - Application Thread
stop time 0.001860 seconds.
2021-07-27 03:04:45,134 INFO gcstats:58 - Application Thread
stop time 0.001630 seconds.
2021-07-27 03:04:45,141 INFO gcstats:58 - Application Thread
stop time 0.001515 seconds.
2021-07-27 03:04:45,148 INFO gcstats:58 - Application Thread
stop time 0.001533 seconds.
2021-07-27 03:04:45,156 INFO gcstats:58 - Application Thread
stop time 0.001630 seconds.
2021-07-27 03:04:45,163 INFO gcstats:58 - Application Thread
stop time 0.001577 seconds.
2021-07-27 03:04:45,170 INFO gcstats:58 - Application Thread
stop time 0.001538 seconds.
2021-07-27 03:04:45,177 INFO gcstats:58 - Application Thread
stop time 0.001615 seconds.
2021-07-27 03:04:45,186 INFO gcstats:58 - Application Thread
stop time 0.001584 seconds.
2021-07-27 03:04:45,193 INFO gcstats:58 - Application Thread
stop time 0.001558 seconds.
2021-07-27 03:04:45,200 INFO gcstats:58 - Application Thread
stop time 0.001696 seconds.
2021-07-27 03:04:45,208 INFO gcstats:58 - Application Thread
stop time 0.001658 seconds.
2021-07-27 03:04:45,215 INFO gcstats:58 - Application Thread
stop time 0.001592 seconds.
2021-07-27 03:04:45,222 INFO gcstats:58 - Application Thread
stop time 0.001618 seconds.
2021-07-27 03:05:08,907 INFO gcstats:58 - Application Thread
stop time 0.001624 seconds.
2021-07-27 03:06:34,436 INFO gcstats:58 - Application Thread
stop time 0.297773 seconds.






On Tue, 27 Jul 

Re: Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Chahat Bhatia
>
> Also, It's interesting that you've set validity to over 3 days but you
> update them every 6 hours. Is that intentional?

 We set that earlier when were in the process to add new roles (creating
new roles for the new apps we setup) but we never changed after that and
hence its been the same since.

On Tue, 27 Jul 2021 at 13:53, Chahat Bhatia 
wrote:

> Thanks for the prompt response.
>
> *Here is the system_schema.keyspaces entry:*
>
> system_auth |   True | {'class':
>> 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'us-east': '6',
>> 'us-east-backup': '1'}
>>  census |   True | {'class':
>> 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'us-east': '3',
>> 'us-east-backup': '1'}
>
>
>
> So, the system_auth  for 2 DCs :   *us-east with 6 nodes (and RF=3) and
> us-east-backup with a single node and 100% of the data.*
> *Keyspace census* showing RF=3 for the main DC and RF=1 for the backup
> site.
>
>
> And for running the application, we have a user specifically created for
>> that application itself and that user also has its appropriate permissions
>> in cassandra to select, modify and delete from concerned tables. And it
>> uses LOCAL_QUORUM for querying the data and the local-dc is set to
>> 'us-east'.
>
>
>
> Also, there is no excessive GC for any of the nodes, we run a custom
> script to trackthe GC stats (from the cassandra log itself) and output it.
> Below is the output from the current running script and its similar for all
> the servers:
>
> 2021-07-27 03:04:45,072 INFO gcstats:58 - Application Thread stop time
>> 0.001656 seconds.
>> 2021-07-27 03:04:45,080 INFO gcstats:58 - Application Thread stop time
>> 0.001669 seconds.
>> 2021-07-27 03:04:45,087 INFO gcstats:58 - Application Thread stop time
>> 0.001601 seconds.
>> 2021-07-27 03:04:45,095 INFO gcstats:58 - Application Thread stop time
>> 0.001713 seconds.
>> 2021-07-27 03:04:45,103 INFO gcstats:58 - Application Thread stop time
>> 0.001586 seconds.
>> 2021-07-27 03:04:45,110 INFO gcstats:58 - Application Thread stop time
>> 0.001671 seconds.
>> 2021-07-27 03:04:45,118 INFO gcstats:58 - Application Thread stop time
>> 0.001691 seconds.
>> 2021-07-27 03:04:45,127 INFO gcstats:58 - Application Thread stop time
>> 0.001860 seconds.
>> 2021-07-27 03:04:45,134 INFO gcstats:58 - Application Thread stop time
>> 0.001630 seconds.
>> 2021-07-27 03:04:45,141 INFO gcstats:58 - Application Thread stop time
>> 0.001515 seconds.
>> 2021-07-27 03:04:45,148 INFO gcstats:58 - Application Thread stop time
>> 0.001533 seconds.
>> 2021-07-27 03:04:45,156 INFO gcstats:58 - Application Thread stop time
>> 0.001630 seconds.
>> 2021-07-27 03:04:45,163 INFO gcstats:58 - Application Thread stop time
>> 0.001577 seconds.
>> 2021-07-27 03:04:45,170 INFO gcstats:58 - Application Thread stop time
>> 0.001538 seconds.
>> 2021-07-27 03:04:45,177 INFO gcstats:58 - Application Thread stop time
>> 0.001615 seconds.
>> 2021-07-27 03:04:45,186 INFO gcstats:58 - Application Thread stop time
>> 0.001584 seconds.
>> 2021-07-27 03:04:45,193 INFO gcstats:58 - Application Thread stop time
>> 0.001558 seconds.
>> 2021-07-27 03:04:45,200 INFO gcstats:58 - Application Thread stop time
>> 0.001696 seconds.
>> 2021-07-27 03:04:45,208 INFO gcstats:58 - Application Thread stop time
>> 0.001658 seconds.
>> 2021-07-27 03:04:45,215 INFO gcstats:58 - Application Thread stop time
>> 0.001592 seconds.
>> 2021-07-27 03:04:45,222 INFO gcstats:58 - Application Thread stop time
>> 0.001618 seconds.
>> 2021-07-27 03:05:08,907 INFO gcstats:58 - Application Thread stop time
>> 0.001624 seconds.
>> 2021-07-27 03:06:34,436 INFO gcstats:58 - Application Thread stop time
>> 0.297773 seconds.
>>
>
>
>
>
>
> On Tue, 27 Jul 2021 at 13:23, 'Bowen Song' via Infra Updates <
> infra-upda...@goevive.com> wrote:
>
>> Hello Chahat,
>>
>>
>> First, can you please make sure the Cassandra user used by the
>> application is not "cassandra"? Because the "cassandra" user uses QUORUM
>> consistency level to read the auth tables.
>>
>> Then, can you please make sure the replication strategy is set correctly
>> for the system_auth namespace? I.e.: ensure the old DC is not present, and
>> the new DC has sufficient number of replicas for fault tolerance.
>>
>> Finally, can you please check the GC logs, and make sure there isn't JVM
>> GC issues, espicially long STW pauses?
>>
>>
>> Regards,
>>
>> Bowen
>>
>>
>> On 27/07/2021 08:34, Chahat Bhatia wrote:
>>
>> Hi Community,
>>
>> Context: We are running a cluster of 6 nodes in production with a RF=3 in
>> AWS.
>> We recently moved from physical servers to cloud by adding a new DC and
>> then removing the old one. Everything is working fine in all the other
>> applications except this one.
>>
>>
>> *As we recently started experiencing read timeouts in one of our
>> production applications where the client threw  *
>>
>> Error An unexpected error occurred server side on ip-IP.ec2.internal:
>>> 

Re: Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Chahat Bhatia
Thanks for the prompt response.

*Here is the system_schema.keyspaces entry:*

system_auth |   True | {'class':
> 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'us-east': '6',
> 'us-east-backup': '1'}
>  census |   True | {'class':
> 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'us-east': '3',
> 'us-east-backup': '1'}



So, the system_auth  for 2 DCs :   *us-east with 6 nodes (and RF=3) and
us-east-backup with a single node and 100% of the data.*
*Keyspace census* showing RF=3 for the main DC and RF=1 for the backup site.


And for running the application, we have a user specifically created for
> that application itself and that user also has its appropriate permissions
> in cassandra to select, modify and delete from concerned tables. And it
> uses LOCAL_QUORUM for querying the data and the local-dc is set to
> 'us-east'.



Also, there is no excessive GC for any of the nodes, we run a custom script
to trackthe GC stats (from the cassandra log itself) and output it. Below
is the output from the current running script and its similar for all the
servers:

2021-07-27 03:04:45,072 INFO gcstats:58 - Application Thread stop time
> 0.001656 seconds.
> 2021-07-27 03:04:45,080 INFO gcstats:58 - Application Thread stop time
> 0.001669 seconds.
> 2021-07-27 03:04:45,087 INFO gcstats:58 - Application Thread stop time
> 0.001601 seconds.
> 2021-07-27 03:04:45,095 INFO gcstats:58 - Application Thread stop time
> 0.001713 seconds.
> 2021-07-27 03:04:45,103 INFO gcstats:58 - Application Thread stop time
> 0.001586 seconds.
> 2021-07-27 03:04:45,110 INFO gcstats:58 - Application Thread stop time
> 0.001671 seconds.
> 2021-07-27 03:04:45,118 INFO gcstats:58 - Application Thread stop time
> 0.001691 seconds.
> 2021-07-27 03:04:45,127 INFO gcstats:58 - Application Thread stop time
> 0.001860 seconds.
> 2021-07-27 03:04:45,134 INFO gcstats:58 - Application Thread stop time
> 0.001630 seconds.
> 2021-07-27 03:04:45,141 INFO gcstats:58 - Application Thread stop time
> 0.001515 seconds.
> 2021-07-27 03:04:45,148 INFO gcstats:58 - Application Thread stop time
> 0.001533 seconds.
> 2021-07-27 03:04:45,156 INFO gcstats:58 - Application Thread stop time
> 0.001630 seconds.
> 2021-07-27 03:04:45,163 INFO gcstats:58 - Application Thread stop time
> 0.001577 seconds.
> 2021-07-27 03:04:45,170 INFO gcstats:58 - Application Thread stop time
> 0.001538 seconds.
> 2021-07-27 03:04:45,177 INFO gcstats:58 - Application Thread stop time
> 0.001615 seconds.
> 2021-07-27 03:04:45,186 INFO gcstats:58 - Application Thread stop time
> 0.001584 seconds.
> 2021-07-27 03:04:45,193 INFO gcstats:58 - Application Thread stop time
> 0.001558 seconds.
> 2021-07-27 03:04:45,200 INFO gcstats:58 - Application Thread stop time
> 0.001696 seconds.
> 2021-07-27 03:04:45,208 INFO gcstats:58 - Application Thread stop time
> 0.001658 seconds.
> 2021-07-27 03:04:45,215 INFO gcstats:58 - Application Thread stop time
> 0.001592 seconds.
> 2021-07-27 03:04:45,222 INFO gcstats:58 - Application Thread stop time
> 0.001618 seconds.
> 2021-07-27 03:05:08,907 INFO gcstats:58 - Application Thread stop time
> 0.001624 seconds.
> 2021-07-27 03:06:34,436 INFO gcstats:58 - Application Thread stop time
> 0.297773 seconds.
>





On Tue, 27 Jul 2021 at 13:23, 'Bowen Song' via Infra Updates <
infra-upda...@goevive.com> wrote:

> Hello Chahat,
>
>
> First, can you please make sure the Cassandra user used by the application
> is not "cassandra"? Because the "cassandra" user uses QUORUM consistency
> level to read the auth tables.
>
> Then, can you please make sure the replication strategy is set correctly
> for the system_auth namespace? I.e.: ensure the old DC is not present, and
> the new DC has sufficient number of replicas for fault tolerance.
>
> Finally, can you please check the GC logs, and make sure there isn't JVM
> GC issues, espicially long STW pauses?
>
>
> Regards,
>
> Bowen
>
>
> On 27/07/2021 08:34, Chahat Bhatia wrote:
>
> Hi Community,
>
> Context: We are running a cluster of 6 nodes in production with a RF=3 in
> AWS.
> We recently moved from physical servers to cloud by adding a new DC and
> then removing the old one. Everything is working fine in all the other
> applications except this one.
>
>
> *As we recently started experiencing read timeouts in one of our
> production applications where the client threw  *
>
> Error An unexpected error occurred server side on ip-IP.ec2.internal:
>> com.google.common.util.concurrent.*UncheckedExecutionException*:
>> *com.google.common.util.concurrent.UncheckedExecutionException:**
>> java.lang.RuntimeException:
>> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out -
>> received only 0 responses.*
>> com.datastax.driver.core.exceptions.ServerError: An unexpected error
>> occurred server side  :
>> com.google.common.util.concurrent.UncheckedExecutionException:
>> com.google.common.util.concurrent.UncheckedExecutionException:
>> 

Re: Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Bowen Song

Hello Chahat,


First, can you please make sure the Cassandra user used by the 
application is not "cassandra"? Because the "cassandra" user uses QUORUM 
consistency level to read the auth tables.


Then, can you please make sure the replication strategy is set correctly 
for the system_auth namespace? I.e.: ensure the old DC is not present, 
and the new DC has sufficient number of replicas for fault tolerance.


Finally, can you please check the GC logs, and make sure there isn't JVM 
GC issues, espicially long STW pauses?



Regards,

Bowen


On 27/07/2021 08:34, Chahat Bhatia wrote:

Hi Community,

Context: We are running a cluster of 6 nodes in production with a RF=3 
in AWS.
We recently moved from physical servers to cloud by adding a new DC 
and then removing the old one. Everything is working fine in all the 
other applications except this one.


*As we recently started experiencing read timeouts in one of our 
production applications where the client threw

*

Error An unexpected error occurred server side

onip-IP.ec2.internal:com.google.common.util.concurrent.*UncheckedExecutionException*:*com.google.common.util.concurrent.UncheckedExecutionException:**java.lang.RuntimeException:org.apache.cassandra.exceptions.ReadTimeoutException:Operation
timed out-received
only0responses.*com.datastax.driver.core.exceptions.ServerError:An
unexpected error occurred server side

:com.google.common.util.concurrent.UncheckedExecutionException:com.google.common.util.concurrent.UncheckedExecutionException:java.lang.RuntimeException:org.apache.cassandra.exceptions.ReadTimeoutException:Operation
timed out-received only0responses.

at

com.datastax.driver.core.exceptions.ServerError.copy(ServerError.java:63)~[cassandra-driver-core-3.3.0-shaded.jar!/:?]at

com.datastax.driver.core.exceptions.ServerError.copy(ServerError.java:25)~[cassandra-driver-core-3.3.0-shaded.jar!/:?]at

com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)~[cassandra-driver-core-3.3.0-shaded.jar!/:?]at

com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)~[cassandra-driver-core-3.3.0-shaded.jar!/:?]at

com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:68)~[cassandra-driver-core-3.3.0-shaded.jar!/:?
] 


  cntd

com.datastax.driver.core.exceptions.ReadTimeoutException:
Cassandra timeout during read query at consistency
LOCAL_QUORUM (2 responses were required but only 1 replica
responded)



*And around the same time these were the errors on the server side 
(from cassandra logs):


*

*ERROR [RolesCacheRefresh:1] 2021-07-26 06:32:43,094
CassandraDaemon.java:207 - Exception in thread
Thread[RolesCacheRefresh:1,5,main]
java.lang.RuntimeException:
org.apache.cassandra.exceptions.ReadTimeoutException:
Operation timed out - received only 0 responses.
*        at

org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:512)
~[apache-cassandra-3.0.13.jar:3.0.13]
        at

org.apache.cassandra.auth.CassandraRoleManager.getRoles(CassandraRoleManager.java:280)
~[apache-cassandra-3.0.13.jar:3.0.13]
        at
org.apache.cassandra.auth.RolesCache$1$1.call(RolesCache.java:135)
~[apache-cassandra-3.0.13.jar:3.0.13]
        at
org.apache.cassandra.auth.RolesCache$1$1.call(RolesCache.java:130)
~[apache-cassandra-3.0.13.jar:3.0.13]
        at
java.util.concurrent.FutureTask.run(FutureTask.java:266)
~[na:1.8.0_131]
        at

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
~[na:1.8.0_131]
        at

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_131]
        at

org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
[apache-cassandra-3.0.13.jar:3.0.13]
        at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
Caused by:
org.apache.cassandra.exceptions.ReadTimeoutException:
Operation timed out - received only 0 responses.
        at

org.apache.cassandra.service.ReadCallback.awaitResults(ReadCallback.java:132)
~[apache-cassandra-3.0.13.jar:3.0.13]
        at
org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:137)
~[apache-cassandra-3.0.13.jar:3.0.13]
        at

org.apache.cassandra.service.AbstractReadExecutor.get(AbstractReadExecutor.java:145)
~[apache-cassandra-3.0.13.jar:3.0.13]
        at


Re: Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Erick Ramirez
Are you using the default `cassandra` superuser role? Because that would be
expensive. Also confirm if you've set the replication for the `system_auth`
keyspace to NTS because if you have multiple DCs, the request could be
going to another DC.

It's interesting that you've set validity to over 3 days but you update
them every 6 hours. Is that intentional? Cheers!


Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Chahat Bhatia
Hi Community,

Context: We are running a cluster of 6 nodes in production with a RF=3 in
AWS.
We recently moved from physical servers to cloud by adding a new DC and
then removing the old one. Everything is working fine in all the other
applications except this one.


*As we recently started experiencing read timeouts in one of our production
applications where the client threw *

Error An unexpected error occurred server side on ip-IP.ec2.internal:
> com.google.common.util.concurrent.*UncheckedExecutionException*:
> *com.google.common.util.concurrent.UncheckedExecutionException:**
> java.lang.RuntimeException:
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out -
> received only 0 responses.*
> com.datastax.driver.core.exceptions.ServerError: An unexpected error
> occurred server side  :
> com.google.common.util.concurrent.UncheckedExecutionException:
> com.google.common.util.concurrent.UncheckedExecutionException:
> java.lang.RuntimeException:
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out
> - received only 0 responses.

at com.datastax.driver.core.exceptions.ServerError.copy(ServerError.java:63)
> ~[cassandra-driver-core-3.3.0-shaded.jar!/:?] at
> com.datastax.driver.core.exceptions.ServerError.copy(ServerError.java:25)
> ~[cassandra-driver-core-3.3.0-shaded.jar!/:?] at
> com.datastax.driver.core.DriverThrowables.propagateCause(
> DriverThrowables.java:37) ~[cassandra-driver-core-3.3.0-shaded.jar!/:?]
> at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(
> DefaultResultSetFuture.java:245) ~[cassandra-driver-core-3
> .3.0-shaded.jar!/:?] at com.datastax.driver.core.AbstractSession.execute(
> AbstractSession.java:68) ~[cassandra-driver-core-3.3.0-shaded.jar!/:? ]

  cntd



com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout
> during read query at consistency LOCAL_QUORUM (2 responses were required
> but only 1 replica responded)





*And around the same time these were the errors on the server side (from
cassandra logs):*


>
> *ERROR [RolesCacheRefresh:1] 2021-07-26 06:32:43,094
> CassandraDaemon.java:207 - Exception in thread
> Thread[RolesCacheRefresh:1,5,main]java.lang.RuntimeException:
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out -
> received only 0 responses.*at
> org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:512)
> ~[apache-cassandra-3.0.13.jar:3.0.13]
> at
> org.apache.cassandra.auth.CassandraRoleManager.getRoles(CassandraRoleManager.java:280)
> ~[apache-cassandra-3.0.13.jar:3.0.13]
> at
> org.apache.cassandra.auth.RolesCache$1$1.call(RolesCache.java:135)
> ~[apache-cassandra-3.0.13.jar:3.0.13]
> at
> org.apache.cassandra.auth.RolesCache$1$1.call(RolesCache.java:130)
> ~[apache-cassandra-3.0.13.jar:3.0.13]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[na:1.8.0_131]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> ~[na:1.8.0_131]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_131]
> at
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
> [apache-cassandra-3.0.13.jar:3.0.13]
> at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
> Caused by: org.apache.cassandra.exceptions.ReadTimeoutException: Operation
> timed out - received only 0 responses.
> at
> org.apache.cassandra.service.ReadCallback.awaitResults(ReadCallback.java:132)
> ~[apache-cassandra-3.0.13.jar:3.0.13]
> at
> org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:137)
> ~[apache-cassandra-3.0.13.jar:3.0.13]
> at
> org.apache.cassandra.service.AbstractReadExecutor.get(AbstractReadExecutor.java:145)
> ~[apache-cassandra-3.0.13.jar:3.0.13]
> at
> org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch(StorageProxy.java:1715)
> ~[apache-cassandra-3.0.13.jar:3.0.13]
> at
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1664)
> ~[apache-cassandra-3.0.13.jar:3.0.13]
> at
> org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1605)
> ~[apache-cassandra-3.0.13.jar:3.0.13]
> at
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1524)
> ~[apache-cassandra-3.0.13.jar:3.0.13]
> at
> org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:955)
> ~[apache-cassandra-3.0.13.jar:3.0.13]
> at
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:263)
> ~[apache-cassandra-3.0.13.jar:3.0.13]
> at
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:224)
> ~[apache-cassandra-3.0.13.jar:3.0.13]
> at
>