Okay. Sure. Thanks a lot for all the information. Really helped. :)
On Tue, 27 Jul 2021 at 21:05, Bowen Song wrote:
> Based on the information I know, I'd say that you don't have any specific
> issue with the authentication related tables, but you do have a general
> overloading problem during
Based on the information I know, I'd say that you don't have any
specific issue with the authentication related tables, but you do have a
general overloading problem during peak load. I think it's fairly likely
that your 7 nodes cluster (6 nodes in one DC) is not able to keep up
with the peak
Yes, the application in quite read heavy and the request pattern is bursty
too. Hence that big a request failure in such less time.
Also, nothing out of the ordinary in cfstats and proxyhistograms.
But there are Native-Transport-Requests dropped messages (Almost similar
stats on all the nodes) :
Wow, 15 seconds timeout? That's pretty long... You may want to check the
nodetool tpstats and make sure the NTP thread pool isn't blocking things.
16k read requests dropped in 5 seconds, or over 3k requests per second
on a single node, is a bit suspicious. Does your read requests tend to
be
Yes, RF=6 for system auth. Sorry my bad.
No, we are not using cassandra user for the application. We have a custom
super user for our operational and administrative tasks and a separate role
with needed perms for the application.
> role | super | login | options
>
Hello Chahat,
You haven't replied to the first point, are you using the "cassandra" user?
The schema and your description don't quite match. When you said:
//
/the system_auth for 2 DCs : //*us-east*//with 6 nodes (and RF=3)
and ...
/
I assume you meant to say 6 nodes and RF=6?
>
> Also, It's interesting that you've set validity to over 3 days but you
> update them every 6 hours. Is that intentional?
We set that earlier when were in the process to add new roles (creating
new roles for the new apps we setup) but we never changed after that and
hence its been the same
Thanks for the prompt response.
*Here is the system_schema.keyspaces entry:*
system_auth | True | {'class':
> 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'us-east': '6',
> 'us-east-backup': '1'}
> census | True | {'class':
>
Hello Chahat,
First, can you please make sure the Cassandra user used by the
application is not "cassandra"? Because the "cassandra" user uses QUORUM
consistency level to read the auth tables.
Then, can you please make sure the replication strategy is set correctly
for the system_auth
Are you using the default `cassandra` superuser role? Because that would be
expensive. Also confirm if you've set the replication for the `system_auth`
keyspace to NTS because if you have multiple DCs, the request could be
going to another DC.
It's interesting that you've set validity to over 3
10 matches
Mail list logo