Hi Community, Context: We are running a cluster of 6 nodes in production with a RF=3 in AWS. We recently moved from physical servers to cloud by adding a new DC and then removing the old one. Everything is working fine in all the other applications except this one.
*As we recently started experiencing read timeouts in one of our production applications where the client threw * Error An unexpected error occurred server side on ip-IP.ec2.internal: > com.google.common.util.concurrent.*UncheckedExecutionException*: > *com.google.common.util.concurrent.UncheckedExecutionException:** > java.lang.RuntimeException: > org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - > received only 0 responses.* > com.datastax.driver.core.exceptions.ServerError: An unexpected error > occurred server side : > com.google.common.util.concurrent.UncheckedExecutionException: > com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.RuntimeException: > org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out > - received only 0 responses. at com.datastax.driver.core.exceptions.ServerError.copy(ServerError.java:63) > ~[cassandra-driver-core-3.3.0-shaded.jar!/:?] at > com.datastax.driver.core.exceptions.ServerError.copy(ServerError.java:25) > ~[cassandra-driver-core-3.3.0-shaded.jar!/:?] at > com.datastax.driver.core.DriverThrowables.propagateCause( > DriverThrowables.java:37) ~[cassandra-driver-core-3.3.0-shaded.jar!/:?] > at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly( > DefaultResultSetFuture.java:245) ~[cassandra-driver-core-3 > .3.0-shaded.jar!/:?] at com.datastax.driver.core.AbstractSession.execute( > AbstractSession.java:68) ~[cassandra-driver-core-3.3.0-shaded.jar!/:? ] ............ cntd com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout > during read query at consistency LOCAL_QUORUM (2 responses were required > but only 1 replica responded) *And around the same time these were the errors on the server side (from cassandra logs):* > > *ERROR [RolesCacheRefresh:1] 2021-07-26 06:32:43,094 > CassandraDaemon.java:207 - Exception in thread > Thread[RolesCacheRefresh:1,5,main]java.lang.RuntimeException: > org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - > received only 0 responses.* at > org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:512) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.auth.CassandraRoleManager.getRoles(CassandraRoleManager.java:280) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.auth.RolesCache$1$1.call(RolesCache.java:135) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.auth.RolesCache$1$1.call(RolesCache.java:130) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_131] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_131] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_131] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > [apache-cassandra-3.0.13.jar:3.0.13] > at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131] > Caused by: org.apache.cassandra.exceptions.ReadTimeoutException: Operation > timed out - received only 0 responses. > at > org.apache.cassandra.service.ReadCallback.awaitResults(ReadCallback.java:132) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:137) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.service.AbstractReadExecutor.get(AbstractReadExecutor.java:145) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch(StorageProxy.java:1715) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1664) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1605) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1524) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:955) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:263) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:224) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.auth.CassandraRoleManager.getRoleFromTable(CassandraRoleManager.java:520) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:502) > ~[apache-cassandra-3.0.13.jar:3.0.13] > > > > *ERROR [PermissionsCacheRefresh:1] 2021-07-26 07:11:25,804 > CassandraDaemon.java:207 - Exception in thread > Thread[PermissionsCacheRefresh:1,5,main]java.lang.RuntimeException: > org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - > received only 0 responses.* at > org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:512) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.auth.CassandraRoleManager.isSuper(CassandraRoleManager.java:304) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.auth.Roles.hasSuperuserStatus(Roles.java:52) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.auth.AuthenticatedUser.isSuper(AuthenticatedUser.java:71) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.auth.CassandraAuthorizer.authorize(CassandraAuthorizer.java:76) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.auth.PermissionsCache$1$1.call(PermissionsCache.java:136) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.auth.PermissionsCache$1$1.call(PermissionsCache.java:131) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_131] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_131] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_131] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > [apache-cassandra-3.0.13.jar:3.0.13] > at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131] > Caused by: org.apache.cassandra.exceptions.ReadTimeoutException: Operation > timed out - received only 0 responses. > at > org.apache.cassandra.service.ReadCallback.awaitResults(ReadCallback.java:132) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:137) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.service.AbstractReadExecutor.get(AbstractReadExecutor.java:145) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch(StorageProxy.java:1715) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1664) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1605) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1524) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:955) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:263) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:224) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.auth.CassandraRoleManager.getRoleFromTable(CassandraRoleManager.java:520) > ~[apache-cassandra-3.0.13.jar:3.0.13] > at > org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:502) > ~[apache-cassandra-3.0.13.jar:3.0.13] > *These are the values of these params in my configuration file * permissions_validity_in_ms: 300000 > permissions_update_interval_in_ms: 20000 > roles_validity_in_ms: 300000 > roles_update_interval_in_ms: 20000 > This was not the case earlier and since this comes from a single app alone we are not sure if this is actually the issue. Can anyone please point out if these values are misconfigured and hence causing the issue or is it somewhere else we should be looking at? Any help would be appreciated. Thanks & Regards, Chahat.