[
https://issues.apache.org/jira/browse/CASSANDRA-8194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266056#comment-14266056
]
Sam Tunnicliffe commented on CASSANDRA-8194:
--------------------------------------------
While there is a window during which a stale set of permissions is used, under
normal operation I don't think this *should* present too many practical
problems.
Refresh is triggered by the first lookup after permisions_validity_in_ms, so
we'll continue to use the stale set between that point and when that refresh
actually completes. Outside of tests though, clients have no
visibility/expectation about the precise load or expiry timings, so this
shouldn't usually matter. My concern would be performing every
IAuthorizer.authorize call on a single thread using StorageService.tasks
instead of distributing them across client request threads could cause a
backlog and allow the window to grow unacceptably (plus, these tasks will also
be contending with other users of the shared executor).
The point about the proliferation of threads and executors is valid, but maybe
there's a case for a dedicated executor here. We could make it a TPE with a
default poolsize of 1 but allow that to be increased via a system property if
necessary.
What may be more of an issue is that we'll continue to serve the stale perms as
long as the refresh fails completely due to IAuthorizer.authorize throwing some
exception. This shouldn't really happen with CassandraAuthorizer, but other
IAuthorizer impls could well encounter errors when fetching perms. To guard
against that, we can force an invalidation if the ListenableFutureTask
encounters an exception. That would pretty much maintain current behaviour,
with the client receiving an error response while the refresh fails (actually,
the authorize calls after an error would serve stale perms until the exception
is thrown & caught, but all subsequent calls would fail as per current
behaviour).
I've attached a v3 with this second change, what are your thoughts on reverting
back to a dedicated executor for cache refresh?
Also, as I mentioned, tests do have concrete expectations about expiry of
permissions and so this breaks auth_test.py:TestAuth.permissions_caching_test.
I've pushed a fix [here|https://github.com/beobal/cassandra-dtest/tree/8194]
and I'll open a PR shortly.
> Reading from Auth table should not be in the request path
> ---------------------------------------------------------
>
> Key: CASSANDRA-8194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8194
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Vishy Kasar
> Assignee: Vishy Kasar
> Priority: Minor
> Fix For: 2.0.12, 3.0
>
> Attachments: 8194-V2.patch, 8194.patch, CacheTest2.java
>
>
> We use PasswordAuthenticator and PasswordAuthorizer. The system_auth has a RF
> of 10 per DC over 2 DCs. The permissions_validity_in_ms is 5 minutes.
> We still have few thousand requests failing each day with the trace below.
> The reason for this is read cache request realizing that cached entry has
> expired and doing a blocking request to refresh cache.
> We should have cache refreshed periodically only in the back ground. The user
> request should simply look at the cache and not try to refresh it.
> com.google.common.util.concurrent.UncheckedExecutionException:
> java.lang.RuntimeException:
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out -
> received only 0 responses.
> at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2258)
> at com.google.common.cache.LocalCache.get(LocalCache.java:3990)
> at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3994)
> at
> com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4878)
> at
> org.apache.cassandra.service.ClientState.authorize(ClientState.java:292)
> at
> org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState.java:172)
> at
> org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:165)
> at
> org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientState.java:149)
> at
> org.apache.cassandra.cql3.statements.ModificationStatement.checkAccess(ModificationStatement.java:75)
> at
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:102)
> at
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:113)
> at
> org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1735)
> at
> org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4162)
> at
> org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4150)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
> at
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: java.lang.RuntimeException:
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out -
> received only 0 responses.
> at org.apache.cassandra.auth.Auth.selectUser(Auth.java:256)
> at org.apache.cassandra.auth.Auth.isSuperuser(Auth.java:84)
> at
> org.apache.cassandra.auth.AuthenticatedUser.isSuper(AuthenticatedUser.java:50)
> at
> org.apache.cassandra.auth.CassandraAuthorizer.authorize(CassandraAuthorizer.java:68)
> at org.apache.cassandra.service.ClientState$1.load(ClientState.java:278)
> at org.apache.cassandra.service.ClientState$1.load(ClientState.java:275)
> at
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3589)
> at
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2374)
> at
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2337)
> at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2252)
> ... 19 more
> Caused by: org.apache.cassandra.exceptions.ReadTimeoutException: Operation
> timed out - received only 0 responses.
> at org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:105)
> at
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:943)
> at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:828)
> at
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:140)
> at org.apache.cassandra.auth.Auth.selectUser(Auth.java:245)
> ... 28 more
> ERROR [Thrift:17232] 2014-10-24 05:06:51,004 CustomTThreadPoolServer.java
> (line 224) Error occurred during processing of message.
> com.google.common.util.concurrent.UncheckedExecutionException:
> java.lang.RuntimeException:
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out -
> received only 0 responses.
> at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2258)
> at com.google.common.cache.LocalCache.get(LocalCache.java:3990)
> at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3994)
> at
> com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4878)
> at
> org.apache.cassandra.service.ClientState.authorize(ClientState.java:292)
> at
> org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState.java:172)
> at
> org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:165)
> at
> org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientState.java:149)
> at
> org.apache.cassandra.cql3.statements.SelectStatement.checkAccess(SelectStatement.java:116)
> at
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:102)
> at
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:113)
> at
> org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1735)
> at
> org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4162)
> at
> org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4150)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
> at
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: java.lang.RuntimeException:
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out -
> received only 0 responses.
> at org.apache.cassandra.auth.Auth.selectUser(Auth.java:256)
> at org.apache.cassandra.auth.Auth.isSuperuser(Auth.java:84)
> at
> org.apache.cassandra.auth.AuthenticatedUser.isSuper(AuthenticatedUser.java:50)
> at
> org.apache.cassandra.auth.CassandraAuthorizer.authorize(CassandraAuthorizer.java:68)
> at org.apache.cassandra.service.ClientState$1.load(ClientState.java:278)
> at org.apache.cassandra.service.ClientState$1.load(ClientState.java:275)
> at
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3589)
> at
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2374)
> at
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2337)
> at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2252)
> ... 19 more
> Caused by: org.apache.cassandra.exceptions.ReadTimeoutException: Operation
> timed out - received only 0 responses.
> at org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:105)
> at
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:943)
> at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:828)
> at
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:140)
> at org.apache.cassandra.auth.Auth.selectUser(Auth.java:245)
> ... 28 more
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)