Re: Replication factor, LOCAL_QUORUM write consistency and materialized views
Hi, On Fri, May 17, 2024 at 6:18 PM Jon Haddad wrote: > I strongly suggest you don't use materialized views at all. There are > edge cases that in my opinion make them unsuitable for production, both in > terms of cluster stability as well as data integrity. > Oh, there is already an open and fresh Jira ticket about it: https://issues.apache.org/jira/browse/CASSANDRA-19383 Bye, Gábor AUTH
Re: Replication factor, LOCAL_QUORUM write consistency and materialized views
Hi, On Fri, May 17, 2024 at 6:18 PM Jon Haddad wrote: > I strongly suggest you don't use materialized views at all. There are > edge cases that in my opinion make them unsuitable for production, both in > terms of cluster stability as well as data integrity. > I totally agree with you about it. But it looks like a strange and interesting issue... the affected table has only ~1300 rows and less than 200 kB data. :) Also, I found a same issue: https://dba.stackexchange.com/questions/325140/single-node-failure-in-cassandra-4-0-7-causes-cluster-to-run-into-high-cpu Bye, Gábor AUTH > On Fri, May 17, 2024 at 8:58 AM Gábor Auth wrote: > >> Hi, >> >> I know, I know, the materialized view is experimental... :) >> >> So, I ran into a strange error. Among others, I have a very small 4-nodes >> cluster, with very minimal data (~100 MB at all), the keyspace's >> replication factor is 3, everything is works fine... except: if I restart a >> node, I get a lot of errors with materialized views and consistency level >> ONE, but only for those tables for which there is more than one >> materialized view. >> >> Tables without materialized view don't have it, works fine. >> Tables that have it, but only one materialized view, also works fine. >> But, a table with more than one materialized view, whoops, the cluster >> crashes temporarily, I can also see on the calling side (Java backend) that >> no nodes are responding: >> >> Caused by: com.datastax.driver.core.exceptions.WriteFailureException: >> Cassandra failure during write query at consistency LOCAL_QUORUM (2 >> responses were required but only 1 replica responded, 2 failed) >> >> I am surprised by this behavior, because there is so little data >> involved, and it occurs when there is more than one materialized view only, >> so it might be a concurrency issue under the hood. >> >> Have you seen an issue like this? >> >> Here is a stack trace on the Cassandra's side: >> >> [cassandra-dc03-1] ERROR [MutationStage-1] 2024-05-17 08:51:47,425 >> Keyspace.java:652 - Unknown exception caught while attempting to update >> MaterializedView! pope.unit >> [cassandra-dc03-1] org.apache.cassandra.exceptions.UnavailableException: >> Cannot achieve consistency level ONE >> [cassandra-dc03-1] at >> org.apache.cassandra.exceptions.UnavailableException.create(UnavailableException.java:37) >> [cassandra-dc03-1] at >> org.apache.cassandra.locator.ReplicaPlans.assureSufficientLiveReplicas(ReplicaPlans.java:170) >> [cassandra-dc03-1] at >> org.apache.cassandra.locator.ReplicaPlans.assureSufficientLiveReplicasForWrite(ReplicaPlans.java:113) >> [cassandra-dc03-1] at >> org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:354) >> [cassandra-dc03-1] at >> org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:345) >> [cassandra-dc03-1] at >> org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:339) >> [cassandra-dc03-1] at >> org.apache.cassandra.service.StorageProxy.wrapViewBatchResponseHandler(StorageProxy.java:1312) >> [cassandra-dc03-1] at >> org.apache.cassandra.service.StorageProxy.mutateMV(StorageProxy.java:1004) >> [cassandra-dc03-1] at >> org.apache.cassandra.db.view.TableViews.pushViewReplicaUpdates(TableViews.java:167) >> [cassandra-dc03-1] at >> org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:647) >> [cassandra-dc03-1] at >> org.apache.cassandra.db.Keyspace.applyFuture(Keyspace.java:477) >> [cassandra-dc03-1] at >> org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:210) >> [cassandra-dc03-1] at >> org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:58) >> [cassandra-dc03-1] at >> org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78) >> [cassandra-dc03-1] at >> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97) >> [cassandra-dc03-1] at >> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45) >> [cassandra-dc03-1] at >> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:432) >> [cassandra-dc03-1] at >> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown >> Source) >> [cassandra-dc03-1] at >> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:165) >> [cassandra-dc03-1] at >> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:137) >> [cassandra-dc03-1] at >> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119) >> [cassandra-dc03-1] at >> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) >> [cassandra-dc03-1] at java.base/java.lang.Thread.run(Unknown Source) >> >> -- >> Bye, >> Gábor AUTH >> >
Re: Replication factor, LOCAL_QUORUM write consistency and materialized views
I strongly suggest you don't use materialized views at all. There are edge cases that in my opinion make them unsuitable for production, both in terms of cluster stability as well as data integrity. Jon On Fri, May 17, 2024 at 8:58 AM Gábor Auth wrote: > Hi, > > I know, I know, the materialized view is experimental... :) > > So, I ran into a strange error. Among others, I have a very small 4-nodes > cluster, with very minimal data (~100 MB at all), the keyspace's > replication factor is 3, everything is works fine... except: if I restart a > node, I get a lot of errors with materialized views and consistency level > ONE, but only for those tables for which there is more than one > materialized view. > > Tables without materialized view don't have it, works fine. > Tables that have it, but only one materialized view, also works fine. > But, a table with more than one materialized view, whoops, the cluster > crashes temporarily, I can also see on the calling side (Java backend) that > no nodes are responding: > > Caused by: com.datastax.driver.core.exceptions.WriteFailureException: > Cassandra failure during write query at consistency LOCAL_QUORUM (2 > responses were required but only 1 replica responded, 2 failed) > > I am surprised by this behavior, because there is so little data involved, > and it occurs when there is more than one materialized view only, so it > might be a concurrency issue under the hood. > > Have you seen an issue like this? > > Here is a stack trace on the Cassandra's side: > > [cassandra-dc03-1] ERROR [MutationStage-1] 2024-05-17 08:51:47,425 > Keyspace.java:652 - Unknown exception caught while attempting to update > MaterializedView! pope.unit > [cassandra-dc03-1] org.apache.cassandra.exceptions.UnavailableException: > Cannot achieve consistency level ONE > [cassandra-dc03-1] at > org.apache.cassandra.exceptions.UnavailableException.create(UnavailableException.java:37) > [cassandra-dc03-1] at > org.apache.cassandra.locator.ReplicaPlans.assureSufficientLiveReplicas(ReplicaPlans.java:170) > [cassandra-dc03-1] at > org.apache.cassandra.locator.ReplicaPlans.assureSufficientLiveReplicasForWrite(ReplicaPlans.java:113) > [cassandra-dc03-1] at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:354) > [cassandra-dc03-1] at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:345) > [cassandra-dc03-1] at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:339) > [cassandra-dc03-1] at > org.apache.cassandra.service.StorageProxy.wrapViewBatchResponseHandler(StorageProxy.java:1312) > [cassandra-dc03-1] at > org.apache.cassandra.service.StorageProxy.mutateMV(StorageProxy.java:1004) > [cassandra-dc03-1] at > org.apache.cassandra.db.view.TableViews.pushViewReplicaUpdates(TableViews.java:167) > [cassandra-dc03-1] at > org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:647) > [cassandra-dc03-1] at > org.apache.cassandra.db.Keyspace.applyFuture(Keyspace.java:477) > [cassandra-dc03-1] at > org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:210) > [cassandra-dc03-1] at > org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:58) > [cassandra-dc03-1] at > org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78) > [cassandra-dc03-1] at > org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97) > [cassandra-dc03-1] at > org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45) > [cassandra-dc03-1] at > org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:432) > [cassandra-dc03-1] at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown > Source) > [cassandra-dc03-1] at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:165) > [cassandra-dc03-1] at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:137) > [cassandra-dc03-1] at > org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119) > [cassandra-dc03-1] at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [cassandra-dc03-1] at java.base/java.lang.Thread.run(Unknown Source) > > -- > Bye, > Gábor AUTH >
Replication factor, LOCAL_QUORUM write consistency and materialized views
Hi, I know, I know, the materialized view is experimental... :) So, I ran into a strange error. Among others, I have a very small 4-nodes cluster, with very minimal data (~100 MB at all), the keyspace's replication factor is 3, everything is works fine... except: if I restart a node, I get a lot of errors with materialized views and consistency level ONE, but only for those tables for which there is more than one materialized view. Tables without materialized view don't have it, works fine. Tables that have it, but only one materialized view, also works fine. But, a table with more than one materialized view, whoops, the cluster crashes temporarily, I can also see on the calling side (Java backend) that no nodes are responding: Caused by: com.datastax.driver.core.exceptions.WriteFailureException: Cassandra failure during write query at consistency LOCAL_QUORUM (2 responses were required but only 1 replica responded, 2 failed) I am surprised by this behavior, because there is so little data involved, and it occurs when there is more than one materialized view only, so it might be a concurrency issue under the hood. Have you seen an issue like this? Here is a stack trace on the Cassandra's side: [cassandra-dc03-1] ERROR [MutationStage-1] 2024-05-17 08:51:47,425 Keyspace.java:652 - Unknown exception caught while attempting to update MaterializedView! pope.unit [cassandra-dc03-1] org.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level ONE [cassandra-dc03-1] at org.apache.cassandra.exceptions.UnavailableException.create(UnavailableException.java:37) [cassandra-dc03-1] at org.apache.cassandra.locator.ReplicaPlans.assureSufficientLiveReplicas(ReplicaPlans.java:170) [cassandra-dc03-1] at org.apache.cassandra.locator.ReplicaPlans.assureSufficientLiveReplicasForWrite(ReplicaPlans.java:113) [cassandra-dc03-1] at org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:354) [cassandra-dc03-1] at org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:345) [cassandra-dc03-1] at org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:339) [cassandra-dc03-1] at org.apache.cassandra.service.StorageProxy.wrapViewBatchResponseHandler(StorageProxy.java:1312) [cassandra-dc03-1] at org.apache.cassandra.service.StorageProxy.mutateMV(StorageProxy.java:1004) [cassandra-dc03-1] at org.apache.cassandra.db.view.TableViews.pushViewReplicaUpdates(TableViews.java:167) [cassandra-dc03-1] at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:647) [cassandra-dc03-1] at org.apache.cassandra.db.Keyspace.applyFuture(Keyspace.java:477) [cassandra-dc03-1] at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:210) [cassandra-dc03-1] at org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:58) [cassandra-dc03-1] at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78) [cassandra-dc03-1] at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97) [cassandra-dc03-1] at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45) [cassandra-dc03-1] at org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:432) [cassandra-dc03-1] at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [cassandra-dc03-1] at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:165) [cassandra-dc03-1] at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:137) [cassandra-dc03-1] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119) [cassandra-dc03-1] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [cassandra-dc03-1] at java.base/java.lang.Thread.run(Unknown Source) -- Bye, Gábor AUTH