Re: Replication factor, LOCAL_QUORUM write consistency and materialized views

2024-05-17 Thread Gábor Auth
Hi,

On Fri, May 17, 2024 at 6:18 PM Jon Haddad  wrote:

> I strongly suggest you don't use materialized views at all.  There are
> edge cases that in my opinion make them unsuitable for production, both in
> terms of cluster stability as well as data integrity.
>

Oh, there is already an open and fresh Jira ticket about it:
https://issues.apache.org/jira/browse/CASSANDRA-19383

Bye,
Gábor AUTH


Re: Replication factor, LOCAL_QUORUM write consistency and materialized views

2024-05-17 Thread Gábor Auth
Hi,

On Fri, May 17, 2024 at 6:18 PM Jon Haddad  wrote:

> I strongly suggest you don't use materialized views at all.  There are
> edge cases that in my opinion make them unsuitable for production, both in
> terms of cluster stability as well as data integrity.
>

I totally agree with you about it. But it looks like a strange and
interesting issue... the affected table has only ~1300 rows and less than
200 kB data. :)

Also, I found a same issue:
https://dba.stackexchange.com/questions/325140/single-node-failure-in-cassandra-4-0-7-causes-cluster-to-run-into-high-cpu

Bye,
Gábor AUTH


> On Fri, May 17, 2024 at 8:58 AM Gábor Auth  wrote:
>
>> Hi,
>>
>> I know, I know, the materialized view is experimental... :)
>>
>> So, I ran into a strange error. Among others, I have a very small 4-nodes
>> cluster, with very minimal data (~100 MB at all), the keyspace's
>> replication factor is 3, everything is works fine... except: if I restart a
>> node, I get a lot of errors with materialized views and consistency level
>> ONE, but only for those tables for which there is more than one
>> materialized view.
>>
>> Tables without materialized view don't have it, works fine.
>> Tables that have it, but only one materialized view, also works fine.
>> But, a table with more than one materialized view, whoops, the cluster
>> crashes temporarily, I can also see on the calling side (Java backend) that
>> no nodes are responding:
>>
>> Caused by: com.datastax.driver.core.exceptions.WriteFailureException:
>> Cassandra failure during write query at consistency LOCAL_QUORUM (2
>> responses were required but only 1 replica responded, 2 failed)
>>
>> I am surprised by this behavior, because there is so little data
>> involved, and it occurs when there is more than one materialized view only,
>> so it might be a concurrency issue under the hood.
>>
>> Have you seen an issue like this?
>>
>> Here is a stack trace on the Cassandra's side:
>>
>> [cassandra-dc03-1] ERROR [MutationStage-1] 2024-05-17 08:51:47,425
>> Keyspace.java:652 - Unknown exception caught while attempting to update
>> MaterializedView! pope.unit
>> [cassandra-dc03-1] org.apache.cassandra.exceptions.UnavailableException:
>> Cannot achieve consistency level ONE
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.exceptions.UnavailableException.create(UnavailableException.java:37)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.locator.ReplicaPlans.assureSufficientLiveReplicas(ReplicaPlans.java:170)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.locator.ReplicaPlans.assureSufficientLiveReplicasForWrite(ReplicaPlans.java:113)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:354)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:345)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:339)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.service.StorageProxy.wrapViewBatchResponseHandler(StorageProxy.java:1312)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.service.StorageProxy.mutateMV(StorageProxy.java:1004)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.db.view.TableViews.pushViewReplicaUpdates(TableViews.java:167)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:647)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.db.Keyspace.applyFuture(Keyspace.java:477)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:210)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:58)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:432)
>> [cassandra-dc03-1]  at
>> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown
>> Source)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:165)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:137)
>> [cassandra-dc03-1]  at
>> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119)
>> [cassandra-dc03-1]  at
>> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>> [cassandra-dc03-1]  at java.base/java.lang.Thread.run(Unknown Source)
>>
>> --
>> Bye,
>> Gábor AUTH
>>
>


Re: Replication factor, LOCAL_QUORUM write consistency and materialized views

2024-05-17 Thread Jon Haddad
I strongly suggest you don't use materialized views at all.  There are edge
cases that in my opinion make them unsuitable for production, both in terms
of cluster stability as well as data integrity.

Jon

On Fri, May 17, 2024 at 8:58 AM Gábor Auth  wrote:

> Hi,
>
> I know, I know, the materialized view is experimental... :)
>
> So, I ran into a strange error. Among others, I have a very small 4-nodes
> cluster, with very minimal data (~100 MB at all), the keyspace's
> replication factor is 3, everything is works fine... except: if I restart a
> node, I get a lot of errors with materialized views and consistency level
> ONE, but only for those tables for which there is more than one
> materialized view.
>
> Tables without materialized view don't have it, works fine.
> Tables that have it, but only one materialized view, also works fine.
> But, a table with more than one materialized view, whoops, the cluster
> crashes temporarily, I can also see on the calling side (Java backend) that
> no nodes are responding:
>
> Caused by: com.datastax.driver.core.exceptions.WriteFailureException:
> Cassandra failure during write query at consistency LOCAL_QUORUM (2
> responses were required but only 1 replica responded, 2 failed)
>
> I am surprised by this behavior, because there is so little data involved,
> and it occurs when there is more than one materialized view only, so it
> might be a concurrency issue under the hood.
>
> Have you seen an issue like this?
>
> Here is a stack trace on the Cassandra's side:
>
> [cassandra-dc03-1] ERROR [MutationStage-1] 2024-05-17 08:51:47,425
> Keyspace.java:652 - Unknown exception caught while attempting to update
> MaterializedView! pope.unit
> [cassandra-dc03-1] org.apache.cassandra.exceptions.UnavailableException:
> Cannot achieve consistency level ONE
> [cassandra-dc03-1]  at
> org.apache.cassandra.exceptions.UnavailableException.create(UnavailableException.java:37)
> [cassandra-dc03-1]  at
> org.apache.cassandra.locator.ReplicaPlans.assureSufficientLiveReplicas(ReplicaPlans.java:170)
> [cassandra-dc03-1]  at
> org.apache.cassandra.locator.ReplicaPlans.assureSufficientLiveReplicasForWrite(ReplicaPlans.java:113)
> [cassandra-dc03-1]  at
> org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:354)
> [cassandra-dc03-1]  at
> org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:345)
> [cassandra-dc03-1]  at
> org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:339)
> [cassandra-dc03-1]  at
> org.apache.cassandra.service.StorageProxy.wrapViewBatchResponseHandler(StorageProxy.java:1312)
> [cassandra-dc03-1]  at
> org.apache.cassandra.service.StorageProxy.mutateMV(StorageProxy.java:1004)
> [cassandra-dc03-1]  at
> org.apache.cassandra.db.view.TableViews.pushViewReplicaUpdates(TableViews.java:167)
> [cassandra-dc03-1]  at
> org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:647)
> [cassandra-dc03-1]  at
> org.apache.cassandra.db.Keyspace.applyFuture(Keyspace.java:477)
> [cassandra-dc03-1]  at
> org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:210)
> [cassandra-dc03-1]  at
> org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:58)
> [cassandra-dc03-1]  at
> org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
> [cassandra-dc03-1]  at
> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
> [cassandra-dc03-1]  at
> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
> [cassandra-dc03-1]  at
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:432)
> [cassandra-dc03-1]  at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown
> Source)
> [cassandra-dc03-1]  at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:165)
> [cassandra-dc03-1]  at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:137)
> [cassandra-dc03-1]  at
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119)
> [cassandra-dc03-1]  at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [cassandra-dc03-1]  at java.base/java.lang.Thread.run(Unknown Source)
>
> --
> Bye,
> Gábor AUTH
>


Replication factor, LOCAL_QUORUM write consistency and materialized views

2024-05-17 Thread Gábor Auth
Hi,

I know, I know, the materialized view is experimental... :)

So, I ran into a strange error. Among others, I have a very small 4-nodes
cluster, with very minimal data (~100 MB at all), the keyspace's
replication factor is 3, everything is works fine... except: if I restart a
node, I get a lot of errors with materialized views and consistency level
ONE, but only for those tables for which there is more than one
materialized view.

Tables without materialized view don't have it, works fine.
Tables that have it, but only one materialized view, also works fine.
But, a table with more than one materialized view, whoops, the cluster
crashes temporarily, I can also see on the calling side (Java backend) that
no nodes are responding:

Caused by: com.datastax.driver.core.exceptions.WriteFailureException:
Cassandra failure during write query at consistency LOCAL_QUORUM (2
responses were required but only 1 replica responded, 2 failed)

I am surprised by this behavior, because there is so little data involved,
and it occurs when there is more than one materialized view only, so it
might be a concurrency issue under the hood.

Have you seen an issue like this?

Here is a stack trace on the Cassandra's side:

[cassandra-dc03-1] ERROR [MutationStage-1] 2024-05-17 08:51:47,425
Keyspace.java:652 - Unknown exception caught while attempting to update
MaterializedView! pope.unit
[cassandra-dc03-1] org.apache.cassandra.exceptions.UnavailableException:
Cannot achieve consistency level ONE
[cassandra-dc03-1]  at
org.apache.cassandra.exceptions.UnavailableException.create(UnavailableException.java:37)
[cassandra-dc03-1]  at
org.apache.cassandra.locator.ReplicaPlans.assureSufficientLiveReplicas(ReplicaPlans.java:170)
[cassandra-dc03-1]  at
org.apache.cassandra.locator.ReplicaPlans.assureSufficientLiveReplicasForWrite(ReplicaPlans.java:113)
[cassandra-dc03-1]  at
org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:354)
[cassandra-dc03-1]  at
org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:345)
[cassandra-dc03-1]  at
org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:339)
[cassandra-dc03-1]  at
org.apache.cassandra.service.StorageProxy.wrapViewBatchResponseHandler(StorageProxy.java:1312)
[cassandra-dc03-1]  at
org.apache.cassandra.service.StorageProxy.mutateMV(StorageProxy.java:1004)
[cassandra-dc03-1]  at
org.apache.cassandra.db.view.TableViews.pushViewReplicaUpdates(TableViews.java:167)
[cassandra-dc03-1]  at
org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:647)
[cassandra-dc03-1]  at
org.apache.cassandra.db.Keyspace.applyFuture(Keyspace.java:477)
[cassandra-dc03-1]  at
org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:210)
[cassandra-dc03-1]  at
org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:58)
[cassandra-dc03-1]  at
org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
[cassandra-dc03-1]  at
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
[cassandra-dc03-1]  at
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
[cassandra-dc03-1]  at
org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:432)
[cassandra-dc03-1]  at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown
Source)
[cassandra-dc03-1]  at
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:165)
[cassandra-dc03-1]  at
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:137)
[cassandra-dc03-1]  at
org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119)
[cassandra-dc03-1]  at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
[cassandra-dc03-1]  at java.base/java.lang.Thread.run(Unknown Source)

-- 
Bye,
Gábor AUTH