Re: CAS operation result is unknown - proposal accepted by 1 but not a quorum

Ralph Boehme Tue, 11 Apr 2023 12:14:44 -0700

On 4/11/23 19:53, Bowen Song via user wrote:

That error message sounds like one of the nodes timed out in the paxos propose stage. You can check the system.log and gc.log and see if you can find anything unusual in them, such as network errors, out of sync clocks or long stop-the-world GC pauses.

hm, I'll check the logs, but I can reproduce this 100% on an idle test cluster just by running a simple test client that generates a smallish workload where just 2 processes on a single host hammer the Cassandra cluster with LWTs.


Maybe LWTs are not meant to be used this way?

BTW, since you said you want it to be fast, I think it's worth mentioning that LWT comes with additional cost and is much slower than a straight forward INSERT/UPDATE.


Sure, but we have to swallow that pill as we need linearizability.

You should avoid using it if possible. For example, if all of the Cassandra clients (samba servers) are running on the same machine, it may be far more efficient to use a lock than LWT.

no, the goal is designing a huge scaleout SMB cluster spanning hundreds of nodes, used as multitennant cloud SMB frontend much like Microsoft Azure SMB.


Thanks!
-slow

OpenPGP_signature
Description: OpenPGP digital signature

Re: CAS operation result is unknown - proposal accepted by 1 but not a quorum

Reply via email to