Andy Klages created CASSANDRA-12683:
---------------------------------------

             Summary: Batch statement fails with consistency ONE when only 1 
node up in DC
                 Key: CASSANDRA-12683
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12683
             Project: Cassandra
          Issue Type: Bug
          Components: Coordination
         Environment: 3 Cassandra nodes (N1, N2, N3)
2 Data Centers (DC1, DC2)
N1 and N2 members of DC1
N3 a member of DC2
1 keyspace using SimpleReplicationStrategy with RF=3 (can also be 2):
CREATE KEYSPACE ks WITH 
REPLICATION={'class':'SimpleStrategy','replication_factor':3};
1 column family in the keyspace. For simplicity, a partition key of bigint and 
one other field of any type. No cluster key needed:
CREATE TABLE ks.test (
        id      bigint,
        flag    boolean,
        PRIMARY KEY(id)
);

            Reporter: Andy Klages
            Priority: Minor


If Cassandra node N2 is stopped, that only leaves one node (N1) in DC1 running. 
Output from "nodetool status" is:

Datacenter: DC1
=========================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns    Host ID                           
    Rack
DN  N2            ...
UN  N1            ...
Datacenter: DC2
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns    Host ID                           
    Rack
UN  N3            ...


The following batch statement will fail when executed on N1 with consistency 
level of ONE using cqlsh (also fails with Java using Datastax driver):

CONSISTENCY ONE
BEGIN BATCH
UPDATE ks.test SET flag=true where id=1;
UPDATE ks.test SET flag=true where id=2;
APPLY BATCH;

The failure is:

  File 
"apache-cassandra-2.1.15/bin/../lib/cassandra-driver-internal-only-2.7.2.zip/cassandra-driver-2.7.2/cassandra/cluster.py",
 line 3347, in result
    raise self._final_exception
Unavailable: code=1000 [Unavailable exception] message="Cannot achieve 
consistency level ONE" info={'required_replicas': 1, 'alive_replicas': 0, 
'consistency': 'ONE'}

There are 2 replicas alive but for some reason, Cassandra thinks there are none.

If each statement is executed individually (i.e. no batch), each one succeeds.

This same batch query succeeds on N3, which is the other node in the cluster 
that is running. My analysis shows this happens when the query is executed on a 
node in a DC where all other nodes in that DC is down. The batch statement has 
2 or more queries with different partition keys. If all of the partition keys 
are the same value, it succeeds. As the replication factor is set to the number 
of nodes in the cluster (full replication) and the consistency level is ONE, 
the batch statement should succeed.

2 workarounds for this are:

1. Set the consistency level to ANY.
2. Use an unlogged batch.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to