James Lee created CASSANDRA-5789:
------------------------------------

             Summary: Data not fully replicated with 2 nodes and replication 
factor 2
                 Key: CASSANDRA-5789
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5789
             Project: Cassandra
          Issue Type: Bug
    Affects Versions: 1.2.6, 1.2.2
         Environment: Official Datastax Cassandra 1.2.6, running on Linux RHEL 
6.2.  I've seen the same behavior with Cassandra 1.2.2.
Sun Java 1.7.0_10-b18 64-bit
Java heap settings: -Xms8192M -Xmx8192M -Xmn2048M
            Reporter: James Lee


I'm seeing a problem with a 2-node Cassandra test deployment, where it seems 
that data isn't being replicated among the nodes as I would expect.

The setup and test is as follows:
- Two Cassandra nodes in the cluster (they each have themselves and the other 
node as seeds in cassandra.yaml).
- Create 40 keyspaces, each with simple replication strategy and 
replication factor 2.
- Populate 125,000 rows into each keyspace, using a pycassa client with a 
connection pool pointed at both nodes.  These are populated with writes using 
consistency level of 1.
- Wait until nodetool on each node reports that there are no hinted handoffs 
outstanding (see output below).
- Do random reads of the rows in the keyspaces, again using a pycassa client 
with a connection pool pointed at both nodes.  These are read using consistency 
level 1.

I'm finding that the vast majority of reads are successful, but a small 
proportion (~0.1%) are returned as Not Found.  If I manually try to look up 
those keys using cassandra-cli, I see that they are returned when querying one 
of the nodes, but not when querying the other.  So it seems like some of the 
rows have simply not been replicated, even though the write for these rows was 
reported to the client as successful.

If I reduce the rate at which the test tool initially writes data into the 
database then I don't see any failed reads, so this seems like a load-related 
issue.  My understanding is that if all writes were successful and there are no 
pending hinted handoffs, then the data should be fully-replicated and reads 
should return it (even with read and write consistency of 1).

Here's the output from notetool on the two nodes:

comet-mvs01:/dsc-cassandra-1.2.6# ./bin/nodetool tpstats
Pool Name                    Active   Pending      Completed   Blocked  All 
time blocked
ReadStage                         0         0              2         0          
       0
RequestResponseStage              0         0         878494         0          
       0
MutationStage                     0         0        2869107         0          
       0
ReadRepairStage                   0         0              0         0          
       0
ReplicateOnWriteStage             0         0              0         0          
       0
GossipStage                       0         0           2208         0          
       0
AntiEntropyStage                  0         0              0         0          
       0
MigrationStage                    0         0            994         0          
       0
MemtablePostFlusher               0         0           4399         0          
       0
FlushWriter                       0         0           2264         0          
     556
MiscStage                         0         0              0         0          
       0
commitlog_archiver                0         0              0         0          
       0
InternalResponseStage             0         0            153         0          
       0
HintedHandoff                     0         0              2         0          
       0

Message type           Dropped
RANGE_SLICE                  0
READ_REPAIR                  0
BINARY                       0
READ                         0
MUTATION                 87655
_TRACE                       0
REQUEST_RESPONSE             0


comet-mvs02:/dsc-cassandra-1.2.6# ./bin/nodetool tpstats
Pool Name                    Active   Pending      Completed   Blocked  All 
time blocked
ReadStage                         0         0            868         0          
       0
RequestResponseStage              0         0        3919665         0          
       0
MutationStage                     0         0        8177325         0          
       0
ReadRepairStage                   0         0            113         0          
       0
ReplicateOnWriteStage             0         0              0         0          
       0
GossipStage                       0         0           9624         0          
       0
AntiEntropyStage                  0         0              0         0          
       0
MigrationStage                    0         0           2666         0          
       0
MemtablePostFlusher               0         0           7869         0          
       0
FlushWriter                       0         0           4273         0          
    1179
MiscStage                         0         0              0         0          
       0
commitlog_archiver                0         0              0         0          
       0
InternalResponseStage             0         0            215         0          
       0
HintedHandoff                     0         0              8         0          
       0

Message type           Dropped
RANGE_SLICE                  0
READ_REPAIR                  0
BINARY                       0
READ                         0
MUTATION                531988
_TRACE                       0
REQUEST_RESPONSE             0


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to