Russell Alexander Spitzer created CASSANDRA-7240:
----------------------------------------------------
Summary: Altering Keyspace Replication On Large Cluster With
vnodes Leads to Warns on All nodes
Key: CASSANDRA-7240
URL: https://issues.apache.org/jira/browse/CASSANDRA-7240
Project: Cassandra
Issue Type: Bug
Components: Core
Environment: 1000 Nodes M1.large ubuntu 12.04
Reporter: Russell Alexander Spitzer
1000 Node cluster started with vnodes(256) on. 25 separate Nodes began an all
write workload against the first 1000 nodes. During the test I attempted to
alter the key-space from simple strategy to a network topology strategy.
{code}
cqlsh> ALTER KEYSPACE "Keyspace1" WITH replication = {'class':
'NetworkTopologyStrategy', 'DC1': '3', 'DC2':'3'} AND durable_writes = true;
errors={}, last_host=127.0.0.1
cqlsh> ALTER KEYSPACE "Keyspace1" WITH replication = {'class':
'NetworkTopologyStrategy', 'DC1': '3', 'DC2':'3'} AND durable_writes = true;
('Unable to complete the operation against any hosts', {<Host: 127.0.0.1 DC1>:
ConnectionShutdown('Connection to 127.0.0.1 is defunct',)})
{code}
All one thousand nodes then began to repeat the following in their respective
logs
{code}
WARN [Thread-50131] 2014-05-14 23:34:07,631 IncomingTcpConnection.java:91 -
UnknownColumnFamilyException reading from socket; closing
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
cfId=46b7b090-dbaf-11e3-8413-fffd4403e7d2
at
org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
~[apache-cassandra-2.1.0-beta2.jar:2.1.0-beta2]
at
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
~[apache-cassandra-2.1.0-beta2.jar:2.1.0-beta2]
at
org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:318)
~[apache-cassandra-2.1.0-beta2.jar:2.1.0-beta2]
at
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:298)
~[apache-cassandra-2.1.0-beta2.jar:2.1.0-beta2]
at
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:326)
~[apache-cassandra-2.1.0-beta2.jar:2.1.0-beta2]
at
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:268)
~[apache-cassandra-2.1.0-beta2.jar:2.1.0-beta2]
at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
~[apache-cassandra-2.1.0-beta2.jar:2.1.0-beta2]
at
org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:165)
~[apache-cassandra-2.1.0-beta2.jar:2.1.0-beta2]
at
org.apache.cassandra.net.IncomingTcpConnection.handleModernVersion(IncomingTcpConnection.java:147)
~[apache-cassandra-2.1.0-beta2.jar:2.1.0-beta2]
at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:82)
~[apache-cassandra-2.1.0-beta2.jar:2.1.0-beta2]
{code}
Stress continued but at a decreased speed
{code}
Excerpt from one of the 25 Stress Nodes
83222847 , 14602, 14602, 6.7, 2.1, 23.1, 132.1, 292.3,
531.3, 5216.5, 0.00188
83239512 , 13888, 13888, 7.3, 2.1, 31.3, 129.9, 267.9,
555.8, 5217.7, 0.00188
83258520 , 14301, 14301, 7.0, 2.1, 28.8, 125.4, 297.2,
758.1, 5219.0, 0.00188
83277750 , 14023, 14023, 7.1, 2.1, 28.4, 132.8, 292.3,
703.6, 5220.4, 0.00188
83301413 , 14410, 14410, 6.9, 2.1, 24.5, 124.8, 391.4,
1010.1, 5222.0, 0.00188
83316846 , 12313, 12313, 8.1, 2.1, 35.1, 168.2, 275.3,
467.9, 5223.3, 0.00188
83332883 , 13753, 13753, 6.9, 2.1, 28.1, 132.2, 276.1,
498.9, 5224.4, 0.00188
#ALTER REQUEST HERE
83351413 , 9981, 9981, 9.9, 2.1, 46.7, 172.0, 447.8,
1327.9, 5226.3, 0.00188
83358381 , 4464, 4464, 22.7, 2.2, 125.9, 257.8, 594.6,
1650.6, 5227.8, 0.00188
83363153 , 3186, 3186, 31.7, 2.5, 153.0, 300.3, 477.0,
566.1, 5229.3, 0.00189
83367341 , 2967, 2967, 33.7, 2.4, 173.9, 311.5, 465.8,
761.9, 5230.7, 0.00190
83370738 , 2392, 2392, 41.4, 2.9, 208.0, 308.1, 434.8,
839.6, 5232.2, 0.00191
83373651 , 2283, 2283, 43.0, 2.5, 213.9, 310.5, 409.3,
503.3, 5233.4, 0.00192
{code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)