[
https://issues.apache.org/jira/browse/CASSANDRA-13277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15940277#comment-15940277
]
Andrés de la Peña commented on CASSANDRA-13277:
-----------------------------------------------
The underlying problem can be reproduced with a single node:
{code}
CREATE KEYSPACE k WITH replication = {'class': 'SimpleStrategy',
'replication_factor': 1};
CREATE TABLE k.c (
pk int,
ck int,
sc int static
primary key (pk, ck)
);
CREATE index ON k.c (sc);
INSERT INTO k.c (pk, ck, sc) values (1, 2, 3);
INSERT INTO k.c (pk, ck, sc) values (-1, 2, 3);
SELECT token(pk), pk, ck, sc FROM k.c where sc = 3 AND token(pk) > 0;
system.token(pk) | pk | ck | sc
----------------------+----+----+----
-4069959284402364209 | 1 | 2 | 3
7297452126230313552 | -1 | 2 | 3
SELECT token(pk), pk, ck, sc FROM k.c where sc = 3 AND token(pk) <= 0;
system.token(pk) | pk | ck | sc
----------------------+----+----+----
-4069959284402364209 | 1 | 2 | 3
7297452126230313552 | -1 | 2 | 3
{code}
This is produced because {{CompositesSearcher}} doesn't verify that index hits
satisfy command's key constraint when dealing with static columns, as it is
done with regular columns.
The provided examples don't specify key restrictions but they fail when RF is
lesser than the number of nodes because they are internally split into
subqueries directed to specific token ranges. Replicas ignore the token range
restriction and the coordinator receives duplicate rows from unexpected token
ranges, as it is shown in the previous example.
An initial version of the patch can be found here.
||[trunk|https://github.com/apache/cassandra/compare/trunk...adelapena:13277-trunk]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13277-trunk-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13277-trunk-dtest/]|
||[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...adelapena:13277-3.11]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13277-3.11-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13277-3.11-dtest/]|
> Duplicate results with secondary index on static column
> -------------------------------------------------------
>
> Key: CASSANDRA-13277
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13277
> Project: Cassandra
> Issue Type: Bug
> Reporter: Romain Hardouin
> Assignee: Andrés de la Peña
> Labels: 2i
>
> As a follow up of
> http://www.mail-archive.com/[email protected]/msg50816.html
> Duplicate results appear with secondary index on static column with RF > 1.
> Number of results vary depending on consistency level.
> Here is a CCM session to reproduce the issue:
> {code}
> romain@debian:~$ ccm create 39 -n 3 -v 3.9 -s
> Current cluster is now: 39
> romain@debian:~$ ccm node1 cqlsh
> Connected to 39 at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4]
> Use HELP for help.
> cqlsh> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': 2};
> cqlsh> CREATE TABLE test.idx_static (id text, id2 bigint static, added
> timestamp, source text static, dest text, primary key (id, added));
> cqlsh> CREATE index ON test.idx_static (id2);
> cqlsh> INSERT INTO test.idx_static (id, id2, added, source, dest) values
> ('id1', 22,'2017-01-28', 'src1', 'dst1');
> cqlsh> SELECT * FROM test.idx_static where id2=22;
> id | added | id2 | source | dest
> -----+---------------------------------+-----+--------+------
> id1 | 2017-01-27 23:00:00.000000+0000 | 22 | src1 | dst1
> id1 | 2017-01-27 23:00:00.000000+0000 | 22 | src1 | dst1
> (2 rows)
> cqlsh> CONSISTENCY ALL
> Consistency level set to ALL.
> cqlsh> SELECT * FROM test.idx_static where id2=22;
> id | added | id2 | source | dest
> -----+---------------------------------+-----+--------+------
> id1 | 2017-01-27 23:00:00.000000+0000 | 22 | src1 | dst1
> id1 | 2017-01-27 23:00:00.000000+0000 | 22 | src1 | dst1
> id1 | 2017-01-27 23:00:00.000000+0000 | 22 | src1 | dst1
> (3 rows)
> {code}
> When RF matches the number of nodes, it works as expected.
> Example with RF=3 and 3 nodes:
> {code}
> romain@debian:~$ ccm create 39 -n 3 -v 3.9 -s
> Current cluster is now: 39
> romain@debian:~$ ccm node1 cqlsh
> Connected to 39 at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4]
> Use HELP for help.
> cqlsh> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': 3};
> cqlsh> CREATE TABLE test.idx_static (id text, id2 bigint static, added
> timestamp, source text static, dest text, primary key (id, added));
> cqlsh> CREATE index ON test.idx_static (id2);
> cqlsh> INSERT INTO test.idx_static (id, id2, added, source, dest) values
> ('id1', 22,'2017-01-28', 'src1', 'dst1');
> cqlsh> SELECT * FROM test.idx_static where id2=22;
> id | added | id2 | source | dest
> -----+---------------------------------+-----+--------+------
> id1 | 2017-01-27 23:00:00.000000+0000 | 22 | src1 | dst1
> (1 rows)
> cqlsh> CONSISTENCY all
> Consistency level set to ALL.
> cqlsh> SELECT * FROM test.idx_static where id2=22;
> id | added | id2 | source | dest
> -----+---------------------------------+-----+--------+------
> id1 | 2017-01-27 23:00:00.000000+0000 | 22 | src1 | dst1
> (1 rows)
> {code}
> Example with RF = 2 and 2 nodes:
> {code}
> romain@debian:~$ ccm create 39 -n 2 -v 3.9 -s
> Current cluster is now: 39
> romain@debian:~$ ccm node1 cqlsh
> Connected to 39 at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4]
> Use HELP for help.
> cqlsh> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': 2};
> cqlsh> CREATE TABLE test.idx_static (id text, id2 bigint static, added
> timestamp, source text static, dest text, primary key (id, added));
> cqlsh> INSERT INTO test.idx_static (id, id2, added, source, dest) values
> ('id1', 22,'2017-01-28', 'src1', 'dst1');
> cqlsh> CREATE index ON test.idx_static (id2);
> cqlsh> INSERT INTO test.idx_static (id, id2, added, source, dest) values
> ('id1', 22,'2017-01-28', 'src1', 'dst1');
> cqlsh> SELECT * FROM test.idx_static where id2=22;
> id | added | id2 | source | dest
> -----+---------------------------------+-----+--------+------
> id1 | 2017-01-27 23:00:00.000000+0000 | 22 | src1 | dst1
> (1 rows)
> cqlsh> CONSISTENCY ALL
> Consistency level set to ALL.
> cqlsh> SELECT * FROM test.idx_static where id2=22;
> id | added | id2 | source | dest
> -----+---------------------------------+-----+--------+------
> id1 | 2017-01-27 23:00:00.000000+0000 | 22 | src1 | dst1
> (1 rows)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)