[ 
https://issues.apache.org/jira/browse/CASSANDRA-14242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16417254#comment-16417254
 ] 

Andrés de la Peña commented on CASSANDRA-14242:
-----------------------------------------------

This patch should solve the problem:
||[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...adelapena:14242-3.0]||[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...adelapena:14242-3.11]||[trunk|https://github.com/apache/cassandra/compare/trunk...adelapena:14242-trunk]||
It also fixes this infinite loop problem when querying 2i with a page size of 
one:
{code}
CREATE TABLE t(k int PRIMARY KEY, v int);
CREATE INDEX on t(v);
INSERT INTO t (k, v) VALUES (1, 10);
PAGING 1
SELECT * FROM t WHERE v = 10 AND k = 1; -- Infinite loop!
{code}
For 3.11 and trunk, where it's possible to create 2i on static columns since 
CASSANDRA-8103, the patch allows selecting only static columns, for example:
{code}
CREATE TABLE t(k int, c int, s int STATIC, PRIMARY KEY(k, c));
CREATE INDEX on t(s);

INSERT INTO t (k, s) VALUES (1, 100);
INSERT INTO t (k, c) VALUES (1, 10);
INSERT INTO t (k, c) VALUES (1, 20);

SELECT DISTINCT s FROM t WHERE s = 100 AND k = 1; -- Previously it would have 
thrown a "Queries using 2ndary indexes don't support selecting only static 
columns" error
{code}

[~blerer]/[~ifesdjeen] could you please review?

> Indexed static column returns inconsistent results
> --------------------------------------------------
>
>                 Key: CASSANDRA-14242
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14242
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Cassandra 3.11.2
> Java driver 3.4.0
> Ubuntu - 4.4.0-112-generic
>            Reporter: Ross Black
>            Assignee: Andrés de la Peña
>            Priority: Major
>
> I am using Cassandra 3.11.2, and the Java driver 3.4.0
> I have a table that has a static column, where the static column has a 
> secondary index.
> When querying the table I get incomplete or duplicated results, depending on 
> the fetch size.
> e.g.
> {code:java}
> CREATE KEYSPACE hack WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> CREATE TABLE hack.stuff (id int, kind text, chunk int static, val1 int, 
> PRIMARY KEY (id, kind));
> CREATE INDEX stuff_chunk_index ON hack.stuff (chunk);{code}
> -- repeat with thousands of values for id =>
> {code:java}
>   INSERT INTO hack.stuff (id, chunk, kind, val1 ) VALUES (${id}, 777, 'A', 
> 123);{code}
> Querying from Java:
> {code:java}
>     final SimpleStatement statement = new SimpleStatement("SELECT id, kind, 
> val1 FROM hack.stuff WHERE chunk = " + chunk); 
>     statement.setFetchSize(fetchSize);
>     statement.setConsistencyLevel(ConsistencyLevel.ALL);
>     final ResultSet resultSet = connection.getSession().execute(statement);
>     for (Row row : resultSet) {
>         final int id = row.getInt("id");
>     }{code}
> *The number of results returned depends on the fetch-size.*
> e.g. For 30k values inserted, I get the following:
> ||fetch-size||result-size||
> |40000|30000|
> |20000|30001|
> |5000|30006|
> |100|30303|
> In production, I have a much larger table where the correct result size for a 
> specific chunk is 20019, but some fetch sizes will return _significantly 
> fewer_ results.
> ||fetch-size||result-size|| ||
> |25000|20019| |
> |5000|9999|*<== this one is has far fewer results*|
> |5001|20026| |
> (so far been unable to reproduce this with the simpler test table)
> Thanks,
> Ross



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to