[ 
https://issues.apache.org/jira/browse/CASSANDRA-16737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-16737:
--------------------------------------------
    Description: 
With the following SSTables:
{code:java}
CREATE TABLE my_table (pk int, ck int, v1 int, PRIMARY KEY(pk, ck))

INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 1) USING TIMESTAMP 1000;
--> flush()
INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 2) USING TIMESTAMP 2000;
--> flush()
INSERT INTO my_table  (pk, ck, v1) VALUES (1, 1, 3) USING TIMESTAMP 3000
--> flush()
{code}
the following query:
{code:java}
SELECT pk, ck, v1 FROM my_table WHERE pk = 1 AND ck = 1{code}
will only read the third SSTable.

If we add a column to the table (e.g. {{ALTER TABLE my_table ADD v2 int}}) and 
rerun the query, the query will read the 3 SSTables.

The reason for this behavior is due to the fact that C* is trying to read all 
the {{fetched}} columns to ensure that it will return a row if at least one of 
its column is non null.

In practice for CQL tables, C* does not need to fetch all columns if the row 
contains a primary key liveness as it is enough to guaranty that the row 
exists. By consequence, even after the addition of the new column C* should 
read only the third SSTable.

  was:
With the following SSTables:

{code}
CREATE TABLE my_table (pk int, ck int, v1 int, PRIMARY KEY(pk, ck))

INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 1) USING TIMESTAMP 1000;
--> flush
INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 2) USING TIMESTAMP 2000;
--> flush()
INSERT INTO my_table  (pk, ck, v1) VALUES (1, 1, 3) USING TIMESTAMP 3000
--> flush()
{code}

the following query:
{code}SELECT pk, ck, v1 FROM my_table WHERE pk = 1 AND ck = 1{code}
will only read the third SSTable.

If we add a column to the table  (e.g. {{ALTER TABLE my_table ADD v2 int}}) and 
rerun the query, the query will read the 3 SSTables.

The reason for this behavior is due to the fact that C* is trying to read all 
the {{fetched}} columns to ensure that it will return a row if at least one of 
its column is non null.

In practice for CQL tables, C* does not need to fetch all columns if the row 
contains a primary key liveness as it is enough to guaranty that the row 
exists. By consequence, even after the addition of the new column C* should 
read only the third SSTable.


> ALTER ... ADD can increase the number of SSTables being read
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-16737
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16737
>             Project: Cassandra
>          Issue Type: Bug
>          Components: CQL/Semantics
>            Reporter: Benjamin Lerer
>            Assignee: Benjamin Lerer
>            Priority: Normal
>             Fix For: 3.11.x, 4.0.x
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> With the following SSTables:
> {code:java}
> CREATE TABLE my_table (pk int, ck int, v1 int, PRIMARY KEY(pk, ck))
> INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 1) USING TIMESTAMP 1000;
> --> flush()
> INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 2) USING TIMESTAMP 2000;
> --> flush()
> INSERT INTO my_table  (pk, ck, v1) VALUES (1, 1, 3) USING TIMESTAMP 3000
> --> flush()
> {code}
> the following query:
> {code:java}
> SELECT pk, ck, v1 FROM my_table WHERE pk = 1 AND ck = 1{code}
> will only read the third SSTable.
> If we add a column to the table (e.g. {{ALTER TABLE my_table ADD v2 int}}) 
> and rerun the query, the query will read the 3 SSTables.
> The reason for this behavior is due to the fact that C* is trying to read all 
> the {{fetched}} columns to ensure that it will return a row if at least one 
> of its column is non null.
> In practice for CQL tables, C* does not need to fetch all columns if the row 
> contains a primary key liveness as it is enough to guaranty that the row 
> exists. By consequence, even after the addition of the new column C* should 
> read only the third SSTable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to