DOAN DuyHai created CASSANDRA-8180:
--------------------------------------
Summary: Optimize disk seek using min/max colunm name meta data
when the LIMIT clause is used
Key: CASSANDRA-8180
URL: https://issues.apache.org/jira/browse/CASSANDRA-8180
Project: Cassandra
Issue Type: Improvement
Components: Core
Environment: Cassandra 2.0.10
Reporter: DOAN DuyHai
Priority: Minor
I was working on an example of sensor data table (timeseries) and face a use
case where C* does not optimize read on disk.
{code}
cqlsh:test> CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col))
WITH CLUSTERING ORDER BY (col DESC);
cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10');
...
>nodetool flush test test
...
cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20');
...
>nodetool flush test test
...
cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30');
...
>nodetool flush test test
{code}
After that, I activate request tracing:
{code}
cqlsh:test> SELECT * FROM test WHERE id=1 LIMIT 1;
activity |
timestamp | source | source_elapsed
---------------------------------------------------------------------------+--------------+-----------+----------------
execute_cql3_query |
23:48:46,498 | 127.0.0.1 | 0
Parsing SELECT * FROM test WHERE id=1 LIMIT 1; |
23:48:46,498 | 127.0.0.1 | 74
Preparing statement |
23:48:46,499 | 127.0.0.1 | 253
Executing single-partition query on test |
23:48:46,499 | 127.0.0.1 | 930
Acquiring sstable references |
23:48:46,499 | 127.0.0.1 | 943
Merging memtable tombstones |
23:48:46,499 | 127.0.0.1 | 1032
Key cache hit for sstable 3 |
23:48:46,500 | 127.0.0.1 | 1160
Seeking to partition beginning in data file |
23:48:46,500 | 127.0.0.1 | 1173
Key cache hit for sstable 2 |
23:48:46,500 | 127.0.0.1 | 1889
Seeking to partition beginning in data file |
23:48:46,500 | 127.0.0.1 | 1901
Key cache hit for sstable 1 |
23:48:46,501 | 127.0.0.1 | 2373
Seeking to partition beginning in data file |
23:48:46,501 | 127.0.0.1 | 2384
Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones |
23:48:46,501 | 127.0.0.1 | 2768
Merging data from memtables and 3 sstables |
23:48:46,501 | 127.0.0.1 | 2784
Read 2 live and 0 tombstoned cells |
23:48:46,501 | 127.0.0.1 | 2976
Request complete |
23:48:46,501 | 127.0.0.1 | 3551
{code}
We can clearly see that C* hits 3 SSTables on disk instead of just one,
although it has the min/max column meta data to decide which SSTable contains
the most recent data.
Funny enough, if we add a clause on the clustering column to the select, this
time C* optimizes the read path:
{code}
cqlsh:test> SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1;
activity |
timestamp | source | source_elapsed
---------------------------------------------------------------------------+--------------+-----------+----------------
execute_cql3_query |
23:52:31,888 | 127.0.0.1 | 0
Parsing SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1; |
23:52:31,888 | 127.0.0.1 | 60
Preparing statement |
23:52:31,888 | 127.0.0.1 | 277
Executing single-partition query on test |
23:52:31,889 | 127.0.0.1 | 961
Acquiring sstable references |
23:52:31,889 | 127.0.0.1 | 971
Merging memtable tombstones |
23:52:31,889 | 127.0.0.1 | 1020
Key cache hit for sstable 3 |
23:52:31,889 | 127.0.0.1 | 1108
Seeking to partition beginning in data file |
23:52:31,889 | 127.0.0.1 | 1117
Skipped 2/3 non-slice-intersecting sstables, included 0 due to tombstones |
23:52:31,889 | 127.0.0.1 | 1611
Merging data from memtables and 1 sstables |
23:52:31,890 | 127.0.0.1 | 1624
Read 1 live and 0 tombstoned cells |
23:52:31,890 | 127.0.0.1 | 1700
Request complete |
23:52:31,890 | 127.0.0.1 | 2140
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)