[
https://issues.apache.org/jira/browse/CASSANDRA-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sylvain Lebresne updated CASSANDRA-6586:
----------------------------------------
Issue Type: Improvement (was: Bug)
> Cassandra touches all columns on CQL3 select
> --------------------------------------------
>
> Key: CASSANDRA-6586
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6586
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jan Chochol
>
> It seems that Cassandra is checking (garbage collecting) all columns of all
> returned rows, despite the fact that not all columns are requested.
> Example:
> * use following script to fill Cassandra with test data:
> {noformat}
> perl -e "print(\"DROP KEYSPACE t;\nCREATE KEYSPACE t WITH replication =
> {'class': 'SimpleStrategy', 'replication_factor' : 1};\nuse t;\nCREATE TABLE
> t (a varchar PRIMARY KEY, b varchar, c varchar, d varchar);\nCREATE INDEX t_b
> ON t (b);\nCREATE INDEX t_c ON t (c);\nCREATE INDEX t_d ON t (d);\n\");\$max
> = 200; for(\$i = 0; \$i < \$max; \$i++) { \$j = int(\$i * 10 / \$max); \$k =
> int(\$i * 100 / \$max); print(\"INSERT INTO t (a, b, c, d) VALUES ('a\$i',
> 'b\$j', 'c\$k', 'd\$i');\n\")}\n" | cqlsh
> {noformat}
> * turn on {{ALL}} logging for Cassandra
> * issue this query:
> {noformat}
> select a from t where c = 'c1';
> {noformat}
> This is result:
> {noformat}
> [root@jch3-devel:~/c4] cqlsh --no-color
> Connected to C4 Cluster Single at localhost:9160.
> [cqlsh 3.1.7 | Cassandra 1.2.11-SNAPSHOT | CQL spec 3.0.0 | Thrift protocol
> 19.36.1]
> Use HELP for help.
> cqlsh> use t;
> cqlsh:t> select a from t where c = 'c1';
> a
> ----
> a3
> a2
> {noformat}
> From Cassandra log:
> {noformat}
> 2014-01-15 09:14:56.663+0100 [Thrift:1] [TRACE] QueryProcessor.java(125)
> org.apache.cassandra.cql3.QueryProcessor: component=c4 Process
> org.apache.cassandra.cql3.statements.SelectStatement@614b3189 @CL.ONE
> 2014-01-15 09:14:56.810+0100 [Thrift:1] [TRACE] ReadCallback.java(67)
> org.apache.cassandra.service.ReadCallback: component=c4 Blockfor is 1;
> setting up requests to /127.0.0.1
> 2014-01-15 09:14:56.816+0100 [ReadStage:2] [DEBUG]
> CompositesSearcher.java(112)
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4
> Most-selective indexed predicate is 't.c EQ c1'
> 2014-01-15 09:14:56.817+0100 [ReadStage:2] [TRACE]
> ColumnFamilyStore.java(1493) org.apache.cassandra.db.ColumnFamilyStore:
> component=c4 Filtering
> org.apache.cassandra.db.index.composites.CompositesSearcher$1@e15911 for rows
> matching
> org.apache.cassandra.db.filter.ExtendedFilter$FilterWithCompositeClauses@4a9e6b8a
> 2014-01-15 09:14:56.817+0100 [ReadStage:2] [TRACE]
> CompositesSearcher.java(237)
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4
> Scanning index 't.c EQ c1' starting with
> 2014-01-15 09:14:56.820+0100 [ReadStage:2] [TRACE] SSTableReader.java(776)
> org.apache.cassandra.io.sstable.SSTableReader: component=c4 Adding cache
> entry for KeyCacheKey(/mnt/ebs/cassandra/data/t/t/t-t.t_c-ic-1, 6331) ->
> org.apache.cassandra.db.RowIndexEntry@66a6574b
> 2014-01-15 09:14:56.821+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164)
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 0 of
> 10000: 6133:false:0@1389773577394000
> 2014-01-15 09:14:56.821+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164)
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of
> 10000: 6132:false:0@1389773577391000
> 2014-01-15 09:14:56.822+0100 [ReadStage:2] [TRACE]
> CompositesSearcher.java(313)
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4
> Adding index hit to current row for 6133
> 2014-01-15 09:14:56.825+0100 [ReadStage:2] [TRACE] SSTableReader.java(776)
> org.apache.cassandra.io.sstable.SSTableReader: component=c4 Adding cache
> entry for KeyCacheKey(/mnt/ebs/cassandra/data/t/t/t-t-ic-1, 6133) ->
> org.apache.cassandra.db.RowIndexEntry@32ad3193
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164)
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 0 of
> 2147483647: :false:0@1389773577394000
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164)
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of
> 2147483647: b:false:2@1389773577394000
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164)
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of
> 2147483647: c:false:2@1389773577394000
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164)
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of
> 2147483647: d:false:2@1389773577394000
> 2014-01-15 09:14:56.828+0100 [ReadStage:2] [TRACE]
> CompositesSearcher.java(313)
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4
> Adding index hit to current row for 6132
> 2014-01-15 09:14:56.828+0100 [ReadStage:2] [TRACE] SSTableReader.java(776)
> org.apache.cassandra.io.sstable.SSTableReader: component=c4 Adding cache
> entry for KeyCacheKey(/mnt/ebs/cassandra/data/t/t/t-t-ic-1, 6132) ->
> org.apache.cassandra.db.RowIndexEntry@87d66d5
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164)
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 0 of
> 2147483647: :false:0@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164)
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of
> 2147483647: b:false:2@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164)
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of
> 2147483647: c:false:2@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164)
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of
> 2147483647: d:false:2@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE]
> CompositesSearcher.java(232)
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4
> Read only 2 (< 10000) last page through, must be done
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE]
> CompositesSearcher.java(232)
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4
> Read only 2 (< 10000) last page through, must be done
> 2014-01-15 09:14:56.830+0100 [Thrift:1] [DEBUG] Tracing.java(169)
> org.apache.cassandra.tracing.Tracing: component=c4 request complete
> {noformat}
> Note that Cassandra is checking all columns ({{a}}, {{b}}, {{c}} and {{d}}),
> even we requested only column {{a}}.
> Things became really nasty, when using lots of columns, or bigger collections
> (yes - each member of collection is checked).
> This is quite counter intuitive behaviour, as all Cassandra guides said, that
> using wide rows should not affect performance, but in CQL3 they create big
> performance bottleneck.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)