[ 
https://issues.apache.org/jira/browse/CASSANDRA-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-6586:
----------------------------------------

    Issue Type: Improvement  (was: Bug)

> Cassandra touches all columns on CQL3 select
> --------------------------------------------
>
>                 Key: CASSANDRA-6586
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6586
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jan Chochol
>
> It seems that Cassandra is checking (garbage collecting) all columns of all 
> returned rows, despite the fact that not all columns are requested.
> Example:
> * use following script to fill Cassandra with test data:
> {noformat}
> perl -e "print(\"DROP KEYSPACE t;\nCREATE KEYSPACE t WITH replication = 
> {'class': 'SimpleStrategy', 'replication_factor' : 1};\nuse t;\nCREATE TABLE 
> t (a varchar PRIMARY KEY, b varchar, c varchar, d varchar);\nCREATE INDEX t_b 
> ON t (b);\nCREATE INDEX t_c ON t (c);\nCREATE INDEX t_d ON t (d);\n\");\$max 
> = 200; for(\$i = 0; \$i < \$max; \$i++) { \$j = int(\$i * 10 / \$max); \$k = 
> int(\$i * 100 / \$max); print(\"INSERT INTO t (a, b, c, d) VALUES ('a\$i', 
> 'b\$j', 'c\$k', 'd\$i');\n\")}\n" | cqlsh
> {noformat}
> * turn on {{ALL}} logging for Cassandra
> * issue this query:
> {noformat}
> select a from t where c = 'c1';
> {noformat}
> This is result:
> {noformat}
> [root@jch3-devel:~/c4] cqlsh --no-color
> Connected to C4 Cluster Single at localhost:9160.
> [cqlsh 3.1.7 | Cassandra 1.2.11-SNAPSHOT | CQL spec 3.0.0 | Thrift protocol 
> 19.36.1]
> Use HELP for help.
> cqlsh> use t;
> cqlsh:t> select a from t where c = 'c1';
>  a
> ----
>  a3
>  a2
> {noformat}
> From Cassandra log:
> {noformat}
> 2014-01-15 09:14:56.663+0100 [Thrift:1] [TRACE] QueryProcessor.java(125) 
> org.apache.cassandra.cql3.QueryProcessor: component=c4 Process 
> org.apache.cassandra.cql3.statements.SelectStatement@614b3189 @CL.ONE
> 2014-01-15 09:14:56.810+0100 [Thrift:1] [TRACE] ReadCallback.java(67) 
> org.apache.cassandra.service.ReadCallback: component=c4 Blockfor is 1; 
> setting up requests to /127.0.0.1
> 2014-01-15 09:14:56.816+0100 [ReadStage:2] [DEBUG] 
> CompositesSearcher.java(112) 
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 
> Most-selective indexed predicate is 't.c EQ c1'
> 2014-01-15 09:14:56.817+0100 [ReadStage:2] [TRACE] 
> ColumnFamilyStore.java(1493) org.apache.cassandra.db.ColumnFamilyStore: 
> component=c4 Filtering 
> org.apache.cassandra.db.index.composites.CompositesSearcher$1@e15911 for rows 
> matching 
> org.apache.cassandra.db.filter.ExtendedFilter$FilterWithCompositeClauses@4a9e6b8a
> 2014-01-15 09:14:56.817+0100 [ReadStage:2] [TRACE] 
> CompositesSearcher.java(237) 
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 
> Scanning index 't.c EQ c1' starting with 
> 2014-01-15 09:14:56.820+0100 [ReadStage:2] [TRACE] SSTableReader.java(776) 
> org.apache.cassandra.io.sstable.SSTableReader: component=c4 Adding cache 
> entry for KeyCacheKey(/mnt/ebs/cassandra/data/t/t/t-t.t_c-ic-1, 6331) -> 
> org.apache.cassandra.db.RowIndexEntry@66a6574b
> 2014-01-15 09:14:56.821+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 0 of 
> 10000: 6133:false:0@1389773577394000
> 2014-01-15 09:14:56.821+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 
> 10000: 6132:false:0@1389773577391000
> 2014-01-15 09:14:56.822+0100 [ReadStage:2] [TRACE] 
> CompositesSearcher.java(313) 
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 
> Adding index hit to current row for 6133
> 2014-01-15 09:14:56.825+0100 [ReadStage:2] [TRACE] SSTableReader.java(776) 
> org.apache.cassandra.io.sstable.SSTableReader: component=c4 Adding cache 
> entry for KeyCacheKey(/mnt/ebs/cassandra/data/t/t/t-t-ic-1, 6133) -> 
> org.apache.cassandra.db.RowIndexEntry@32ad3193
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 0 of 
> 2147483647: :false:0@1389773577394000
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 
> 2147483647: b:false:2@1389773577394000
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 
> 2147483647: c:false:2@1389773577394000
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 
> 2147483647: d:false:2@1389773577394000
> 2014-01-15 09:14:56.828+0100 [ReadStage:2] [TRACE] 
> CompositesSearcher.java(313) 
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 
> Adding index hit to current row for 6132
> 2014-01-15 09:14:56.828+0100 [ReadStage:2] [TRACE] SSTableReader.java(776) 
> org.apache.cassandra.io.sstable.SSTableReader: component=c4 Adding cache 
> entry for KeyCacheKey(/mnt/ebs/cassandra/data/t/t/t-t-ic-1, 6132) -> 
> org.apache.cassandra.db.RowIndexEntry@87d66d5
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 0 of 
> 2147483647: :false:0@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 
> 2147483647: b:false:2@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 
> 2147483647: c:false:2@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 
> 2147483647: d:false:2@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] 
> CompositesSearcher.java(232) 
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 
> Read only 2 (< 10000) last page through, must be done
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] 
> CompositesSearcher.java(232) 
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 
> Read only 2 (< 10000) last page through, must be done
> 2014-01-15 09:14:56.830+0100 [Thrift:1] [DEBUG] Tracing.java(169) 
> org.apache.cassandra.tracing.Tracing: component=c4 request complete
> {noformat}
> Note that Cassandra is checking all columns ({{a}}, {{b}}, {{c}} and {{d}}), 
> even we requested only column {{a}}.
> Things became really nasty, when using lots of columns, or bigger collections 
> (yes - each member of collection is checked).
> This is quite counter intuitive behaviour, as all Cassandra guides said, that 
> using wide rows should not affect performance, but in CQL3 they create big 
> performance bottleneck.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to