[ 
https://issues.apache.org/jira/browse/CASSANDRA-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13873177#comment-13873177
 ] 

Sylvain Lebresne commented on CASSANDRA-6586:
---------------------------------------------

bq. You can still have wide rows and you still can avoid reading all the 
(physical thrift) columns of that row.

Yes you can and yes you can still avoid reading all the thrift columns/physical 
cells of that thrift row/partition. I was hoping that the 'this is *less* of a 
problem for "wide rows" usage' of my previous comment was making it clear 
enough that this did not meant that "you can't really do wide rows in CQL" in 
any way.

> Cassandra touches all columns on CQL3 select
> --------------------------------------------
>
>                 Key: CASSANDRA-6586
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6586
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jan Chochol
>            Priority: Minor
>
> It seems that Cassandra is checking (garbage collecting) all columns of all 
> returned rows, despite the fact that not all columns are requested.
> Example:
> * use following script to fill Cassandra with test data:
> {noformat}
> perl -e "print(\"DROP KEYSPACE t;\nCREATE KEYSPACE t WITH replication = 
> {'class': 'SimpleStrategy', 'replication_factor' : 1};\nuse t;\nCREATE TABLE 
> t (a varchar PRIMARY KEY, b varchar, c varchar, d varchar);\nCREATE INDEX t_b 
> ON t (b);\nCREATE INDEX t_c ON t (c);\nCREATE INDEX t_d ON t (d);\n\");\$max 
> = 200; for(\$i = 0; \$i < \$max; \$i++) { \$j = int(\$i * 10 / \$max); \$k = 
> int(\$i * 100 / \$max); print(\"INSERT INTO t (a, b, c, d) VALUES ('a\$i', 
> 'b\$j', 'c\$k', 'd\$i');\n\")}\n" | cqlsh
> {noformat}
> * turn on {{ALL}} logging for Cassandra
> * issue this query:
> {noformat}
> select a from t where c = 'c1';
> {noformat}
> This is result:
> {noformat}
> [root@jch3-devel:~/c4] cqlsh --no-color
> Connected to C4 Cluster Single at localhost:9160.
> [cqlsh 3.1.7 | Cassandra 1.2.11-SNAPSHOT | CQL spec 3.0.0 | Thrift protocol 
> 19.36.1]
> Use HELP for help.
> cqlsh> use t;
> cqlsh:t> select a from t where c = 'c1';
>  a
> ----
>  a3
>  a2
> {noformat}
> From Cassandra log:
> {noformat}
> 2014-01-15 09:14:56.663+0100 [Thrift:1] [TRACE] QueryProcessor.java(125) 
> org.apache.cassandra.cql3.QueryProcessor: component=c4 Process 
> org.apache.cassandra.cql3.statements.SelectStatement@614b3189 @CL.ONE
> 2014-01-15 09:14:56.810+0100 [Thrift:1] [TRACE] ReadCallback.java(67) 
> org.apache.cassandra.service.ReadCallback: component=c4 Blockfor is 1; 
> setting up requests to /127.0.0.1
> 2014-01-15 09:14:56.816+0100 [ReadStage:2] [DEBUG] 
> CompositesSearcher.java(112) 
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 
> Most-selective indexed predicate is 't.c EQ c1'
> 2014-01-15 09:14:56.817+0100 [ReadStage:2] [TRACE] 
> ColumnFamilyStore.java(1493) org.apache.cassandra.db.ColumnFamilyStore: 
> component=c4 Filtering 
> org.apache.cassandra.db.index.composites.CompositesSearcher$1@e15911 for rows 
> matching 
> org.apache.cassandra.db.filter.ExtendedFilter$FilterWithCompositeClauses@4a9e6b8a
> 2014-01-15 09:14:56.817+0100 [ReadStage:2] [TRACE] 
> CompositesSearcher.java(237) 
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 
> Scanning index 't.c EQ c1' starting with 
> 2014-01-15 09:14:56.820+0100 [ReadStage:2] [TRACE] SSTableReader.java(776) 
> org.apache.cassandra.io.sstable.SSTableReader: component=c4 Adding cache 
> entry for KeyCacheKey(/mnt/ebs/cassandra/data/t/t/t-t.t_c-ic-1, 6331) -> 
> org.apache.cassandra.db.RowIndexEntry@66a6574b
> 2014-01-15 09:14:56.821+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 0 of 
> 10000: 6133:false:0@1389773577394000
> 2014-01-15 09:14:56.821+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 
> 10000: 6132:false:0@1389773577391000
> 2014-01-15 09:14:56.822+0100 [ReadStage:2] [TRACE] 
> CompositesSearcher.java(313) 
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 
> Adding index hit to current row for 6133
> 2014-01-15 09:14:56.825+0100 [ReadStage:2] [TRACE] SSTableReader.java(776) 
> org.apache.cassandra.io.sstable.SSTableReader: component=c4 Adding cache 
> entry for KeyCacheKey(/mnt/ebs/cassandra/data/t/t/t-t-ic-1, 6133) -> 
> org.apache.cassandra.db.RowIndexEntry@32ad3193
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 0 of 
> 2147483647: :false:0@1389773577394000
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 
> 2147483647: b:false:2@1389773577394000
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 
> 2147483647: c:false:2@1389773577394000
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 
> 2147483647: d:false:2@1389773577394000
> 2014-01-15 09:14:56.828+0100 [ReadStage:2] [TRACE] 
> CompositesSearcher.java(313) 
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 
> Adding index hit to current row for 6132
> 2014-01-15 09:14:56.828+0100 [ReadStage:2] [TRACE] SSTableReader.java(776) 
> org.apache.cassandra.io.sstable.SSTableReader: component=c4 Adding cache 
> entry for KeyCacheKey(/mnt/ebs/cassandra/data/t/t/t-t-ic-1, 6132) -> 
> org.apache.cassandra.db.RowIndexEntry@87d66d5
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 0 of 
> 2147483647: :false:0@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 
> 2147483647: b:false:2@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 
> 2147483647: c:false:2@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) 
> org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 
> 2147483647: d:false:2@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] 
> CompositesSearcher.java(232) 
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 
> Read only 2 (< 10000) last page through, must be done
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] 
> CompositesSearcher.java(232) 
> org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 
> Read only 2 (< 10000) last page through, must be done
> 2014-01-15 09:14:56.830+0100 [Thrift:1] [DEBUG] Tracing.java(169) 
> org.apache.cassandra.tracing.Tracing: component=c4 request complete
> {noformat}
> Note that Cassandra is checking all columns ({{a}}, {{b}}, {{c}} and {{d}}), 
> even we requested only column {{a}}.
> Things became really nasty, when using lots of columns, or bigger collections 
> (yes - each member of collection is checked).
> This is quite counter intuitive behaviour, as all Cassandra guides said, that 
> using wide rows should not affect performance, but in CQL3 they create big 
> performance bottleneck.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to