[
https://issues.apache.org/jira/browse/CASSANDRA-7085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093456#comment-14093456
]
sankalp kohli commented on CASSANDRA-7085:
------------------------------------------
Using SliceFilter for all CQL queries is very bad for performance. For large
CQL partitions, it has to touch all sstables where bloom filter says yes. This
will cause response times to grow with amount of data or levels. Here is a
simple example which shows how bad it is
CREATE TABLE test
A int,
B int,
C int
PRIMARY KEY(A,B);
INSERT INTO test(A,B,C) values(1,2,3);
nodetool flush
INSERT INTO test(A,B,C) values(1,2,4);
select c from test where A=1 and B=2;
This query instead of being served entirely from memtable actually touched the
sstable.
We verified this through tracing and also through debug in the code.
> Specialized query filters for CQL3
> ----------------------------------
>
> Key: CASSANDRA-7085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7085
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Sylvain Lebresne
> Labels: cql, perfomance
> Fix For: 3.0
>
>
> The semantic of CQL makes it so that the current {{NamesQueryFilter}} and
> {{SliceQueryFilter}} are not always as efficient as we could be. Namely, when
> a {{SELECT}} only selects a handful of columns, we still have to query to
> query all the columns of the select rows to distinguish between 'live row but
> with no data for the queried columns' and 'no row' (see CASSANDRA-6588 for
> more details).
> We can solve that however by adding new filters (name and slice) specialized
> for CQL. The new name filter would be a list of row prefix + a list of CQL
> column names (instead of one list of cell names). The slice filter would
> still take a ColumnSlice[] but would add the list of column names we care
> about for each row.
> The new sstable readers that goes with those filter would use the list of
> column names to filter out all the cells we don't care about, so we don't
> have to ship those back to the coordinator to skip them there, yet would know
> to still return the row marker when necessary.
--
This message was sent by Atlassian JIRA
(v6.2#6252)