[jira] [Commented] (CASSANDRA-6377) ALLOW FILTERING should allow seq scan filtering

Srini (JIRA) Fri, 19 Jun 2015 18:04:29 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594225#comment-14594225
 ]


Srini commented on CASSANDRA-6377:
----------------------------------

Just to be clear what I'm referring to, let me give an example.

Primary Key  (Key1, key2, key3, Key4, Key5)

where Key1 is the partitioning key.

Assume, this is what we want to do:

Select * from merchant_data where Key1 = ‘abc’  and Key2 = ‘xxxx’  and Key4 = 
‘yyyy’  ALLOW FILTERING;

This is what the current version of Cassandra allows:
Select * from merchant_data where Key1 = ‘abc’  and Key2 = ‘xxxx’

The difference between both of them is that in the second query the application 
has to filter with in its own logic for Key4, where as Cassandra (had it 
allowed) could have done under 1st query.

It would be a huge performance difference as it avoids network load/latency 
between Cassandra node and the client. Reducing the use of secondary indexes 
and using the core strengths of Cassandra would be extremely beneficial for 
Cassandra's adaptability across many use cases.

I do  see where this can be abused if the partition contains thousands of rows, 
but by forcing ALLOW FILTERING clause, the burden would be on the client as 
they have to make a conscious decision.


> ALLOW FILTERING should allow seq scan filtering
> -----------------------------------------------
>
>                 Key: CASSANDRA-6377
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6377
>             Project: Cassandra
>          Issue Type: Bug
>          Components: API
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>              Labels: cql
>             Fix For: 3.x
>
>
> CREATE TABLE emp_table2 (
>         empID int PRIMARY KEY,
>         firstname text,
>         lastname text,
>         b_mon text,
>         b_day text,
>         b_yr text,
> );
> INSERT INTO emp_table2 (empID,firstname,lastname,b_mon,b_day,b_yr) 
>    VALUES (100,'jane','doe','oct','31','1980');
> INSERT INTO emp_table2 (empID,firstname,lastname,b_mon,b_day,b_yr) 
>    VALUES (101,'john','smith','jan','01','1981');
> INSERT INTO emp_table2 (empID,firstname,lastname,b_mon,b_day,b_yr) 
>    VALUES (102,'mary','jones','apr','15','1982');
> INSERT INTO emp_table2 (empID,firstname,lastname,b_mon,b_day,b_yr) 
>    VALUES (103,'tim','best','oct','25','1982');
>    
> SELECT b_mon,b_day,b_yr,firstname,lastname FROM emp_table2 
>     WHERE b_mon='oct' ALLOW FILTERING;
> Bad Request: No indexed columns present in by-columns clause with Equal 
> operator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6377) ALLOW FILTERING should allow seq scan filtering

Reply via email to