Jonathan Halliday created CASSANDRA-7016:
--------------------------------------------

             Summary: can't map/reduce over subset of rows with cql
                 Key: CASSANDRA-7016
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7016
             Project: Cassandra
          Issue Type: Bug
          Components: Core, Hadoop
            Reporter: Jonathan Halliday


select ... where token(k) < x and token(k) >= y and k in (a,b) allow filtering;

This fails on 2.0.6: can't restrict k by more than one relation.

In the context of map/reduce (hence the token range) I want to map over only a 
subset of the keys (hence the 'in').  Pushing the 'in' filter down to cql is 
substantially cheaper than pulling all rows to the client and then discarding 
most of them.

Currently this is possible only if the hadoop integration code is altered to 
apply the AND on the client side and use cql that contains only the resulting 
filtered 'in' set.  The problem is not hadoop specific though, so IMO it should 
really be solved in cql not the hadoop integration code.

Most restrictions on cql syntax seem to exist to prevent unduly expensive 
queries. This one seems to be doing the opposite.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to