Jonathan Halliday created CASSANDRA-7016:
--------------------------------------------
Summary: can't map/reduce over subset of rows with cql
Key: CASSANDRA-7016
URL: https://issues.apache.org/jira/browse/CASSANDRA-7016
Project: Cassandra
Issue Type: Bug
Components: Core, Hadoop
Reporter: Jonathan Halliday
select ... where token(k) < x and token(k) >= y and k in (a,b) allow filtering;
This fails on 2.0.6: can't restrict k by more than one relation.
In the context of map/reduce (hence the token range) I want to map over only a
subset of the keys (hence the 'in'). Pushing the 'in' filter down to cql is
substantially cheaper than pulling all rows to the client and then discarding
most of them.
Currently this is possible only if the hadoop integration code is altered to
apply the AND on the client side and use cql that contains only the resulting
filtered 'in' set. The problem is not hadoop specific though, so IMO it should
really be solved in cql not the hadoop integration code.
Most restrictions on cql syntax seem to exist to prevent unduly expensive
queries. This one seems to be doing the opposite.
--
This message was sent by Atlassian JIRA
(v6.2#6252)