[
https://issues.apache.org/jira/browse/CASSANDRA-10221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jon Haddad updated CASSANDRA-10221:
-----------------------------------
Description:
For analytics workloads (in particular I'm thinking spark) it would be nice if
we could add any predicate to the WHERE clause. I added the CL=ONE requirement
since it seems like this may be insane to do with any other level of
consistency.
Currently in the spark connector if you want to filter on an arbitrary column
of a table, you have to pull the entire table in memory via what is effectively
a distributed SELECT * with token ranges and CL=ONE (typically). It would be
much nicer to avoid pulling the extra data into memory and just noop on the row
if it doesn't satisfy the predicates.
I think for sanity this should require the ALLOW FILTERING clause.
was:For analytics workloads it would be nice if we could add any predicate.
I added the CL=ONE requirement since it seems like this may be insane to do
with any other level of consistency.
> arbitrary predicate pushdown on CL=ONE
> --------------------------------------
>
> Key: CASSANDRA-10221
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10221
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Jon Haddad
>
> For analytics workloads (in particular I'm thinking spark) it would be nice
> if we could add any predicate to the WHERE clause. I added the CL=ONE
> requirement since it seems like this may be insane to do with any other level
> of consistency.
> Currently in the spark connector if you want to filter on an arbitrary column
> of a table, you have to pull the entire table in memory via what is
> effectively a distributed SELECT * with token ranges and CL=ONE (typically).
> It would be much nicer to avoid pulling the extra data into memory and just
> noop on the row if it doesn't satisfy the predicates.
> I think for sanity this should require the ALLOW FILTERING clause.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)