[
https://issues.apache.org/jira/browse/CASSANDRA-10436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sam Tunnicliffe updated CASSANDRA-10436:
----------------------------------------
Component/s: CQL
> Index selection should be weighted in favour of custom expressions
> ------------------------------------------------------------------
>
> Key: CASSANDRA-10436
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10436
> Project: Cassandra
> Issue Type: Improvement
> Components: CQL
> Reporter: Sam Tunnicliffe
> Assignee: Sam Tunnicliffe
> Fix For: 3.0.0 rc2
>
>
> If a SELECT contains a custom index expression (CASSANDRA-10217), that should
> always be chosen as the primary expression during query execution. Should the
> statement contain other expressions which can be satsfied by a built in
> index, we don't currently have the ability to apply the custom expression as
> a filter. What's more, the method of selecting which index to use is fairly
> primitive (and cannot be overridden until CASSANDRA-10214), so we should
> ensure that a custom expression, if present, is always chosen.
> Suppose we have a custom index implementation which provides prefix matching
> on text fields.
> {code}
> CREATE TABLE ks.t (k int, v1 int, v2 text, PRIMARY KEY(k));
> CREATE INDEX v1_idx ON ks.t(v1);
> CREATE CUSTOM INDEX v2_idx ON ks.t(v2) USING 'com.example.CustomIndex';
> INSERT INTO ks.t(k, v1, v2) VALUES(0, 0, 'abc');
> INSERT INTO ks.t(k, v1, v2) VALUES(1, 1, 'def');
> SELECT * FROM ks.t WHERE v1=0 AND expr(v2_idx, 'd*') ALLOW FILTERING;
> {code}
> In the above example the expected result would contain no rows, which would
> be the case if {{v2_idx}} is selected as the primary (i.e. most selective)
> index during query execution. However, if {{v1_idx}} is chosen instead, the
> results of its lookup will have no further filter applied and so an incorrect
> result will be returned.
> Note: this has always been something of an issue for custom indexes as the
> expressions they support may not be natively filterable by C*. For example,
> with the full text search syntax used by Stratio & DSE Search, if the custom
> index isn't selected the filtering will erroneously remove all rows as the
> value of the dummy column does not match the Lucene/Solr search expression
> literal. It's probably a fairly minor concern as in most cases a query using
> a custom index will not include other expressions (usually because custom
> indexes are per-row indexes, and so can support multi-field expression
> syntax). Also, an index implementation can return a very low number of
> estimated result count to try and ensure it is selected, custom expressions
> just provide an opportunity to improve the situation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)