[jira] [Commented] (CASSANDRA-6048) Add the ability to use multiple indexes in a single query

Alex Liu (JIRA) Thu, 17 Oct 2013 17:33:01 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13798659#comment-13798659
 ]


Alex Liu commented on CASSANDRA-6048:
-------------------------------------

>From this link 
>http://sqlinthewild.co.za/index.php/2010/09/14/one-wide-index-or-multiple-narrow-indexes/

{code}
SQL can use multiple indexes on a single table (Index Intersection), but it’s 
not the most efficient option. It’s worth nothing that SQL won’t always chose 
to do the index intersection. It may quite well decide that a table/clustered 
index scan is faster than the multiple seeks and joins that the intersection 
will do. Or, if one of the conditions is very selective, it may decide to seek 
on one of the indexes, do key lookups to fetch the rest of the columns and then 
do secondary filters to evaluate the rest of the predicates.
{code}

If the primary index is significant more selective (mean of number of columns 
is much lower) than other indexes, one index + loop is better.

We can also add a threshold of the number of indexes to join, so we don't end 
up with too many indexes seek. 

> Add the ability to use multiple indexes in a single query
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-6048
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6048
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Alex Liu
>            Assignee: Alex Liu
>             Fix For: 2.1
>
>         Attachments: 6048-1.2-branch.txt, 6048-trunk.txt
>
>
> Existing data filtering uses the following algorithm
> {code}
>    1. find best selective predicate based on the smallest mean columns count
>    2. fetch rows for the best selective predicate predicate, then filter the 
> data based on other predicates left.
> {code}
> So potentially we could improve the performance by
> {code}
>    1.  joining multiple predicates then do the data filtering for other 
> predicates.
>    2.  fine tune the best predicate selection algorithm
> {code}
> For multiple predicate join, it could improve performance if one predicate 
> has many entries and another predicate has a very few of entries. It means a 
> few index CF read, join the row keys, fetch rows then filter other predicates
> Another approach is to have index on multiple columns.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (CASSANDRA-6048) Add the ability to use multiple indexes in a single query

Reply via email to