[jira] Commented: (CASSANDRA-749) Secondary indices for column families

Jonathan Ellis (JIRA) Fri, 12 Mar 2010 15:39:50 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844755#action_12844755
 ]


Jonathan Ellis commented on CASSANDRA-749:
------------------------------------------

> Is it worth creating a secondary index that only contains local data, versus 
> a distributed secondary index (a normal ColumnFamily?) 

I think my initial reasoning was wrong here.  I was anti-local-indexes because 
"we have to query the full cluster for any index lookup, since we are throwing 
away our usual partitioning scheme."

Which is true, but it ignores the fact that, in most cases, you will have to 
"query the full cluster" to get the actual matching rows, b/c the indexed rows 
will be spread across all machines.  So, having local indexes is better in the 
common case, since it actually saves a round trip from querying a the index to 
querying the rows.

Also, having each node index the rows it has locally means you don't have to 
worry about sharding a very large index since it happens automatically.

Finally, it lets us use the local commitlog to keep index + data in sync.

> Secondary indices for column families
> -------------------------------------
>
>                 Key: CASSANDRA-749
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-749
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Gary Dusbabek
>            Assignee: Gary Dusbabek
>            Priority: Minor
>             Fix For: 0.8
>
>         Attachments: 0001-simple-secondary-indices.patch, views-discussion.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-749) Secondary indices for column families

Reply via email to