[
https://issues.apache.org/jira/browse/CASSANDRA-749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844755#action_12844755
]
Jonathan Ellis commented on CASSANDRA-749:
------------------------------------------
> Is it worth creating a secondary index that only contains local data, versus
> a distributed secondary index (a normal ColumnFamily?)
I think my initial reasoning was wrong here. I was anti-local-indexes because
"we have to query the full cluster for any index lookup, since we are throwing
away our usual partitioning scheme."
Which is true, but it ignores the fact that, in most cases, you will have to
"query the full cluster" to get the actual matching rows, b/c the indexed rows
will be spread across all machines. So, having local indexes is better in the
common case, since it actually saves a round trip from querying a the index to
querying the rows.
Also, having each node index the rows it has locally means you don't have to
worry about sharding a very large index since it happens automatically.
Finally, it lets us use the local commitlog to keep index + data in sync.
> Secondary indices for column families
> -------------------------------------
>
> Key: CASSANDRA-749
> URL: https://issues.apache.org/jira/browse/CASSANDRA-749
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Gary Dusbabek
> Assignee: Gary Dusbabek
> Priority: Minor
> Fix For: 0.8
>
> Attachments: 0001-simple-secondary-indices.patch, views-discussion.txt
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.