[jira] [Comment Edited] (CASSANDRA-6477) Global indexes

Carl Yeksigian (JIRA) Mon, 04 May 2015 07:18:22 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14526653#comment-14526653
 ]


Carl Yeksigian edited comment on CASSANDRA-6477 at 5/4/15 2:16 PM:
-------------------------------------------------------------------

# It is going to be the same mechanism, but we don't want to use the same 
consistency as what the insert is. This way, we can ensure that at least one 
node has seen all of the updates, and thus we can generate the correct 
tombstone based on the previous values; we are trying to make the dependency 
between the data table and the index table redundant, so we need to make sure a 
quorum is involved in the write
# Each replica makes a GI update independently, based on the data that it has, 
which means that we might issue updates for an older update that hasn't made it 
to all of the replicas yet. To cut down on the amount of work that the indexes 
do, a pretty easy optimization is to just send the index mutation to the index 
replica that the data node will wait on instead of sending them to all of the 
index replicas
# If we ever get into a situation where we have data loss in either the base 
table or the index table (both would likely go together), we would really need 
to run a rebuild, since there is no guarantee that extra data wouldn't be 
present in the index which isn't in the data table. Otherwise, we can repair 
the data and index tables independently, so that a repair issued on the data 
table should also repair all of the global index tables


was (Author: carlyeks):
# It is going to be the same mechanism, but we don't want to use the same 
consistency as what the insert is. This way, we can ensure that at least one 
node has seen all of the updates, and thus we can generate the correct 
tombstone based on the previous values
# Each replica makes a GI update independently, based on the data that it has, 
which means that we might issue updates for an older update that hasn't made it 
to all of the replicas yet. To cut down on the amount of work that the indexes 
do, a pretty easy optimization is to just send the index mutation to the index 
replica that the data node will wait on instead of sending them to all of the 
index replicas
# If we ever get into a situation where we have data loss in either the base 
table or the index table (both would likely go together), we would really need 
to run a rebuild, since there is no guarantee that extra data wouldn't be 
present in the index which isn't in the data table. Otherwise, we can repair 
the data and index tables independently, so that a repair issued on the data 
table should also repair all of the global index tables

> Global indexes
> --------------
>
>                 Key: CASSANDRA-6477
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>            Assignee: Carl Yeksigian
>              Labels: cql
>             Fix For: 3.x
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-6477) Global indexes

Reply via email to