[jira] [Comment Edited] (CASSANDRA-6477) Global indexes

Oleg Anastasyev (JIRA) Mon, 09 Mar 2015 03:03:07 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14352732#comment-14352732
 ]


Oleg Anastasyev edited comment on CASSANDRA-6477 at 3/9/15 10:02 AM:
---------------------------------------------------------------------

Sorry for a slight delay. Here are my throughts, hope you'll find them useful:

1. Composite indexes are the most useful feature of GI. Majority of our GI are 
composite, some with different clustering order defined. And this is not much 
more work to implement composite indexes by the way. Composite partition keys 
on GI CF are also used to split otherwise wide partitions of global index with 
popular and frequently changing values. These otherwise wide partitions suffer 
from too much range tombstones (obviously, on modification of indexed value a 
range tombstone is generated to global index CF. still, it surprises ppl).

2. As I can see from changes to CQL syntax and CFMetaData.java, compaction and 
compression props are copied from base CF to global index CF. GI CFs could have 
row sizes, update behaviour, reads and writes very different from its base CF, 
so specifying compaction, compression properties as well as other available in 
CREATE TABLE WITH clause could be useful. 

The syntax for GI with composite keys then could be eg:
{code}
CREATE GLOBAL INDEX indexname ON baseCF( (partk1,partk2),clustkey1,... ) 
DENORMALIZED ... WITH <all the with properties of normal table>
{code}
( I'd also suggest to replace the keyword "DENORMALIZED" with something more 
familiar to SQL ppl, like "INCLUDE", eg in 
https://msdn.microsoft.com/en-us/library/ms190806.aspx )

3. I am not sure forbidding to create global index on the column with existing 
2i is good idea. We use 2 modes for global index: Right after global index is 
created only writes to the new index are activated. No reads from it are 
allowed while the base CF dataset is scanned and its data copied to the global 
index. If there are another global or 2i available on the same columns which 
could be used for reads - they are used. After a build is complete, operator 
can enable the just built index for read using ALTER GLOBAL INDEX ENABLE 
statement, and disable old indexes. This makes transition to GI and changing 
the structure of GI smoother from operational perspective. In case of something 
go wrong, operator just disables new and re-enables old indexes in no-time. 
Applying the same write-only/read-write mode switch here could make ppl 
transition from 2i to GI easier. This feature also makes on-the-run rebuild of 
GI possible, which could be useful until all bugs with inconsistent global 
index updates would be fixed. 

4. The base CF old data scan to fill data into new global index consistently 
with base CF is another tricky process, to which I came after several trials 
and errors. It has no external dependencies and most work is performed locally 
on C* nodes. You may find it useful as well.
It breaks into 6 stages:
        1. First of all a new empty table to hold index data is created.
        2. Index writes are started on all nodes. So new modifications start to 
fill the index. At this moment new index is disabled for reading.
        3. C* nodes launch the primary range repair procedure on the base table 
to make sure all replicas of it are the same.
        4. C* nodes each scan their primary ranges of the data locally in the 
base table and fill index memtable and preparing data to stream to other nodes 
in parallel.
        5. Then they stream necessary data to other nodes, according to 
partitioning schema and replica count.
        6. When streaming completes, index is ready and enabled for clients to 
read as a final step either automatically or by operator command.

Some smaller issues I found in the Carl's branch are:

1. Not sure, how base table schema evolution is supported on fully denormalized 
global index. If column is added to base table it must be added to GI. Same 
with drop of column in base table, drop of the base table itself.
2. It looks like due to order of schema modifications in 
CreateGlobalIndexStatement.announceMigration (define global index to base CF 
and then create the GI CF itself) and DropGlobalIndex (drop GI CF and then 
deregister it from base table) there will be mutations to unknown CF, when 
schema modifications not applied fully to all nodes of cluster. I'd suggest 
creating GI CF first, then register it as global index in base table metadata.
3. Not sure, is delete of the row from base table implemented right. It seems 
like current implemenmtation of MutationUnit.oldValueIfUpdated interprets 
deletion as no-change to global index.
4. StorageProxy.mutateAtomic in case of concurrent modifications of the base 
row will produce inconsistent records to global index, but as I understood this 
is to be resolved later.



was (Author: m0nstermind):
Sorry for a slight delay. Here are my throughts, hope you'll find them useful:

1. Composite indexes are the most useful feature of GI. Majority of our GI are 
composite, some with different clustering order defined. And this is not much 
more work to implement composite indexes by the way. Composite partition keys 
on GI CF are also used to split otherwise wide partitions of global index with 
popular and frequently changing values. These otherwise wide partitions suffer 
from too much range tombstones (obvously on modification of indexed value a 
range tombstone is generated because on .

2. As I can see from changes to CQL syntax and CFMetaData.java, compaction and 
compression props are copied from base CF to global index CF. GI CFs could have 
row sizes, update behaviour, reads and writes very different from its base CF, 
so specifying compaction, compression properties as well as other available in 
CREATE TABLE WITH clause could be useful. 

The syntax for GI with composite keys then could be eg:
{code}
CREATE GLOBAL INDEX indexname ON baseCF( (partk1,partk2),clustkey1,... ) 
DENORMALIZED ... WITH <all the with properties of normal table>
{code}
( I'd also suggest to replace the keyword "DENORMALIZED" with something more 
familiar to SQL ppl, like "INCLUDE", eg in 
https://msdn.microsoft.com/en-us/library/ms190806.aspx )

3. I am not sure forbidding to create global index on the column with existing 
2i is good idea. We use 2 modes for global index: Right after global index is 
created only writes to the new index are activated. No reads from it are 
allowed while the base CF dataset is scanned and its data copied to the global 
index. If there are another global or 2i available on the same columns which 
could be used for reads - they are used. After a build is complete, operator 
can enable the just built index for read using ALTER GLOBAL INDEX ENABLE 
statement, and disable old indexes. This makes transition to GI and changing 
the structure of GI smoother from operational perspective. In case of something 
go wrong, operator just disables new and re-enables old indexes in no-time. 
Applying the same write-only/read-write mode switch here could make ppl 
transition from 2i to GI easier. This feature also makes on-the-run rebuild of 
GI possible, which could be useful until all bugs with inconsistent global 
index updates would be fixed. 

4. The base CF old data scan to fill data into new global index consistently 
with base CF is another tricky process, to which I came after several trials 
and errors. It has no external dependencies and most work is performed locally 
on C* nodes. You may find it useful as well.
It breaks into 6 stages:
        1. First of all a new empty table to hold index data is created.
        2. Index writes are started on all nodes. So new modifications start to 
fill the index. At this moment new index is disabled for reading.
        3. C* nodes launch the primary range repair procedure on the base table 
to make sure all replicas of it are the same.
        4. C* nodes each scan their primary ranges of the data locally in the 
base table and fill index memtable and preparing data to stream to other nodes 
in parallel.
        5. Then they stream necessary data to other nodes, according to 
partitioning schema and replica count.
        6. When streaming completes, index is ready and enabled for clients to 
read as a final step either automatically or by operator command.

Some smaller issues I found in the Carl's branch are:

1. Not sure, how base table schema evolution is supported on fully denormalized 
global index. If column is added to base table it must be added to GI. Same 
with drop of column in base table, drop of the base table itself.
2. It looks like due to order of schema modifications in 
CreateGlobalIndexStatement.announceMigration (define global index to base CF 
and then create the GI CF itself) and DropGlobalIndex (drop GI CF and then 
deregister it from base table) there will be mutations to unknown CF, when 
schema modifications not applied fully to all nodes of cluster. I'd suggest 
creating GI CF first, then register it as global index in base table metadata.
3. Not sure, is delete of the row from base table implemented right. It seems 
like current implemenmtation of MutationUnit.oldValueIfUpdated interprets 
deletion as no-change to global index.
4. StorageProxy.mutateAtomic in case of concurrent modifications of the base 
row will produce inconsistent records to global index, but as I understood this 
is to be resolved later.


> Global indexes
> --------------
>
>                 Key: CASSANDRA-6477
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>            Assignee: Carl Yeksigian
>              Labels: cql
>             Fix For: 3.0
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-6477) Global indexes

Reply via email to