[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14352732#comment-14352732 ]
Oleg Anastasyev edited comment on CASSANDRA-6477 at 3/9/15 10:02 AM: --------------------------------------------------------------------- Sorry for a slight delay. Here are my throughts, hope you'll find them useful: 1. Composite indexes are the most useful feature of GI. Majority of our GI are composite, some with different clustering order defined. And this is not much more work to implement composite indexes by the way. Composite partition keys on GI CF are also used to split otherwise wide partitions of global index with popular and frequently changing values. These otherwise wide partitions suffer from too much range tombstones (obviously, on modification of indexed value a range tombstone is generated to global index CF. still, it surprises ppl). 2. As I can see from changes to CQL syntax and CFMetaData.java, compaction and compression props are copied from base CF to global index CF. GI CFs could have row sizes, update behaviour, reads and writes very different from its base CF, so specifying compaction, compression properties as well as other available in CREATE TABLE WITH clause could be useful. The syntax for GI with composite keys then could be eg: {code} CREATE GLOBAL INDEX indexname ON baseCF( (partk1,partk2),clustkey1,... ) DENORMALIZED ... WITH <all the with properties of normal table> {code} ( I'd also suggest to replace the keyword "DENORMALIZED" with something more familiar to SQL ppl, like "INCLUDE", eg in https://msdn.microsoft.com/en-us/library/ms190806.aspx ) 3. I am not sure forbidding to create global index on the column with existing 2i is good idea. We use 2 modes for global index: Right after global index is created only writes to the new index are activated. No reads from it are allowed while the base CF dataset is scanned and its data copied to the global index. If there are another global or 2i available on the same columns which could be used for reads - they are used. After a build is complete, operator can enable the just built index for read using ALTER GLOBAL INDEX ENABLE statement, and disable old indexes. This makes transition to GI and changing the structure of GI smoother from operational perspective. In case of something go wrong, operator just disables new and re-enables old indexes in no-time. Applying the same write-only/read-write mode switch here could make ppl transition from 2i to GI easier. This feature also makes on-the-run rebuild of GI possible, which could be useful until all bugs with inconsistent global index updates would be fixed. 4. The base CF old data scan to fill data into new global index consistently with base CF is another tricky process, to which I came after several trials and errors. It has no external dependencies and most work is performed locally on C* nodes. You may find it useful as well. It breaks into 6 stages: 1. First of all a new empty table to hold index data is created. 2. Index writes are started on all nodes. So new modifications start to fill the index. At this moment new index is disabled for reading. 3. C* nodes launch the primary range repair procedure on the base table to make sure all replicas of it are the same. 4. C* nodes each scan their primary ranges of the data locally in the base table and fill index memtable and preparing data to stream to other nodes in parallel. 5. Then they stream necessary data to other nodes, according to partitioning schema and replica count. 6. When streaming completes, index is ready and enabled for clients to read as a final step either automatically or by operator command. Some smaller issues I found in the Carl's branch are: 1. Not sure, how base table schema evolution is supported on fully denormalized global index. If column is added to base table it must be added to GI. Same with drop of column in base table, drop of the base table itself. 2. It looks like due to order of schema modifications in CreateGlobalIndexStatement.announceMigration (define global index to base CF and then create the GI CF itself) and DropGlobalIndex (drop GI CF and then deregister it from base table) there will be mutations to unknown CF, when schema modifications not applied fully to all nodes of cluster. I'd suggest creating GI CF first, then register it as global index in base table metadata. 3. Not sure, is delete of the row from base table implemented right. It seems like current implemenmtation of MutationUnit.oldValueIfUpdated interprets deletion as no-change to global index. 4. StorageProxy.mutateAtomic in case of concurrent modifications of the base row will produce inconsistent records to global index, but as I understood this is to be resolved later. was (Author: m0nstermind): Sorry for a slight delay. Here are my throughts, hope you'll find them useful: 1. Composite indexes are the most useful feature of GI. Majority of our GI are composite, some with different clustering order defined. And this is not much more work to implement composite indexes by the way. Composite partition keys on GI CF are also used to split otherwise wide partitions of global index with popular and frequently changing values. These otherwise wide partitions suffer from too much range tombstones (obvously on modification of indexed value a range tombstone is generated because on . 2. As I can see from changes to CQL syntax and CFMetaData.java, compaction and compression props are copied from base CF to global index CF. GI CFs could have row sizes, update behaviour, reads and writes very different from its base CF, so specifying compaction, compression properties as well as other available in CREATE TABLE WITH clause could be useful. The syntax for GI with composite keys then could be eg: {code} CREATE GLOBAL INDEX indexname ON baseCF( (partk1,partk2),clustkey1,... ) DENORMALIZED ... WITH <all the with properties of normal table> {code} ( I'd also suggest to replace the keyword "DENORMALIZED" with something more familiar to SQL ppl, like "INCLUDE", eg in https://msdn.microsoft.com/en-us/library/ms190806.aspx ) 3. I am not sure forbidding to create global index on the column with existing 2i is good idea. We use 2 modes for global index: Right after global index is created only writes to the new index are activated. No reads from it are allowed while the base CF dataset is scanned and its data copied to the global index. If there are another global or 2i available on the same columns which could be used for reads - they are used. After a build is complete, operator can enable the just built index for read using ALTER GLOBAL INDEX ENABLE statement, and disable old indexes. This makes transition to GI and changing the structure of GI smoother from operational perspective. In case of something go wrong, operator just disables new and re-enables old indexes in no-time. Applying the same write-only/read-write mode switch here could make ppl transition from 2i to GI easier. This feature also makes on-the-run rebuild of GI possible, which could be useful until all bugs with inconsistent global index updates would be fixed. 4. The base CF old data scan to fill data into new global index consistently with base CF is another tricky process, to which I came after several trials and errors. It has no external dependencies and most work is performed locally on C* nodes. You may find it useful as well. It breaks into 6 stages: 1. First of all a new empty table to hold index data is created. 2. Index writes are started on all nodes. So new modifications start to fill the index. At this moment new index is disabled for reading. 3. C* nodes launch the primary range repair procedure on the base table to make sure all replicas of it are the same. 4. C* nodes each scan their primary ranges of the data locally in the base table and fill index memtable and preparing data to stream to other nodes in parallel. 5. Then they stream necessary data to other nodes, according to partitioning schema and replica count. 6. When streaming completes, index is ready and enabled for clients to read as a final step either automatically or by operator command. Some smaller issues I found in the Carl's branch are: 1. Not sure, how base table schema evolution is supported on fully denormalized global index. If column is added to base table it must be added to GI. Same with drop of column in base table, drop of the base table itself. 2. It looks like due to order of schema modifications in CreateGlobalIndexStatement.announceMigration (define global index to base CF and then create the GI CF itself) and DropGlobalIndex (drop GI CF and then deregister it from base table) there will be mutations to unknown CF, when schema modifications not applied fully to all nodes of cluster. I'd suggest creating GI CF first, then register it as global index in base table metadata. 3. Not sure, is delete of the row from base table implemented right. It seems like current implemenmtation of MutationUnit.oldValueIfUpdated interprets deletion as no-change to global index. 4. StorageProxy.mutateAtomic in case of concurrent modifications of the base row will produce inconsistent records to global index, but as I understood this is to be resolved later. > Global indexes > -------------- > > Key: CASSANDRA-6477 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 > Project: Cassandra > Issue Type: New Feature > Components: API, Core > Reporter: Jonathan Ellis > Assignee: Carl Yeksigian > Labels: cql > Fix For: 3.0 > > > Local indexes are suitable for low-cardinality data, where spreading the > index across the cluster is a Good Thing. However, for high-cardinality > data, local indexes require querying most nodes in the cluster even if only a > handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)