[jira] [Commented] (CASSANDRA-6717) Modernize schema tables

Aleksey Yeschenko (JIRA) Wed, 12 Aug 2015 12:15:07 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14694031#comment-14694031
 ]


Aleksey Yeschenko commented on CASSANDRA-6717:
----------------------------------------------

Publicly-visible bits LGTM, the remained - mostly LGTM. Can ultimately be 
committed as is once (6) gets addressed. Nits and notes, mixed, in no 
particular order, below:

A few places
1. {{indexDef.indexType == IndexMetadata.IndexType.CUSTOM}} and similar 
comparisons to {{IndexType}} enum, would better be {{isCustom()}}, 
{{isKeys()}}, {{isComposites()}} methods in {{IndexMetadata}} 

{{CFMetaData}}
2. {{CFMetaData.getAvailableIndexName()}} should be moved to Indexes
3. {{else if (getIndexes().get(def).isPresent())}} - should be a method in 
{{Indexes}} ({{Indexes.indexes()}}?)
4. {{isIndexNameValid()}} should be moved to {{IndexMetadata}}
5. {{existingIndexNames()}} should be keyspace-scoped - and moved to 
{{KeyspaceMetadata}} (a per-existing issue)
6. {{equals()}}/{{hashCode()}}/{{toString()}} should take indexes into account
7.
{code}
if (index.indexType == IndexMetadata.IndexType.CUSTOM)
{
    if (index.options == null || 
!index.options.containsKey(SecondaryIndex.CUSTOM_INDEX_OPTION_NAME))
        throw new ConfigurationException("Required index option missing: " + 
SecondaryIndex.CUSTOM_INDEX_OPTION_NAME);
}
{code}

belongs to {{IndexMetadata.validate()}}, and so does 

{code}
if (!isIndexNameValid(index.name))
    throw new ConfigurationException("Illegal index name " + index.name);
{code}

8.
{code}
if (isSuper())
    throw new ConfigurationException("Secondary indexes are not supported on 
super column families");
{code}

doesn’t need to be in a loop, just {{if (!indexes.isEmpty() && isSuper())}} on 
top will do (pre-exsiting)

{{CreateIndexStatement}}
9. {{findIndexedCF()}} should be moved to KeyspaceMetadata (pre-existing issue)

{{ColumnFamilyStore}}
10. {{if (info.indexType != null)}} check is redundant, {{indexType}} is 
non-nullable
11.
{code}
// also clean out any index leftovers.
for (IndexMetadata def : metadata.getIndexes())
{
    CFMetaData indexMetadata = SecondaryIndex.newIndexMetadata(metadata, def);
    if (indexMetadata != null)
        scrubDataDirectories(indexMetadata);
}
{code}

should check type for {{CUSTOM}} right there instead, and 
{{SI.newIndexMetadata()}} should assert {{type != CUSTOM}} (pre-existing-ish)

{{SchemaKeyspace}}
12. {{resetCollection}}, {{addMapEntry}}, and {{addSetEntry}} loops should be 
replaced with {{RowUpdateBuilder.map()/set()}}} where possible. And then all 
switched to {{frozen}} collection, with the rest of the schema (I or Robert 
will handle in a separate commit)

> Modernize schema tables
> -----------------------
>
>                 Key: CASSANDRA-6717
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6717
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Sylvain Lebresne
>            Assignee: Sam Tunnicliffe
>              Labels: client-impacting, doc-impacting
>             Fix For: 3.0 beta 1
>
>
> There is a few problems/improvements that can be done with the way we store 
> schema:
> # CASSANDRA-4988: as explained on the ticket, storing the comparator is now 
> redundant (or almost, we'd need to store whether the table is COMPACT or not 
> too, which we don't currently is easy and probably a good idea anyway), it 
> can be entirely reconstructed from the infos in schema_columns (the same is 
> true of key_validator and subcomparator, and replacing default_validator by a 
> COMPACT_VALUE column in all case is relatively simple). And storing the 
> comparator as an opaque string broke concurrent updates of sub-part of said 
> comparator (concurrent collection addition or altering 2 separate clustering 
> columns typically) so it's really worth removing it.
> # CASSANDRA-4603: it's time to get rid of those ugly json maps. I'll note 
> that schema_keyspaces is a problem due to its use of COMPACT STORAGE, but I 
> think we should fix it once and for-all nonetheless (see below).
> # For CASSANDRA-6382 and to allow indexing both map keys and values at the 
> same time, we'd need to be able to have more than one index definition for a 
> given column.
> # There is a few mismatches in table options between the one stored in the 
> schema and the one used when declaring/altering a table which would be nice 
> to fix. The compaction, compression and replication maps are one already 
> mentioned from CASSANDRA-4603, but also for some reason 
> 'dclocal_read_repair_chance' in CQL is called just 'local_read_repair_chance' 
> in the schema table, and 'min/max_compaction_threshold' are column families 
> option in the schema but just compaction options for CQL (which makes more 
> sense).
> None of those issues are major, and we could probably deal with them 
> independently but it might be simpler to just fix them all in one shot so I 
> wanted to sum them all up here. In particular, the fact that 
> 'schema_keyspaces' uses COMPACT STORAGE is annoying (for the replication map, 
> but it may limit future stuff too) which suggest we should migrate it to a 
> new, non COMPACT table. And while that's arguably a detail, it wouldn't hurt 
> to rename schema_columnfamilies to schema_tables for the years to come since 
> that's the prefered vernacular for CQL.
> Overall, what I would suggest is to move all schema tables to a new keyspace, 
> named 'schema' for instance (or 'system_schema' but I prefer the shorter 
> version), and fix all the issues above at once. Since we currently don't 
> exchange schema between nodes of different versions, all we'd need to do that 
> is a one shot startup migration, and overall, I think it could be simpler for 
> clients to deal with one clear migration than to have to handle minor 
> individual changes all over the place. I also think it's somewhat cleaner 
> conceptually to have schema tables in their own keyspace since they are 
> replicated through a different mechanism than other system tables.
> If we do that, we could, for instance, migrate to the following schema tables 
> (details up for discussion of course):
> {noformat}
> CREATE TYPE user_type (
>   name text,
>   column_names list<text>,
>   column_types list<text>
> )
> CREATE TABLE keyspaces (
>   name text PRIMARY KEY,
>   durable_writes boolean,
>   replication map<string, string>,
>   user_types map<string, user_type>
> )
> CREATE TYPE trigger_definition (
>   name text,
>   options map<tex, text>
> )
> CREATE TABLE tables (
>   keyspace text,
>   name text,
>   id uuid,
>   table_type text, // COMPACT, CQL or SUPER
>   dropped_columns map<text, bigint>,
>   triggers map<text, trigger_definition>,
>   // options
>   comment text,
>   compaction map<text, text>,
>   compression map<text, text>,
>   read_repair_chance double,
>   dclocal_read_repair_chance double,
>   gc_grace_seconds int,
>   caching text,
>   rows_per_partition_to_cache text,
>   default_time_to_live int,
>   min_index_interval int,
>   max_index_interval int,
>   speculative_retry text,
>   populate_io_cache_on_flush boolean,
>   bloom_filter_fp_chance double
>   memtable_flush_period_in_ms int,
>   PRIMARY KEY (keyspace, name)
> )
> CREATE TYPE index_definition (
>   name text,
>   index_type text,
>   options map<text, text>
> )
> CREATE TABLE columns (
>   keyspace text,
>   table text,
>   name text,
>   kind text, // PARTITION_KEY, CLUSTERING_COLUMN, REGULAR or COMPACT_VALUE
>   component_index int;
>   type text,
>   indexes map<text, index_definition>,
>   PRIMARY KEY (keyspace, table, name)
> )
> {noformat}
> Nit: wouldn't hurt to create a simple enum that is reuse by both CFMetaData 
> and CFPropDefs for table options names while we're at it once they are the 
> same instead of repeating string constants which is fragile.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6717) Modernize schema tables

Reply via email to