[
https://issues.apache.org/jira/browse/CASSANDRA-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14694031#comment-14694031
]
Aleksey Yeschenko commented on CASSANDRA-6717:
----------------------------------------------
Publicly-visible bits LGTM, the remained - mostly LGTM. Can ultimately be
committed as is once (6) gets addressed. Nits and notes, mixed, in no
particular order, below:
A few places
1. {{indexDef.indexType == IndexMetadata.IndexType.CUSTOM}} and similar
comparisons to {{IndexType}} enum, would better be {{isCustom()}},
{{isKeys()}}, {{isComposites()}} methods in {{IndexMetadata}}
{{CFMetaData}}
2. {{CFMetaData.getAvailableIndexName()}} should be moved to Indexes
3. {{else if (getIndexes().get(def).isPresent())}} - should be a method in
{{Indexes}} ({{Indexes.indexes()}}?)
4. {{isIndexNameValid()}} should be moved to {{IndexMetadata}}
5. {{existingIndexNames()}} should be keyspace-scoped - and moved to
{{KeyspaceMetadata}} (a per-existing issue)
6. {{equals()}}/{{hashCode()}}/{{toString()}} should take indexes into account
7.
{code}
if (index.indexType == IndexMetadata.IndexType.CUSTOM)
{
if (index.options == null ||
!index.options.containsKey(SecondaryIndex.CUSTOM_INDEX_OPTION_NAME))
throw new ConfigurationException("Required index option missing: " +
SecondaryIndex.CUSTOM_INDEX_OPTION_NAME);
}
{code}
belongs to {{IndexMetadata.validate()}}, and so does
{code}
if (!isIndexNameValid(index.name))
throw new ConfigurationException("Illegal index name " + index.name);
{code}
8.
{code}
if (isSuper())
throw new ConfigurationException("Secondary indexes are not supported on
super column families");
{code}
doesn’t need to be in a loop, just {{if (!indexes.isEmpty() && isSuper())}} on
top will do (pre-exsiting)
{{CreateIndexStatement}}
9. {{findIndexedCF()}} should be moved to KeyspaceMetadata (pre-existing issue)
{{ColumnFamilyStore}}
10. {{if (info.indexType != null)}} check is redundant, {{indexType}} is
non-nullable
11.
{code}
// also clean out any index leftovers.
for (IndexMetadata def : metadata.getIndexes())
{
CFMetaData indexMetadata = SecondaryIndex.newIndexMetadata(metadata, def);
if (indexMetadata != null)
scrubDataDirectories(indexMetadata);
}
{code}
should check type for {{CUSTOM}} right there instead, and
{{SI.newIndexMetadata()}} should assert {{type != CUSTOM}} (pre-existing-ish)
{{SchemaKeyspace}}
12. {{resetCollection}}, {{addMapEntry}}, and {{addSetEntry}} loops should be
replaced with {{RowUpdateBuilder.map()/set()}}} where possible. And then all
switched to {{frozen}} collection, with the rest of the schema (I or Robert
will handle in a separate commit)
> Modernize schema tables
> -----------------------
>
> Key: CASSANDRA-6717
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6717
> Project: Cassandra
> Issue Type: Sub-task
> Reporter: Sylvain Lebresne
> Assignee: Sam Tunnicliffe
> Labels: client-impacting, doc-impacting
> Fix For: 3.0 beta 1
>
>
> There is a few problems/improvements that can be done with the way we store
> schema:
> # CASSANDRA-4988: as explained on the ticket, storing the comparator is now
> redundant (or almost, we'd need to store whether the table is COMPACT or not
> too, which we don't currently is easy and probably a good idea anyway), it
> can be entirely reconstructed from the infos in schema_columns (the same is
> true of key_validator and subcomparator, and replacing default_validator by a
> COMPACT_VALUE column in all case is relatively simple). And storing the
> comparator as an opaque string broke concurrent updates of sub-part of said
> comparator (concurrent collection addition or altering 2 separate clustering
> columns typically) so it's really worth removing it.
> # CASSANDRA-4603: it's time to get rid of those ugly json maps. I'll note
> that schema_keyspaces is a problem due to its use of COMPACT STORAGE, but I
> think we should fix it once and for-all nonetheless (see below).
> # For CASSANDRA-6382 and to allow indexing both map keys and values at the
> same time, we'd need to be able to have more than one index definition for a
> given column.
> # There is a few mismatches in table options between the one stored in the
> schema and the one used when declaring/altering a table which would be nice
> to fix. The compaction, compression and replication maps are one already
> mentioned from CASSANDRA-4603, but also for some reason
> 'dclocal_read_repair_chance' in CQL is called just 'local_read_repair_chance'
> in the schema table, and 'min/max_compaction_threshold' are column families
> option in the schema but just compaction options for CQL (which makes more
> sense).
> None of those issues are major, and we could probably deal with them
> independently but it might be simpler to just fix them all in one shot so I
> wanted to sum them all up here. In particular, the fact that
> 'schema_keyspaces' uses COMPACT STORAGE is annoying (for the replication map,
> but it may limit future stuff too) which suggest we should migrate it to a
> new, non COMPACT table. And while that's arguably a detail, it wouldn't hurt
> to rename schema_columnfamilies to schema_tables for the years to come since
> that's the prefered vernacular for CQL.
> Overall, what I would suggest is to move all schema tables to a new keyspace,
> named 'schema' for instance (or 'system_schema' but I prefer the shorter
> version), and fix all the issues above at once. Since we currently don't
> exchange schema between nodes of different versions, all we'd need to do that
> is a one shot startup migration, and overall, I think it could be simpler for
> clients to deal with one clear migration than to have to handle minor
> individual changes all over the place. I also think it's somewhat cleaner
> conceptually to have schema tables in their own keyspace since they are
> replicated through a different mechanism than other system tables.
> If we do that, we could, for instance, migrate to the following schema tables
> (details up for discussion of course):
> {noformat}
> CREATE TYPE user_type (
> name text,
> column_names list<text>,
> column_types list<text>
> )
> CREATE TABLE keyspaces (
> name text PRIMARY KEY,
> durable_writes boolean,
> replication map<string, string>,
> user_types map<string, user_type>
> )
> CREATE TYPE trigger_definition (
> name text,
> options map<tex, text>
> )
> CREATE TABLE tables (
> keyspace text,
> name text,
> id uuid,
> table_type text, // COMPACT, CQL or SUPER
> dropped_columns map<text, bigint>,
> triggers map<text, trigger_definition>,
> // options
> comment text,
> compaction map<text, text>,
> compression map<text, text>,
> read_repair_chance double,
> dclocal_read_repair_chance double,
> gc_grace_seconds int,
> caching text,
> rows_per_partition_to_cache text,
> default_time_to_live int,
> min_index_interval int,
> max_index_interval int,
> speculative_retry text,
> populate_io_cache_on_flush boolean,
> bloom_filter_fp_chance double
> memtable_flush_period_in_ms int,
> PRIMARY KEY (keyspace, name)
> )
> CREATE TYPE index_definition (
> name text,
> index_type text,
> options map<text, text>
> )
> CREATE TABLE columns (
> keyspace text,
> table text,
> name text,
> kind text, // PARTITION_KEY, CLUSTERING_COLUMN, REGULAR or COMPACT_VALUE
> component_index int;
> type text,
> indexes map<text, index_definition>,
> PRIMARY KEY (keyspace, table, name)
> )
> {noformat}
> Nit: wouldn't hurt to create a simple enum that is reuse by both CFMetaData
> and CFPropDefs for table options names while we're at it once they are the
> same instead of repeating string constants which is fragile.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)