[ 
https://issues.apache.org/jira/browse/CASSANDRA-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13788002#comment-13788002
 ] 

Sylvain Lebresne commented on CASSANDRA-4988:
---------------------------------------------

Each separate collection already has it's separate entry in schema_columns, and 
we have all the information there, so I don't think we need a new table here. 
The information is already redundant. It's just that because the comparator 
object needs to know the collections (to implement correctly 
AbstractType.compareCollectionMembers), we currently include the collection 
names in the comparator serialized form and we should stop doing that, but we 
already have all the information we need.

In fact, in 2.0 the 'comparator' field in schema_columnfamilies is entirely 
useless, all the information it contains can be reconstructed from the 
schema_columns. So probably the right solution is to stop saving that field at 
all, and to reconstruct it from schema_columns instead. Which would also some 
the "concurrent modification of comparator components" other problem I've 
discussed above.

Of course, we'd need to be careful with backward compatibility if we do so.

> Fix concurrent addition of collection columns
> ---------------------------------------------
>
>                 Key: CASSANDRA-4988
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4988
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 2.0.2
>
>
> It is currently not safe to update the schema by adding multiple collection 
> columns to the same table. The reason is that with collections, the 
> comparator embeds a map of names->comparator for each collection columns 
> (since different maps can have different key type for example). And when 
> serialized on disk in the schema table, the comparator is serialized as a 
> string with that map as one column. So if new collection columns are added 
> concurrently, the addition may not be merged correctly.
> One option to fix this would be to stop serializing the names->comparator map 
> of ColumnToCollectionType in toString(), and do one of:
> # reconstruct that map from the information stores in the schema_columns. The 
> downside I can see is that code-wise this may not be super clean to do.
> # change ColumnToCollectionType so that instead of having it's own 
> names->comparator map, to just store a point to the CFMetaData that contains 
> it and when it needs to find the exact comparator for a collection column, it 
> would use CFMetadata.column_metadata directly. The downside is that creating 
> a dependency from a comparator to a CFMetadata feels a bit backward.
> Note sure what's the best solution of the two honestly.
> While probably more anecdotal, we also now allow to change the type of the 
> comparator in some cases (for example updating to BytesType is always 
> allowed), and doing so concurrently on multiple components of a composite 
> comparator is also not safe for a similar reason. I'm not sure how to fix 
> that one.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to