[jira] [Commented] (CASSANDRA-3794) Avoid ID conflicts from concurrent schema changes

Sylvain Lebresne (JIRA) Fri, 25 May 2012 02:18:30 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283235#comment-13283235
 ]


Sylvain Lebresne commented on CASSANDRA-3794:
---------------------------------------------

Looking more closely, there is actually two problems with respect to rolling 
upgrades:
# because newly created CF won't have an old format id, it means people 
shouldn't create any CF in a mixed version cluster. That would clearly be fine 
for a major upgrade, it's more annoying to roll this in a minor upgrade though 
imo. I don't think there is anything we can do about that.
# as is, streaming won't work in a mixed version cluster (as is the case in 
major upgrade) by virtue of the following code in IncomingTcpConnection:
{noformat}
if (version == MessagingService.version_)
{
    ....
}
else
{
    // streaming connections are per-session and have a fixed version.  we 
can't do anything with a wrong-version stream connection, so drop it.
    logger.error("Received stream using protocol version {} (my version {}). 
Terminating connection",
                 version, MessagingService.version_);
}
{noformat}
We could avoid that, by say adding some isStreamingCompatible(v1, v2) method 
that would return true for VERSION_11 and VERSION_111, since after all there is 
no change to the stream format. However, the patch also need to version 
correctly StreamRequestMessage for it to work correctly.

Overall, this is not a small patch, and it will induces more limited rolling 
upgrade behavior than is the norm in a minor version, so I'll admit I'm 
personally growing more in favor of solution #2 above (postpone to 1.2).

That being said, on the patch itself:
* In RowCacheKey.compareTo(), == is used intead of equals().
* In Schema, we can remove the cfIdGen field && MIN_CF_ID.
* nameUUIDFromBytes already does a md5 internally, so we should just pass the 
concatenation of ksName and cfName bytes (doubling the md5 slightly augments 
the chance of collisions).
* When writing the schema, for the "id" column, the code write a string/UUID 
(toSchemaNoColumns) but expect an int when reading (fromSchemaNoColumns). The 
fact is, we don't need to save the new style id in the schema since we can 
recompute it. So we should keep the "id" column for oldId (if they exist). 
Also, when writing a CF schema, we should check if it has an associated old 
cfId and write it if it has (i.e. we should preserve the old ids mapping (when 
it exists) for now, we'll drop that in a future version).
* Schema.addOldCfIdMapping should check for null value for the oldId and ignore 
it, since in fromSchemaNoColumns, result.getInt("id") will return null for new 
CF.
* ColumnFamilySerializer needs to version the serialize version, when we talk 
to old node (same in RowMutation serialize method). Of course, when a CF don't 
have a old id, we'll have to throw an exception instead (that the 'user 
shouldn't create CF in a mixed cluster').
* StreamRequestMessage should version cfId correctly.
* In SchemaLoader, not sure we want to always assign an old style id to the CF. 
Instead, it would probably be better to add a few specific tests (serialization 
test ?) that validate the old id are correctly handled.
* OCD nit: convertOldCFId could be renamed to convertOldCfId for consistency 
with the rest (i.e. 'F' could be lowercased) :P

                
> Avoid ID conflicts from concurrent schema changes
> -------------------------------------------------
>
>                 Key: CASSANDRA-3794
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3794
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>             Fix For: 1.1.1
>
>         Attachments: CASSANDRA-3794.patch
>
>
> Change ColumnFamily identifiers to be UUIDs instead of sequential Integers. 
> Would be useful in the situation when nodes simultaneously trying to create 
> ColumnFamilies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3794) Avoid ID conflicts from concurrent schema changes

Reply via email to