[
https://issues.apache.org/jira/browse/CASSANDRA-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196176#comment-13196176
]
Pavel Yaskevich commented on CASSANDRA-3804:
--------------------------------------------
This exception (taken from Sylvain's #2) explains what will happen when you
only partially migrate:
{noformat}
ERROR [GossipStage:1] 2012-01-30 14:35:13,363 AbstractCassandraDaemon.java
(line 139) Fatal exception in thread Thread[GossipStage:1,5,main]
java.lang.UnsupportedOperationException: Not a time-based UUID
at java.util.UUID.timestamp(UUID.java:308)
at
org.apache.cassandra.service.MigrationManager.updateHighestKnown(MigrationManager.java:121)
at
org.apache.cassandra.service.MigrationManager.rectify(MigrationManager.java:99)
at
org.apache.cassandra.service.MigrationManager.onAlive(MigrationManager.java:83)
at org.apache.cassandra.gms.Gossiper.markAlive(Gossiper.java:806)
at
org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:849)
at
org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:908)
at
org.apache.cassandra.gms.GossipDigestAckVerbHandler.doVerb(GossipDigestAckVerbHandler.java:68)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{noformat}
As we switched from Time-based UUID for schema versions MigrationManager on the
old nodes will fail all the time when nodes with new schema start-up or when
they will request migrations from it (because they see that their schema
version is different from others). Even if we make a fix in
MigrationManager.rectify(...) method for 1.0.x, nodes with new/old schema will
never come to agreement because of different types of the UUID and because they
unable to run schema mutations anymore.
> upgrade problems from 1.0 to trunk
> ----------------------------------
>
> Key: CASSANDRA-3804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3804
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 1.1
> Environment: ubuntu, cluster set up with ccm.
> Reporter: Tyler Patterson
> Assignee: Pavel Yaskevich
> Fix For: 1.1
>
>
> A 3-node cluster is on version 0.8.9, 1.0.6, or 1.0.7 and then one and only
> one node is taken down, upgraded to trunk, and started again. An rpc timeout
> exception happens if counter-add operations are done. It usually takes
> between 1 and 500 add operations before the failure occurs. The failure seems
> to happen sooner if the coordinator node is NOT the one that was upgraded.
> Here is the error:
> {code}
> ======================================================================
> ERROR: counter_upgrade_test.TestCounterUpgrade.counter_upgrade_test
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File "/usr/lib/pymodules/python2.7/nose/case.py", line 187, in runTest
> self.test(*self.arg)
> File "/home/tahooie/cassandra-dtest/counter_upgrade_test.py", line 50, in
> counter_upgrade_test
> cursor.execute("UPDATE counters SET row = row+1 where key='a'")
> File "/usr/local/lib/python2.7/dist-packages/cql/cursor.py", line 96, in
> execute
> raise cql.OperationalError("Request did not complete within rpc_timeout.")
> OperationalError: Request did not complete within rpc_timeout.
> {code}
> A script has been added to cassandra-dtest (counter_upgrade_test.py) to
> demonstrate the failure. The newest version of CCM is required to run the
> test. It is available here if it hasn't yet been pulled:
> [email protected]:tpatterson/ccm.git
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira