[ 
https://issues.apache.org/jira/browse/CASSANDRA-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler resolved CASSANDRA-5924.
---------------------------------------

    Resolution: Not a Problem

Closing as not a problem, due to unexpected data where data is expected. Feel 
free to re-open with some concrete reproduction steps on the latest version of 
1.2.x or 2.0.x, if you would like to pursue further.  Thanks!

> If migration (upgrade) failed mid-way, some data will be "lost" on the 
> upgraded instance
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5924
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5924
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jackson Chung
>
> When upgrading from 1.0 to 1.1, C* checks from the system keyspace 
> (schema_keyspaces) to see if a migration is needed.
> When it is needed, it proceeds with migrate migrateSSTables.
> But this process does not have any particular order (File.listFiles() has no 
> guarantee order), and IOException can be thrown (eg fail to create directory).
> In some of our upgrades, system was migrated first, followed by some KSs/CFs, 
> but before it finishes all the KSs/CFs, it failed on a custom directory, with 
> files in this directory that similar to sstables file convention (contains 
> "-"). 
> They really shouldn't be there and we are removing them. But this results in 
> C* tried to create directory for this file, but it fails, because of 
> ownership/permission, with IOException. As a result C* failed to start.
> Without knowing why C* failed to start to begin with, C* was restarted. Only 
> this time C* does not think it needs to migrate any more (system already 
> migrated, so schema_keyspaces exists). This results in the those remaining 
> KS/CF failed to be migrated.
> Our root cause is because of the custom directory and the 
> ownership/permission of it, and again we are removing them to re-upgrade. But 
> the purpose of this jira is IOException (or any other exception) can still be 
> thrown for various reasons during this process, and can result in the same 
> problem: some CF failed to be migrated.
> 1.2 seems to have some handling codes, but it looks like a RuntimeException 
> would still be thrown, and that would still be caught by the 
> AbstractCassandraDaemon (or CassandraDaemon if 1.2) :
> {code}
>         catch (Throwable e)
>         {
>             logger.error("Exception encountered during startup", e);
>             // try to warn user on stdout too, if we haven't already detached
>             e.printStackTrace();
>             System.out.println("Exception encountered during startup: " + 
> e.getMessage());
>             System.exit(3);
>         }
> {code}
> And so I think this problem still exists in 1.2



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to