Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.
The "LiveSchemaUpdates" page has been changed by gdusbabek. The comment on this change is: Clarified under which situations manual cleanup is required after crashing during a migration.. http://wiki.apache.org/cassandra/LiveSchemaUpdates?action=diff&rev1=5&rev2=6 -------------------------------------------------- === Server Migration Process === Applying a migration consists of the following steps: 1. Generate the migration, which includes a new version UUID. + a. `DROP`s only: snapshot the data that is going away. 2. Update `SCHEMA_CF` with a new schema row. 3. Update `MIGRATION_CF` by appending a migration column. 4. Update the `"Last Migration"` row in `SCHEMA_CF`. 5. Flush the definitions table. - 6. Update runtime data structures (create directories, etc.) + 6. Update runtime data structures (create directories, do deletions, etc.) === Handling Failure === A node can fail during any step of the update process. Here is an examination of what will happen if a node fails after each part of the update process (see Server Migration Process above). 1. Nothing has been applied. Update fails outright. + a. Same. You will have an extra snapshot though. 2. Extra data exists in SCHEMA_CF but will be ignored because "Last Migration" was not updated. 3. Extra data exists in SCHEMA_CF and MIGRATION_CF but will be ignored because "Last Migration" was not updated. 4. '''Broken''': commit log will not be replayed until *after* schemas are loaded on restart. This means that the "Last Migration" will be read, but will not be able to be loaded and applied. - 5. Startup will happen normally. + 5. Startup will happen normally. 6. Startup will happen normally. + If a node crashes during a migration, chances are you will have to do some manual cleanup. For example, if a node cashes after steps 4 or 5 of a `DROP` migration, you will need to manually delete the data files. (Not deleting them does no harm unless you 'recreate' the same CF via `ADD` later on. Then you have an instant database.) === Starting Up === When a node starts up, it checks `SCHEMA_CF` to find out the latest schema version it has. If it finds nothing (as would happen with a new cluster), it loads nothing and logs a warning. Otherwise, it uses the uuid it just read in to load the correct row from `SCHEMA_CF`. That row is deserialized into one or more keyspace definitions which are then loaded in a manner similar to the load-from-xml approach used in the past.
