[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475383#comment-13475383
 ] 

Florent Clairambault commented on CASSANDRA-4481:
-------------------------------------------------

I doesn't work, it failed again a week ago on a 1.1.5 that was running for a 
little bit.

First of all, it's a commitLog writing and/or reading issue, so if you flush 
your data frequently (every hour and in the stop command of the rc.d's script) 
you reduce your risk of big data losses. You can lose days of data if you don't 
do that. Restarting cassandra and going 2 days in the past is a very unpleasant 
situation.

So here is the new process I applied to fix my data (which is in fact 
restarting from scatch [except we keep the data]):
- Export the keyspace's schema
{code}
cassandra-cli -k ks >schema.txt <<EOF 
show schema;
exit;
EOF
{code}
- Simplify the export (all CF with key_validation_class in AsciiType, 
default_validation_class in UTF8Type for most CF except the one that contains 
binary data where I used BytesTypes).

I simplify an export like that:
{code}
create column family User
  with column_type = 'Standard'
  and comparator = 'AsciiType'
  and default_validation_class = 'UTF8Type'
  and key_validation_class = 'AsciiType'
  and read_repair_chance = 0.1
  and dclocal_read_repair_chance = 0.0
  and gc_grace = 864000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and compaction_strategy = 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
  and caching = 'KEYS_ONLY'
  and column_metadata = [
    {column_name : 'domain',
    validation_class : UTF8Type,
    index_name : 'User_domain_idx',
    index_type : 0},
    {column_name : 'username',
    validation_class : UTF8Type,
    index_name : 'User_username',
    index_type : 0}]
  and compression_options = {'sstable_compression' : 
'org.apache.cassandra.io.compress.SnappyCompressor'};
{code}

To something like that:
{code}
create column family User
  with column_type = 'Standard'
  and key_validation_class = 'AsciiType'
  and comparator = 'AsciiType'
  and default_validation_class = 'UTF8Type'
  and column_metadata = [
    {column_name : 'domain', validation_class : UTF8Type, index_type : 0},
    {column_name : 'username', validation_class : UTF8Type, index_type : 0}];
{code}

During this simplification process, I discovered that some 
default_validation_class had incorrect type, so maybe it comes from that. It 
seems strange that we could "confuse" cassandra this way, but this problem is 
indeed very strange...

- Stop cassandra
- Move the keyspace folder to somewhere else (mkdir backup; mv <ks> backup)
- Start cassandra (Not having a keyspace folder is like not having any data, 
it's not a problem).
- Delete the keyspace (I know deletion creates snapshots and moving is 
unecessary but it's easier to use sstableloader that way)
- Recreate the keyspace with the schema exported and simplified
- Use sstableloader to import data:
{code}
cd backup; find <ks> -type d -exec sstableloader -d localhost {} \;
{code}

NOTE: Don't think about replaying your commitLogs with your new schema, the 
column families won't have the same id.

Any empty cassandra instance startup does at least 1 mutation replay because of 
the "system" keyspace. So I still think 0 replayed mutations should never occur 
and if they do, we should have some warning with them. And if it's indeed "a CF 
that doesn't fully exist", it should be reported at startup.

I hope we can find a way to reproduce it.
                
> Commitlog not replayed after restart - data lost
> ------------------------------------------------
>
>                 Key: CASSANDRA-4481
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.1.2
>         Environment: Single node cluster on 64Bit CentOS
>            Reporter: Ivo Meißner
>            Priority: Critical
>
> When data is written to the commitlog and I restart the machine, all commited 
> data is lost that has not been flushed to disk. 
> In the startup logs it says that it replays the commitlog successfully, but 
> the data is not available then. 
> When I open the commitlog file in an editor I can see the added data, but 
> after the restart it cannot be fetched from cassandra. 
> {code}
>  INFO 09:59:45,362 Replaying 
> /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
>  INFO 09:59:45,476 Finished reading 
> /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
>  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to