[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-10-12 Thread Steven Haufe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475081#comment-13475081
 ] 

Steven Haufe commented on CASSANDRA-4481:
-

I have seen this same issue of data loss on restart with fresh install of 
cassandra 1.1.3.

Fixed it by upgrading to cassandra 1.1.5 and then used the instruction in 
https://issues.apache.org/jira/browse/CASSANDRA-4481?focusedCommentId=13435597page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13435597
 to reload the data




 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-10-12 Thread Florent Clairambault (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475383#comment-13475383
 ] 

Florent Clairambault commented on CASSANDRA-4481:
-

I doesn't work, it failed again a week ago on a 1.1.5 that was running for a 
little bit.

First of all, it's a commitLog writing and/or reading issue, so if you flush 
your data frequently (every hour and in the stop command of the rc.d's script) 
you reduce your risk of big data losses. You can lose days of data if you don't 
do that. Restarting cassandra and going 2 days in the past is a very unpleasant 
situation.

So here is the new process I applied to fix my data (which is in fact 
restarting from scatch [except we keep the data]):
- Export the keyspace's schema
{code}
cassandra-cli -k ks schema.txt EOF 
show schema;
exit;
EOF
{code}
- Simplify the export (all CF with key_validation_class in AsciiType, 
default_validation_class in UTF8Type for most CF except the one that contains 
binary data where I used BytesTypes).

I simplify an export like that:
{code}
create column family User
  with column_type = 'Standard'
  and comparator = 'AsciiType'
  and default_validation_class = 'UTF8Type'
  and key_validation_class = 'AsciiType'
  and read_repair_chance = 0.1
  and dclocal_read_repair_chance = 0.0
  and gc_grace = 864000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and compaction_strategy = 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
  and caching = 'KEYS_ONLY'
  and column_metadata = [
{column_name : 'domain',
validation_class : UTF8Type,
index_name : 'User_domain_idx',
index_type : 0},
{column_name : 'username',
validation_class : UTF8Type,
index_name : 'User_username',
index_type : 0}]
  and compression_options = {'sstable_compression' : 
'org.apache.cassandra.io.compress.SnappyCompressor'};
{code}

To something like that:
{code}
create column family User
  with column_type = 'Standard'
  and key_validation_class = 'AsciiType'
  and comparator = 'AsciiType'
  and default_validation_class = 'UTF8Type'
  and column_metadata = [
{column_name : 'domain', validation_class : UTF8Type, index_type : 0},
{column_name : 'username', validation_class : UTF8Type, index_type : 0}];
{code}

During this simplification process, I discovered that some 
default_validation_class had incorrect type, so maybe it comes from that. It 
seems strange that we could confuse cassandra this way, but this problem is 
indeed very strange...

- Stop cassandra
- Move the keyspace folder to somewhere else (mkdir backup; mv ks backup)
- Start cassandra (Not having a keyspace folder is like not having any data, 
it's not a problem).
- Delete the keyspace (I know deletion creates snapshots and moving is 
unecessary but it's easier to use sstableloader that way)
- Recreate the keyspace with the schema exported and simplified
- Use sstableloader to import data:
{code}
cd backup; find ks -type d -exec sstableloader -d localhost {} \;
{code}

NOTE: Don't think about replaying your commitLogs with your new schema, the 
column families won't have the same id.

Any empty cassandra instance startup does at least 1 mutation replay because of 
the system keyspace. So I still think 0 replayed mutations should never occur 
and if they do, we should have some warning with them. And if it's indeed a CF 
that doesn't fully exist, it should be reported at startup.

I hope we can find a way to reproduce it.

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-10-12 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475428#comment-13475428
 ] 

Jonathan Ellis commented on CASSANDRA-4481:
---

Commitlog failing to replay is CASSANDRA-4782.

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-10-12 Thread Florent Clairambault (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475449#comment-13475449
 ] 

Florent Clairambault commented on CASSANDRA-4481:
-

Thank you. It makes a lot of sense (and the code is amazingly clear). That's 
why the only way to be able to read the commitLogs was to delete the previous 
one.

I deleted my last comment (about recreating the CF + re-loading the data with 
sstableloader), as it won't help anyone. 

In the mean time, people wanting to upgrade cassandra between 1.1.0 and 1.1.5 
can flush (before stopping the old cassandra) and delete commit logs (before 
starting the new cassandra).

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436020#comment-13436020
 ] 

Jonathan Ellis commented on CASSANDRA-4481:
---

I'm not aware of commitlog format changes recently.  What specific versions are 
you saying are incompatible?

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Florent Clairambault (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436024#comment-13436024
 ] 

Florent Clairambault commented on CASSANDRA-4481:
-

Well, I spoke a little bit too fast here. It looks like they are incompatible 
and it looks like this potential so-called (by me) incompatibility occurs 
between 1.1.1 or 1.1.2 and 1.1.3. 


 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436030#comment-13436030
 ] 

Jonathan Ellis commented on CASSANDRA-4481:
---

If you can reproduce commitlog incompatibility between those versions, please 
let us know.

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Florent Clairambault (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436033#comment-13436033
 ] 

Florent Clairambault commented on CASSANDRA-4481:
-

As I have something that now works fine, I don't think I will do it.

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436346#comment-13436346
 ] 

Ivo Meißner commented on CASSANDRA-4481:


I have also created the broken keyspace with a version prior to 1.1.2 (I'm 
pretty sure it was 1.1.1). So maybe there is a commitlog incompatibility... 
I also ran into some schema changing issues with that keyspace. Maybe I 
destroyed the keyspace structure. 
But it would be nice to get some kind of error message if something goes wrong 
with the commitlogs. Everything else seems to work with the keyspace. You 
really don't notice until you wonder where the data is...

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436350#comment-13436350
 ] 

Jonathan Ellis commented on CASSANDRA-4481:
---

But this is exactly the situation if you dropped the keyspace on purpose: 
commitlog will have data for CFs that don't exist anymore.  Not a good idea to 
panic users when things are working as designed.

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Florent Clairambault (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436450#comment-13436450
 ] 

Florent Clairambault commented on CASSANDRA-4481:
-

This bug is marked as resolved, so we're just documenting something that never 
happened. We're not scaring anyone here, we're making sure we have all the 
documentation to prove that I we were wrong.

So just to make things clear, I didn't make any kind of change or deletion on 
my keyspaces. The two keyspaces were created by code (one with pelops and one 
with hector) once and never changed. I know I told I did it with cassandra-cli 
earlier but it turns out that it was entirely by code.

While doing some tests, I did delete the keyspaces and in that cases it gives 
an error that looks like: Commit logs for non-existing Column Family 1036 were 
ignored (I can't find the exact error in my logs). 

When I deleted the keyspace files, they were recreated by reading the commit 
logs (this is step 4 in my previous report). So I think they were in accordance 
with the schema stored in cassandra.

--- 

I wanted to actually test it.

The only last versions I could find were 1.0.1, 1.1.2 and 1.1.3. I created a 
small testscript and it definitely works with them. But it would be good if we 
could have access to 1.1.1 to do the same data upgrade we did in the past.

{code}
#!/bin/sh
apt-get remove --purge cassandra -y
rm -Rf /var/log/cassandra /var/lib/cassandra

if [ ! -f cassandra_1.0.11_all.deb ]; then
  wget 
http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.0.11_all.deb
fi

if [ ! -f cassandra_1.1.2_all.deb ]; then
   wget 
http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.1.2_all.deb
fi

if [ ! -f cassandra_1.1.3_all.deb ]; then
  wget 
http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.1.3_all.deb
fi

wait_for_server() {
  while ! echo exit | nc localhost 9160; do sleep 1; done
}

dpkg -i cassandra_1.0.11_all.deb
tail -f /var/log/cassandra/output.log 

wait_for_server;

cassandra-cli -h localhost EOF

create keyspace m2mp;
use m2mp;

create column family Registry
  with column_type = 'Standard'
  and comparator = 'AsciiType'
  and default_validation_class = 'AsciiType'
  and key_validation_class = 'AsciiType';

set Registry['/user/florent']['first']='Florent';
set Registry['/user/florent']['country']='France';
set Registry['/version']['1.0.11']='done';
EOF

cassandra-cli -h localhost -k m2mp EOF
list Registry;
exit;
EOF

dpkg -i cassandra_1.1.2_all.deb

wait_for_server;

cassandra-cli -h localhost -k m2mp EOF
set Registry['/version']['1.1.2']='done';
list Registry;
exit;
EOF

dpkg -i cassandra_1.1.3_all.deb

wait_for_server;

cassandra-cli -h localhost -k m2mp EOF
set Registry['/version']['1.1.3']='done';
list Registry;
exit;
EOF

service cassandra restart

wait_for_server;

cassandra-cli -h localhost -k m2mp EOF
list Registry;
exit;
EOF
{code}

In the end I do have:
{quote}
---
RowKey: /user/florent
= (column=country, value=France, timestamp=1345161343036000)
= (column=first, value=Florent, timestamp=1345161342992000)
---
RowKey: /version
= (column=1.0.11, value=done, timestamp=1345161343039000)
= (column=1.1.2, value=done, timestamp=1345161366935000)
= (column=1.1.3, value=done, timestamp=134516138976)
{quote}

So it's ok. But I would be pretty interested to see if we get the same result 
if we don't skip any version.

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436515#comment-13436515
 ] 

Jonathan Ellis commented on CASSANDRA-4481:
---

1.1.1 is available here: http://archive.apache.org/dist/cassandra/1.1.1/

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-15 Thread Florent Clairambault (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435414#comment-13435414
 ] 

Florent Clairambault commented on CASSANDRA-4481:
-

I'm on cassandra 1.1.3.

I have the same problem here on two different keyspaces that I created by 
different way at different time (one with cassandra-cli and one with the hector 
library), they were both created at least 6 months ago and without any special 
configuration parameter as we're on a single host (at this point).

* If some keyspaces don't exist anymore it would be nice to report it 
somewhere. So that we could have an idea of what to fix.
* It really doesn't look like these keyspaces don't exist anymore, everything 
works, including flushing them. The only thing that doesn't work is replaying 
logs.


 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-15 Thread Florent Clairambault (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435597#comment-13435597
 ] 

Florent Clairambault commented on CASSANDRA-4481:
-

So, I didn't find (or tried to find) a solution to reproduce this bug. But I 
found a fix. I'm on Debian/6.0.5, still with Cassandra/1.1.3:

For a keyspace named dom2dom (so that I don't have to replace any name).

# In my case I removed all the commitlog files that were created prior to 1.1.3

# 1. Flush cassandra
nodetool flush

# 2. Stop cassandra
service cassandra stop

# 3. Move the sstable files to an other directory
mkdir /var/lib/cassandra/toload
mv /var/lib/cassandra/data/dom2dom /var/lib/cassandra/toload/m2mp

# In my case, I had to create a 127.0.0.2 loopback interface 
# and update the cassandra.yaml file to change rpc_address and listen_address 
settings 
# to 127.0.0.2 so that sstableloader could work.

# 4. Start cassandra
service cassandra stop

# At that point the commitlogs should work again and you should have some new 
sstable created
du -sh /var/lib/cassandra/dom2dom
# Returns: 236K

# You now have the new data and not the old one, so you need to load the old 
data using sstableloader:
find /var/lib/cassandra/toload/dom2dom/ -type d -exec sstableloader -d 
127.0.0.2 {} \;

# In my case, I had to put back localhost in the cassandra.yaml for the 
rpc_address and listen_address settings

# You can delete the /var/lib/cassandra/toload folder

IMPORTANT NOTE:
I'm not sure that putting back the old (prior to 1.1.3) commitlog files will 
work. From what I've quickly tested it doesn't.

Still, I think there's should be more information in the logs to get what is 
happening. It seems very strange that the system could get stuck like this 
without any a single error message.



 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427889#comment-13427889
 ] 

Ivo Meißner commented on CASSANDRA-4481:


I can reproduce the bug as follows:

1. I insert data with my client into a column family.
2. When I select the data afterwards with a cassandra client, the data is 
returned.
{code}
get comment['1|3488da80-1dd5-11b2-aff8-030772c33eed'];
= (super_column=34942d20-1dd5-11b2-bfef-3f53095dd669,
 (column=added, value=1343979036, timestamp=1343979036707674)
 (column=id, value=34942d20-1dd5-11b2-bfef-3f53095dd669, 
timestamp=1343979036707674)
 (column=itemId, value=3488da801dd511b2aff8030772c33eed, 
timestamp=1343979036707674)
 (column=text, value=Comment, timestamp=1343979036707674)
 (column=typeId, value=1, timestamp=1343979036707674)
 (column=userId, value=4ab5fcb6753a8021ae02, timestamp=1343979036707674))
Returned 1 results.
Elapsed time: 6 msec(s).
{code}
3. Then I restart the machine
4. When I start cassandra again, I get the following output 
{code}
 INFO 09:33:56,857 Log replay complete, 0 replayed mutations
{code}
5. I select the exact same row and get no results, so the data I inserted 
before is gone.
{code}
get comment['1|3488da80-1dd5-11b2-aff8-030772c33eed'];
Returned 0 results.
Elapsed time: 120 msec(s).
{code}

I tried to reproduce it with a newly created keyspace and column family and 
wasn't able to reproduce it yet. In the other keyspace I can reproduce it 
consistently and it happens on all column familys. 
Any suggestions what I can try to narrow it down?

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical
 Fix For: 1.1.3


 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-03 Thread sunjian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427904#comment-13427904
 ] 

sunjian commented on CASSANDRA-4481:


I need the 1.1.3 version to fix the Schema no longer modifiable bug ! when 
should 1.1.3 release ?? so , please make sure does this matter is a bug or not !

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical
 Fix For: 1.1.3


 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427970#comment-13427970
 ] 

Ivo Meißner commented on CASSANDRA-4481:


I am still trying to narrow it down. I have created a new keyspace 
(testkeyspace) with the same configuration and structure. 
When I only use the testkeyspace, the error does not occur, everything in the 
commitlog is available after reboot: 

The following works as expected: 
1. Insert dataA in testkeyspace
2. Reboot - 1 replayed mutations
3. Get dataA returns data as expected

The following does not work:
1. Insert dataA in testkeyspace
2. Get dataA from testkeyspace - returns data as expected
3. Insert dataB in brokenkeyspace
4. Get dataB from brokenkeypsace - returns data as expected
5. Reboot - 0 replayed mutations
6. Get dataA from testkeyspace - NO DATA
7. Get dataB from brokenkeyspace - NO DATA

So it seems to have something to do with the broken keyspace. I don't know 
yet how to get the keyspace into that state. So any input of how I can figure 
it out or what I could try would be appreciated.

I have changed the Fix-Version to 1.1.4. 

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical
 Fix For: 1.1.4


 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-03 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428102#comment-13428102
 ] 

Jonathan Ellis commented on CASSANDRA-4481:
---

(1.1.3 is planned for Monday release; in the meantime, artifacts are at 
http://people.apache.org/~slebresne/.)

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427630#comment-13427630
 ] 

Jonathan Ellis commented on CASSANDRA-4481:
---

Please give full instructions on how to reproduce.  0 replayed mutations 
means all the data was flushed to sstables before restart...

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical
 Fix For: 1.1.3


 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira