Re: Possibly losing data with corrupted SSTables

2014-03-14 Thread Robert Coli
On Wed, Feb 12, 2014 at 9:20 AM, Francisco Nogueira Calmon Sobral 
fsob...@igcorp.com.br wrote:

 I've removed the corrupted sstables and 'nodetool repair' ran successfully
 for the column family. I'm not sure whether or not we've lost data.


If you read/write at CL.ONE, there is a non-zero chance that you have lost
data. In practice, this chance is pretty low unless you constantly drop
mutation messages or have a RF of under 3.

=Rob


Re: Possibly losing data with corrupted SSTables

2014-02-12 Thread Francisco Nogueira Calmon Sobral
Hi, Rahul.

I've removed the corrupted sstables and 'nodetool repair' ran successfully for 
the column family. I'm not sure whether or not we've lost data.

Best regards,
Francisco Sobral


On Jan 30, 2014, at 3:58 PM, Rahul Menon ra...@apigee.com wrote:

 Yes should delete all files related to cfname-ib-num-extension.db
 
 Run a repair after deletion
 
 
 On Thu, Jan 30, 2014 at 10:17 PM, Francisco Nogueira Calmon Sobral 
 fsob...@igcorp.com.br wrote:
 Ok. I'll try this idea with one sstable. But, should I delete all the files 
 associated with it? I mean, there is a difference in the number of files 
 between the BAD sstable and a GOOD one, as I've already shown:
 
 BAD
 --
 -rw-r--r-- 8 cassandra cassandra 991M Nov  8 15:11 
 Sessions-Users-ib-2516-Data.db
 -rw-r--r-- 8 cassandra cassandra 703M Nov  8 15:11 
 Sessions-Users-ib-2516-Index.db
 -rw-r--r-- 8 cassandra cassandra 5.3M Nov 13 11:42 
 Sessions-Users-ib-2516-Summary.db
 
 GOOD
 -
 -rw-r--r-- 1 cassandra cassandra  22K Jan 15 10:50 
 Sessions-Users-ic-2933-CompressionInfo.db
 -rw-r--r-- 1 cassandra cassandra 106M Jan 15 10:50 
 Sessions-Users-ic-2933-Data.db
 -rw-r--r-- 1 cassandra cassandra 2.2M Jan 15 10:50 
 Sessions-Users-ic-2933-Filter.db
 -rw-r--r-- 1 cassandra cassandra  76M Jan 15 10:50 
 Sessions-Users-ic-2933-Index.db
 -rw-r--r-- 1 cassandra cassandra 4.3K Jan 15 10:50 
 Sessions-Users-ic-2933-Statistics.db
 -rw-r--r-- 1 cassandra cassandra 574K Jan 15 10:50 
 Sessions-Users-ic-2933-Summary.db
 -rw-r--r-- 1 cassandra cassandra   79 Jan 15 10:50 
 Sessions-Users-ic-2933-TOC.txt
 
 Should I delete those 3 files? Should I run nodetool refresh after the 
 operation?
 
 Best regards,
 Francisco.
 
 On Jan 30, 2014, at 2:02 PM, Rahul Menon ra...@apigee.com wrote:
 
  Looks like the sstables are corrupt. I dont believe there is a method to 
  recover those sstables. I would delete them and run a repair to ensure data 
  consistency.
 
  Rahul
 
 
  On Wed, Jan 29, 2014 at 11:29 PM, Francisco Nogueira Calmon Sobral 
  fsob...@igcorp.com.br wrote:
  Hi, Rahul.
 
  I've run nodetool upgradesstable only in the problematic CF. It throwed the 
  following exception:
 
  Error occurred while upgrading the sstables for keyspace Sessions
  java.util.concurrent.ExecutionException: 
  org.apache.cassandra.io.sstable.CorruptSSTableException: 
  java.io.IOException: dataSize of 3622081913630118729 starting at 32906 
  would be larger than file 
  /mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length 
  1038
  893416
  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
  at java.util.concurrent.FutureTask.get(FutureTask.java:188)
  at 
  org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:271)
  at 
  org.apache.cassandra.db.compaction.CompactionManager.performSSTableRewrite(CompactionManager.java:287)
  at 
  org.apache.cassandra.db.ColumnFamilyStore.sstablesRewrite(ColumnFamilyStore.java:977)
  at 
  org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:2191)
  … …
  Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: 
  java.io.IOException: dataSize of 3622081913630118729 starting at 32906 
  would be larger than file 
  /mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length 
  1038893416
  at 
  org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:167)
  at 
  org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:83)
  at 
  org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:69)
  at 
  org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:180)
  at 
  org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:155)
  at 
  org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:142)
  at 
  org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)
  at 
  org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:202)
  at 
  com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
  at 
  com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
  at 
  org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:134)
  at 
  org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
  at 
  org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
  at 
  org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
  at 
  org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
  at 
  

Re: Possibly losing data with corrupted SSTables

2014-02-12 Thread sankalp kohli
You might want to look at this JIRA i filed today
CASSANDRA-6696 https://issues.apache.org/jira/browse/CASSANDRA-6696

You are good if you are fine with data reappearing.


On Wed, Feb 12, 2014 at 9:20 AM, Francisco Nogueira Calmon Sobral 
fsob...@igcorp.com.br wrote:

 Hi, Rahul.

 I've removed the corrupted sstables and 'nodetool repair' ran successfully
 for the column family. I'm not sure whether or not we've lost data.

 Best regards,
 Francisco Sobral


 On Jan 30, 2014, at 3:58 PM, Rahul Menon ra...@apigee.com wrote:

 Yes should delete all files related to cfname-ib-num-extension.db

 Run a repair after deletion


 On Thu, Jan 30, 2014 at 10:17 PM, Francisco Nogueira Calmon Sobral 
 fsob...@igcorp.com.br wrote:

 Ok. I'll try this idea with one sstable. But, should I delete all the
 files associated with it? I mean, there is a difference in the number of
 files between the BAD sstable and a GOOD one, as I've already shown:

 BAD
 --
 -rw-r--r-- 8 cassandra cassandra 991M Nov  8 15:11
 Sessions-Users-ib-2516-Data.db
 -rw-r--r-- 8 cassandra cassandra 703M Nov  8 15:11
 Sessions-Users-ib-2516-Index.db
 -rw-r--r-- 8 cassandra cassandra 5.3M Nov 13 11:42
 Sessions-Users-ib-2516-Summary.db

 GOOD
 -
 -rw-r--r-- 1 cassandra cassandra  22K Jan 15 10:50
 Sessions-Users-ic-2933-CompressionInfo.db
 -rw-r--r-- 1 cassandra cassandra 106M Jan 15 10:50
 Sessions-Users-ic-2933-Data.db
 -rw-r--r-- 1 cassandra cassandra 2.2M Jan 15 10:50
 Sessions-Users-ic-2933-Filter.db
 -rw-r--r-- 1 cassandra cassandra  76M Jan 15 10:50
 Sessions-Users-ic-2933-Index.db
 -rw-r--r-- 1 cassandra cassandra 4.3K Jan 15 10:50
 Sessions-Users-ic-2933-Statistics.db
 -rw-r--r-- 1 cassandra cassandra 574K Jan 15 10:50
 Sessions-Users-ic-2933-Summary.db
 -rw-r--r-- 1 cassandra cassandra   79 Jan 15 10:50
 Sessions-Users-ic-2933-TOC.txt

 Should I delete those 3 files? Should I run nodetool refresh after the
 operation?

 Best regards,
 Francisco.

 On Jan 30, 2014, at 2:02 PM, Rahul Menon ra...@apigee.com wrote:

  Looks like the sstables are corrupt. I dont believe there is a method
 to recover those sstables. I would delete them and run a repair to ensure
 data consistency.
 
  Rahul
 
 
  On Wed, Jan 29, 2014 at 11:29 PM, Francisco Nogueira Calmon Sobral 
 fsob...@igcorp.com.br wrote:
  Hi, Rahul.
 
  I've run nodetool upgradesstable only in the problematic CF. It throwed
 the following exception:
 
  Error occurred while upgrading the sstables for keyspace Sessions
  java.util.concurrent.ExecutionException:
 org.apache.cassandra.io.sstable.CorruptSSTableException:
 java.io.IOException: dataSize of 3622081913630118729 starting at 32906
 would be larger than file
 /mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length
 1038
  893416
  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
  at java.util.concurrent.FutureTask.get(FutureTask.java:188)
  at
 org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:271)
  at
 org.apache.cassandra.db.compaction.CompactionManager.performSSTableRewrite(CompactionManager.java:287)
  at
 org.apache.cassandra.db.ColumnFamilyStore.sstablesRewrite(ColumnFamilyStore.java:977)
  at
 org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:2191)
  ... ...
  Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException:
 java.io.IOException: dataSize of 3622081913630118729 starting at 32906
 would be larger than file
 /mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length
 1038893416
  at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:167)
  at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:83)
  at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:69)
  at
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:180)
  at
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:155)
  at
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:142)
  at
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)
  at
 org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:202)
  at
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
  at
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
  at
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:134)
  at
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
  at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
  at
 

Re: Possibly losing data with corrupted SSTables

2014-01-30 Thread Rahul Menon
Looks like the sstables are corrupt. I dont believe there is a method to
recover those sstables. I would delete them and run a repair to ensure data
consistency.

Rahul


On Wed, Jan 29, 2014 at 11:29 PM, Francisco Nogueira Calmon Sobral 
fsob...@igcorp.com.br wrote:

 Hi, Rahul.

 I've run nodetool upgradesstable only in the problematic CF. It throwed
 the following exception:

 Error occurred while upgrading the sstables for keyspace Sessions
 java.util.concurrent.ExecutionException:
 org.apache.cassandra.io.sstable.CorruptSSTableException:
 java.io.IOException: dataSize of 3622081913630118729 starting at 32906
 would be larger than file
 /mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length
 1038
 893416
 at java.util.concurrent.FutureTask.report(FutureTask.java:122)
 at java.util.concurrent.FutureTask.get(FutureTask.java:188)
 at
 org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:271)
 at
 org.apache.cassandra.db.compaction.CompactionManager.performSSTableRewrite(CompactionManager.java:287)
 at
 org.apache.cassandra.db.ColumnFamilyStore.sstablesRewrite(ColumnFamilyStore.java:977)
 at
 org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:2191)
 ... ...
 Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException:
 java.io.IOException: dataSize of 3622081913630118729 starting at 32906
 would be larger than file
 /mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length
 1038893416
 at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:167)
 at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:83)
 at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:69)
 at
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:180)
 at
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:155)
 at
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:142)
 at
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)
 at
 org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:202)
 at
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 at
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
 at
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:134)
 at
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
 at
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
 at
 org.apache.cassandra.db.compaction.CompactionManager$4.perform(CompactionManager.java:301)
 at
 org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:250)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 ... 3 more
 Caused by: java.io.IOException: dataSize of 3622081913630118729 starting
 at 32906 would be larger than file
 /mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length
 1038893416
 at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:123)
 ... 20 more


 Regards,
 Francisco


 On Jan 29, 2014, at 3:38 PM, Rahul Menon ra...@apigee.com wrote:

  Francisco,
 
  the sstables with *-ib-* is something that was from a previous version
 of c*. The *-ib-* naming convention started at c* 1.2.1 but 1.2.10 onwards
 im sure it has the *-ic-* convention. You could try running a nodetool
 sstableupgrade which should ideally upgrade the sstables with the *-ib-* to
 *-ic-*.
 
  Rahul
 
  On Wed, Jan 29, 2014 at 12:55 AM, Francisco Nogueira Calmon Sobral 
 fsob...@igcorp.com.br wrote:
  Dear experts,
 
  We are facing a annoying problem in our cluster.
 
  We have 9 amazon extra large linux nodes, running Cassandra 1.2.11.
 
  The short story is that after moving the data from one cluster to
 another, we've been unable to run 'nodetool repair'. It get stuck due to a
 CorruptSSTableException in some nodes and CFs. After looking at some
 problematic CFs, we observed that some of them have root permissions,
 instead of cassandra permissions. Also, their names are different from the
 'good' ones as we can see below:
 
  BAD
  --
  -rw-r--r-- 8 cassandra cassandra 991M Nov  8 15:11
 Sessions-Users-ib-2516-Data.db
  -rw-r--r-- 8 cassandra cassandra 703M Nov  8 15:11
 Sessions-Users-ib-2516-Index.db
  -rw-r--r-- 8 cassandra cassandra 5.3M Nov 13 

Re: Possibly losing data with corrupted SSTables

2014-01-30 Thread Francisco Nogueira Calmon Sobral
Ok. I'll try this idea with one sstable. But, should I delete all the files 
associated with it? I mean, there is a difference in the number of files 
between the BAD sstable and a GOOD one, as I've already shown:

BAD
--
-rw-r--r-- 8 cassandra cassandra 991M Nov  8 15:11 
Sessions-Users-ib-2516-Data.db
-rw-r--r-- 8 cassandra cassandra 703M Nov  8 15:11 
Sessions-Users-ib-2516-Index.db
-rw-r--r-- 8 cassandra cassandra 5.3M Nov 13 11:42 
Sessions-Users-ib-2516-Summary.db

GOOD
-
-rw-r--r-- 1 cassandra cassandra  22K Jan 15 10:50 
Sessions-Users-ic-2933-CompressionInfo.db
-rw-r--r-- 1 cassandra cassandra 106M Jan 15 10:50 
Sessions-Users-ic-2933-Data.db
-rw-r--r-- 1 cassandra cassandra 2.2M Jan 15 10:50 
Sessions-Users-ic-2933-Filter.db
-rw-r--r-- 1 cassandra cassandra  76M Jan 15 10:50 
Sessions-Users-ic-2933-Index.db
-rw-r--r-- 1 cassandra cassandra 4.3K Jan 15 10:50 
Sessions-Users-ic-2933-Statistics.db
-rw-r--r-- 1 cassandra cassandra 574K Jan 15 10:50 
Sessions-Users-ic-2933-Summary.db
-rw-r--r-- 1 cassandra cassandra   79 Jan 15 10:50 
Sessions-Users-ic-2933-TOC.txt

Should I delete those 3 files? Should I run nodetool refresh after the 
operation?

Best regards,
Francisco.

On Jan 30, 2014, at 2:02 PM, Rahul Menon ra...@apigee.com wrote:

 Looks like the sstables are corrupt. I dont believe there is a method to 
 recover those sstables. I would delete them and run a repair to ensure data 
 consistency.
 
 Rahul  
 
 
 On Wed, Jan 29, 2014 at 11:29 PM, Francisco Nogueira Calmon Sobral 
 fsob...@igcorp.com.br wrote:
 Hi, Rahul.
 
 I've run nodetool upgradesstable only in the problematic CF. It throwed the 
 following exception:
 
 Error occurred while upgrading the sstables for keyspace Sessions
 java.util.concurrent.ExecutionException: 
 org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
 dataSize of 3622081913630118729 starting at 32906 would be larger than file 
 /mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length 1038
 893416
 at java.util.concurrent.FutureTask.report(FutureTask.java:122)
 at java.util.concurrent.FutureTask.get(FutureTask.java:188)
 at 
 org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:271)
 at 
 org.apache.cassandra.db.compaction.CompactionManager.performSSTableRewrite(CompactionManager.java:287)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.sstablesRewrite(ColumnFamilyStore.java:977)
 at 
 org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:2191)
 … …
 Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: 
 java.io.IOException: dataSize of 3622081913630118729 starting at 32906 would 
 be larger than file 
 /mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length 
 1038893416
 at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:167)
 at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:83)
 at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:69)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:180)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:155)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:142)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)
 at 
 org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:202)
 at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:134)
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$4.perform(CompactionManager.java:301)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:250)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 ... 3 more
 Caused by: java.io.IOException: dataSize of 3622081913630118729 starting at 
 32906 would be larger than file 
 /mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length 
 1038893416
 at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:123)

Re: Possibly losing data with corrupted SSTables

2014-01-30 Thread Rahul Menon
Yes should delete all files related to cfname-ib-num-extension.db

Run a repair after deletion


On Thu, Jan 30, 2014 at 10:17 PM, Francisco Nogueira Calmon Sobral 
fsob...@igcorp.com.br wrote:

 Ok. I'll try this idea with one sstable. But, should I delete all the
 files associated with it? I mean, there is a difference in the number of
 files between the BAD sstable and a GOOD one, as I've already shown:

 BAD
 --
 -rw-r--r-- 8 cassandra cassandra 991M Nov  8 15:11
 Sessions-Users-ib-2516-Data.db
 -rw-r--r-- 8 cassandra cassandra 703M Nov  8 15:11
 Sessions-Users-ib-2516-Index.db
 -rw-r--r-- 8 cassandra cassandra 5.3M Nov 13 11:42
 Sessions-Users-ib-2516-Summary.db

 GOOD
 -
 -rw-r--r-- 1 cassandra cassandra  22K Jan 15 10:50
 Sessions-Users-ic-2933-CompressionInfo.db
 -rw-r--r-- 1 cassandra cassandra 106M Jan 15 10:50
 Sessions-Users-ic-2933-Data.db
 -rw-r--r-- 1 cassandra cassandra 2.2M Jan 15 10:50
 Sessions-Users-ic-2933-Filter.db
 -rw-r--r-- 1 cassandra cassandra  76M Jan 15 10:50
 Sessions-Users-ic-2933-Index.db
 -rw-r--r-- 1 cassandra cassandra 4.3K Jan 15 10:50
 Sessions-Users-ic-2933-Statistics.db
 -rw-r--r-- 1 cassandra cassandra 574K Jan 15 10:50
 Sessions-Users-ic-2933-Summary.db
 -rw-r--r-- 1 cassandra cassandra   79 Jan 15 10:50
 Sessions-Users-ic-2933-TOC.txt

 Should I delete those 3 files? Should I run nodetool refresh after the
 operation?

 Best regards,
 Francisco.

 On Jan 30, 2014, at 2:02 PM, Rahul Menon ra...@apigee.com wrote:

  Looks like the sstables are corrupt. I dont believe there is a method to
 recover those sstables. I would delete them and run a repair to ensure data
 consistency.
 
  Rahul
 
 
  On Wed, Jan 29, 2014 at 11:29 PM, Francisco Nogueira Calmon Sobral 
 fsob...@igcorp.com.br wrote:
  Hi, Rahul.
 
  I've run nodetool upgradesstable only in the problematic CF. It throwed
 the following exception:
 
  Error occurred while upgrading the sstables for keyspace Sessions
  java.util.concurrent.ExecutionException:
 org.apache.cassandra.io.sstable.CorruptSSTableException:
 java.io.IOException: dataSize of 3622081913630118729 starting at 32906
 would be larger than file
 /mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length
 1038
  893416
  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
  at java.util.concurrent.FutureTask.get(FutureTask.java:188)
  at
 org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:271)
  at
 org.apache.cassandra.db.compaction.CompactionManager.performSSTableRewrite(CompactionManager.java:287)
  at
 org.apache.cassandra.db.ColumnFamilyStore.sstablesRewrite(ColumnFamilyStore.java:977)
  at
 org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:2191)
  ... ...
  Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException:
 java.io.IOException: dataSize of 3622081913630118729 starting at 32906
 would be larger than file
 /mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length
 1038893416
  at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:167)
  at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:83)
  at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:69)
  at
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:180)
  at
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:155)
  at
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:142)
  at
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)
  at
 org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:202)
  at
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
  at
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
  at
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:134)
  at
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
  at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
  at
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
  at
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
  at
 org.apache.cassandra.db.compaction.CompactionManager$4.perform(CompactionManager.java:301)
  at
 org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:250)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  ... 3 more
  Caused by: java.io.IOException: dataSize of 3622081913630118729 starting
 at 32906 

Re: Possibly losing data with corrupted SSTables

2014-01-29 Thread Rahul Menon
Francisco,

the sstables with *-ib-* is something that was from a previous version of
c*. The *-ib-* naming convention started at c* 1.2.1 but 1.2.10 onwards im
sure it has the *-ic-* convention. You could try running a nodetool
sstableupgrade which should ideally upgrade the sstables with the *-ib-* to
*-ic-*.

Rahul

On Wed, Jan 29, 2014 at 12:55 AM, Francisco Nogueira Calmon Sobral 
fsob...@igcorp.com.br wrote:

 Dear experts,

 We are facing a annoying problem in our cluster.

 We have 9 amazon extra large linux nodes, running Cassandra 1.2.11.

 The short story is that after moving the data from one cluster to another,
 we've been unable to run 'nodetool repair'. It get stuck due to a
 CorruptSSTableException in some nodes and CFs. After looking at some
 problematic CFs, we observed that some of them have root permissions,
 instead of cassandra permissions. Also, their names are different from the
 'good' ones as we can see below:

 BAD
 --
 -rw-r--r-- 8 cassandra cassandra 991M Nov  8 15:11
 Sessions-Users-ib-2516-Data.db
 -rw-r--r-- 8 cassandra cassandra 703M Nov  8 15:11
 Sessions-Users-ib-2516-Index.db
 -rw-r--r-- 8 cassandra cassandra 5.3M Nov 13 11:42
 Sessions-Users-ib-2516-Summary.db

 GOOD
 -
 -rw-r--r-- 1 cassandra cassandra  22K Jan 15 10:50
 Sessions-Users-ic-2933-CompressionInfo.db
 -rw-r--r-- 1 cassandra cassandra 106M Jan 15 10:50
 Sessions-Users-ic-2933-Data.db
 -rw-r--r-- 1 cassandra cassandra 2.2M Jan 15 10:50
 Sessions-Users-ic-2933-Filter.db
 -rw-r--r-- 1 cassandra cassandra  76M Jan 15 10:50
 Sessions-Users-ic-2933-Index.db
 -rw-r--r-- 1 cassandra cassandra 4.3K Jan 15 10:50
 Sessions-Users-ic-2933-Statistics.db
 -rw-r--r-- 1 cassandra cassandra 574K Jan 15 10:50
 Sessions-Users-ic-2933-Summary.db
 -rw-r--r-- 1 cassandra cassandra   79 Jan 15 10:50
 Sessions-Users-ic-2933-TOC.txt


 We changed the permissions back to 'cassandra' and ran 'nodetool scrub' in
 this problematic CF, but it has been running for at least two weeks (it is
 not frozen) and keeps logging many WARNs while working with the above
 mentioned SSTable:

 WARN [CompactionExecutor:15] 2014-01-28 17:01:22,571 OutputHandler.java
 (line 57) Non-fatal error reading row (stacktrace follows)
 java.io.IOError: java.io.IOException: Impossible row size
 3618452438597849419
 at
 org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:171)
 at
 org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:526)
 at
 org.apache.cassandra.db.compaction.CompactionManager.doScrub(CompactionManager.java:515)
 at
 org.apache.cassandra.db.compaction.CompactionManager.access$400(CompactionManager.java:70)
 at
 org.apache.cassandra.db.compaction.CompactionManager$3.perform(CompactionManager.java:280)
 at
 org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:250)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 Caused by: java.io.IOException: Impossible row size 3618452438597849419
 ... 10 more


 1) I do not think that deleting all data of one node and running 'nodetool
 rebuild' will work, since we observed that this problem occurs in all
 nodes. So we may not be able to restore all the data. What can be done in
 this case?

 2) Why the permissions of some sstables are 'root'? Is this problem caused
 by our manual migration of data? (see long story below)


 How we ran into this?

 The long story is that we've tried to move our cluster with sstableloader,
 but it was unable to load all the data correctly. Our solution was to put
 ALL cluster data into EACH new node and run 'nodetool refresh'. I performed
 this task for each node and each column family sequentially. Sometimes I
 had to rename some sstables, because they came from different nodes with
 the same name. I don't remember if I ran 'nodetool repair'  or even
 'nodetool cleanup' in each node. Apparently, the process was successful,
 and (almost) all the data was moved.

 Unfortunately, after 3 months since we moved, I am unable to perform read
 operations in some keys of some CFs. I think that some of these keys belong
 to the above mentioned sstables.

 Any insights are welcome.

 Best regards,
 Francisco Sobral


Re: Possibly losing data with corrupted SSTables

2014-01-29 Thread Francisco Nogueira Calmon Sobral
Hi, Rahul.

I've run nodetool upgradesstable only in the problematic CF. It throwed the 
following exception:

Error occurred while upgrading the sstables for keyspace Sessions
java.util.concurrent.ExecutionException: 
org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
dataSize of 3622081913630118729 starting at 32906 would be larger than file 
/mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length 1038
893416
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at 
org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:271)
at 
org.apache.cassandra.db.compaction.CompactionManager.performSSTableRewrite(CompactionManager.java:287)
at 
org.apache.cassandra.db.ColumnFamilyStore.sstablesRewrite(ColumnFamilyStore.java:977)
at 
org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:2191)
… … 
Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: 
java.io.IOException: dataSize of 3622081913630118729 starting at 32906 would be 
larger than file 
/mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length 
1038893416
at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:167)
at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:83)
at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:69)
at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:180)
at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:155)
at 
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:142)
at 
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)
at 
org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:202)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:134)
at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
at 
org.apache.cassandra.db.compaction.CompactionManager$4.perform(CompactionManager.java:301)
at 
org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:250)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
... 3 more
Caused by: java.io.IOException: dataSize of 3622081913630118729 starting at 
32906 would be larger than file 
/mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length 
1038893416
at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:123)
... 20 more


Regards,
Francisco


On Jan 29, 2014, at 3:38 PM, Rahul Menon ra...@apigee.com wrote:

 Francisco, 
 
 the sstables with *-ib-* is something that was from a previous version of c*. 
 The *-ib-* naming convention started at c* 1.2.1 but 1.2.10 onwards im sure 
 it has the *-ic-* convention. You could try running a nodetool sstableupgrade 
 which should ideally upgrade the sstables with the *-ib-* to *-ic-*. 
 
 Rahul
 
 On Wed, Jan 29, 2014 at 12:55 AM, Francisco Nogueira Calmon Sobral 
 fsob...@igcorp.com.br wrote:
 Dear experts,
 
 We are facing a annoying problem in our cluster.
 
 We have 9 amazon extra large linux nodes, running Cassandra 1.2.11.
 
 The short story is that after moving the data from one cluster to another, 
 we've been unable to run 'nodetool repair'. It get stuck due to a 
 CorruptSSTableException in some nodes and CFs. After looking at some 
 problematic CFs, we observed that some of them have root permissions, instead 
 of cassandra permissions. Also, their names are different from the 'good' 
 ones as we can see below:
 
 BAD
 --
 -rw-r--r-- 8 cassandra cassandra 991M Nov  8 15:11 
 Sessions-Users-ib-2516-Data.db
 -rw-r--r-- 8 cassandra cassandra 703M Nov  8 15:11 
 Sessions-Users-ib-2516-Index.db
 -rw-r--r-- 8 cassandra cassandra 5.3M Nov 13 11:42 
 Sessions-Users-ib-2516-Summary.db
 
 GOOD
 -
 -rw-r--r-- 1 cassandra cassandra  22K Jan 15 10:50 
 Sessions-Users-ic-2933-CompressionInfo.db
 -rw-r--r-- 1 cassandra cassandra 106M Jan 15 10:50 
 Sessions-Users-ic-2933-Data.db
 -rw-r--r-- 1 cassandra cassandra 2.2M Jan 15 10:50