RE: validation compaction

2014-10-17 Thread S C
Thanks for the reply Rob.

Date: Thu, 16 Oct 2014 11:46:52 -0700
Subject: Re: validation compaction
From: rc...@eventbrite.com
To: user@cassandra.apache.org

On Thu, Oct 16, 2014 at 6:41 AM, S C as...@outlook.com wrote:



Bob,
Bob is my father's name. Unless you need a gastrointestinal consult, you 
probably don't want to ask Bob Coli a question... ;P Default compression is 
Snappy compression and I have seen compression ranging between 2-4% (just as 
the doc says). I got the storage part. Does it mean that as a result of 
compaction/repair SSTables are decompressed? Is it the reason for CPU 
utilization spiking up a little?
Yes, compaction needs the values in SSTables, if those values are compressed it 
needs to decompress them. Yes, it one of the reasons compaction consumes 
noticable CPU.
=Robhttp://twitter.com/rcolidba   

RE: validation compaction

2014-10-16 Thread S C
Bob,
Default compression is Snappy compression and I have seen compression ranging 
between 2-4% (just as the doc says). I got the storage part. Does it mean that 
as a result of compaction/repair SSTables are decompressed? Is it the reason 
for CPU utilization spiking up a little?
-SR
From: as...@outlook.com
To: user@cassandra.apache.org
Subject: RE: validation compaction
Date: Tue, 14 Oct 2014 17:09:14 -0500




Thanks Rob. 


Date: Mon, 13 Oct 2014 13:42:39 -0700
Subject: Re: validation compaction
From: rc...@eventbrite.com
To: user@cassandra.apache.org

On Mon, Oct 13, 2014 at 1:04 PM, S C as...@outlook.com wrote:



I have started repairing a 10 node cluster with one of the table having  1TB 
of data. I notice that the validation compaction actually shows 3 TB in the 
nodetool compactionstats bytes total. However, I have less than 1TB data on 
the machine. If I take into consideration of 3 replicas then 3TB makes sense. 
Per my understanding, validation does only care about data local to the machine 
running validation compaction. Am I missing some thing here? Any help is much 
appreciated. 

Compression is enabled by default; it's showing the uncompressed data size. 
Your 1TB of data would be 3TB without compression.
=Rob
  

Re: validation compaction

2014-10-16 Thread Robert Coli
On Thu, Oct 16, 2014 at 6:41 AM, S C as...@outlook.com wrote:

 Bob,


Bob is my father's name. Unless you need a gastrointestinal consult, you
probably don't want to ask Bob Coli a question... ;P


 Default compression is Snappy compression and I have seen compression
 ranging between 2-4% (just as the doc says). I got the storage part. Does
 it mean that as a result of compaction/repair SSTables are decompressed? Is
 it the reason for CPU utilization spiking up a little?


Yes, compaction needs the values in SSTables, if those values are
compressed it needs to decompress them. Yes, it one of the reasons
compaction consumes noticable CPU.

=Rob
http://twitter.com/rcolidba


RE: validation compaction

2014-10-14 Thread S C
Thanks Rob. 


Date: Mon, 13 Oct 2014 13:42:39 -0700
Subject: Re: validation compaction
From: rc...@eventbrite.com
To: user@cassandra.apache.org

On Mon, Oct 13, 2014 at 1:04 PM, S C as...@outlook.com wrote:



I have started repairing a 10 node cluster with one of the table having  1TB 
of data. I notice that the validation compaction actually shows 3 TB in the 
nodetool compactionstats bytes total. However, I have less than 1TB data on 
the machine. If I take into consideration of 3 replicas then 3TB makes sense. 
Per my understanding, validation does only care about data local to the machine 
running validation compaction. Am I missing some thing here? Any help is much 
appreciated. 

Compression is enabled by default; it's showing the uncompressed data size. 
Your 1TB of data would be 3TB without compression.
=Rob  

Re: validation compaction

2014-10-13 Thread Robert Coli
On Mon, Oct 13, 2014 at 1:04 PM, S C as...@outlook.com wrote:

 I have started repairing a 10 node cluster with one of the table having 
 1TB of data. I notice that the validation compaction actually shows 3 TB
 in the nodetool compactionstats bytes total. However, I have less than
 1TB data on the machine. If I take into consideration of 3 replicas then
 3TB makes sense. Per my understanding, validation does only care about data
 local to the machine running validation compaction. Am I missing some thing
 here? Any help is much appreciated.


Compression is enabled by default; it's showing the uncompressed data size.
Your 1TB of data would be 3TB without compression.

=Rob