date:20150218

10-20 per minute is the average. Worstcase can be 10x of avg.

On Wed, Feb 18, 2015 at 4:49 PM, Mohammed Guller 
wrote:

>  What is the maximum number of events that you expect in a day? What is
> the worst-case scenario?
>
>
>
> Mohammed
>
>
>
> *From:* cass savy [mailto:casss...@gmail.com]
> *Sent:* Wednesday, February 18, 2015 4:21 PM
> *To:* user@cassandra.apache.org
> *Subject:* Data tiered compaction and data model question
>
>
>
> We want to track events in log  Cf/table and should be able to query for
> events that occurred in range of mins or hours for given day. Multiple
> events can occur in a given minute.  Listed 2 table designs and leaning
> towards table 1 to avoid large wide row.  Please advice on
>
>
>
> *Table 1*: not very widerow, still be able to query for range of minutes
> for given day
>
> and/or given day and range of hours
>
> Create table *log_Event*
>
> (
>
>  event_day text,
>
>  event_hr int,
>
>  event_time timeuuid,
>
>  data text,
>
> PRIMARY KEY (* (event_day,event_hr),*event_time)
>
> )
>
> *Table 2: This will be very wide row*
>
>
>
> Create table *log_Event*
>
> ( event_day text,
>
>  event_time timeuuid,
>
>  data text,
>
> PRIMARY KEY (* event_day,*event_time)
>
> )
>
>
>
> *Datatiered compaction: recommended for time series data as per below doc.
> Our data will be kept only for 30 days. Hence thought of using this
> compaction strategy.*
>
> http://www.datastax.com/dev/blog/datetieredcompactionstrategy
>
> Create table 1 listed above with this compaction strategy. Added some rows
> and did manual flush.  I do not see any sstables created yet. Is that
> expected?
>
>  compaction={'max_sstable_age_days': '1', 'class':
> 'DateTieredCompactionStrategy'}
>
>
>

Re: C* 2.1.2 invokes oom-killer

2015-02-18 Thread Jacob Rhoden

I neglected to mention, I also adjust the oom score of cassandra, to tell the 
kernel to kill something else other than cassandra. (Like if one of your dev’s 
runs a script that uses a lot of memory, so it kills your dev’s script instead).

http://lwn.net/Articles/317814/ 

> On 19 Feb 2015, at 5:28 am, Michał Łowicki  wrote:
> 
> Hi,
> 
> Couple of times a day 2 out of 4 members cluster nodes are killed
> 
> root@db4:~# dmesg | grep -i oom
> [4811135.792657] [ pid ]   uid  tgid total_vm  rss cpu oom_adj 
> oom_score_adj name
> [6559049.307293] java invoked oom-killer: gfp_mask=0x201da, order=0, 
> oom_adj=0, oom_score_adj=0
> 
> Nodes are using 8GB heap (confirmed with *nodetool info*) and aren't using 
> row cache. 
> 
> Noticed that couple of times a day used RSS is growing really fast within 
> couple of minutes and I see CPU spikes at the same time - 
> https://www.dropbox.com/s/khco2kdp4qdzjit/Screenshot%202015-02-18%2015.10.54.png?dl=0
>  
> .
> 
> Could be related to compaction but after compaction is finished used RSS 
> doesn't shrink. Output from pmap when C* process uses 50GB RAM (out of 64GB) 
> is available on http://paste.ofcode.org/ZjLUA2dYVuKvJHAk9T3Hjb 
> . At the time dump was made 
> heap usage is far below 8GB (~3GB) but total RSS is ~50GB.
> 
> Any help will be appreciated.
> 
> -- 
> BR,
> Michał Łowicki



smime.p7s
Description: S/MIME cryptographic signature

RE: run cassandra on a small instance

2015-02-18 Thread Jason Kushmaul | WDA

I asked this previously when a similar message came through, with a similar 
response.

planetcassandra seems to have it “right”, in that stable=2.0, development=2.1, 
whereas the apache site says stable is 2.1.
“Right” in they assume latest minor version is development.  Why not have the 
apache site do the same?  That’s just my lowly non-contributing opinion though.

Jason

From: Andrew [mailto:redmu...@gmail.com]
Sent: Wednesday, February 18, 2015 8:26 PM
To: Robert Coli; user@cassandra.apache.org
Subject: Re: run cassandra on a small instance

Robert,

Let me know if I’m off base about this—but I feel like I see a lot of posts 
that are like this (i.e., use this arbitrary version, not this other arbitrary 
version).  Why are releases going out if they’re “broken”?  This seems like a 
very confusing way for new (and existing) users to approach versions...

Andrew


On February 18, 2015 at 5:16:27 PM, Robert Coli 
(rc...@eventbrite.com) wrote:
On Wed, Feb 18, 2015 at 5:09 PM, Tim Dunphy 
mailto:bluethu...@gmail.com>> wrote:
I'm attempting to run Cassandra 2.1.2 on a smallish 2.GB ram 
instance over at Digital Ocean. It's a CentOS 7 host.

2.1.2 is IMO broken and should not be used for any purpose.

Use 2.1.1 or 2.1.3.

https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/

=Rob

Re: C* 2.1.2 invokes oom-killer

2015-02-18 Thread Jacob Rhoden

Are you tweaking the "nice" priority on Cassandra? (Type: man nice) if you 
don't know much about it. Certainly improving cassandra's nice score becomes 
important when you have other things running on the server like scheduled jobs 
of people logging in to the server and doing things.

__
Sent from iPhone

> On 19 Feb 2015, at 5:28 am, Michał Łowicki  wrote:
> 
> Hi,
> 
> Couple of times a day 2 out of 4 members cluster nodes are killed
> 
> root@db4:~# dmesg | grep -i oom
> [4811135.792657] [ pid ]   uid  tgid total_vm  rss cpu oom_adj 
> oom_score_adj name
> [6559049.307293] java invoked oom-killer: gfp_mask=0x201da, order=0, 
> oom_adj=0, oom_score_adj=0
> 
> Nodes are using 8GB heap (confirmed with *nodetool info*) and aren't using 
> row cache. 
> 
> Noticed that couple of times a day used RSS is growing really fast within 
> couple of minutes and I see CPU spikes at the same time - 
> https://www.dropbox.com/s/khco2kdp4qdzjit/Screenshot%202015-02-18%2015.10.54.png?dl=0.
> 
> Could be related to compaction but after compaction is finished used RSS 
> doesn't shrink. Output from pmap when C* process uses 50GB RAM (out of 64GB) 
> is available on http://paste.ofcode.org/ZjLUA2dYVuKvJHAk9T3Hjb. At the time 
> dump was made heap usage is far below 8GB (~3GB) but total RSS is ~50GB.
> 
> Any help will be appreciated.
> 
> -- 
> BR,
> Michał Łowicki

Re: run cassandra on a small instance

2015-02-18 Thread Andrew

Robert,

Let me know if I’m off base about this—but I feel like I see a lot of posts 
that are like this (i.e., use this arbitrary version, not this other arbitrary 
version).  Why are releases going out if they’re “broken”?  This seems like a 
very confusing way for new (and existing) users to approach versions...

Andrew

On February 18, 2015 at 5:16:27 PM, Robert Coli (rc...@eventbrite.com) wrote:

On Wed, Feb 18, 2015 at 5:09 PM, Tim Dunphy  wrote:
I'm attempting to run Cassandra 2.1.2 on a smallish 2.GB ram instance over at 
Digital Ocean. It's a CentOS 7 host.

2.1.2 is IMO broken and should not be used for any purpose.

Use 2.1.1 or 2.1.3.

https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/

=Rob

Re: run cassandra on a small instance

2015-02-18 Thread Tim Dunphy

>
> 2.1.2 is IMO broken and should not be used for any purpose.
> Use 2.1.1 or 2.1.3.
> https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
> =Rob


Cool man. Thanks for the info. I just upgraded to 2.1.3. We'll see how that
goes. I can let you know more once it's been running for a while.

Thanks
Tim

On Wed, Feb 18, 2015 at 8:16 PM, Robert Coli  wrote:

> On Wed, Feb 18, 2015 at 5:09 PM, Tim Dunphy  wrote:
>
>> I'm attempting to run Cassandra 2.1.2 on a smallish 2.GB ram instance
>> over at Digital Ocean. It's a CentOS 7 host.
>>
>
> 2.1.2 is IMO broken and should not be used for any purpose.
>
> Use 2.1.1 or 2.1.3.
>
> https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
>
> =Rob
>
>



-- 
GPG me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B

Re: run cassandra on a small instance

2015-02-18 Thread Robert Coli

On Wed, Feb 18, 2015 at 5:09 PM, Tim Dunphy  wrote:

> I'm attempting to run Cassandra 2.1.2 on a smallish 2.GB ram instance
> over at Digital Ocean. It's a CentOS 7 host.
>

2.1.2 is IMO broken and should not be used for any purpose.

Use 2.1.1 or 2.1.3.

https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/

=Rob

Logging client ID for YCSB workloads on Cassandra?

2015-02-18 Thread Jatin Ganhotra

Hi,

I'd like to log the client ID for every operation performed by the YCSB on
my Cassandra cluster.

The purpose is to identify & analyze various other consistency measures
other than eventual consistency.

I wanted to know if people have done something similar in the past. Or am I
missing something really basic here?

Please let me know if you need more information. Thanks
—
Jatin Ganhotra

run cassandra on a small instance

2015-02-18 Thread Tim Dunphy

Hey all,

I'm attempting to run Cassandra 2.1.2 on a smallish 2.GB ram instance over
at Digital Ocean. It's a CentOS 7 host.

But I'm having some difficulty there. Cassandra will start with no problems
and run for a while. But then choke on the lack of memory and crash. This
is what the system looks like after a reboot:

[root@web2:~] #free -m
total used free sharedbuffers cached
Mem: 2002568 1433 8 20207
-/+ buffers/cache:340 1661
Swap:0 0 0

After I start cassandra and leave it running for a few minutes, this is
what the memory situation looks like:

[root@web2:~] #free -m
total used free sharedbuffers cached
Mem: 2002 1669332 8 21359
-/+ buffers/cache: 1288713
Swap:0 0 0

I've been able to find this article on how to tune memory for Cassandra on
the datastax site:

http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_tune_jvm_c.html

So I tried setting up the MAX_HEAP_SIZE and HEAP_NEW_SIZE like this in
cassandra-env.sh:

MAX_HEAP_SIZE="800M"
HEAP_NEWSIZE="200M"

And I gave the latter 200MB based on the fact that this VM has 2 cores:

[root@web2:/etc/alternatives/cassandrahome] #grep name /proc/cpuinfo
model name : QEMU Virtual CPU version 1.0
model name : QEMU Virtual CPU version 1.0

So, 100MB per core basically.

And I've found that this will run for a while.. like maybe 5 or 6 hours!!
So it does stay up a while. But then it finally will crash due to the lack
of memory.

Are there any tricks I can try or things I can do to get Cassandra to stay
runing happily in this amount of space? I'm just using test data and this
is an exercise to learn more about cassandra. I realize a 'real' data set
will require more resources in terms of memory.

Thanks!
Tim
--
GPG me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B

RE: Data tiered compaction and data model question

2015-02-18 Thread Mohammed Guller

What is the maximum number of events that you expect in a day? What is the 
worst-case scenario?

Mohammed

From: cass savy [mailto:casss...@gmail.com]
Sent: Wednesday, February 18, 2015 4:21 PM
To: user@cassandra.apache.org
Subject: Data tiered compaction and data model question

We want to track events in log  Cf/table and should be able to query for events 
that occurred in range of mins or hours for given day. Multiple events can 
occur in a given minute.  Listed 2 table designs and leaning towards table 1 to 
avoid large wide row.  Please advice on

Table 1: not very widerow, still be able to query for range of minutes for 
given day
and/or given day and range of hours
Create table log_Event
(
 event_day text,
 event_hr int,
 event_time timeuuid,
 data text,
PRIMARY KEY ( (event_day,event_hr),event_time)
)
Table 2: This will be very wide row

Create table log_Event
( event_day text,
 event_time timeuuid,
 data text,
PRIMARY KEY ( event_day,event_time)
)

Datatiered compaction: recommended for time series data as per below doc. Our 
data will be kept only for 30 days. Hence thought of using this compaction 
strategy.
http://www.datastax.com/dev/blog/datetieredcompactionstrategy
Create table 1 listed above with this compaction strategy. Added some rows and 
did manual flush.  I do not see any sstables created yet. Is that expected?
 compaction={'max_sstable_age_days': '1', 'class': 
'DateTieredCompactionStrategy'}

Data tiered compaction and data model question

We want to track events in log  Cf/table and should be able to query for
events that occurred in range of mins or hours for given day. Multiple
events can occur in a given minute.  Listed 2 table designs and leaning
towards table 1 to avoid large wide row.  Please advice on

*Table 1*: not very widerow, still be able to query for range of minutes
for given day
and/or given day and range of hours

Create table *log_Event*

(

 event_day text,

 event_hr int,

 event_time timeuuid,

 data text,

PRIMARY KEY ( *(event_day,event_hr),*event_time)

)
*Table 2: This will be very wide row*

Create table *log_Event*

( event_day text,

 event_time timeuuid,

 data text,

PRIMARY KEY ( *event_day,*event_time)

)


*Datatiered compaction: recommended for time series data as per below doc.
Our data will be kept only for 30 days. Hence thought of using this
compaction strategy.*

http://www.datastax.com/dev/blog/datetieredcompactionstrategy

Create table 1 listed above with this compaction strategy. Added some rows
and did manual flush.  I do not see any sstables created yet. Is that
expected?

 compaction={'max_sstable_age_days': '1', 'class':
'DateTieredCompactionStrategy'}

Re: Cassandra install on JRE vs JDK

2015-02-18 Thread karim duran

...with JDK 1.7.x...(not 1.6.x but it's equal) Sorry...

Regards.

Karim Duran

2015-02-18 23:33 GMT+01:00 karim duran :

> Hi Mark, Cass Savy, Robert...
>
> I confirm that Cassandra runs on JRE ( or JDK because a JRE is provided
> with JDK ).
> Oracle (ex Sun Microsystem) is the best choice to make Cassandra running
> without issue.
> (there is some problems with IBM JVM or OpenJDK).
>
> Here's a screenshot of Cassandra 2.1.2 running on my computer (Linux
> Ubuntu 14.04 x86-64 with JDK 1.6.x).
>
> Regards.
> Karim Duran.
>
> 
>
> 2015-02-18 21:40 GMT+01:00 Mark Reddy :
>
>> Cassandra 1.2.18 and Java 1.6 u45.
>>
>> Planning an upgrade to the 2.x series in the near future along with a
>> bump in version of Java.
>>
>> Regards,
>> Mark
>>
>> On 18 February 2015 at 20:32, cass savy  wrote:
>>
>>> Thanks Mark  for quick response. What version of Cassandra and JDK are
>>> you using in Prod.
>>>
>>>
>>> On Wed, Feb 18, 2015 at 11:58 AM, Mark Reddy 
>>> wrote:
>>>
 Yes you can use Oracle JDK if your prefer, I've been using the JDK with
 Cassandra in production for years without issue.

 Regards,
 Mark

 On 18 February 2015 at 19:49, cass savy  wrote:

> Can we install Oracle JDK instead of JRE  in Cassandra servers? We
> have few clusters running JDK when we upgraded to C*2.0.
>
> Is there any known issue or impact with using  JDK vs JRE?
> What is the reason to not use Oracle JDK in C* servers?
> Is there any performance impact ?
>
> Please advice.
>
>


>>>
>>
>

Re: Cassandra install on JRE vs JDK

2015-02-18 Thread karim duran

Hi Mark, Cass Savy, Robert...

I confirm that Cassandra runs on JRE ( or JDK because a JRE is provided
with JDK ).
Oracle (ex Sun Microsystem) is the best choice to make Cassandra running
without issue.
(there is some problems with IBM JVM or OpenJDK).

Here's a screenshot of Cassandra 2.1.2 running on my computer (Linux Ubuntu
14.04 x86-64 with JDK 1.6.x).

Regards.
Karim Duran.

2015-02-18 21:40 GMT+01:00 Mark Reddy :

> Cassandra 1.2.18 and Java 1.6 u45.
>
> Planning an upgrade to the 2.x series in the near future along with a bump
> in version of Java.
>
> Regards,
> Mark
>
> On 18 February 2015 at 20:32, cass savy  wrote:
>
>> Thanks Mark  for quick response. What version of Cassandra and JDK are
>> you using in Prod.
>>
>>
>> On Wed, Feb 18, 2015 at 11:58 AM, Mark Reddy 
>> wrote:
>>
>>> Yes you can use Oracle JDK if your prefer, I've been using the JDK with
>>> Cassandra in production for years without issue.
>>>
>>> Regards,
>>> Mark
>>>
>>> On 18 February 2015 at 19:49, cass savy  wrote:
>>>
 Can we install Oracle JDK instead of JRE  in Cassandra servers? We have
 few clusters running JDK when we upgraded to C*2.0.

 Is there any known issue or impact with using  JDK vs JRE?
 What is the reason to not use Oracle JDK in C* servers?
 Is there any performance impact ?

 Please advice.

>>>
>>>
>>
>

Re: Cassandra install on JRE vs JDK

2015-02-18 Thread Mark Reddy

Cassandra 1.2.18 and Java 1.6 u45.

Planning an upgrade to the 2.x series in the near future along with a bump
in version of Java.

Regards,
Mark

On 18 February 2015 at 20:32, cass savy  wrote:

> Thanks Mark  for quick response. What version of Cassandra and JDK are you
> using in Prod.
>
>
> On Wed, Feb 18, 2015 at 11:58 AM, Mark Reddy 
> wrote:
>
>> Yes you can use Oracle JDK if your prefer, I've been using the JDK with
>> Cassandra in production for years without issue.
>>
>> Regards,
>> Mark
>>
>> On 18 February 2015 at 19:49, cass savy  wrote:
>>
>>> Can we install Oracle JDK instead of JRE  in Cassandra servers? We have
>>> few clusters running JDK when we upgraded to C*2.0.
>>>
>>> Is there any known issue or impact with using  JDK vs JRE?
>>> What is the reason to not use Oracle JDK in C* servers?
>>> Is there any performance impact ?
>>>
>>> Please advice.
>>>
>>>
>>
>>
>

Re: Cassandra install on JRE vs JDK

Thanks Robert  for quick response. I use Oracle JDK and not OpenJDK.


On Wed, Feb 18, 2015 at 11:54 AM, Robert Stupp  wrote:

> The ”natural” dependency of Cassandra is the JRE (not the JDK) - e.g. in
> the Debian package.
> You should be safe using JRE instead of JDK.
>
> If you’re asking whether to use a non-Oracle JVM - the answer would be:
> use the Oracle JVM.
> OpenJDK might work, but I’d not recommend it.
>
>
> Am 18.02.2015 um 20:49 schrieb cass savy :
>
> Can we install Oracle JDK instead of JRE  in Cassandra servers? We have
> few clusters running JDK when we upgraded to C*2.0.
>
> Is there any known issue or impact with using  JDK vs JRE?
> What is the reason to not use Oracle JDK in C* servers?
> Is there any performance impact ?
>
> Please advice.
>
>
>
> —
> Robert Stupp
> @snazy
>
>

Re: Cassandra install on JRE vs JDK

Thanks Mark  for quick response. What version of Cassandra and JDK are you
using in Prod.


On Wed, Feb 18, 2015 at 11:58 AM, Mark Reddy  wrote:

> Yes you can use Oracle JDK if your prefer, I've been using the JDK with
> Cassandra in production for years without issue.
>
> Regards,
> Mark
>
> On 18 February 2015 at 19:49, cass savy  wrote:
>
>> Can we install Oracle JDK instead of JRE  in Cassandra servers? We have
>> few clusters running JDK when we upgraded to C*2.0.
>>
>> Is there any known issue or impact with using  JDK vs JRE?
>> What is the reason to not use Oracle JDK in C* servers?
>> Is there any performance impact ?
>>
>> Please advice.
>>
>>
>
>

Re: Cassandra install on JRE vs JDK

2015-02-18 Thread Mark Reddy

Yes you can use Oracle JDK if your prefer, I've been using the JDK with
Cassandra in production for years without issue.

Regards,
Mark

On 18 February 2015 at 19:49, cass savy  wrote:

> Can we install Oracle JDK instead of JRE  in Cassandra servers? We have
> few clusters running JDK when we upgraded to C*2.0.
>
> Is there any known issue or impact with using  JDK vs JRE?
> What is the reason to not use Oracle JDK in C* servers?
> Is there any performance impact ?
>
> Please advice.
>
>

Re: Cassandra install on JRE vs JDK

2015-02-18 Thread Robert Stupp

The ”natural” dependency of Cassandra is the JRE (not the JDK) - e.g. in the 
Debian package.
You should be safe using JRE instead of JDK.

If you’re asking whether to use a non-Oracle JVM - the answer would be: use the 
Oracle JVM.
OpenJDK might work, but I’d not recommend it.


> Am 18.02.2015 um 20:49 schrieb cass savy :
> 
> Can we install Oracle JDK instead of JRE  in Cassandra servers? We have few 
> clusters running JDK when we upgraded to C*2.0. 
> 
> Is there any known issue or impact with using  JDK vs JRE?
> What is the reason to not use Oracle JDK in C* servers?
> Is there any performance impact ?
> 
> Please advice.
>  

—
Robert Stupp
@snazy

Cassandra install on JRE vs JDK

Can we install Oracle JDK instead of JRE  in Cassandra servers? We have few
clusters running JDK when we upgraded to C*2.0.

Is there any known issue or impact with using  JDK vs JRE?
What is the reason to not use Oracle JDK in C* servers?
Is there any performance impact ?

Please advice.

Re: Deleting Statistics.db at startup

2015-02-18 Thread Robert Coli

On Wed, Feb 18, 2015 at 4:02 AM, Tomer Pearl 
wrote:

>  My question is what is the consequences of deleting this file every time
> the node is starting up? Performance wise or other.
>

You waste the time Cassandra spends to regenerate it.

I personally would not institute an operational practice whereby I
regularly purged these files to avoid OOM.

=Rob

Re: C* 2.1.2 invokes oom-killer

2015-02-18 Thread Robert Coli

On Wed, Feb 18, 2015 at 10:28 AM, Michał Łowicki  wrote:

> Couple of times a day 2 out of 4 members cluster nodes are killed
>

This sort of issue is usually best handled/debugged interactively on IRC.

But briefly :

- 2.1.2 is IMO broken for production. Downgrade (officially unsupported but
fine between these versions) to 2.1.1 or upgrade to 2.1.3.
- Beyond that, look at the steady state heap consumption. With 2.1.2, it
would likely take at least 1TB of data to fill heap in steady state to
near-failure.

=Rob

C* 2.1.2 invokes oom-killer

2015-02-18 Thread Michał Łowicki

Hi,

Couple of times a day 2 out of 4 members cluster nodes are killed

root@db4:~# dmesg | grep -i oom
[4811135.792657] [ pid ]   uid  tgid total_vm  rss cpu oom_adj
oom_score_adj name
[6559049.307293] java invoked oom-killer: gfp_mask=0x201da, order=0,
oom_adj=0, oom_score_adj=0

Nodes are using 8GB heap (confirmed with *nodetool info*) and aren't using
row cache.

Noticed that couple of times a day used RSS is growing really fast within
couple of minutes and I see CPU spikes at the same time -
https://www.dropbox.com/s/khco2kdp4qdzjit/Screenshot%202015-02-18%2015.10.54.png?dl=0
.

Could be related to compaction but after compaction is finished used RSS
doesn't shrink. Output from pmap when C* process uses 50GB RAM (out of
64GB) is available on http://paste.ofcode.org/ZjLUA2dYVuKvJHAk9T3Hjb. At
the time dump was made heap usage is far below 8GB (~3GB) but total RSS is
~50GB.

Any help will be appreciated.

-- 
BR,
Michał Łowicki

Re: Many pending compactions

As Al Tobey suggest me I upgraded my 2.1.0 to snaphot version of 2.1.3. I
have now installed exactly this build:
https://cassci.datastax.com/job/cassandra-2.1/912/
I see many compaction which completes, but some of them are really slow.
Maybe I should send some stats form OpsCenter or servers? But it is
difficult to me to choose what is important

Regards



On Wed, Feb 18, 2015 at 6:11 PM, Jake Luciani  wrote:

> Ja, Please upgrade to official 2.1.3 we've fixed many things related to
> compaction.  Are you seeing the compactions % complete progress at all?
>
> On Wed, Feb 18, 2015 at 11:58 AM, Roni Balthazar 
> wrote:
>
>> Try repair -pr on all nodes.
>>
>> If after that you still have issues, you can try to rebuild the SSTables
>> using nodetool upgradesstables or scrub.
>>
>> Regards,
>>
>> Roni Balthazar
>>
>> Em 18/02/2015, às 14:13, Ja Sam  escreveu:
>>
>> ad 3)  I did this already yesterday (setcompactionthrouput also). But
>> still SSTables are increasing.
>>
>> ad 1) What do you think I should use -pr or try to use incremental?
>>
>>
>>
>> On Wed, Feb 18, 2015 at 4:54 PM, Roni Balthazar 
>> wrote:
>>
>>> You are right... Repair makes the data consistent between nodes.
>>>
>>> I understand that you have 2 issues going on.
>>>
>>> You need to run repair periodically without errors and need to decrease
>>> the numbers of compactions pending.
>>>
>>> So I suggest:
>>>
>>> 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can
>>> use incremental repairs. There were some bugs on 2.1.2.
>>> 2) Run cleanup on all nodes
>>> 3) Since you have too many cold SSTables, set cold_reads_to_omit to
>>> 0.0, and increase setcompactionthroughput for some time and see if the
>>> number of SSTables is going down.
>>>
>>> Let us know what errors are you getting when running repairs.
>>>
>>> Regards,
>>>
>>> Roni Balthazar
>>>
>>>
>>> On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam  wrote:
>>>
 Can you explain me what is the correlation between growing SSTables and
 repair?
 I was sure, until your  mail, that repair is only to make data
 consistent between nodes.

 Regards


 On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar <
 ronibaltha...@gmail.com> wrote:

> Which error are you getting when running repairs?
> You need to run repair on your nodes within gc_grace_seconds (eg:
> weekly). They have data that are not read frequently. You can run
> "repair -pr" on all nodes. Since you do not have deletes, you will not
> have trouble with that. If you have deletes, it's better to increase
> gc_grace_seconds before the repair.
>
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
> After repair, try to run a "nodetool cleanup".
>
> Check if the number of SSTables goes down after that... Pending
> compactions must decrease as well...
>
> Cheers,
>
> Roni Balthazar
>
>
>
>
> On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam  wrote:
> > 1) we tried to run repairs but they usually does not succeed. But we
> had
> > Leveled compaction before. Last week we ALTER tables to STCS,
> because guys
> > from DataStax suggest us that we should not use Leveled and alter
> tables in
> > STCS, because we don't have SSD. After this change we did not run any
> > repair. Anyway I don't think it will change anything in SSTable
> count - if I
> > am wrong please give me an information
> >
> > 2) I did this. My tables are 99% write only. It is audit system
> >
> > 3) Yes I am using default values
> >
> > 4) In both operations I am using LOCAL_QUORUM.
> >
> > I am almost sure that READ timeout happens because of too much
> SSTables.
> > Anyway firstly I would like to fix to many pending compactions. I
> still
> > don't know how to speed up them.
> >
> >
> > On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar <
> ronibaltha...@gmail.com>
> > wrote:
> >>
> >> Are you running repairs within gc_grace_seconds? (default is 10
> days)
> >>
> >>
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
> >>
> >> Double check if you set cold_reads_to_omit to 0.0 on tables with
> STCS
> >> that you do not read often.
> >>
> >> Are you using default values for the properties
> >> min_compaction_threshold(4) and max_compaction_threshold(32)?
> >>
> >> Which Consistency Level are you using for reading operations? Check
> if
> >> you are not reading from DC_B due to your Replication Factor and CL.
> >>
> >>
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
> >>
> >>
> >> Cheers,
> >>
> >> Roni Balthazar
> >>
> >> On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam 
> wrote:
> >> > I don't have pro

Re: Many pending compactions

2015-02-18 Thread Jake Luciani

Ja, Please upgrade to official 2.1.3 we've fixed many things related to
compaction.  Are you seeing the compactions % complete progress at all?

On Wed, Feb 18, 2015 at 11:58 AM, Roni Balthazar 
wrote:

> Try repair -pr on all nodes.
>
> If after that you still have issues, you can try to rebuild the SSTables
> using nodetool upgradesstables or scrub.
>
> Regards,
>
> Roni Balthazar
>
> Em 18/02/2015, às 14:13, Ja Sam  escreveu:
>
> ad 3)  I did this already yesterday (setcompactionthrouput also). But
> still SSTables are increasing.
>
> ad 1) What do you think I should use -pr or try to use incremental?
>
>
>
> On Wed, Feb 18, 2015 at 4:54 PM, Roni Balthazar 
> wrote:
>
>> You are right... Repair makes the data consistent between nodes.
>>
>> I understand that you have 2 issues going on.
>>
>> You need to run repair periodically without errors and need to decrease
>> the numbers of compactions pending.
>>
>> So I suggest:
>>
>> 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can
>> use incremental repairs. There were some bugs on 2.1.2.
>> 2) Run cleanup on all nodes
>> 3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0,
>> and increase setcompactionthroughput for some time and see if the number
>> of SSTables is going down.
>>
>> Let us know what errors are you getting when running repairs.
>>
>> Regards,
>>
>> Roni Balthazar
>>
>>
>> On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam  wrote:
>>
>>> Can you explain me what is the correlation between growing SSTables and
>>> repair?
>>> I was sure, until your  mail, that repair is only to make data
>>> consistent between nodes.
>>>
>>> Regards
>>>
>>>
>>> On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar >> > wrote:
>>>
 Which error are you getting when running repairs?
 You need to run repair on your nodes within gc_grace_seconds (eg:
 weekly). They have data that are not read frequently. You can run
 "repair -pr" on all nodes. Since you do not have deletes, you will not
 have trouble with that. If you have deletes, it's better to increase
 gc_grace_seconds before the repair.

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 After repair, try to run a "nodetool cleanup".

 Check if the number of SSTables goes down after that... Pending
 compactions must decrease as well...

 Cheers,

 Roni Balthazar




 On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam  wrote:
 > 1) we tried to run repairs but they usually does not succeed. But we
 had
 > Leveled compaction before. Last week we ALTER tables to STCS, because
 guys
 > from DataStax suggest us that we should not use Leveled and alter
 tables in
 > STCS, because we don't have SSD. After this change we did not run any
 > repair. Anyway I don't think it will change anything in SSTable count
 - if I
 > am wrong please give me an information
 >
 > 2) I did this. My tables are 99% write only. It is audit system
 >
 > 3) Yes I am using default values
 >
 > 4) In both operations I am using LOCAL_QUORUM.
 >
 > I am almost sure that READ timeout happens because of too much
 SSTables.
 > Anyway firstly I would like to fix to many pending compactions. I
 still
 > don't know how to speed up them.
 >
 >
 > On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar <
 ronibaltha...@gmail.com>
 > wrote:
 >>
 >> Are you running repairs within gc_grace_seconds? (default is 10 days)
 >>
 >>
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 >>
 >> Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
 >> that you do not read often.
 >>
 >> Are you using default values for the properties
 >> min_compaction_threshold(4) and max_compaction_threshold(32)?
 >>
 >> Which Consistency Level are you using for reading operations? Check
 if
 >> you are not reading from DC_B due to your Replication Factor and CL.
 >>
 >>
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
 >>
 >>
 >> Cheers,
 >>
 >> Roni Balthazar
 >>
 >> On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam 
 wrote:
 >> > I don't have problems with DC_B (replica) only in DC_A(my system
 write
 >> > only
 >> > to it) I have read timeouts.
 >> >
 >> > I checked in OpsCenter SSTable count  and I have:
 >> > 1) in DC_A  same +-10% for last week, a small increase for last
 24h (it
 >> > is
 >> > more than 15000-2 SSTables depends on node)
 >> > 2) in DC_B last 24h shows up to 50% decrease, which give nice
 >> > prognostics.
 >> > Now I have less then 1000 SSTables
 >> >
 >> > What did you measure during system optimizations? Or do you have
 an idea
 >> > what m

Re: Many pending compactions

Try repair -pr on all nodes.

If after that you still have issues, you can try to rebuild the SSTables using 
nodetool upgradesstables or scrub.

Regards,

Roni Balthazar

> Em 18/02/2015, às 14:13, Ja Sam  escreveu:
> 
> ad 3)  I did this already yesterday (setcompactionthrouput also). But still 
> SSTables are increasing.
> 
> ad 1) What do you think I should use -pr or try to use incremental?
> 
> 
> 
>> On Wed, Feb 18, 2015 at 4:54 PM, Roni Balthazar  
>> wrote:
>> You are right... Repair makes the data consistent between nodes.
>> 
>> I understand that you have 2 issues going on.
>> 
>> You need to run repair periodically without errors and need to decrease the 
>> numbers of compactions pending.
>> 
>> So I suggest:
>> 
>> 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can use 
>> incremental repairs. There were some bugs on 2.1.2.
>> 2) Run cleanup on all nodes
>> 3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0, and 
>> increase setcompactionthroughput for some time and see if the number of 
>> SSTables is going down.
>> 
>> Let us know what errors are you getting when running repairs.
>> 
>> Regards,
>> 
>> Roni Balthazar
>> 
>> 
>>> On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam  wrote:
>>> Can you explain me what is the correlation between growing SSTables and 
>>> repair? 
>>> I was sure, until your  mail, that repair is only to make data consistent 
>>> between nodes.
>>> 
>>> Regards
>>> 
>>> 
>>> 
 On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar  
 wrote:
 Which error are you getting when running repairs?
 You need to run repair on your nodes within gc_grace_seconds (eg:
 weekly). They have data that are not read frequently. You can run
 "repair -pr" on all nodes. Since you do not have deletes, you will not
 have trouble with that. If you have deletes, it's better to increase
 gc_grace_seconds before the repair.
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 After repair, try to run a "nodetool cleanup".
 
 Check if the number of SSTables goes down after that... Pending
 compactions must decrease as well...
 
 Cheers,
 
 Roni Balthazar
 
 
 
 
 On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam  wrote:
 > 1) we tried to run repairs but they usually does not succeed. But we had
 > Leveled compaction before. Last week we ALTER tables to STCS, because 
 > guys
 > from DataStax suggest us that we should not use Leveled and alter tables 
 > in
 > STCS, because we don't have SSD. After this change we did not run any
 > repair. Anyway I don't think it will change anything in SSTable count - 
 > if I
 > am wrong please give me an information
 >
 > 2) I did this. My tables are 99% write only. It is audit system
 >
 > 3) Yes I am using default values
 >
 > 4) In both operations I am using LOCAL_QUORUM.
 >
 > I am almost sure that READ timeout happens because of too much SSTables.
 > Anyway firstly I would like to fix to many pending compactions. I still
 > don't know how to speed up them.
 >
 >
 > On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar 
 > wrote:
 >>
 >> Are you running repairs within gc_grace_seconds? (default is 10 days)
 >>
 >> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 >>
 >> Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
 >> that you do not read often.
 >>
 >> Are you using default values for the properties
 >> min_compaction_threshold(4) and max_compaction_threshold(32)?
 >>
 >> Which Consistency Level are you using for reading operations? Check if
 >> you are not reading from DC_B due to your Replication Factor and CL.
 >>
 >> http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
 >>
 >>
 >> Cheers,
 >>
 >> Roni Balthazar
 >>
 >> On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam  wrote:
 >> > I don't have problems with DC_B (replica) only in DC_A(my system write
 >> > only
 >> > to it) I have read timeouts.
 >> >
 >> > I checked in OpsCenter SSTable count  and I have:
 >> > 1) in DC_A  same +-10% for last week, a small increase for last 24h 
 >> > (it
 >> > is
 >> > more than 15000-2 SSTables depends on node)
 >> > 2) in DC_B last 24h shows up to 50% decrease, which give nice
 >> > prognostics.
 >> > Now I have less then 1000 SSTables
 >> >
 >> > What did you measure during system optimizations? Or do you have an 
 >> > idea
 >> > what more should I check?
 >> > 1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
 >> > 2) Disk queue -> mostly is it near zero: avg 0.09. Sometimes there are
 >> > spikes
 >> > 3) system RAM usage is almost full
>>>

Re: Many pending compactions

ad 3)  I did this already yesterday (setcompactionthrouput also). But still
SSTables are increasing.

ad 1) What do you think I should use -pr or try to use incremental?



On Wed, Feb 18, 2015 at 4:54 PM, Roni Balthazar 
wrote:

> You are right... Repair makes the data consistent between nodes.
>
> I understand that you have 2 issues going on.
>
> You need to run repair periodically without errors and need to decrease
> the numbers of compactions pending.
>
> So I suggest:
>
> 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can
> use incremental repairs. There were some bugs on 2.1.2.
> 2) Run cleanup on all nodes
> 3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0,
> and increase setcompactionthroughput for some time and see if the number
> of SSTables is going down.
>
> Let us know what errors are you getting when running repairs.
>
> Regards,
>
> Roni Balthazar
>
>
> On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam  wrote:
>
>> Can you explain me what is the correlation between growing SSTables and
>> repair?
>> I was sure, until your  mail, that repair is only to make data consistent
>> between nodes.
>>
>> Regards
>>
>>
>> On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar 
>> wrote:
>>
>>> Which error are you getting when running repairs?
>>> You need to run repair on your nodes within gc_grace_seconds (eg:
>>> weekly). They have data that are not read frequently. You can run
>>> "repair -pr" on all nodes. Since you do not have deletes, you will not
>>> have trouble with that. If you have deletes, it's better to increase
>>> gc_grace_seconds before the repair.
>>>
>>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
>>> After repair, try to run a "nodetool cleanup".
>>>
>>> Check if the number of SSTables goes down after that... Pending
>>> compactions must decrease as well...
>>>
>>> Cheers,
>>>
>>> Roni Balthazar
>>>
>>>
>>>
>>>
>>> On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam  wrote:
>>> > 1) we tried to run repairs but they usually does not succeed. But we
>>> had
>>> > Leveled compaction before. Last week we ALTER tables to STCS, because
>>> guys
>>> > from DataStax suggest us that we should not use Leveled and alter
>>> tables in
>>> > STCS, because we don't have SSD. After this change we did not run any
>>> > repair. Anyway I don't think it will change anything in SSTable count
>>> - if I
>>> > am wrong please give me an information
>>> >
>>> > 2) I did this. My tables are 99% write only. It is audit system
>>> >
>>> > 3) Yes I am using default values
>>> >
>>> > 4) In both operations I am using LOCAL_QUORUM.
>>> >
>>> > I am almost sure that READ timeout happens because of too much
>>> SSTables.
>>> > Anyway firstly I would like to fix to many pending compactions. I still
>>> > don't know how to speed up them.
>>> >
>>> >
>>> > On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar <
>>> ronibaltha...@gmail.com>
>>> > wrote:
>>> >>
>>> >> Are you running repairs within gc_grace_seconds? (default is 10 days)
>>> >>
>>> >>
>>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
>>> >>
>>> >> Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
>>> >> that you do not read often.
>>> >>
>>> >> Are you using default values for the properties
>>> >> min_compaction_threshold(4) and max_compaction_threshold(32)?
>>> >>
>>> >> Which Consistency Level are you using for reading operations? Check if
>>> >> you are not reading from DC_B due to your Replication Factor and CL.
>>> >>
>>> >>
>>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
>>> >>
>>> >>
>>> >> Cheers,
>>> >>
>>> >> Roni Balthazar
>>> >>
>>> >> On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam  wrote:
>>> >> > I don't have problems with DC_B (replica) only in DC_A(my system
>>> write
>>> >> > only
>>> >> > to it) I have read timeouts.
>>> >> >
>>> >> > I checked in OpsCenter SSTable count  and I have:
>>> >> > 1) in DC_A  same +-10% for last week, a small increase for last 24h
>>> (it
>>> >> > is
>>> >> > more than 15000-2 SSTables depends on node)
>>> >> > 2) in DC_B last 24h shows up to 50% decrease, which give nice
>>> >> > prognostics.
>>> >> > Now I have less then 1000 SSTables
>>> >> >
>>> >> > What did you measure during system optimizations? Or do you have an
>>> idea
>>> >> > what more should I check?
>>> >> > 1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
>>> >> > 2) Disk queue -> mostly is it near zero: avg 0.09. Sometimes there
>>> are
>>> >> > spikes
>>> >> > 3) system RAM usage is almost full
>>> >> > 4) In Total Bytes Compacted most most lines are below 3MB/s. For
>>> total
>>> >> > DC_A
>>> >> > it is less than 10MB/s, in DC_B it looks much better (avg is like
>>> >> > 17MB/s)
>>> >> >
>>> >> > something else?
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar
>>> >> > 
>>> >> > wrote:
>>> >> >>
>>> >> >> Hi

Re: Many pending compactions

You are right... Repair makes the data consistent between nodes.

I understand that you have 2 issues going on.

You need to run repair periodically without errors and need to decrease the
numbers of compactions pending.

So I suggest:

1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can
use incremental repairs. There were some bugs on 2.1.2.
2) Run cleanup on all nodes
3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0,
and increase setcompactionthroughput for some time and see if the number of
SSTables is going down.

Let us know what errors are you getting when running repairs.

Regards,

Roni Balthazar


On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam  wrote:

> Can you explain me what is the correlation between growing SSTables and
> repair?
> I was sure, until your  mail, that repair is only to make data consistent
> between nodes.
>
> Regards
>
>
> On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar 
> wrote:
>
>> Which error are you getting when running repairs?
>> You need to run repair on your nodes within gc_grace_seconds (eg:
>> weekly). They have data that are not read frequently. You can run
>> "repair -pr" on all nodes. Since you do not have deletes, you will not
>> have trouble with that. If you have deletes, it's better to increase
>> gc_grace_seconds before the repair.
>>
>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
>> After repair, try to run a "nodetool cleanup".
>>
>> Check if the number of SSTables goes down after that... Pending
>> compactions must decrease as well...
>>
>> Cheers,
>>
>> Roni Balthazar
>>
>>
>>
>>
>> On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam  wrote:
>> > 1) we tried to run repairs but they usually does not succeed. But we had
>> > Leveled compaction before. Last week we ALTER tables to STCS, because
>> guys
>> > from DataStax suggest us that we should not use Leveled and alter
>> tables in
>> > STCS, because we don't have SSD. After this change we did not run any
>> > repair. Anyway I don't think it will change anything in SSTable count -
>> if I
>> > am wrong please give me an information
>> >
>> > 2) I did this. My tables are 99% write only. It is audit system
>> >
>> > 3) Yes I am using default values
>> >
>> > 4) In both operations I am using LOCAL_QUORUM.
>> >
>> > I am almost sure that READ timeout happens because of too much SSTables.
>> > Anyway firstly I would like to fix to many pending compactions. I still
>> > don't know how to speed up them.
>> >
>> >
>> > On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar <
>> ronibaltha...@gmail.com>
>> > wrote:
>> >>
>> >> Are you running repairs within gc_grace_seconds? (default is 10 days)
>> >>
>> >>
>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
>> >>
>> >> Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
>> >> that you do not read often.
>> >>
>> >> Are you using default values for the properties
>> >> min_compaction_threshold(4) and max_compaction_threshold(32)?
>> >>
>> >> Which Consistency Level are you using for reading operations? Check if
>> >> you are not reading from DC_B due to your Replication Factor and CL.
>> >>
>> >>
>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
>> >>
>> >>
>> >> Cheers,
>> >>
>> >> Roni Balthazar
>> >>
>> >> On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam  wrote:
>> >> > I don't have problems with DC_B (replica) only in DC_A(my system
>> write
>> >> > only
>> >> > to it) I have read timeouts.
>> >> >
>> >> > I checked in OpsCenter SSTable count  and I have:
>> >> > 1) in DC_A  same +-10% for last week, a small increase for last 24h
>> (it
>> >> > is
>> >> > more than 15000-2 SSTables depends on node)
>> >> > 2) in DC_B last 24h shows up to 50% decrease, which give nice
>> >> > prognostics.
>> >> > Now I have less then 1000 SSTables
>> >> >
>> >> > What did you measure during system optimizations? Or do you have an
>> idea
>> >> > what more should I check?
>> >> > 1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
>> >> > 2) Disk queue -> mostly is it near zero: avg 0.09. Sometimes there
>> are
>> >> > spikes
>> >> > 3) system RAM usage is almost full
>> >> > 4) In Total Bytes Compacted most most lines are below 3MB/s. For
>> total
>> >> > DC_A
>> >> > it is less than 10MB/s, in DC_B it looks much better (avg is like
>> >> > 17MB/s)
>> >> >
>> >> > something else?
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar
>> >> > 
>> >> > wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> You can check if the number of SSTables is decreasing. Look for the
>> >> >> "SSTable count" information of your tables using "nodetool cfstats".
>> >> >> The compaction history can be viewed using "nodetool
>> >> >> compactionhistory".
>> >> >>
>> >> >> About the timeouts, check this out:
>> >> >>
>> >> >>
>> http://www.datastax.com/dev/blog/how-cassandra-deals-with-repli

Re: Many pending compactions

2015-02-18 Thread Marcelo Valle (BLOOMBERG/ LONDON)

Cassandra 2.1 comes with incremental repair, and I haven't read the details 
myself: 
http://www.datastax.com/documentation/cassandra/2.1/cassandra/operations/ops_repair_nodes_c.html

However, AFAIK, a full repair will rebuild all sstables, that's why you should 
have more than 50% of disk space available on each node. Of course, it will 
also make sure data is replicated to the right nodes in the process.

[]s
From: user@cassandra.apache.org 
Subject: Re: Many pending compactions

Can you explain me what is the correlation between growing SSTables and repair? 
I was sure, until your  mail, that repair is only to make data consistent 
between nodes.

Regards


On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar  wrote:

Which error are you getting when running repairs?
You need to run repair on your nodes within gc_grace_seconds (eg:
weekly). They have data that are not read frequently. You can run
"repair -pr" on all nodes. Since you do not have deletes, you will not
have trouble with that. If you have deletes, it's better to increase
gc_grace_seconds before the repair.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
After repair, try to run a "nodetool cleanup".

Check if the number of SSTables goes down after that... Pending
compactions must decrease as well...

Cheers,

Roni Balthazar


On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam  wrote:
> 1) we tried to run repairs but they usually does not succeed. But we had
> Leveled compaction before. Last week we ALTER tables to STCS, because guys
> from DataStax suggest us that we should not use Leveled and alter tables in
> STCS, because we don't have SSD. After this change we did not run any
> repair. Anyway I don't think it will change anything in SSTable count - if I
> am wrong please give me an information
>
> 2) I did this. My tables are 99% write only. It is audit system
>
> 3) Yes I am using default values
>
> 4) In both operations I am using LOCAL_QUORUM.
>
> I am almost sure that READ timeout happens because of too much SSTables.
> Anyway firstly I would like to fix to many pending compactions. I still
> don't know how to speed up them.
>
>
> On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar 
> wrote:
>>
>> Are you running repairs within gc_grace_seconds? (default is 10 days)
>>
>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
>>
>> Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
>> that you do not read often.
>>
>> Are you using default values for the properties
>> min_compaction_threshold(4) and max_compaction_threshold(32)?
>>
>> Which Consistency Level are you using for reading operations? Check if
>> you are not reading from DC_B due to your Replication Factor and CL.
>>
>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
>>
>>
>> Cheers,
>>
>> Roni Balthazar
>>
>> On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam  wrote:
>> > I don't have problems with DC_B (replica) only in DC_A(my system write
>> > only
>> > to it) I have read timeouts.
>> >
>> > I checked in OpsCenter SSTable count  and I have:
>> > 1) in DC_A  same +-10% for last week, a small increase for last 24h (it
>> > is
>> > more than 15000-2 SSTables depends on node)
>> > 2) in DC_B last 24h shows up to 50% decrease, which give nice
>> > prognostics.
>> > Now I have less then 1000 SSTables
>> >
>> > What did you measure during system optimizations? Or do you have an idea
>> > what more should I check?
>> > 1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
>> > 2) Disk queue -> mostly is it near zero: avg 0.09. Sometimes there are
>> > spikes
>> > 3) system RAM usage is almost full
>> > 4) In Total Bytes Compacted most most lines are below 3MB/s. For total
>> > DC_A
>> > it is less than 10MB/s, in DC_B it looks much better (avg is like
>> > 17MB/s)
>> >
>> > something else?
>> >
>> >
>> >
>> > On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar
>> > 
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> You can check if the number of SSTables is decreasing. Look for the
>> >> "SSTable count" information of your tables using "nodetool cfstats".
>> >> The compaction history can be viewed using "nodetool
>> >> compactionhistory".
>> >>
>> >> About the timeouts, check this out:
>> >>
>> >> http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
>> >> Also try to run "nodetool tpstats" to see the threads statistics. It
>> >> can lead you to know if you are having performance problems. If you
>> >> are having too many pending tasks or dropped messages, maybe will you
>> >> need to tune your system (eg: driver's timeout, concurrent reads and
>> >> so on)
>> >>
>> >> Regards,
>> >>
>> >> Roni Balthazar
>> >>
>> >> On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam  wrote:
>> >> > Hi,
>> >> > Thanks for your "tip" it looks that something changed - I still don't
>> >> > know
>> >> > if it is ok.
>> >> >
>> >> > My nodes started to do more compa

Re: Many pending compactions

Can you explain me what is the correlation between growing SSTables and
repair?
I was sure, until your  mail, that repair is only to make data consistent
between nodes.

Regards


On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar 
wrote:

> Which error are you getting when running repairs?
> You need to run repair on your nodes within gc_grace_seconds (eg:
> weekly). They have data that are not read frequently. You can run
> "repair -pr" on all nodes. Since you do not have deletes, you will not
> have trouble with that. If you have deletes, it's better to increase
> gc_grace_seconds before the repair.
>
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
> After repair, try to run a "nodetool cleanup".
>
> Check if the number of SSTables goes down after that... Pending
> compactions must decrease as well...
>
> Cheers,
>
> Roni Balthazar
>
>
>
>
> On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam  wrote:
> > 1) we tried to run repairs but they usually does not succeed. But we had
> > Leveled compaction before. Last week we ALTER tables to STCS, because
> guys
> > from DataStax suggest us that we should not use Leveled and alter tables
> in
> > STCS, because we don't have SSD. After this change we did not run any
> > repair. Anyway I don't think it will change anything in SSTable count -
> if I
> > am wrong please give me an information
> >
> > 2) I did this. My tables are 99% write only. It is audit system
> >
> > 3) Yes I am using default values
> >
> > 4) In both operations I am using LOCAL_QUORUM.
> >
> > I am almost sure that READ timeout happens because of too much SSTables.
> > Anyway firstly I would like to fix to many pending compactions. I still
> > don't know how to speed up them.
> >
> >
> > On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar  >
> > wrote:
> >>
> >> Are you running repairs within gc_grace_seconds? (default is 10 days)
> >>
> >>
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
> >>
> >> Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
> >> that you do not read often.
> >>
> >> Are you using default values for the properties
> >> min_compaction_threshold(4) and max_compaction_threshold(32)?
> >>
> >> Which Consistency Level are you using for reading operations? Check if
> >> you are not reading from DC_B due to your Replication Factor and CL.
> >>
> >>
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
> >>
> >>
> >> Cheers,
> >>
> >> Roni Balthazar
> >>
> >> On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam  wrote:
> >> > I don't have problems with DC_B (replica) only in DC_A(my system write
> >> > only
> >> > to it) I have read timeouts.
> >> >
> >> > I checked in OpsCenter SSTable count  and I have:
> >> > 1) in DC_A  same +-10% for last week, a small increase for last 24h
> (it
> >> > is
> >> > more than 15000-2 SSTables depends on node)
> >> > 2) in DC_B last 24h shows up to 50% decrease, which give nice
> >> > prognostics.
> >> > Now I have less then 1000 SSTables
> >> >
> >> > What did you measure during system optimizations? Or do you have an
> idea
> >> > what more should I check?
> >> > 1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
> >> > 2) Disk queue -> mostly is it near zero: avg 0.09. Sometimes there are
> >> > spikes
> >> > 3) system RAM usage is almost full
> >> > 4) In Total Bytes Compacted most most lines are below 3MB/s. For total
> >> > DC_A
> >> > it is less than 10MB/s, in DC_B it looks much better (avg is like
> >> > 17MB/s)
> >> >
> >> > something else?
> >> >
> >> >
> >> >
> >> > On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar
> >> > 
> >> > wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> You can check if the number of SSTables is decreasing. Look for the
> >> >> "SSTable count" information of your tables using "nodetool cfstats".
> >> >> The compaction history can be viewed using "nodetool
> >> >> compactionhistory".
> >> >>
> >> >> About the timeouts, check this out:
> >> >>
> >> >>
> http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
> >> >> Also try to run "nodetool tpstats" to see the threads statistics. It
> >> >> can lead you to know if you are having performance problems. If you
> >> >> are having too many pending tasks or dropped messages, maybe will you
> >> >> need to tune your system (eg: driver's timeout, concurrent reads and
> >> >> so on)
> >> >>
> >> >> Regards,
> >> >>
> >> >> Roni Balthazar
> >> >>
> >> >> On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam  wrote:
> >> >> > Hi,
> >> >> > Thanks for your "tip" it looks that something changed - I still
> don't
> >> >> > know
> >> >> > if it is ok.
> >> >> >
> >> >> > My nodes started to do more compaction, but it looks that some
> >> >> > compactions
> >> >> > are really slow.
> >> >> > In IO we have idle, CPU is quite ok (30%-40%). We set
> >> >> > compactionthrouput
> >> >> > to
> >> >> > 999, but I do not see difference.
> >

Re: Many pending compactions

Which error are you getting when running repairs?
You need to run repair on your nodes within gc_grace_seconds (eg:
weekly). They have data that are not read frequently. You can run
"repair -pr" on all nodes. Since you do not have deletes, you will not
have trouble with that. If you have deletes, it's better to increase
gc_grace_seconds before the repair.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
After repair, try to run a "nodetool cleanup".

Check if the number of SSTables goes down after that... Pending
compactions must decrease as well...

Cheers,

Roni Balthazar




On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam  wrote:
> 1) we tried to run repairs but they usually does not succeed. But we had
> Leveled compaction before. Last week we ALTER tables to STCS, because guys
> from DataStax suggest us that we should not use Leveled and alter tables in
> STCS, because we don't have SSD. After this change we did not run any
> repair. Anyway I don't think it will change anything in SSTable count - if I
> am wrong please give me an information
>
> 2) I did this. My tables are 99% write only. It is audit system
>
> 3) Yes I am using default values
>
> 4) In both operations I am using LOCAL_QUORUM.
>
> I am almost sure that READ timeout happens because of too much SSTables.
> Anyway firstly I would like to fix to many pending compactions. I still
> don't know how to speed up them.
>
>
> On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar 
> wrote:
>>
>> Are you running repairs within gc_grace_seconds? (default is 10 days)
>>
>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
>>
>> Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
>> that you do not read often.
>>
>> Are you using default values for the properties
>> min_compaction_threshold(4) and max_compaction_threshold(32)?
>>
>> Which Consistency Level are you using for reading operations? Check if
>> you are not reading from DC_B due to your Replication Factor and CL.
>>
>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
>>
>>
>> Cheers,
>>
>> Roni Balthazar
>>
>> On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam  wrote:
>> > I don't have problems with DC_B (replica) only in DC_A(my system write
>> > only
>> > to it) I have read timeouts.
>> >
>> > I checked in OpsCenter SSTable count  and I have:
>> > 1) in DC_A  same +-10% for last week, a small increase for last 24h (it
>> > is
>> > more than 15000-2 SSTables depends on node)
>> > 2) in DC_B last 24h shows up to 50% decrease, which give nice
>> > prognostics.
>> > Now I have less then 1000 SSTables
>> >
>> > What did you measure during system optimizations? Or do you have an idea
>> > what more should I check?
>> > 1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
>> > 2) Disk queue -> mostly is it near zero: avg 0.09. Sometimes there are
>> > spikes
>> > 3) system RAM usage is almost full
>> > 4) In Total Bytes Compacted most most lines are below 3MB/s. For total
>> > DC_A
>> > it is less than 10MB/s, in DC_B it looks much better (avg is like
>> > 17MB/s)
>> >
>> > something else?
>> >
>> >
>> >
>> > On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar
>> > 
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> You can check if the number of SSTables is decreasing. Look for the
>> >> "SSTable count" information of your tables using "nodetool cfstats".
>> >> The compaction history can be viewed using "nodetool
>> >> compactionhistory".
>> >>
>> >> About the timeouts, check this out:
>> >>
>> >> http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
>> >> Also try to run "nodetool tpstats" to see the threads statistics. It
>> >> can lead you to know if you are having performance problems. If you
>> >> are having too many pending tasks or dropped messages, maybe will you
>> >> need to tune your system (eg: driver's timeout, concurrent reads and
>> >> so on)
>> >>
>> >> Regards,
>> >>
>> >> Roni Balthazar
>> >>
>> >> On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam  wrote:
>> >> > Hi,
>> >> > Thanks for your "tip" it looks that something changed - I still don't
>> >> > know
>> >> > if it is ok.
>> >> >
>> >> > My nodes started to do more compaction, but it looks that some
>> >> > compactions
>> >> > are really slow.
>> >> > In IO we have idle, CPU is quite ok (30%-40%). We set
>> >> > compactionthrouput
>> >> > to
>> >> > 999, but I do not see difference.
>> >> >
>> >> > Can we check something more? Or do you have any method to monitor
>> >> > progress
>> >> > with small files?
>> >> >
>> >> > Regards
>> >> >
>> >> > On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar
>> >> > 
>> >> > wrote:
>> >> >>
>> >> >> HI,
>> >> >>
>> >> >> Yes... I had the same issue and setting cold_reads_to_omit to 0.0
>> >> >> was
>> >> >> the solution...
>> >> >> The number of SSTables decreased from many thousands to a number
>> >> >> below
>> >> >> a hundred and the SSTables a

Re: Many pending compactions

1) we tried to run repairs but they usually does not succeed. But we had
Leveled compaction before. Last week we ALTER tables to STCS, because guys
from DataStax suggest us that we should not use Leveled and alter tables in
STCS, because we don't have SSD. After this change we did not run any
repair. Anyway I don't think it will change anything in SSTable count - if
I am wrong please give me an information

2) I did this. My tables are 99% write only. It is audit system

3) Yes I am using default values

4) In both operations I am using LOCAL_QUORUM.

I am almost sure that READ timeout happens because of too much SSTables.
Anyway firstly I would like to fix to many pending compactions. I still
don't know how to speed up them.


On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar 
wrote:

> Are you running repairs within gc_grace_seconds? (default is 10 days)
>
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
>
> Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
> that you do not read often.
>
> Are you using default values for the properties
> min_compaction_threshold(4) and max_compaction_threshold(32)?
>
> Which Consistency Level are you using for reading operations? Check if
> you are not reading from DC_B due to your Replication Factor and CL.
>
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
>
>
> Cheers,
>
> Roni Balthazar
>
> On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam  wrote:
> > I don't have problems with DC_B (replica) only in DC_A(my system write
> only
> > to it) I have read timeouts.
> >
> > I checked in OpsCenter SSTable count  and I have:
> > 1) in DC_A  same +-10% for last week, a small increase for last 24h (it
> is
> > more than 15000-2 SSTables depends on node)
> > 2) in DC_B last 24h shows up to 50% decrease, which give nice
> prognostics.
> > Now I have less then 1000 SSTables
> >
> > What did you measure during system optimizations? Or do you have an idea
> > what more should I check?
> > 1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
> > 2) Disk queue -> mostly is it near zero: avg 0.09. Sometimes there are
> > spikes
> > 3) system RAM usage is almost full
> > 4) In Total Bytes Compacted most most lines are below 3MB/s. For total
> DC_A
> > it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s)
> >
> > something else?
> >
> >
> >
> > On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar  >
> > wrote:
> >>
> >> Hi,
> >>
> >> You can check if the number of SSTables is decreasing. Look for the
> >> "SSTable count" information of your tables using "nodetool cfstats".
> >> The compaction history can be viewed using "nodetool
> >> compactionhistory".
> >>
> >> About the timeouts, check this out:
> >>
> http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
> >> Also try to run "nodetool tpstats" to see the threads statistics. It
> >> can lead you to know if you are having performance problems. If you
> >> are having too many pending tasks or dropped messages, maybe will you
> >> need to tune your system (eg: driver's timeout, concurrent reads and
> >> so on)
> >>
> >> Regards,
> >>
> >> Roni Balthazar
> >>
> >> On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam  wrote:
> >> > Hi,
> >> > Thanks for your "tip" it looks that something changed - I still don't
> >> > know
> >> > if it is ok.
> >> >
> >> > My nodes started to do more compaction, but it looks that some
> >> > compactions
> >> > are really slow.
> >> > In IO we have idle, CPU is quite ok (30%-40%). We set
> compactionthrouput
> >> > to
> >> > 999, but I do not see difference.
> >> >
> >> > Can we check something more? Or do you have any method to monitor
> >> > progress
> >> > with small files?
> >> >
> >> > Regards
> >> >
> >> > On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar
> >> > 
> >> > wrote:
> >> >>
> >> >> HI,
> >> >>
> >> >> Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
> >> >> the solution...
> >> >> The number of SSTables decreased from many thousands to a number
> below
> >> >> a hundred and the SSTables are now much bigger with several gigabytes
> >> >> (most of them).
> >> >>
> >> >> Cheers,
> >> >>
> >> >> Roni Balthazar
> >> >>
> >> >>
> >> >>
> >> >> On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam 
> wrote:
> >> >> > After some diagnostic ( we didn't set yet cold_reads_to_omit ).
> >> >> > Compaction
> >> >> > are running but VERY slow with "idle" IO.
> >> >> >
> >> >> > We had a lot of "Data files" in Cassandra. In DC_A it is about
> >> >> > ~12
> >> >> > (only
> >> >> > xxx-Data.db) in DC_B has only ~4000.
> >> >> >
> >> >> > I don't know if this change anything but:
> >> >> > 1) in DC_A avg size of Data.db file is ~13 mb. I have few a really
> >> >> > big
> >> >> > ones,
> >> >> > but most is really small (almost 1 files are less then 100mb).
> >> >> > 2) in DC_B avg size of Data.db is much bigger ~260mb.
> >> >> >
> >> >> > Do you think

Re: Many pending compactions

Are you running repairs within gc_grace_seconds? (default is 10 days)
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html

Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
that you do not read often.

Are you using default values for the properties
min_compaction_threshold(4) and max_compaction_threshold(32)?

Which Consistency Level are you using for reading operations? Check if
you are not reading from DC_B due to your Replication Factor and CL.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html


Cheers,

Roni Balthazar

On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam  wrote:
> I don't have problems with DC_B (replica) only in DC_A(my system write only
> to it) I have read timeouts.
>
> I checked in OpsCenter SSTable count  and I have:
> 1) in DC_A  same +-10% for last week, a small increase for last 24h (it is
> more than 15000-2 SSTables depends on node)
> 2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics.
> Now I have less then 1000 SSTables
>
> What did you measure during system optimizations? Or do you have an idea
> what more should I check?
> 1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
> 2) Disk queue -> mostly is it near zero: avg 0.09. Sometimes there are
> spikes
> 3) system RAM usage is almost full
> 4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A
> it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s)
>
> something else?
>
>
>
> On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar 
> wrote:
>>
>> Hi,
>>
>> You can check if the number of SSTables is decreasing. Look for the
>> "SSTable count" information of your tables using "nodetool cfstats".
>> The compaction history can be viewed using "nodetool
>> compactionhistory".
>>
>> About the timeouts, check this out:
>> http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
>> Also try to run "nodetool tpstats" to see the threads statistics. It
>> can lead you to know if you are having performance problems. If you
>> are having too many pending tasks or dropped messages, maybe will you
>> need to tune your system (eg: driver's timeout, concurrent reads and
>> so on)
>>
>> Regards,
>>
>> Roni Balthazar
>>
>> On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam  wrote:
>> > Hi,
>> > Thanks for your "tip" it looks that something changed - I still don't
>> > know
>> > if it is ok.
>> >
>> > My nodes started to do more compaction, but it looks that some
>> > compactions
>> > are really slow.
>> > In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput
>> > to
>> > 999, but I do not see difference.
>> >
>> > Can we check something more? Or do you have any method to monitor
>> > progress
>> > with small files?
>> >
>> > Regards
>> >
>> > On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar
>> > 
>> > wrote:
>> >>
>> >> HI,
>> >>
>> >> Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
>> >> the solution...
>> >> The number of SSTables decreased from many thousands to a number below
>> >> a hundred and the SSTables are now much bigger with several gigabytes
>> >> (most of them).
>> >>
>> >> Cheers,
>> >>
>> >> Roni Balthazar
>> >>
>> >>
>> >>
>> >> On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam  wrote:
>> >> > After some diagnostic ( we didn't set yet cold_reads_to_omit ).
>> >> > Compaction
>> >> > are running but VERY slow with "idle" IO.
>> >> >
>> >> > We had a lot of "Data files" in Cassandra. In DC_A it is about
>> >> > ~12
>> >> > (only
>> >> > xxx-Data.db) in DC_B has only ~4000.
>> >> >
>> >> > I don't know if this change anything but:
>> >> > 1) in DC_A avg size of Data.db file is ~13 mb. I have few a really
>> >> > big
>> >> > ones,
>> >> > but most is really small (almost 1 files are less then 100mb).
>> >> > 2) in DC_B avg size of Data.db is much bigger ~260mb.
>> >> >
>> >> > Do you think that above flag will help us?
>> >> >
>> >> >
>> >> > On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam  wrote:
>> >> >>
>> >> >> I set setcompactionthroughput 999 permanently and it doesn't change
>> >> >> anything. IO is still same. CPU is idle.
>> >> >>
>> >> >> On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar
>> >> >> 
>> >> >> wrote:
>> >> >>>
>> >> >>> Hi,
>> >> >>>
>> >> >>> You can run "nodetool compactionstats" to view statistics on
>> >> >>> compactions.
>> >> >>> Setting cold_reads_to_omit to 0.0 can help to reduce the number of
>> >> >>> SSTables when you use Size-Tiered compaction.
>> >> >>> You can also create a cron job to increase the value of
>> >> >>> setcompactionthroughput during the night or when your IO is not
>> >> >>> busy.
>> >> >>>
>> >> >>> From http://wiki.apache.org/cassandra/NodeTool:
>> >> >>> 0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
>> >> >>> 0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16
>> >> >>>
>> >> >>> Cheers,
>> >> >>>
>> >> >>> Roni Balthazar
>> >> >>>
>> >> >

Re: Adding new node to cluster

2015-02-18 Thread Jonathan Lacefield

Hello,

  Please note that DataStax has updated the documentation for replacing a
seed node.  The new docs outline a simplified process to help avoid the
confusion on this topic.


http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_seed_node.html

Jonathan

[image: datastax_logo.png]

Jonathan Lacefield

Solution Architect | (404) 822 3487 | jlacefi...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

 

On Tue, Feb 17, 2015 at 8:04 PM, Robert Coli  wrote:

> On Tue, Feb 17, 2015 at 2:25 PM,  wrote:
>
>>  SimpleSnitch is not rack aware. You would want to choose seed nodes and
>> then not change them. Seed nodes apparently don’t bootstrap.
>>
>
> No one seems to know what a "seed node" actually *is*, but "seed nodes"
> can in fact bootstrap. They just have to temporarily forget to tell
> themselves that they are a seed node while bootstrapping, and then other
> nodes will still gossip to it as a seed once it comes up, even though it
> doesn't consider itself a seed.
>
>
> https://issues.apache.org/jira/browse/CASSANDRA-5836?focusedCommentId=13727032&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13727032
> "
>
> Replacing a seed node is a very common operation, and this best practice
> is confusing/poorly documented. There are regular contacts to
> #cassandra/cassandra-user@ where people ask how to replace a seed node,
> and are confused by the answer. The workaround also means that, if you do
> not restart your node after bootstrapping it (and changing the conf file
> back to indicate to itself that it is a seed) the node runs until next
> restart without any understanding that it is a seed node.
>
> Being a seed node appears to mean two things :
>
> 1) I have myself as an entry in my own seed list, so I know that I am a
> seed.
> 2) Other nodes have me in their seed list, so they consider me a seed.
>
> The current code checks for 1) and refuses to bootstrap. The workaround is
> to remove the 1) state temporarily. But if it is unsafe to bootstrap a seed
> node because of either 1) or 2), the workaround is unsafe.
>
> Can you explicate the special cases here? I sincerely would like to
> understand why the code tries to prevent "a seed" from bootstrapping when
> one can clearly, and apparently safely, bootstrap "a seed".
>
> "
>
>
> Unfortunately, there has been no answer.
>
>
> =Rob
>
>
>
>

Re: Many pending compactions

I don't have problems with DC_B (replica) only in DC_A(my system write only
to it) I have read timeouts.

I checked in OpsCenter SSTable count  and I have:
1) in DC_A  same +-10% for last week, a small increase for last 24h (it is
more than 15000-2 SSTables depends on node)
2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics.
Now I have less then 1000 SSTables

What did you measure during system optimizations? Or do you have an idea
what more should I check?
1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
2) Disk queue -> mostly is it near zero: avg 0.09. Sometimes there are
spikes
3) system RAM usage is almost full
4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A
it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s)

something else?



On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar 
wrote:

> Hi,
>
> You can check if the number of SSTables is decreasing. Look for the
> "SSTable count" information of your tables using "nodetool cfstats".
> The compaction history can be viewed using "nodetool
> compactionhistory".
>
> About the timeouts, check this out:
> http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
> Also try to run "nodetool tpstats" to see the threads statistics. It
> can lead you to know if you are having performance problems. If you
> are having too many pending tasks or dropped messages, maybe will you
> need to tune your system (eg: driver's timeout, concurrent reads and
> so on)
>
> Regards,
>
> Roni Balthazar
>
> On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam  wrote:
> > Hi,
> > Thanks for your "tip" it looks that something changed - I still don't
> know
> > if it is ok.
> >
> > My nodes started to do more compaction, but it looks that some
> compactions
> > are really slow.
> > In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput
> to
> > 999, but I do not see difference.
> >
> > Can we check something more? Or do you have any method to monitor
> progress
> > with small files?
> >
> > Regards
> >
> > On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar  >
> > wrote:
> >>
> >> HI,
> >>
> >> Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
> >> the solution...
> >> The number of SSTables decreased from many thousands to a number below
> >> a hundred and the SSTables are now much bigger with several gigabytes
> >> (most of them).
> >>
> >> Cheers,
> >>
> >> Roni Balthazar
> >>
> >>
> >>
> >> On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam  wrote:
> >> > After some diagnostic ( we didn't set yet cold_reads_to_omit ).
> >> > Compaction
> >> > are running but VERY slow with "idle" IO.
> >> >
> >> > We had a lot of "Data files" in Cassandra. In DC_A it is about ~12
> >> > (only
> >> > xxx-Data.db) in DC_B has only ~4000.
> >> >
> >> > I don't know if this change anything but:
> >> > 1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big
> >> > ones,
> >> > but most is really small (almost 1 files are less then 100mb).
> >> > 2) in DC_B avg size of Data.db is much bigger ~260mb.
> >> >
> >> > Do you think that above flag will help us?
> >> >
> >> >
> >> > On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam  wrote:
> >> >>
> >> >> I set setcompactionthroughput 999 permanently and it doesn't change
> >> >> anything. IO is still same. CPU is idle.
> >> >>
> >> >> On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar
> >> >> 
> >> >> wrote:
> >> >>>
> >> >>> Hi,
> >> >>>
> >> >>> You can run "nodetool compactionstats" to view statistics on
> >> >>> compactions.
> >> >>> Setting cold_reads_to_omit to 0.0 can help to reduce the number of
> >> >>> SSTables when you use Size-Tiered compaction.
> >> >>> You can also create a cron job to increase the value of
> >> >>> setcompactionthroughput during the night or when your IO is not
> busy.
> >> >>>
> >> >>> From http://wiki.apache.org/cassandra/NodeTool:
> >> >>> 0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
> >> >>> 0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16
> >> >>>
> >> >>> Cheers,
> >> >>>
> >> >>> Roni Balthazar
> >> >>>
> >> >>> On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam 
> wrote:
> >> >>> > One think I do not understand. In my case compaction is running
> >> >>> > permanently.
> >> >>> > Is there a way to check which compaction is pending? The only
> >> >>> > information is
> >> >>> > about total count.
> >> >>> >
> >> >>> >
> >> >>> > On Monday, February 16, 2015, Ja Sam  wrote:
> >> >>> >>
> >> >>> >> Of couse I made a mistake. I am using 2.1.2. Anyway night build
> is
> >> >>> >> available from
> >> >>> >> http://cassci.datastax.com/job/cassandra-2.1/
> >> >>> >>
> >> >>> >> I read about cold_reads_to_omit It looks promising. Should I set
> >> >>> >> also
> >> >>> >> compaction throughput?
> >> >>> >>
> >> >>> >> p.s. I am really sad that I didn't read this before:
> >> >>> >>
> >> >>> >>
> >> >>> >>
> https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
>

Re: Many pending compactions

Hi,

You can check if the number of SSTables is decreasing. Look for the
"SSTable count" information of your tables using "nodetool cfstats".
The compaction history can be viewed using "nodetool
compactionhistory".

About the timeouts, check this out:
http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
Also try to run "nodetool tpstats" to see the threads statistics. It
can lead you to know if you are having performance problems. If you
are having too many pending tasks or dropped messages, maybe will you
need to tune your system (eg: driver's timeout, concurrent reads and
so on)

Regards,

Roni Balthazar

On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam  wrote:
> Hi,
> Thanks for your "tip" it looks that something changed - I still don't know
> if it is ok.
>
> My nodes started to do more compaction, but it looks that some compactions
> are really slow.
> In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput to
> 999, but I do not see difference.
>
> Can we check something more? Or do you have any method to monitor progress
> with small files?
>
> Regards
>
> On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar 
> wrote:
>>
>> HI,
>>
>> Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
>> the solution...
>> The number of SSTables decreased from many thousands to a number below
>> a hundred and the SSTables are now much bigger with several gigabytes
>> (most of them).
>>
>> Cheers,
>>
>> Roni Balthazar
>>
>>
>>
>> On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam  wrote:
>> > After some diagnostic ( we didn't set yet cold_reads_to_omit ).
>> > Compaction
>> > are running but VERY slow with "idle" IO.
>> >
>> > We had a lot of "Data files" in Cassandra. In DC_A it is about ~12
>> > (only
>> > xxx-Data.db) in DC_B has only ~4000.
>> >
>> > I don't know if this change anything but:
>> > 1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big
>> > ones,
>> > but most is really small (almost 1 files are less then 100mb).
>> > 2) in DC_B avg size of Data.db is much bigger ~260mb.
>> >
>> > Do you think that above flag will help us?
>> >
>> >
>> > On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam  wrote:
>> >>
>> >> I set setcompactionthroughput 999 permanently and it doesn't change
>> >> anything. IO is still same. CPU is idle.
>> >>
>> >> On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar
>> >> 
>> >> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> You can run "nodetool compactionstats" to view statistics on
>> >>> compactions.
>> >>> Setting cold_reads_to_omit to 0.0 can help to reduce the number of
>> >>> SSTables when you use Size-Tiered compaction.
>> >>> You can also create a cron job to increase the value of
>> >>> setcompactionthroughput during the night or when your IO is not busy.
>> >>>
>> >>> From http://wiki.apache.org/cassandra/NodeTool:
>> >>> 0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
>> >>> 0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16
>> >>>
>> >>> Cheers,
>> >>>
>> >>> Roni Balthazar
>> >>>
>> >>> On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam  wrote:
>> >>> > One think I do not understand. In my case compaction is running
>> >>> > permanently.
>> >>> > Is there a way to check which compaction is pending? The only
>> >>> > information is
>> >>> > about total count.
>> >>> >
>> >>> >
>> >>> > On Monday, February 16, 2015, Ja Sam  wrote:
>> >>> >>
>> >>> >> Of couse I made a mistake. I am using 2.1.2. Anyway night build is
>> >>> >> available from
>> >>> >> http://cassci.datastax.com/job/cassandra-2.1/
>> >>> >>
>> >>> >> I read about cold_reads_to_omit It looks promising. Should I set
>> >>> >> also
>> >>> >> compaction throughput?
>> >>> >>
>> >>> >> p.s. I am really sad that I didn't read this before:
>> >>> >>
>> >>> >>
>> >>> >> https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> On Monday, February 16, 2015, Carlos Rolo  wrote:
>> >>> >>>
>> >>> >>> Hi 100% in agreement with Roland,
>> >>> >>>
>> >>> >>> 2.1.x series is a pain! I would never recommend the current 2.1.x
>> >>> >>> series
>> >>> >>> for production.
>> >>> >>>
>> >>> >>> Clocks is a pain, and check your connectivity! Also check tpstats
>> >>> >>> to
>> >>> >>> see
>> >>> >>> if your threadpools are being overrun.
>> >>> >>>
>> >>> >>> Regards,
>> >>> >>>
>> >>> >>> Carlos Juzarte Rolo
>> >>> >>> Cassandra Consultant
>> >>> >>>
>> >>> >>> Pythian - Love your data
>> >>> >>>
>> >>> >>> rolo@pythian | Twitter: cjrolo | Linkedin:
>> >>> >>> linkedin.com/in/carlosjuzarterolo
>> >>> >>> Tel: 1649
>> >>> >>> www.pythian.com
>> >>> >>>
>> >>> >>> On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer
>> >>> >>>  wrote:
>> >>> 
>> >>>  Hi,
>> >>> 
>> >>>  1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested
>> >>>  by
>> >>>  Al
>> >>>  Tobey from DataStax)
>> >>>  7) minimal reads (usually none, sometimes few)
>> >>> 
>> >>>  those two points keep me

Deleting Statistics.db at startup

2015-02-18 Thread Tomer Pearl

Hello,

I have received the following error
ERROR [SSTableBatchOpen:2] 2015-01-19 13:55:28,478 CassandraDaemon.java (line 
196) Exception in thread Thread[SSTableBatchOpen:2,5,main]
java.lang.OutOfMemoryError: Java heap space
at 
org.apache.cassandra.utils.EstimatedHistogram$EstimatedHistogramSerializer.deserialize(EstimatedHistogram.java:335)
at 
org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:462)
at 
org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:448)
at 
org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:432)
at 
org.apache.cassandra.io.sstable.SSTableReader.openMetadata(SSTableReader.java:225)
at 
org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:194)
at 
org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:184)
at 
org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:264)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

I have found a solution here:
http://www.mail-archive.com/user%40cassandra.apache.org/msg23682.html
Which advice to delete the statistics.db file.

My question is what is the consequences of deleting this file every time the 
node is starting up? Performance wise or other.

Thanks,
Tomer.

Re: Many pending compactions