Re: IO scheduler for SSDs on EC2?

2015-03-15 Thread Roni Balthazar
Hi Ali,

The best practice is to use the noop scheduler on array of SSDs behind
your block device (Hardware RAID controller).
If you are using only one SSD disk, the deadline scheduler is the best
choice to reduce IO latency.
It is not recommended to set cfq on SSDs disks.

Regards,

Roni Balthazar

On 15 March 2015 at 09:03, Ali Akhtar ali.rac...@gmail.com wrote:
 I was watching a talk recently on Elasticsearch performance in EC2, and they
 recommended setting the IO scheduler to noop for SSDs. Is that the case for
 Cassandra as well, or is it recommended to keep the default 'deadline'
 scheduler for Cassandra?

 Thanks.


Downgrade Cassandra from 2.1.x to 2.0.x

2015-03-06 Thread Roni Balthazar
Hi there,

What is the best way to downgrade a C* 2.1.3 cluster to the stable 2.0.12?
I know it's not supported, but we are getting too many issues with the 2.1.x...
It is leading us to think that the best solution is to use the stable version.
Is there a safe way to do that?

Cheers,

Roni


OOM and high SSTables count

2015-03-04 Thread Roni Balthazar
$RunnableAdapter.call(Executors.java:511)
~[na:1.8.0_31]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
~[apache-cassandra-2.1.3.jar:2.1.3]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
~[apache-cassandra-2.1.3.jar:2.1.3]

So I am asking how to debug this issue and what are the best practices
in this situation?

Regards,

Roni


Re: Possible problem with disk latency

2015-02-25 Thread Roni Balthazar
Hi Ja,

How are the pending compactions distributed between the nodes?
Run nodetool compactionstats on all of your nodes and check if the
pendings tasks are balanced or they are concentrated in only few
nodes.
You also can check the if the SSTable count is balanced running
nodetool cfstats on your nodes.

Cheers,

Roni Balthazar



On 25 February 2015 at 13:29, Ja Sam ptrstp...@gmail.com wrote:
 I do NOT have SSD. I have normal HDD group by JBOD.
 My CF have SizeTieredCompactionStrategy
 I am using local quorum for reads and writes. To be precise I have a lot of
 writes and almost 0 reads.
 I changed cold_reads_to_omit to 0.0 as someone suggest me. I used set
 compactionthrouput to 999.

 So if my disk are idle, my CPU is less then 40%, I have some free RAM - why
 SSTables count is growing? How I can speed up compactions?

 On Wed, Feb 25, 2015 at 5:16 PM, Nate McCall n...@thelastpickle.com wrote:



 If You could be so kind and validate above and give me an answer is my
 disk are real problems or not? And give me a tip what should I do with above
 cluster? Maybe I have misconfiguration?



 You disks are effectively idle. What consistency level are you using for
 reads and writes?

 Actually, 'await' is sort of weirdly high for idle SSDs. Check your
 interrupt mappings (cat /proc/interrupts) and make sure the interrupts are
 not being stacked on a single CPU.





Re: Possible problem with disk latency

2015-02-25 Thread Roni Balthazar
Hi Piotr,

Are your repairs finishing without errors?

Regards,

Roni Balthazar

On 25 February 2015 at 15:43, Ja Sam ptrstp...@gmail.com wrote:
 Hi, Roni,
 They aren't exactly balanced but as I wrote before they are in range from
 2500-6000.
 If you need exactly data I will check them tomorrow morning. But all nodes
 in AGRAF have small increase of pending compactions during last week, which
 is wrong direction

 I will check in the morning get compaction throuput, but my feeling about
 this parameter is that it doesn't change anything.

 Regards
 Piotr




 On Wed, Feb 25, 2015 at 7:34 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 Hi Piotr,

 What about the nodes on AGRAF? Are the pending tasks balanced between
 this DC nodes as well?
 You can check the pending compactions on each node.

 Also try to run nodetool getcompactionthroughput on all nodes and
 check if the compaction throughput is set to 999.

 Cheers,

 Roni Balthazar

 On 25 February 2015 at 14:47, Ja Sam ptrstp...@gmail.com wrote:
  Hi Roni,
 
  It is not balanced. As I wrote you last week I have problems only in DC
  in
  which we writes (on screen it is named as AGRAF:
  https://drive.google.com/file/d/0B4N_AbBPGGwLR21CZk9OV1kxVDA/view). The
  problem is on ALL nodes in this dc.
  In second DC (ZETO) only one node have more than 30 SSTables and pending
  compactions are decreasing to zero.
 
  In AGRAF the minimum pending compaction is 2500 , maximum is 6000 (avg
  on
  screen from opscenter is less then 5000)
 
 
  Regards
  Piotrek.
 
  p.s. I don't know why my mail client display my name as Ja Sam instead
  of
  Piotr Stapp, but this doesn't change anything :)
 
 
  On Wed, Feb 25, 2015 at 5:45 PM, Roni Balthazar
  ronibaltha...@gmail.com
  wrote:
 
  Hi Ja,
 
  How are the pending compactions distributed between the nodes?
  Run nodetool compactionstats on all of your nodes and check if the
  pendings tasks are balanced or they are concentrated in only few
  nodes.
  You also can check the if the SSTable count is balanced running
  nodetool cfstats on your nodes.
 
  Cheers,
 
  Roni Balthazar
 
 
 
  On 25 February 2015 at 13:29, Ja Sam ptrstp...@gmail.com wrote:
   I do NOT have SSD. I have normal HDD group by JBOD.
   My CF have SizeTieredCompactionStrategy
   I am using local quorum for reads and writes. To be precise I have a
   lot
   of
   writes and almost 0 reads.
   I changed cold_reads_to_omit to 0.0 as someone suggest me. I used
   set
   compactionthrouput to 999.
  
   So if my disk are idle, my CPU is less then 40%, I have some free RAM
   -
   why
   SSTables count is growing? How I can speed up compactions?
  
   On Wed, Feb 25, 2015 at 5:16 PM, Nate McCall n...@thelastpickle.com
   wrote:
  
  
  
   If You could be so kind and validate above and give me an answer is
   my
   disk are real problems or not? And give me a tip what should I do
   with
   above
   cluster? Maybe I have misconfiguration?
  
  
  
   You disks are effectively idle. What consistency level are you using
   for
   reads and writes?
  
   Actually, 'await' is sort of weirdly high for idle SSDs. Check your
   interrupt mappings (cat /proc/interrupts) and make sure the
   interrupts
   are
   not being stacked on a single CPU.
  
  
  
 
 




Re: Possible problem with disk latency

2015-02-25 Thread Roni Balthazar
Hi,

Check how many active CompactionExecutors is showing in nodetool tpstats.
Maybe your concurrent_compactors is too low. Enforce 1 per CPU core,
even it's the default value on 2.1.
Some of our nodes were running with 2 compactors, but we have an 8 core CPU...
After that monitor your nodes to be sure that the value is not too
high. You may get too much IO if you increase concurrent compactors
when using spinning disks.

Regards,

Roni Balthazar

On 25 February 2015 at 16:37, Ja Sam ptrstp...@gmail.com wrote:
 Hi,
 One more thing. Hinted Handoff for last week for all nodes was less than 5.
 For me every READ is a problem because it must open too many files (3
 SSTables), which occurs as an error in reads, repairs, etc.
 Regards
 Piotrek

 On Wed, Feb 25, 2015 at 8:32 PM, Ja Sam ptrstp...@gmail.com wrote:

 Hi,
 It is not obvious, because data is replicated to second data center. We
 check it manually for random records we put into Cassandra and we find all
 of them in secondary DC.
 We know about every single GC failure, but this doesn't change anything.
 The problem with GC failure is only one: restart the node. For few days we
 do not have GC errors anymore. It looks for me like memory leaks.
 We use Chef.

 By MANUAL compaction you mean running nodetool compact?  What does it
 change to permanently running compactions?

 Regards
 Piotrek

 On Wed, Feb 25, 2015 at 8:13 PM, daemeon reiydelle daeme...@gmail.com
 wrote:

 I think you may have a vicious circle of errors: because your data is not
 properly replicated to the neighbour, it is not replicating to the secondary
 data center (yeah, obvious). I would suspect the GC errors are (also
 obviously) the result of a backlog of compactions that take out the
 neighbour (assuming replication of 3, that means each neighbour is
 participating in compaction from at least one other node besides the primary
 you are looking at (and can of course be much more, depending on e.g. vnode
 count if used).

 What happens is that when a node fails due to a GC error (can't reclaim
 space), that causes a cascade of other errors, as you see. Might I suggest
 you have someone in devops with monitoring experience install a monitoring
 tool that will notify you of EVERY SINGLE java GC failure event? Your DevOps
 team may have a favorite log shipping/monitoring tool, could use e.g. Puppet

 I think you may have to go through a MANUAL, table by table compaction.




 ...
 “Life should not be a journey to the grave with the intention of arriving
 safely in a
 pretty and well preserved body, but rather to skid in broadside in a
 cloud of smoke,
 thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a
 Ride!”
 - Hunter Thompson

 Daemeon C.M. Reiydelle
 USA (+1) 415.501.0198
 London (+44) (0) 20 8144 9872

 On Wed, Feb 25, 2015 at 11:01 AM, Ja Sam ptrstp...@gmail.com wrote:

 Hi Roni,
 The repair results is following (we run it Friday): Cannot proceed on
 repair because a neighbor (/192.168.61.201) is dead: session failed

 But to be honest the neighbor did not died. It seemed to trigger a
 series of full GC events on the initiating node. The results form logs are:

 [2015-02-20 16:47:54,884] Starting repair command #2, repairing 7 ranges
 for keyspace prem_maelstrom_2 (parallelism=PARALLEL, full=false)
 [2015-02-21 02:21:55,640] Lost notification. You should check server log
 for repair status of keyspace prem_maelstrom_2
 [2015-02-21 02:22:55,642] Lost notification. You should check server log
 for repair status of keyspace prem_maelstrom_2
 [2015-02-21 02:23:55,642] Lost notification. You should check server log
 for repair status of keyspace prem_maelstrom_2
 [2015-02-21 02:24:55,644] Lost notification. You should check server log
 for repair status of keyspace prem_maelstrom_2
 [2015-02-21 04:41:08,607] Repair session
 d5d01dd0-b917-11e4-bc97-e9a66e5b2124 for range
 (85070591730234615865843651857942052874,102084710076281535261119195933814292480]
 failed with error org.apache.cassandra.exceptions.RepairException: [repair
 #d5d01dd0-b917-11e4-bc97-e9a66e5b2124 on prem_maelstrom_2/customer_events,
 (85070591730234615865843651857942052874,102084710076281535261119195933814292480]]
 Sync failed between /192.168.71.196 and /192.168.61.199
 [2015-02-21 04:41:08,608] Repair session
 eb8d8d10-b967-11e4-bc97-e9a66e5b2124 for range
 (68056473384187696470568107782069813248,85070591730234615865843651857942052874]
 failed with error java.io.IOException: Endpoint /192.168.61.199 died
 [2015-02-21 04:41:08,608] Repair session
 c48aef00-b971-11e4-bc97-e9a66e5b2124 for range (0,10] failed with error
 java.io.IOException: Cannot proceed on repair because a neighbor
 (/192.168.61.201) is dead: session failed
 [2015-02-21 04:41:08,609] Repair session
 c48d38f0-b971-11e4-bc97-e9a66e5b2124 for range
 (42535295865117307932921825928971026442,68056473384187696470568107782069813248]
 failed with error java.io.IOException: Cannot proceed on repair because a
 neighbor

Re: Many pending compactions

2015-02-18 Thread Roni Balthazar
Try repair -pr on all nodes.

If after that you still have issues, you can try to rebuild the SSTables using 
nodetool upgradesstables or scrub.

Regards,

Roni Balthazar

 Em 18/02/2015, às 14:13, Ja Sam ptrstp...@gmail.com escreveu:
 
 ad 3)  I did this already yesterday (setcompactionthrouput also). But still 
 SSTables are increasing.
 
 ad 1) What do you think I should use -pr or try to use incremental?
 
 
 
 On Wed, Feb 18, 2015 at 4:54 PM, Roni Balthazar ronibaltha...@gmail.com 
 wrote:
 You are right... Repair makes the data consistent between nodes.
 
 I understand that you have 2 issues going on.
 
 You need to run repair periodically without errors and need to decrease the 
 numbers of compactions pending.
 
 So I suggest:
 
 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can use 
 incremental repairs. There were some bugs on 2.1.2.
 2) Run cleanup on all nodes
 3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0, and 
 increase setcompactionthroughput for some time and see if the number of 
 SSTables is going down.
 
 Let us know what errors are you getting when running repairs.
 
 Regards,
 
 Roni Balthazar
 
 
 On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam ptrstp...@gmail.com wrote:
 Can you explain me what is the correlation between growing SSTables and 
 repair? 
 I was sure, until your  mail, that repair is only to make data consistent 
 between nodes.
 
 Regards
 
 
 
 On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar ronibaltha...@gmail.com 
 wrote:
 Which error are you getting when running repairs?
 You need to run repair on your nodes within gc_grace_seconds (eg:
 weekly). They have data that are not read frequently. You can run
 repair -pr on all nodes. Since you do not have deletes, you will not
 have trouble with that. If you have deletes, it's better to increase
 gc_grace_seconds before the repair.
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 After repair, try to run a nodetool cleanup.
 
 Check if the number of SSTables goes down after that... Pending
 compactions must decrease as well...
 
 Cheers,
 
 Roni Balthazar
 
 
 
 
 On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote:
  1) we tried to run repairs but they usually does not succeed. But we had
  Leveled compaction before. Last week we ALTER tables to STCS, because 
  guys
  from DataStax suggest us that we should not use Leveled and alter tables 
  in
  STCS, because we don't have SSD. After this change we did not run any
  repair. Anyway I don't think it will change anything in SSTable count - 
  if I
  am wrong please give me an information
 
  2) I did this. My tables are 99% write only. It is audit system
 
  3) Yes I am using default values
 
  4) In both operations I am using LOCAL_QUORUM.
 
  I am almost sure that READ timeout happens because of too much SSTables.
  Anyway firstly I would like to fix to many pending compactions. I still
  don't know how to speed up them.
 
 
  On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar ronibaltha...@gmail.com
  wrote:
 
  Are you running repairs within gc_grace_seconds? (default is 10 days)
 
  http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 
  Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
  that you do not read often.
 
  Are you using default values for the properties
  min_compaction_threshold(4) and max_compaction_threshold(32)?
 
  Which Consistency Level are you using for reading operations? Check if
  you are not reading from DC_B due to your Replication Factor and CL.
 
  http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
 
 
  Cheers,
 
  Roni Balthazar
 
  On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote:
   I don't have problems with DC_B (replica) only in DC_A(my system write
   only
   to it) I have read timeouts.
  
   I checked in OpsCenter SSTable count  and I have:
   1) in DC_A  same +-10% for last week, a small increase for last 24h 
   (it
   is
   more than 15000-2 SSTables depends on node)
   2) in DC_B last 24h shows up to 50% decrease, which give nice
   prognostics.
   Now I have less then 1000 SSTables
  
   What did you measure during system optimizations? Or do you have an 
   idea
   what more should I check?
   1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
   2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are
   spikes
   3) system RAM usage is almost full
   4) In Total Bytes Compacted most most lines are below 3MB/s. For total
   DC_A
   it is less than 10MB/s, in DC_B it looks much better (avg is like
   17MB/s)
  
   something else?
  
  
  
   On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   Hi,
  
   You can check if the number of SSTables is decreasing. Look for the
   SSTable count information of your tables using nodetool

Re: Many pending compactions

2015-02-18 Thread Roni Balthazar
Are you running repairs within gc_grace_seconds? (default is 10 days)
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html

Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
that you do not read often.

Are you using default values for the properties
min_compaction_threshold(4) and max_compaction_threshold(32)?

Which Consistency Level are you using for reading operations? Check if
you are not reading from DC_B due to your Replication Factor and CL.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html


Cheers,

Roni Balthazar

On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote:
 I don't have problems with DC_B (replica) only in DC_A(my system write only
 to it) I have read timeouts.

 I checked in OpsCenter SSTable count  and I have:
 1) in DC_A  same +-10% for last week, a small increase for last 24h (it is
 more than 15000-2 SSTables depends on node)
 2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics.
 Now I have less then 1000 SSTables

 What did you measure during system optimizations? Or do you have an idea
 what more should I check?
 1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
 2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are
 spikes
 3) system RAM usage is almost full
 4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A
 it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s)

 something else?



 On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 Hi,

 You can check if the number of SSTables is decreasing. Look for the
 SSTable count information of your tables using nodetool cfstats.
 The compaction history can be viewed using nodetool
 compactionhistory.

 About the timeouts, check this out:
 http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
 Also try to run nodetool tpstats to see the threads statistics. It
 can lead you to know if you are having performance problems. If you
 are having too many pending tasks or dropped messages, maybe will you
 need to tune your system (eg: driver's timeout, concurrent reads and
 so on)

 Regards,

 Roni Balthazar

 On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote:
  Hi,
  Thanks for your tip it looks that something changed - I still don't
  know
  if it is ok.
 
  My nodes started to do more compaction, but it looks that some
  compactions
  are really slow.
  In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput
  to
  999, but I do not see difference.
 
  Can we check something more? Or do you have any method to monitor
  progress
  with small files?
 
  Regards
 
  On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar
  ronibaltha...@gmail.com
  wrote:
 
  HI,
 
  Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
  the solution...
  The number of SSTables decreased from many thousands to a number below
  a hundred and the SSTables are now much bigger with several gigabytes
  (most of them).
 
  Cheers,
 
  Roni Balthazar
 
 
 
  On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote:
   After some diagnostic ( we didn't set yet cold_reads_to_omit ).
   Compaction
   are running but VERY slow with idle IO.
  
   We had a lot of Data files in Cassandra. In DC_A it is about
   ~12
   (only
   xxx-Data.db) in DC_B has only ~4000.
  
   I don't know if this change anything but:
   1) in DC_A avg size of Data.db file is ~13 mb. I have few a really
   big
   ones,
   but most is really small (almost 1 files are less then 100mb).
   2) in DC_B avg size of Data.db is much bigger ~260mb.
  
   Do you think that above flag will help us?
  
  
   On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote:
  
   I set setcompactionthroughput 999 permanently and it doesn't change
   anything. IO is still same. CPU is idle.
  
   On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   Hi,
  
   You can run nodetool compactionstats to view statistics on
   compactions.
   Setting cold_reads_to_omit to 0.0 can help to reduce the number of
   SSTables when you use Size-Tiered compaction.
   You can also create a cron job to increase the value of
   setcompactionthroughput during the night or when your IO is not
   busy.
  
   From http://wiki.apache.org/cassandra/NodeTool:
   0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
   0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16
  
   Cheers,
  
   Roni Balthazar
  
   On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com
   wrote:
One think I do not understand. In my case compaction is running
permanently.
Is there a way to check which compaction is pending? The only
information is
about total count.
   
   
On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote

Re: Many pending compactions

2015-02-18 Thread Roni Balthazar
You are right... Repair makes the data consistent between nodes.

I understand that you have 2 issues going on.

You need to run repair periodically without errors and need to decrease the
numbers of compactions pending.

So I suggest:

1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can
use incremental repairs. There were some bugs on 2.1.2.
2) Run cleanup on all nodes
3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0,
and increase setcompactionthroughput for some time and see if the number of
SSTables is going down.

Let us know what errors are you getting when running repairs.

Regards,

Roni Balthazar


On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam ptrstp...@gmail.com wrote:

 Can you explain me what is the correlation between growing SSTables and
 repair?
 I was sure, until your  mail, that repair is only to make data consistent
 between nodes.

 Regards


 On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 Which error are you getting when running repairs?
 You need to run repair on your nodes within gc_grace_seconds (eg:
 weekly). They have data that are not read frequently. You can run
 repair -pr on all nodes. Since you do not have deletes, you will not
 have trouble with that. If you have deletes, it's better to increase
 gc_grace_seconds before the repair.

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 After repair, try to run a nodetool cleanup.

 Check if the number of SSTables goes down after that... Pending
 compactions must decrease as well...

 Cheers,

 Roni Balthazar




 On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote:
  1) we tried to run repairs but they usually does not succeed. But we had
  Leveled compaction before. Last week we ALTER tables to STCS, because
 guys
  from DataStax suggest us that we should not use Leveled and alter
 tables in
  STCS, because we don't have SSD. After this change we did not run any
  repair. Anyway I don't think it will change anything in SSTable count -
 if I
  am wrong please give me an information
 
  2) I did this. My tables are 99% write only. It is audit system
 
  3) Yes I am using default values
 
  4) In both operations I am using LOCAL_QUORUM.
 
  I am almost sure that READ timeout happens because of too much SSTables.
  Anyway firstly I would like to fix to many pending compactions. I still
  don't know how to speed up them.
 
 
  On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar 
 ronibaltha...@gmail.com
  wrote:
 
  Are you running repairs within gc_grace_seconds? (default is 10 days)
 
 
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 
  Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
  that you do not read often.
 
  Are you using default values for the properties
  min_compaction_threshold(4) and max_compaction_threshold(32)?
 
  Which Consistency Level are you using for reading operations? Check if
  you are not reading from DC_B due to your Replication Factor and CL.
 
 
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
 
 
  Cheers,
 
  Roni Balthazar
 
  On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote:
   I don't have problems with DC_B (replica) only in DC_A(my system
 write
   only
   to it) I have read timeouts.
  
   I checked in OpsCenter SSTable count  and I have:
   1) in DC_A  same +-10% for last week, a small increase for last 24h
 (it
   is
   more than 15000-2 SSTables depends on node)
   2) in DC_B last 24h shows up to 50% decrease, which give nice
   prognostics.
   Now I have less then 1000 SSTables
  
   What did you measure during system optimizations? Or do you have an
 idea
   what more should I check?
   1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
   2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there
 are
   spikes
   3) system RAM usage is almost full
   4) In Total Bytes Compacted most most lines are below 3MB/s. For
 total
   DC_A
   it is less than 10MB/s, in DC_B it looks much better (avg is like
   17MB/s)
  
   something else?
  
  
  
   On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   Hi,
  
   You can check if the number of SSTables is decreasing. Look for the
   SSTable count information of your tables using nodetool cfstats.
   The compaction history can be viewed using nodetool
   compactionhistory.
  
   About the timeouts, check this out:
  
  
 http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
   Also try to run nodetool tpstats to see the threads statistics. It
   can lead you to know if you are having performance problems. If you
   are having too many pending tasks or dropped messages, maybe will
 you
   need to tune your system (eg: driver's timeout, concurrent reads and
   so on)
  
   Regards,
  
   Roni Balthazar

Re: Many pending compactions

2015-02-18 Thread Roni Balthazar
Hi,

You can check if the number of SSTables is decreasing. Look for the
SSTable count information of your tables using nodetool cfstats.
The compaction history can be viewed using nodetool
compactionhistory.

About the timeouts, check this out:
http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
Also try to run nodetool tpstats to see the threads statistics. It
can lead you to know if you are having performance problems. If you
are having too many pending tasks or dropped messages, maybe will you
need to tune your system (eg: driver's timeout, concurrent reads and
so on)

Regards,

Roni Balthazar

On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote:
 Hi,
 Thanks for your tip it looks that something changed - I still don't know
 if it is ok.

 My nodes started to do more compaction, but it looks that some compactions
 are really slow.
 In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput to
 999, but I do not see difference.

 Can we check something more? Or do you have any method to monitor progress
 with small files?

 Regards

 On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 HI,

 Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
 the solution...
 The number of SSTables decreased from many thousands to a number below
 a hundred and the SSTables are now much bigger with several gigabytes
 (most of them).

 Cheers,

 Roni Balthazar



 On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote:
  After some diagnostic ( we didn't set yet cold_reads_to_omit ).
  Compaction
  are running but VERY slow with idle IO.
 
  We had a lot of Data files in Cassandra. In DC_A it is about ~12
  (only
  xxx-Data.db) in DC_B has only ~4000.
 
  I don't know if this change anything but:
  1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big
  ones,
  but most is really small (almost 1 files are less then 100mb).
  2) in DC_B avg size of Data.db is much bigger ~260mb.
 
  Do you think that above flag will help us?
 
 
  On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote:
 
  I set setcompactionthroughput 999 permanently and it doesn't change
  anything. IO is still same. CPU is idle.
 
  On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar
  ronibaltha...@gmail.com
  wrote:
 
  Hi,
 
  You can run nodetool compactionstats to view statistics on
  compactions.
  Setting cold_reads_to_omit to 0.0 can help to reduce the number of
  SSTables when you use Size-Tiered compaction.
  You can also create a cron job to increase the value of
  setcompactionthroughput during the night or when your IO is not busy.
 
  From http://wiki.apache.org/cassandra/NodeTool:
  0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
  0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16
 
  Cheers,
 
  Roni Balthazar
 
  On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com wrote:
   One think I do not understand. In my case compaction is running
   permanently.
   Is there a way to check which compaction is pending? The only
   information is
   about total count.
  
  
   On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote:
  
   Of couse I made a mistake. I am using 2.1.2. Anyway night build is
   available from
   http://cassci.datastax.com/job/cassandra-2.1/
  
   I read about cold_reads_to_omit It looks promising. Should I set
   also
   compaction throughput?
  
   p.s. I am really sad that I didn't read this before:
  
  
   https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
  
  
  
   On Monday, February 16, 2015, Carlos Rolo r...@pythian.com wrote:
  
   Hi 100% in agreement with Roland,
  
   2.1.x series is a pain! I would never recommend the current 2.1.x
   series
   for production.
  
   Clocks is a pain, and check your connectivity! Also check tpstats
   to
   see
   if your threadpools are being overrun.
  
   Regards,
  
   Carlos Juzarte Rolo
   Cassandra Consultant
  
   Pythian - Love your data
  
   rolo@pythian | Twitter: cjrolo | Linkedin:
   linkedin.com/in/carlosjuzarterolo
   Tel: 1649
   www.pythian.com
  
   On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer
   r.etzenham...@t-online.de wrote:
  
   Hi,
  
   1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested
   by
   Al
   Tobey from DataStax)
   7) minimal reads (usually none, sometimes few)
  
   those two points keep me repeating an anwser I got. First where
   did
   you
   get 2.1.3 from? Maybe I missed it, I will have a look. But if it
   is
   2.1.2
   whis is the latest released version, that version has many bugs -
   most of
   them I got kicked by while testing 2.1.2. I got many problems
   with
   compactions not beeing triggred on column families not beeing
   read,
   compactions and repairs not beeing completed.  See
  
  
  
  
   https://www.mail-archive.com/search?l=user@cassandra.apache.orgq=subject:%22Re%3A

Re: Many pending compactions

2015-02-18 Thread Roni Balthazar
Which error are you getting when running repairs?
You need to run repair on your nodes within gc_grace_seconds (eg:
weekly). They have data that are not read frequently. You can run
repair -pr on all nodes. Since you do not have deletes, you will not
have trouble with that. If you have deletes, it's better to increase
gc_grace_seconds before the repair.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
After repair, try to run a nodetool cleanup.

Check if the number of SSTables goes down after that... Pending
compactions must decrease as well...

Cheers,

Roni Balthazar




On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote:
 1) we tried to run repairs but they usually does not succeed. But we had
 Leveled compaction before. Last week we ALTER tables to STCS, because guys
 from DataStax suggest us that we should not use Leveled and alter tables in
 STCS, because we don't have SSD. After this change we did not run any
 repair. Anyway I don't think it will change anything in SSTable count - if I
 am wrong please give me an information

 2) I did this. My tables are 99% write only. It is audit system

 3) Yes I am using default values

 4) In both operations I am using LOCAL_QUORUM.

 I am almost sure that READ timeout happens because of too much SSTables.
 Anyway firstly I would like to fix to many pending compactions. I still
 don't know how to speed up them.


 On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 Are you running repairs within gc_grace_seconds? (default is 10 days)

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html

 Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
 that you do not read often.

 Are you using default values for the properties
 min_compaction_threshold(4) and max_compaction_threshold(32)?

 Which Consistency Level are you using for reading operations? Check if
 you are not reading from DC_B due to your Replication Factor and CL.

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html


 Cheers,

 Roni Balthazar

 On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote:
  I don't have problems with DC_B (replica) only in DC_A(my system write
  only
  to it) I have read timeouts.
 
  I checked in OpsCenter SSTable count  and I have:
  1) in DC_A  same +-10% for last week, a small increase for last 24h (it
  is
  more than 15000-2 SSTables depends on node)
  2) in DC_B last 24h shows up to 50% decrease, which give nice
  prognostics.
  Now I have less then 1000 SSTables
 
  What did you measure during system optimizations? Or do you have an idea
  what more should I check?
  1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
  2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are
  spikes
  3) system RAM usage is almost full
  4) In Total Bytes Compacted most most lines are below 3MB/s. For total
  DC_A
  it is less than 10MB/s, in DC_B it looks much better (avg is like
  17MB/s)
 
  something else?
 
 
 
  On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar
  ronibaltha...@gmail.com
  wrote:
 
  Hi,
 
  You can check if the number of SSTables is decreasing. Look for the
  SSTable count information of your tables using nodetool cfstats.
  The compaction history can be viewed using nodetool
  compactionhistory.
 
  About the timeouts, check this out:
 
  http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
  Also try to run nodetool tpstats to see the threads statistics. It
  can lead you to know if you are having performance problems. If you
  are having too many pending tasks or dropped messages, maybe will you
  need to tune your system (eg: driver's timeout, concurrent reads and
  so on)
 
  Regards,
 
  Roni Balthazar
 
  On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote:
   Hi,
   Thanks for your tip it looks that something changed - I still don't
   know
   if it is ok.
  
   My nodes started to do more compaction, but it looks that some
   compactions
   are really slow.
   In IO we have idle, CPU is quite ok (30%-40%). We set
   compactionthrouput
   to
   999, but I do not see difference.
  
   Can we check something more? Or do you have any method to monitor
   progress
   with small files?
  
   Regards
  
   On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   HI,
  
   Yes... I had the same issue and setting cold_reads_to_omit to 0.0
   was
   the solution...
   The number of SSTables decreased from many thousands to a number
   below
   a hundred and the SSTables are now much bigger with several
   gigabytes
   (most of them).
  
   Cheers,
  
   Roni Balthazar
  
  
  
   On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com
   wrote:
After some diagnostic ( we didn't set yet cold_reads_to_omit ).
Compaction
are running but VERY slow

Re: Many pending compactions

2015-02-16 Thread Roni Balthazar
Hi,

You can run nodetool compactionstats to view statistics on compactions.
Setting cold_reads_to_omit to 0.0 can help to reduce the number of
SSTables when you use Size-Tiered compaction.
You can also create a cron job to increase the value of
setcompactionthroughput during the night or when your IO is not busy.

From http://wiki.apache.org/cassandra/NodeTool:
0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16

Cheers,

Roni Balthazar

On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com wrote:
 One think I do not understand. In my case compaction is running permanently.
 Is there a way to check which compaction is pending? The only information is
 about total count.


 On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote:

 Of couse I made a mistake. I am using 2.1.2. Anyway night build is
 available from
 http://cassci.datastax.com/job/cassandra-2.1/

 I read about cold_reads_to_omit It looks promising. Should I set also
 compaction throughput?

 p.s. I am really sad that I didn't read this before:
 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/



 On Monday, February 16, 2015, Carlos Rolo r...@pythian.com wrote:

 Hi 100% in agreement with Roland,

 2.1.x series is a pain! I would never recommend the current 2.1.x series
 for production.

 Clocks is a pain, and check your connectivity! Also check tpstats to see
 if your threadpools are being overrun.

 Regards,

 Carlos Juzarte Rolo
 Cassandra Consultant

 Pythian - Love your data

 rolo@pythian | Twitter: cjrolo | Linkedin:
 linkedin.com/in/carlosjuzarterolo
 Tel: 1649
 www.pythian.com

 On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer
 r.etzenham...@t-online.de wrote:

 Hi,

 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al
 Tobey from DataStax)
 7) minimal reads (usually none, sometimes few)

 those two points keep me repeating an anwser I got. First where did you
 get 2.1.3 from? Maybe I missed it, I will have a look. But if it is 2.1.2
 whis is the latest released version, that version has many bugs - most of
 them I got kicked by while testing 2.1.2. I got many problems with
 compactions not beeing triggred on column families not beeing read,
 compactions and repairs not beeing completed.  See


 https://www.mail-archive.com/search?l=user@cassandra.apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger%22o=newestf=1
 https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html

 Apart from that, how are those both datacenters connected? Maybe there
 is a bottleneck.

 Also do you have ntp up and running on all nodes to keep all clocks in
 thight sync?

 Note: I'm no expert (yet) - just sharing my 2 cents.

 Cheers,
 Roland



 --






Re: High read latency after data volume increased

2015-01-09 Thread Roni Balthazar
Hi there,

The compaction remains running with our workload.
We are using SATA HDDs RAIDs.

When trying to run cfhistograms on our user_data table, we are getting
this message:
nodetool: Unable to compute when histogram overflowed

Please see what happens when running some queries on this cf:
http://pastebin.com/jbAgDzVK

Thanks,

Roni Balthazar

On Fri, Jan 9, 2015 at 12:03 PM, datastax jlacefi...@datastax.com wrote:
 Hello

   You may not be experiencing versioning issues.   Do you know if compaction
 is keeping up with your workload?  The behavior described in the subject is
 typically associated with compaction falling behind or having a suboptimal
 compaction strategy configured.   What does the output of nodetool
 cfhistograms keyspace table look like for a table that is experiencing
 this issue?  Also, what type of disks are you using on the nodes?

 Sent from my iPad

 On Jan 9, 2015, at 8:55 AM, Brian Tarbox briantar...@gmail.com wrote:

 C* seems to have more than its share of version x doesn't work, use version
 y  type issues

 On Thu, Jan 8, 2015 at 2:23 PM, Robert Coli rc...@eventbrite.com wrote:

 On Thu, Jan 8, 2015 at 11:14 AM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 We are using C* 2.1.2 with 2 DCs. 30 nodes DC1 and 10 nodes DC2.


 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/

 2.1.2 in particular is known to have significant issues. You'd be better
 off running 2.1.1 ...

 =Rob





 --
 http://about.me/BrianTarbox


Re: High read latency after data volume increased

2015-01-08 Thread Roni Balthazar
Hi Robert,

We downgraded to 2.1.1, but got the very same result. The read latency is
still high, but we figured out that it happens only using a specific
keyspace.
Please see the graphs below...

​
Trying another keyspace with 600+ reads/sec, we are getting the acceptable
~30ms read latency.

Let me know if I need to provide more information.

Thanks,

Roni Balthazar

On Thu, Jan 8, 2015 at 5:23 PM, Robert Coli rc...@eventbrite.com wrote:

 On Thu, Jan 8, 2015 at 11:14 AM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 We are using C* 2.1.2 with 2 DCs. 30 nodes DC1 and 10 nodes DC2.


 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/

 2.1.2 in particular is known to have significant issues. You'd be better
 off running 2.1.1 ...

 =Rob




High read latency after data volume increased

2015-01-08 Thread Roni Balthazar
Hi there,

We are using C* 2.1.2 with 2 DCs. 30 nodes DC1 and 10 nodes DC2.

While our data volume is increasing (34 TB now), we are running into
some problems:

1) Read latency is around 1000 ms when running 600 reads/sec (DC1
CL.LOCAL_ONE). At the same time the load average is about 20-30 on all
DC1 nodes(8 cores CPU - 32 GB RAM). C* starts timing out connections.
Still in this scenario OpsCenter has some issues as well. Opscenter
resets all Graphs layout and backs to the default layout on every
refresh. It doesn't back to normal after the load decrease. I only
managed to put OpsCenter to it's normal behavior after reinstalling
it.
Just for reference, we are using SATA HDDs on all nodes and running
hdparm to check disk performance under this load, some nodes are
reporting very low read rates (under 10 MB/sec), while others above
100 MB/sec. Under low load average this rate is above 250 MB/sec.

2) Repair takes at least 4-5 days to complete. Last repair was 20 days
ago. Running repair under high loads is bringing some nodes down with
the exception: JVMStabilityInspector.java:94 - JVM state determined
to be unstable. Exiting forcefully due to: java.lang.OutOfMemoryError:
Java heap space

Any hints?

Regards,

Roni Balthazar


Re: Operating on large cluster

2014-10-23 Thread Roni Balthazar
Hi,

We use Puppet to manage our Cassandra configuration. (http://puppetlabs.com)

You can use Cluster SSH to send commands to the server as well.

Another good choice is Saltstack.

Regards,

Roni

On Thu, Oct 23, 2014 at 5:18 AM, Alain RODRIGUEZ arodr...@gmail.com wrote:

 Hi,

 I was wondering about how do you guys handle a large cluster (50+
 machines).

 I mean there is sometime you need to change configuration (cassandra.yaml)
 or send a command to one, some or all nodes (cleanup, upgradesstables,
 setstramthoughput or whatever).

 So far we have been using things like custom scripts for repairs or any
 routine maintenance and cssh for specific and one shot actions on the
 cluster. But I guess this doesn't really scale, I guess we coul use pssh
 instead. For configuration changes we use Capistrano that might scale
 properly.

 So I would like to known, what are the methods that operators use on large
 cluster out there ? Have some of you built some open sourced cluster
 management interfaces or scripts that could make things easier while
 operating on large Cassandra clusters ?

 Alain



What will be the steps for adding new nodes

2011-04-15 Thread Roni
I have a 0.6.4 Cassandra cluster of two nodes in full replica (replica
factor 2). I wants to add two more nodes and balance the cluster (replica
factor 2).

I want all of them to be seed's.

 

What should be the simple steps:

1. add the AutoBootstraptrue/AutoBootstrap to all the nodes or only
the new ones?

2. add the Seed[new_node]/Seed to the config file of the old nodes
before adding the new ones?

3. do the old node need to be restarted (if no change is needed in their
config file)?

 

TX,

 

 



What will be the steps for adding new nodes

2011-04-15 Thread Roni
I have a 0.6.4 Cassandra cluster of two nodes in full replica (replica
factor 2). I wants to add two more nodes and balance the cluster (replica
factor 2).

I want all of them to be seed's.

 

What should be the simple steps:

1. add the AutoBootstraptrue/AutoBootstrap to all the nodes or only
the new ones?

2. add the Seed[new_node]/Seed to the config file of the old nodes
before adding the new ones?

3. do the old node need to be restarted (if no change is needed in their
config file)?

 

TX,