Re: Many pending compactions

2015-02-24 Thread Ja Sam
The repair results is following (we run it Friday): Cannot proceed on
repair because a neighbor (/192.168.61.201) is dead: session failed

But to be honest the neighbor did not died. It seemed to trigger a series
of full GC events on the initiating node. The results form logs are:

[2015-02-20 16:47:54,884] Starting repair command #2, repairing 7 ranges
for keyspace prem_maelstrom_2 (parallelism=PARALLEL, full=false)
[2015-02-21 02:21:55,640] Lost notification. You should check server log
for repair status of keyspace prem_maelstrom_2
[2015-02-21 02:22:55,642] Lost notification. You should check server log
for repair status of keyspace prem_maelstrom_2
[2015-02-21 02:23:55,642] Lost notification. You should check server log
for repair status of keyspace prem_maelstrom_2
[2015-02-21 02:24:55,644] Lost notification. You should check server log
for repair status of keyspace prem_maelstrom_2
[2015-02-21 04:41:08,607] Repair session
d5d01dd0-b917-11e4-bc97-e9a66e5b2124 for range
(85070591730234615865843651857942052874,102084710076281535261119195933814292480]
failed with error org.apache.cassandra.exceptions.RepairException: [repair
#d5d01dd0-b917-11e4-bc97-e9a66e5b2124 on prem_maelstrom_2/customer_events,
(85070591730234615865843651857942052874,102084710076281535261119195933814292480]]
Sync failed between /192.168.71.196 and /192.168.61.199
[2015-02-21 04:41:08,608] Repair session
eb8d8d10-b967-11e4-bc97-e9a66e5b2124 for range
(68056473384187696470568107782069813248,85070591730234615865843651857942052874]
failed with error java.io.IOException: Endpoint /192.168.61.199 died
[2015-02-21 04:41:08,608] Repair session
c48aef00-b971-11e4-bc97-e9a66e5b2124 for range (0,10] failed with error
java.io.IOException: Cannot proceed on repair because a neighbor (/
192.168.61.201) is dead: session failed
[2015-02-21 04:41:08,609] Repair session
c48d38f0-b971-11e4-bc97-e9a66e5b2124 for range
(42535295865117307932921825928971026442,68056473384187696470568107782069813248]
failed with error java.io.IOException: Cannot proceed on repair because a
neighbor (/192.168.61.201) is dead: session failed
[2015-02-21 04:41:08,609] Repair session
c48d38f1-b971-11e4-bc97-e9a66e5b2124 for range
(127605887595351923798765477786913079306,136112946768375392941136215564139626496]
failed with error java.io.IOException: Cannot proceed on repair because a
neighbor (/192.168.61.201) is dead: session failed
[2015-02-21 04:41:08,619] Repair session
c48d6000-b971-11e4-bc97-e9a66e5b2124 for range
(136112946768375392941136215564139626496,0] failed with error
java.io.IOException: Cannot proceed on repair because a neighbor (/
192.168.61.201) is dead: session failed
[2015-02-21 04:41:08,620] Repair session
c48d6001-b971-11e4-bc97-e9a66e5b2124 for range
(102084710076281535261119195933814292480,127605887595351923798765477786913079306]
failed with error java.io.IOException: Cannot proceed on repair because a
neighbor (/192.168.61.201) is dead: session failed
[2015-02-21 04:41:08,620] Repair command #2 finished


We tried to run repair one more time. After 24 hour have some streaming
errors. Moreover we have to stop it because we start to have write timeouts
on client :(

We check iostat when we have write timeouts. Example from one node in DC_A
are here:
The file also contains tpstats from all nodes.Nodes starting with z are
in DC_B, rest is in DC_A
Cassandra is data and commit log are on disk dm-XX.

I also read
http://jonathanhui.com/cassandra-performance-tuning-and-monitoring and I
think about:
1) memtable configuration - do you have some suggestion?
2) run INSERT in batch statements - I am not sure if this reduce IO, again
do you have experience with this?

Any tips will be helpful

Regards
Piotrek

On Thu, Feb 19, 2015 at 10:34 AM, Roland Etzenhammer 
r.etzenham...@t-online.de wrote:

 Hi,

 2.1.3 is now the official latest release - I checked this morning and got
 this good surprise. Now it's update time - thanks to all guys involved, if
 I meet anyone one beer from me :-)

 The changelist is rather long:
 https://git1-us-west.apache.org/repos/asf?p=cassandra.git;
 a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-2.1.3

 Hopefully that will solve many of those oddities and not invent to much
 new ones :-)

 Cheers,
 Roland





Re: Many pending compactions

2015-02-19 Thread Roland Etzenhammer

Hi,

2.1.3 is now the official latest release - I checked this morning and 
got this good surprise. Now it's update time - thanks to all guys 
involved, if I meet anyone one beer from me :-)


The changelist is rather long:
https://git1-us-west.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-2.1.3

Hopefully that will solve many of those oddities and not invent to much 
new ones :-)


Cheers,
Roland




Re: Many pending compactions

2015-02-18 Thread Roni Balthazar
Try repair -pr on all nodes.

If after that you still have issues, you can try to rebuild the SSTables using 
nodetool upgradesstables or scrub.

Regards,

Roni Balthazar

 Em 18/02/2015, às 14:13, Ja Sam ptrstp...@gmail.com escreveu:
 
 ad 3)  I did this already yesterday (setcompactionthrouput also). But still 
 SSTables are increasing.
 
 ad 1) What do you think I should use -pr or try to use incremental?
 
 
 
 On Wed, Feb 18, 2015 at 4:54 PM, Roni Balthazar ronibaltha...@gmail.com 
 wrote:
 You are right... Repair makes the data consistent between nodes.
 
 I understand that you have 2 issues going on.
 
 You need to run repair periodically without errors and need to decrease the 
 numbers of compactions pending.
 
 So I suggest:
 
 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can use 
 incremental repairs. There were some bugs on 2.1.2.
 2) Run cleanup on all nodes
 3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0, and 
 increase setcompactionthroughput for some time and see if the number of 
 SSTables is going down.
 
 Let us know what errors are you getting when running repairs.
 
 Regards,
 
 Roni Balthazar
 
 
 On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam ptrstp...@gmail.com wrote:
 Can you explain me what is the correlation between growing SSTables and 
 repair? 
 I was sure, until your  mail, that repair is only to make data consistent 
 between nodes.
 
 Regards
 
 
 
 On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar ronibaltha...@gmail.com 
 wrote:
 Which error are you getting when running repairs?
 You need to run repair on your nodes within gc_grace_seconds (eg:
 weekly). They have data that are not read frequently. You can run
 repair -pr on all nodes. Since you do not have deletes, you will not
 have trouble with that. If you have deletes, it's better to increase
 gc_grace_seconds before the repair.
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 After repair, try to run a nodetool cleanup.
 
 Check if the number of SSTables goes down after that... Pending
 compactions must decrease as well...
 
 Cheers,
 
 Roni Balthazar
 
 
 
 
 On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote:
  1) we tried to run repairs but they usually does not succeed. But we had
  Leveled compaction before. Last week we ALTER tables to STCS, because 
  guys
  from DataStax suggest us that we should not use Leveled and alter tables 
  in
  STCS, because we don't have SSD. After this change we did not run any
  repair. Anyway I don't think it will change anything in SSTable count - 
  if I
  am wrong please give me an information
 
  2) I did this. My tables are 99% write only. It is audit system
 
  3) Yes I am using default values
 
  4) In both operations I am using LOCAL_QUORUM.
 
  I am almost sure that READ timeout happens because of too much SSTables.
  Anyway firstly I would like to fix to many pending compactions. I still
  don't know how to speed up them.
 
 
  On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar ronibaltha...@gmail.com
  wrote:
 
  Are you running repairs within gc_grace_seconds? (default is 10 days)
 
  http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 
  Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
  that you do not read often.
 
  Are you using default values for the properties
  min_compaction_threshold(4) and max_compaction_threshold(32)?
 
  Which Consistency Level are you using for reading operations? Check if
  you are not reading from DC_B due to your Replication Factor and CL.
 
  http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
 
 
  Cheers,
 
  Roni Balthazar
 
  On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote:
   I don't have problems with DC_B (replica) only in DC_A(my system write
   only
   to it) I have read timeouts.
  
   I checked in OpsCenter SSTable count  and I have:
   1) in DC_A  same +-10% for last week, a small increase for last 24h 
   (it
   is
   more than 15000-2 SSTables depends on node)
   2) in DC_B last 24h shows up to 50% decrease, which give nice
   prognostics.
   Now I have less then 1000 SSTables
  
   What did you measure during system optimizations? Or do you have an 
   idea
   what more should I check?
   1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
   2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are
   spikes
   3) system RAM usage is almost full
   4) In Total Bytes Compacted most most lines are below 3MB/s. For total
   DC_A
   it is less than 10MB/s, in DC_B it looks much better (avg is like
   17MB/s)
  
   something else?
  
  
  
   On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   Hi,
  
   You can check if the number of SSTables is decreasing. Look for the
   SSTable count information of your tables using nodetool

Re: Many pending compactions

2015-02-18 Thread Ja Sam
As Al Tobey suggest me I upgraded my 2.1.0 to snaphot version of 2.1.3. I
have now installed exactly this build:
https://cassci.datastax.com/job/cassandra-2.1/912/
I see many compaction which completes, but some of them are really slow.
Maybe I should send some stats form OpsCenter or servers? But it is
difficult to me to choose what is important

Regards



On Wed, Feb 18, 2015 at 6:11 PM, Jake Luciani jak...@gmail.com wrote:

 Ja, Please upgrade to official 2.1.3 we've fixed many things related to
 compaction.  Are you seeing the compactions % complete progress at all?

 On Wed, Feb 18, 2015 at 11:58 AM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 Try repair -pr on all nodes.

 If after that you still have issues, you can try to rebuild the SSTables
 using nodetool upgradesstables or scrub.

 Regards,

 Roni Balthazar

 Em 18/02/2015, às 14:13, Ja Sam ptrstp...@gmail.com escreveu:

 ad 3)  I did this already yesterday (setcompactionthrouput also). But
 still SSTables are increasing.

 ad 1) What do you think I should use -pr or try to use incremental?



 On Wed, Feb 18, 2015 at 4:54 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 You are right... Repair makes the data consistent between nodes.

 I understand that you have 2 issues going on.

 You need to run repair periodically without errors and need to decrease
 the numbers of compactions pending.

 So I suggest:

 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can
 use incremental repairs. There were some bugs on 2.1.2.
 2) Run cleanup on all nodes
 3) Since you have too many cold SSTables, set cold_reads_to_omit to
 0.0, and increase setcompactionthroughput for some time and see if the
 number of SSTables is going down.

 Let us know what errors are you getting when running repairs.

 Regards,

 Roni Balthazar


 On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam ptrstp...@gmail.com wrote:

 Can you explain me what is the correlation between growing SSTables and
 repair?
 I was sure, until your  mail, that repair is only to make data
 consistent between nodes.

 Regards


 On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar 
 ronibaltha...@gmail.com wrote:

 Which error are you getting when running repairs?
 You need to run repair on your nodes within gc_grace_seconds (eg:
 weekly). They have data that are not read frequently. You can run
 repair -pr on all nodes. Since you do not have deletes, you will not
 have trouble with that. If you have deletes, it's better to increase
 gc_grace_seconds before the repair.

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 After repair, try to run a nodetool cleanup.

 Check if the number of SSTables goes down after that... Pending
 compactions must decrease as well...

 Cheers,

 Roni Balthazar




 On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote:
  1) we tried to run repairs but they usually does not succeed. But we
 had
  Leveled compaction before. Last week we ALTER tables to STCS,
 because guys
  from DataStax suggest us that we should not use Leveled and alter
 tables in
  STCS, because we don't have SSD. After this change we did not run any
  repair. Anyway I don't think it will change anything in SSTable
 count - if I
  am wrong please give me an information
 
  2) I did this. My tables are 99% write only. It is audit system
 
  3) Yes I am using default values
 
  4) In both operations I am using LOCAL_QUORUM.
 
  I am almost sure that READ timeout happens because of too much
 SSTables.
  Anyway firstly I would like to fix to many pending compactions. I
 still
  don't know how to speed up them.
 
 
  On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar 
 ronibaltha...@gmail.com
  wrote:
 
  Are you running repairs within gc_grace_seconds? (default is 10
 days)
 
 
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 
  Double check if you set cold_reads_to_omit to 0.0 on tables with
 STCS
  that you do not read often.
 
  Are you using default values for the properties
  min_compaction_threshold(4) and max_compaction_threshold(32)?
 
  Which Consistency Level are you using for reading operations? Check
 if
  you are not reading from DC_B due to your Replication Factor and CL.
 
 
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
 
 
  Cheers,
 
  Roni Balthazar
 
  On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com
 wrote:
   I don't have problems with DC_B (replica) only in DC_A(my system
 write
   only
   to it) I have read timeouts.
  
   I checked in OpsCenter SSTable count  and I have:
   1) in DC_A  same +-10% for last week, a small increase for last
 24h (it
   is
   more than 15000-2 SSTables depends on node)
   2) in DC_B last 24h shows up to 50% decrease, which give nice
   prognostics.
   Now I have less then 1000 SSTables
  
   What did you measure during system optimizations? Or do you have
 an idea

Re: Many pending compactions

2015-02-18 Thread Jake Luciani
Ja, Please upgrade to official 2.1.3 we've fixed many things related to
compaction.  Are you seeing the compactions % complete progress at all?

On Wed, Feb 18, 2015 at 11:58 AM, Roni Balthazar ronibaltha...@gmail.com
wrote:

 Try repair -pr on all nodes.

 If after that you still have issues, you can try to rebuild the SSTables
 using nodetool upgradesstables or scrub.

 Regards,

 Roni Balthazar

 Em 18/02/2015, às 14:13, Ja Sam ptrstp...@gmail.com escreveu:

 ad 3)  I did this already yesterday (setcompactionthrouput also). But
 still SSTables are increasing.

 ad 1) What do you think I should use -pr or try to use incremental?



 On Wed, Feb 18, 2015 at 4:54 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 You are right... Repair makes the data consistent between nodes.

 I understand that you have 2 issues going on.

 You need to run repair periodically without errors and need to decrease
 the numbers of compactions pending.

 So I suggest:

 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can
 use incremental repairs. There were some bugs on 2.1.2.
 2) Run cleanup on all nodes
 3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0,
 and increase setcompactionthroughput for some time and see if the number
 of SSTables is going down.

 Let us know what errors are you getting when running repairs.

 Regards,

 Roni Balthazar


 On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam ptrstp...@gmail.com wrote:

 Can you explain me what is the correlation between growing SSTables and
 repair?
 I was sure, until your  mail, that repair is only to make data
 consistent between nodes.

 Regards


 On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar ronibaltha...@gmail.com
  wrote:

 Which error are you getting when running repairs?
 You need to run repair on your nodes within gc_grace_seconds (eg:
 weekly). They have data that are not read frequently. You can run
 repair -pr on all nodes. Since you do not have deletes, you will not
 have trouble with that. If you have deletes, it's better to increase
 gc_grace_seconds before the repair.

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 After repair, try to run a nodetool cleanup.

 Check if the number of SSTables goes down after that... Pending
 compactions must decrease as well...

 Cheers,

 Roni Balthazar




 On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote:
  1) we tried to run repairs but they usually does not succeed. But we
 had
  Leveled compaction before. Last week we ALTER tables to STCS, because
 guys
  from DataStax suggest us that we should not use Leveled and alter
 tables in
  STCS, because we don't have SSD. After this change we did not run any
  repair. Anyway I don't think it will change anything in SSTable count
 - if I
  am wrong please give me an information
 
  2) I did this. My tables are 99% write only. It is audit system
 
  3) Yes I am using default values
 
  4) In both operations I am using LOCAL_QUORUM.
 
  I am almost sure that READ timeout happens because of too much
 SSTables.
  Anyway firstly I would like to fix to many pending compactions. I
 still
  don't know how to speed up them.
 
 
  On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar 
 ronibaltha...@gmail.com
  wrote:
 
  Are you running repairs within gc_grace_seconds? (default is 10 days)
 
 
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 
  Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
  that you do not read often.
 
  Are you using default values for the properties
  min_compaction_threshold(4) and max_compaction_threshold(32)?
 
  Which Consistency Level are you using for reading operations? Check
 if
  you are not reading from DC_B due to your Replication Factor and CL.
 
 
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
 
 
  Cheers,
 
  Roni Balthazar
 
  On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com
 wrote:
   I don't have problems with DC_B (replica) only in DC_A(my system
 write
   only
   to it) I have read timeouts.
  
   I checked in OpsCenter SSTable count  and I have:
   1) in DC_A  same +-10% for last week, a small increase for last
 24h (it
   is
   more than 15000-2 SSTables depends on node)
   2) in DC_B last 24h shows up to 50% decrease, which give nice
   prognostics.
   Now I have less then 1000 SSTables
  
   What did you measure during system optimizations? Or do you have
 an idea
   what more should I check?
   1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
   2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there
 are
   spikes
   3) system RAM usage is almost full
   4) In Total Bytes Compacted most most lines are below 3MB/s. For
 total
   DC_A
   it is less than 10MB/s, in DC_B it looks much better (avg is like
   17MB/s)
  
   something else?
  
  
  
   On Wed, Feb 18, 2015 at 1:32 PM

Re: Many pending compactions

2015-02-18 Thread Ja Sam
I don't have problems with DC_B (replica) only in DC_A(my system write only
to it) I have read timeouts.

I checked in OpsCenter SSTable count  and I have:
1) in DC_A  same +-10% for last week, a small increase for last 24h (it is
more than 15000-2 SSTables depends on node)
2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics.
Now I have less then 1000 SSTables

What did you measure during system optimizations? Or do you have an idea
what more should I check?
1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are
spikes
3) system RAM usage is almost full
4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A
it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s)

something else?



On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar ronibaltha...@gmail.com
wrote:

 Hi,

 You can check if the number of SSTables is decreasing. Look for the
 SSTable count information of your tables using nodetool cfstats.
 The compaction history can be viewed using nodetool
 compactionhistory.

 About the timeouts, check this out:
 http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
 Also try to run nodetool tpstats to see the threads statistics. It
 can lead you to know if you are having performance problems. If you
 are having too many pending tasks or dropped messages, maybe will you
 need to tune your system (eg: driver's timeout, concurrent reads and
 so on)

 Regards,

 Roni Balthazar

 On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote:
  Hi,
  Thanks for your tip it looks that something changed - I still don't
 know
  if it is ok.
 
  My nodes started to do more compaction, but it looks that some
 compactions
  are really slow.
  In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput
 to
  999, but I do not see difference.
 
  Can we check something more? Or do you have any method to monitor
 progress
  with small files?
 
  Regards
 
  On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar ronibaltha...@gmail.com
 
  wrote:
 
  HI,
 
  Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
  the solution...
  The number of SSTables decreased from many thousands to a number below
  a hundred and the SSTables are now much bigger with several gigabytes
  (most of them).
 
  Cheers,
 
  Roni Balthazar
 
 
 
  On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote:
   After some diagnostic ( we didn't set yet cold_reads_to_omit ).
   Compaction
   are running but VERY slow with idle IO.
  
   We had a lot of Data files in Cassandra. In DC_A it is about ~12
   (only
   xxx-Data.db) in DC_B has only ~4000.
  
   I don't know if this change anything but:
   1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big
   ones,
   but most is really small (almost 1 files are less then 100mb).
   2) in DC_B avg size of Data.db is much bigger ~260mb.
  
   Do you think that above flag will help us?
  
  
   On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote:
  
   I set setcompactionthroughput 999 permanently and it doesn't change
   anything. IO is still same. CPU is idle.
  
   On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   Hi,
  
   You can run nodetool compactionstats to view statistics on
   compactions.
   Setting cold_reads_to_omit to 0.0 can help to reduce the number of
   SSTables when you use Size-Tiered compaction.
   You can also create a cron job to increase the value of
   setcompactionthroughput during the night or when your IO is not
 busy.
  
   From http://wiki.apache.org/cassandra/NodeTool:
   0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
   0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16
  
   Cheers,
  
   Roni Balthazar
  
   On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com
 wrote:
One think I do not understand. In my case compaction is running
permanently.
Is there a way to check which compaction is pending? The only
information is
about total count.
   
   
On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote:
   
Of couse I made a mistake. I am using 2.1.2. Anyway night build
 is
available from
http://cassci.datastax.com/job/cassandra-2.1/
   
I read about cold_reads_to_omit It looks promising. Should I set
also
compaction throughput?
   
p.s. I am really sad that I didn't read this before:
   
   
   
 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
   
   
   
On Monday, February 16, 2015, Carlos Rolo r...@pythian.com
 wrote:
   
Hi 100% in agreement with Roland,
   
2.1.x series is a pain! I would never recommend the current
 2.1.x
series
for production.
   
Clocks is a pain, and check your connectivity! Also check
 tpstats
to
see
if your 

Re: Many pending compactions

2015-02-18 Thread Ja Sam
Hi,
Thanks for your tip it looks that something changed - I still don't know
if it is ok.

My nodes started to do more compaction, but it looks that some compactions
are really slow.
In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput to
999, but I do not see difference.

Can we check something more? Or do you have any method to monitor progress
with small files?

Regards

On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar ronibaltha...@gmail.com
wrote:

 HI,

 Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
 the solution...
 The number of SSTables decreased from many thousands to a number below
 a hundred and the SSTables are now much bigger with several gigabytes
 (most of them).

 Cheers,

 Roni Balthazar



 On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote:
  After some diagnostic ( we didn't set yet cold_reads_to_omit ).
 Compaction
  are running but VERY slow with idle IO.
 
  We had a lot of Data files in Cassandra. In DC_A it is about ~12
 (only
  xxx-Data.db) in DC_B has only ~4000.
 
  I don't know if this change anything but:
  1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big
 ones,
  but most is really small (almost 1 files are less then 100mb).
  2) in DC_B avg size of Data.db is much bigger ~260mb.
 
  Do you think that above flag will help us?
 
 
  On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote:
 
  I set setcompactionthroughput 999 permanently and it doesn't change
  anything. IO is still same. CPU is idle.
 
  On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar 
 ronibaltha...@gmail.com
  wrote:
 
  Hi,
 
  You can run nodetool compactionstats to view statistics on
 compactions.
  Setting cold_reads_to_omit to 0.0 can help to reduce the number of
  SSTables when you use Size-Tiered compaction.
  You can also create a cron job to increase the value of
  setcompactionthroughput during the night or when your IO is not busy.
 
  From http://wiki.apache.org/cassandra/NodeTool:
  0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
  0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16
 
  Cheers,
 
  Roni Balthazar
 
  On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com wrote:
   One think I do not understand. In my case compaction is running
   permanently.
   Is there a way to check which compaction is pending? The only
   information is
   about total count.
  
  
   On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote:
  
   Of couse I made a mistake. I am using 2.1.2. Anyway night build is
   available from
   http://cassci.datastax.com/job/cassandra-2.1/
  
   I read about cold_reads_to_omit It looks promising. Should I set
 also
   compaction throughput?
  
   p.s. I am really sad that I didn't read this before:
  
  
 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
  
  
  
   On Monday, February 16, 2015, Carlos Rolo r...@pythian.com wrote:
  
   Hi 100% in agreement with Roland,
  
   2.1.x series is a pain! I would never recommend the current 2.1.x
   series
   for production.
  
   Clocks is a pain, and check your connectivity! Also check tpstats
 to
   see
   if your threadpools are being overrun.
  
   Regards,
  
   Carlos Juzarte Rolo
   Cassandra Consultant
  
   Pythian - Love your data
  
   rolo@pythian | Twitter: cjrolo | Linkedin:
   linkedin.com/in/carlosjuzarterolo
   Tel: 1649
   www.pythian.com
  
   On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer
   r.etzenham...@t-online.de wrote:
  
   Hi,
  
   1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested
 by
   Al
   Tobey from DataStax)
   7) minimal reads (usually none, sometimes few)
  
   those two points keep me repeating an anwser I got. First where
 did
   you
   get 2.1.3 from? Maybe I missed it, I will have a look. But if it
 is
   2.1.2
   whis is the latest released version, that version has many bugs -
   most of
   them I got kicked by while testing 2.1.2. I got many problems with
   compactions not beeing triggred on column families not beeing
 read,
   compactions and repairs not beeing completed.  See
  
  
  
  
 https://www.mail-archive.com/search?l=user@cassandra.apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger%22o=newestf=1
  
  
 https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html
  
   Apart from that, how are those both datacenters connected? Maybe
   there
   is a bottleneck.
  
   Also do you have ntp up and running on all nodes to keep all
 clocks
   in
   thight sync?
  
   Note: I'm no expert (yet) - just sharing my 2 cents.
  
   Cheers,
   Roland
  
  
  
   --
  
  
  
  
 
 
 



Re: Many pending compactions

2015-02-18 Thread Ja Sam
1) we tried to run repairs but they usually does not succeed. But we had
Leveled compaction before. Last week we ALTER tables to STCS, because guys
from DataStax suggest us that we should not use Leveled and alter tables in
STCS, because we don't have SSD. After this change we did not run any
repair. Anyway I don't think it will change anything in SSTable count - if
I am wrong please give me an information

2) I did this. My tables are 99% write only. It is audit system

3) Yes I am using default values

4) In both operations I am using LOCAL_QUORUM.

I am almost sure that READ timeout happens because of too much SSTables.
Anyway firstly I would like to fix to many pending compactions. I still
don't know how to speed up them.


On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar ronibaltha...@gmail.com
wrote:

 Are you running repairs within gc_grace_seconds? (default is 10 days)

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html

 Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
 that you do not read often.

 Are you using default values for the properties
 min_compaction_threshold(4) and max_compaction_threshold(32)?

 Which Consistency Level are you using for reading operations? Check if
 you are not reading from DC_B due to your Replication Factor and CL.

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html


 Cheers,

 Roni Balthazar

 On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote:
  I don't have problems with DC_B (replica) only in DC_A(my system write
 only
  to it) I have read timeouts.
 
  I checked in OpsCenter SSTable count  and I have:
  1) in DC_A  same +-10% for last week, a small increase for last 24h (it
 is
  more than 15000-2 SSTables depends on node)
  2) in DC_B last 24h shows up to 50% decrease, which give nice
 prognostics.
  Now I have less then 1000 SSTables
 
  What did you measure during system optimizations? Or do you have an idea
  what more should I check?
  1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
  2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are
  spikes
  3) system RAM usage is almost full
  4) In Total Bytes Compacted most most lines are below 3MB/s. For total
 DC_A
  it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s)
 
  something else?
 
 
 
  On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar ronibaltha...@gmail.com
 
  wrote:
 
  Hi,
 
  You can check if the number of SSTables is decreasing. Look for the
  SSTable count information of your tables using nodetool cfstats.
  The compaction history can be viewed using nodetool
  compactionhistory.
 
  About the timeouts, check this out:
 
 http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
  Also try to run nodetool tpstats to see the threads statistics. It
  can lead you to know if you are having performance problems. If you
  are having too many pending tasks or dropped messages, maybe will you
  need to tune your system (eg: driver's timeout, concurrent reads and
  so on)
 
  Regards,
 
  Roni Balthazar
 
  On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote:
   Hi,
   Thanks for your tip it looks that something changed - I still don't
   know
   if it is ok.
  
   My nodes started to do more compaction, but it looks that some
   compactions
   are really slow.
   In IO we have idle, CPU is quite ok (30%-40%). We set
 compactionthrouput
   to
   999, but I do not see difference.
  
   Can we check something more? Or do you have any method to monitor
   progress
   with small files?
  
   Regards
  
   On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   HI,
  
   Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
   the solution...
   The number of SSTables decreased from many thousands to a number
 below
   a hundred and the SSTables are now much bigger with several gigabytes
   (most of them).
  
   Cheers,
  
   Roni Balthazar
  
  
  
   On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com
 wrote:
After some diagnostic ( we didn't set yet cold_reads_to_omit ).
Compaction
are running but VERY slow with idle IO.
   
We had a lot of Data files in Cassandra. In DC_A it is about
~12
(only
xxx-Data.db) in DC_B has only ~4000.
   
I don't know if this change anything but:
1) in DC_A avg size of Data.db file is ~13 mb. I have few a really
big
ones,
but most is really small (almost 1 files are less then 100mb).
2) in DC_B avg size of Data.db is much bigger ~260mb.
   
Do you think that above flag will help us?
   
   
On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com
 wrote:
   
I set setcompactionthroughput 999 permanently and it doesn't
 change
anything. IO is still same. CPU is idle.
   
On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar

Re: Many pending compactions

2015-02-18 Thread Roni Balthazar
Are you running repairs within gc_grace_seconds? (default is 10 days)
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html

Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
that you do not read often.

Are you using default values for the properties
min_compaction_threshold(4) and max_compaction_threshold(32)?

Which Consistency Level are you using for reading operations? Check if
you are not reading from DC_B due to your Replication Factor and CL.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html


Cheers,

Roni Balthazar

On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote:
 I don't have problems with DC_B (replica) only in DC_A(my system write only
 to it) I have read timeouts.

 I checked in OpsCenter SSTable count  and I have:
 1) in DC_A  same +-10% for last week, a small increase for last 24h (it is
 more than 15000-2 SSTables depends on node)
 2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics.
 Now I have less then 1000 SSTables

 What did you measure during system optimizations? Or do you have an idea
 what more should I check?
 1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
 2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are
 spikes
 3) system RAM usage is almost full
 4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A
 it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s)

 something else?



 On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 Hi,

 You can check if the number of SSTables is decreasing. Look for the
 SSTable count information of your tables using nodetool cfstats.
 The compaction history can be viewed using nodetool
 compactionhistory.

 About the timeouts, check this out:
 http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
 Also try to run nodetool tpstats to see the threads statistics. It
 can lead you to know if you are having performance problems. If you
 are having too many pending tasks or dropped messages, maybe will you
 need to tune your system (eg: driver's timeout, concurrent reads and
 so on)

 Regards,

 Roni Balthazar

 On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote:
  Hi,
  Thanks for your tip it looks that something changed - I still don't
  know
  if it is ok.
 
  My nodes started to do more compaction, but it looks that some
  compactions
  are really slow.
  In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput
  to
  999, but I do not see difference.
 
  Can we check something more? Or do you have any method to monitor
  progress
  with small files?
 
  Regards
 
  On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar
  ronibaltha...@gmail.com
  wrote:
 
  HI,
 
  Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
  the solution...
  The number of SSTables decreased from many thousands to a number below
  a hundred and the SSTables are now much bigger with several gigabytes
  (most of them).
 
  Cheers,
 
  Roni Balthazar
 
 
 
  On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote:
   After some diagnostic ( we didn't set yet cold_reads_to_omit ).
   Compaction
   are running but VERY slow with idle IO.
  
   We had a lot of Data files in Cassandra. In DC_A it is about
   ~12
   (only
   xxx-Data.db) in DC_B has only ~4000.
  
   I don't know if this change anything but:
   1) in DC_A avg size of Data.db file is ~13 mb. I have few a really
   big
   ones,
   but most is really small (almost 1 files are less then 100mb).
   2) in DC_B avg size of Data.db is much bigger ~260mb.
  
   Do you think that above flag will help us?
  
  
   On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote:
  
   I set setcompactionthroughput 999 permanently and it doesn't change
   anything. IO is still same. CPU is idle.
  
   On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   Hi,
  
   You can run nodetool compactionstats to view statistics on
   compactions.
   Setting cold_reads_to_omit to 0.0 can help to reduce the number of
   SSTables when you use Size-Tiered compaction.
   You can also create a cron job to increase the value of
   setcompactionthroughput during the night or when your IO is not
   busy.
  
   From http://wiki.apache.org/cassandra/NodeTool:
   0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
   0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16
  
   Cheers,
  
   Roni Balthazar
  
   On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com
   wrote:
One think I do not understand. In my case compaction is running
permanently.
Is there a way to check which compaction is pending? The only
information is
about total count.
   
   
On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote:
   
 

Re: Many pending compactions

2015-02-18 Thread Roni Balthazar
You are right... Repair makes the data consistent between nodes.

I understand that you have 2 issues going on.

You need to run repair periodically without errors and need to decrease the
numbers of compactions pending.

So I suggest:

1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can
use incremental repairs. There were some bugs on 2.1.2.
2) Run cleanup on all nodes
3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0,
and increase setcompactionthroughput for some time and see if the number of
SSTables is going down.

Let us know what errors are you getting when running repairs.

Regards,

Roni Balthazar


On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam ptrstp...@gmail.com wrote:

 Can you explain me what is the correlation between growing SSTables and
 repair?
 I was sure, until your  mail, that repair is only to make data consistent
 between nodes.

 Regards


 On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 Which error are you getting when running repairs?
 You need to run repair on your nodes within gc_grace_seconds (eg:
 weekly). They have data that are not read frequently. You can run
 repair -pr on all nodes. Since you do not have deletes, you will not
 have trouble with that. If you have deletes, it's better to increase
 gc_grace_seconds before the repair.

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 After repair, try to run a nodetool cleanup.

 Check if the number of SSTables goes down after that... Pending
 compactions must decrease as well...

 Cheers,

 Roni Balthazar




 On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote:
  1) we tried to run repairs but they usually does not succeed. But we had
  Leveled compaction before. Last week we ALTER tables to STCS, because
 guys
  from DataStax suggest us that we should not use Leveled and alter
 tables in
  STCS, because we don't have SSD. After this change we did not run any
  repair. Anyway I don't think it will change anything in SSTable count -
 if I
  am wrong please give me an information
 
  2) I did this. My tables are 99% write only. It is audit system
 
  3) Yes I am using default values
 
  4) In both operations I am using LOCAL_QUORUM.
 
  I am almost sure that READ timeout happens because of too much SSTables.
  Anyway firstly I would like to fix to many pending compactions. I still
  don't know how to speed up them.
 
 
  On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar 
 ronibaltha...@gmail.com
  wrote:
 
  Are you running repairs within gc_grace_seconds? (default is 10 days)
 
 
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 
  Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
  that you do not read often.
 
  Are you using default values for the properties
  min_compaction_threshold(4) and max_compaction_threshold(32)?
 
  Which Consistency Level are you using for reading operations? Check if
  you are not reading from DC_B due to your Replication Factor and CL.
 
 
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
 
 
  Cheers,
 
  Roni Balthazar
 
  On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote:
   I don't have problems with DC_B (replica) only in DC_A(my system
 write
   only
   to it) I have read timeouts.
  
   I checked in OpsCenter SSTable count  and I have:
   1) in DC_A  same +-10% for last week, a small increase for last 24h
 (it
   is
   more than 15000-2 SSTables depends on node)
   2) in DC_B last 24h shows up to 50% decrease, which give nice
   prognostics.
   Now I have less then 1000 SSTables
  
   What did you measure during system optimizations? Or do you have an
 idea
   what more should I check?
   1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
   2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there
 are
   spikes
   3) system RAM usage is almost full
   4) In Total Bytes Compacted most most lines are below 3MB/s. For
 total
   DC_A
   it is less than 10MB/s, in DC_B it looks much better (avg is like
   17MB/s)
  
   something else?
  
  
  
   On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   Hi,
  
   You can check if the number of SSTables is decreasing. Look for the
   SSTable count information of your tables using nodetool cfstats.
   The compaction history can be viewed using nodetool
   compactionhistory.
  
   About the timeouts, check this out:
  
  
 http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
   Also try to run nodetool tpstats to see the threads statistics. It
   can lead you to know if you are having performance problems. If you
   are having too many pending tasks or dropped messages, maybe will
 you
   need to tune your system (eg: driver's timeout, concurrent reads and
   so on)
  
   Regards,
  
   Roni Balthazar

Re: Many pending compactions

2015-02-18 Thread Roni Balthazar
Hi,

You can check if the number of SSTables is decreasing. Look for the
SSTable count information of your tables using nodetool cfstats.
The compaction history can be viewed using nodetool
compactionhistory.

About the timeouts, check this out:
http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
Also try to run nodetool tpstats to see the threads statistics. It
can lead you to know if you are having performance problems. If you
are having too many pending tasks or dropped messages, maybe will you
need to tune your system (eg: driver's timeout, concurrent reads and
so on)

Regards,

Roni Balthazar

On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote:
 Hi,
 Thanks for your tip it looks that something changed - I still don't know
 if it is ok.

 My nodes started to do more compaction, but it looks that some compactions
 are really slow.
 In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput to
 999, but I do not see difference.

 Can we check something more? Or do you have any method to monitor progress
 with small files?

 Regards

 On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 HI,

 Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
 the solution...
 The number of SSTables decreased from many thousands to a number below
 a hundred and the SSTables are now much bigger with several gigabytes
 (most of them).

 Cheers,

 Roni Balthazar



 On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote:
  After some diagnostic ( we didn't set yet cold_reads_to_omit ).
  Compaction
  are running but VERY slow with idle IO.
 
  We had a lot of Data files in Cassandra. In DC_A it is about ~12
  (only
  xxx-Data.db) in DC_B has only ~4000.
 
  I don't know if this change anything but:
  1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big
  ones,
  but most is really small (almost 1 files are less then 100mb).
  2) in DC_B avg size of Data.db is much bigger ~260mb.
 
  Do you think that above flag will help us?
 
 
  On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote:
 
  I set setcompactionthroughput 999 permanently and it doesn't change
  anything. IO is still same. CPU is idle.
 
  On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar
  ronibaltha...@gmail.com
  wrote:
 
  Hi,
 
  You can run nodetool compactionstats to view statistics on
  compactions.
  Setting cold_reads_to_omit to 0.0 can help to reduce the number of
  SSTables when you use Size-Tiered compaction.
  You can also create a cron job to increase the value of
  setcompactionthroughput during the night or when your IO is not busy.
 
  From http://wiki.apache.org/cassandra/NodeTool:
  0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
  0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16
 
  Cheers,
 
  Roni Balthazar
 
  On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com wrote:
   One think I do not understand. In my case compaction is running
   permanently.
   Is there a way to check which compaction is pending? The only
   information is
   about total count.
  
  
   On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote:
  
   Of couse I made a mistake. I am using 2.1.2. Anyway night build is
   available from
   http://cassci.datastax.com/job/cassandra-2.1/
  
   I read about cold_reads_to_omit It looks promising. Should I set
   also
   compaction throughput?
  
   p.s. I am really sad that I didn't read this before:
  
  
   https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
  
  
  
   On Monday, February 16, 2015, Carlos Rolo r...@pythian.com wrote:
  
   Hi 100% in agreement with Roland,
  
   2.1.x series is a pain! I would never recommend the current 2.1.x
   series
   for production.
  
   Clocks is a pain, and check your connectivity! Also check tpstats
   to
   see
   if your threadpools are being overrun.
  
   Regards,
  
   Carlos Juzarte Rolo
   Cassandra Consultant
  
   Pythian - Love your data
  
   rolo@pythian | Twitter: cjrolo | Linkedin:
   linkedin.com/in/carlosjuzarterolo
   Tel: 1649
   www.pythian.com
  
   On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer
   r.etzenham...@t-online.de wrote:
  
   Hi,
  
   1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested
   by
   Al
   Tobey from DataStax)
   7) minimal reads (usually none, sometimes few)
  
   those two points keep me repeating an anwser I got. First where
   did
   you
   get 2.1.3 from? Maybe I missed it, I will have a look. But if it
   is
   2.1.2
   whis is the latest released version, that version has many bugs -
   most of
   them I got kicked by while testing 2.1.2. I got many problems
   with
   compactions not beeing triggred on column families not beeing
   read,
   compactions and repairs not beeing completed.  See
  
  
  
  
   

Re: Many pending compactions

2015-02-18 Thread Roni Balthazar
Which error are you getting when running repairs?
You need to run repair on your nodes within gc_grace_seconds (eg:
weekly). They have data that are not read frequently. You can run
repair -pr on all nodes. Since you do not have deletes, you will not
have trouble with that. If you have deletes, it's better to increase
gc_grace_seconds before the repair.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
After repair, try to run a nodetool cleanup.

Check if the number of SSTables goes down after that... Pending
compactions must decrease as well...

Cheers,

Roni Balthazar




On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote:
 1) we tried to run repairs but they usually does not succeed. But we had
 Leveled compaction before. Last week we ALTER tables to STCS, because guys
 from DataStax suggest us that we should not use Leveled and alter tables in
 STCS, because we don't have SSD. After this change we did not run any
 repair. Anyway I don't think it will change anything in SSTable count - if I
 am wrong please give me an information

 2) I did this. My tables are 99% write only. It is audit system

 3) Yes I am using default values

 4) In both operations I am using LOCAL_QUORUM.

 I am almost sure that READ timeout happens because of too much SSTables.
 Anyway firstly I would like to fix to many pending compactions. I still
 don't know how to speed up them.


 On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 Are you running repairs within gc_grace_seconds? (default is 10 days)

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html

 Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
 that you do not read often.

 Are you using default values for the properties
 min_compaction_threshold(4) and max_compaction_threshold(32)?

 Which Consistency Level are you using for reading operations? Check if
 you are not reading from DC_B due to your Replication Factor and CL.

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html


 Cheers,

 Roni Balthazar

 On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote:
  I don't have problems with DC_B (replica) only in DC_A(my system write
  only
  to it) I have read timeouts.
 
  I checked in OpsCenter SSTable count  and I have:
  1) in DC_A  same +-10% for last week, a small increase for last 24h (it
  is
  more than 15000-2 SSTables depends on node)
  2) in DC_B last 24h shows up to 50% decrease, which give nice
  prognostics.
  Now I have less then 1000 SSTables
 
  What did you measure during system optimizations? Or do you have an idea
  what more should I check?
  1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
  2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are
  spikes
  3) system RAM usage is almost full
  4) In Total Bytes Compacted most most lines are below 3MB/s. For total
  DC_A
  it is less than 10MB/s, in DC_B it looks much better (avg is like
  17MB/s)
 
  something else?
 
 
 
  On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar
  ronibaltha...@gmail.com
  wrote:
 
  Hi,
 
  You can check if the number of SSTables is decreasing. Look for the
  SSTable count information of your tables using nodetool cfstats.
  The compaction history can be viewed using nodetool
  compactionhistory.
 
  About the timeouts, check this out:
 
  http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
  Also try to run nodetool tpstats to see the threads statistics. It
  can lead you to know if you are having performance problems. If you
  are having too many pending tasks or dropped messages, maybe will you
  need to tune your system (eg: driver's timeout, concurrent reads and
  so on)
 
  Regards,
 
  Roni Balthazar
 
  On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote:
   Hi,
   Thanks for your tip it looks that something changed - I still don't
   know
   if it is ok.
  
   My nodes started to do more compaction, but it looks that some
   compactions
   are really slow.
   In IO we have idle, CPU is quite ok (30%-40%). We set
   compactionthrouput
   to
   999, but I do not see difference.
  
   Can we check something more? Or do you have any method to monitor
   progress
   with small files?
  
   Regards
  
   On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   HI,
  
   Yes... I had the same issue and setting cold_reads_to_omit to 0.0
   was
   the solution...
   The number of SSTables decreased from many thousands to a number
   below
   a hundred and the SSTables are now much bigger with several
   gigabytes
   (most of them).
  
   Cheers,
  
   Roni Balthazar
  
  
  
   On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com
   wrote:
After some diagnostic ( we didn't set yet cold_reads_to_omit ).
Compaction
are running but VERY slow

Re: Many pending compactions

2015-02-17 Thread Ja Sam
After some diagnostic ( we didn't set yet cold_reads_to_omit ). Compaction
are running but VERY slow with idle IO.

We had a lot of Data files in Cassandra. In DC_A it is about ~12
(only xxx-Data.db) in DC_B has only ~4000.

I don't know if this change anything but:
1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big
ones, but most is really small (almost 1 files are less then 100mb).
2) in DC_B avg size of Data.db is much bigger ~260mb.

Do you think that above flag will help us?


On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote:

 I set setcompactionthroughput 999 permanently and it doesn't change
 anything. IO is still same. CPU is idle.

 On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 Hi,

 You can run nodetool compactionstats to view statistics on compactions.
 Setting cold_reads_to_omit to 0.0 can help to reduce the number of
 SSTables when you use Size-Tiered compaction.
 You can also create a cron job to increase the value of
 setcompactionthroughput during the night or when your IO is not busy.

 From http://wiki.apache.org/cassandra/NodeTool:
 0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
 0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16

 Cheers,

 Roni Balthazar

 On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com wrote:
  One think I do not understand. In my case compaction is running
 permanently.
  Is there a way to check which compaction is pending? The only
 information is
  about total count.
 
 
  On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote:
 
  Of couse I made a mistake. I am using 2.1.2. Anyway night build is
  available from
  http://cassci.datastax.com/job/cassandra-2.1/
 
  I read about cold_reads_to_omit It looks promising. Should I set also
  compaction throughput?
 
  p.s. I am really sad that I didn't read this before:
 
 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
 
 
 
  On Monday, February 16, 2015, Carlos Rolo r...@pythian.com wrote:
 
  Hi 100% in agreement with Roland,
 
  2.1.x series is a pain! I would never recommend the current 2.1.x
 series
  for production.
 
  Clocks is a pain, and check your connectivity! Also check tpstats to
 see
  if your threadpools are being overrun.
 
  Regards,
 
  Carlos Juzarte Rolo
  Cassandra Consultant
 
  Pythian - Love your data
 
  rolo@pythian | Twitter: cjrolo | Linkedin:
  linkedin.com/in/carlosjuzarterolo
  Tel: 1649
  www.pythian.com
 
  On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer
  r.etzenham...@t-online.de wrote:
 
  Hi,
 
  1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by
 Al
  Tobey from DataStax)
  7) minimal reads (usually none, sometimes few)
 
  those two points keep me repeating an anwser I got. First where did
 you
  get 2.1.3 from? Maybe I missed it, I will have a look. But if it is
 2.1.2
  whis is the latest released version, that version has many bugs -
 most of
  them I got kicked by while testing 2.1.2. I got many problems with
  compactions not beeing triggred on column families not beeing read,
  compactions and repairs not beeing completed.  See
 
 
 
 https://www.mail-archive.com/search?l=user@cassandra.apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger%22o=newestf=1
 
 https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html
 
  Apart from that, how are those both datacenters connected? Maybe
 there
  is a bottleneck.
 
  Also do you have ntp up and running on all nodes to keep all clocks
 in
  thight sync?
 
  Note: I'm no expert (yet) - just sharing my 2 cents.
 
  Cheers,
  Roland
 
 
 
  --
 
 
 
 





Re: Many pending compactions

2015-02-17 Thread Ja Sam
I set setcompactionthroughput 999 permanently and it doesn't change
anything. IO is still same. CPU is idle.

On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar ronibaltha...@gmail.com
wrote:

 Hi,

 You can run nodetool compactionstats to view statistics on compactions.
 Setting cold_reads_to_omit to 0.0 can help to reduce the number of
 SSTables when you use Size-Tiered compaction.
 You can also create a cron job to increase the value of
 setcompactionthroughput during the night or when your IO is not busy.

 From http://wiki.apache.org/cassandra/NodeTool:
 0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
 0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16

 Cheers,

 Roni Balthazar

 On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com wrote:
  One think I do not understand. In my case compaction is running
 permanently.
  Is there a way to check which compaction is pending? The only
 information is
  about total count.
 
 
  On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote:
 
  Of couse I made a mistake. I am using 2.1.2. Anyway night build is
  available from
  http://cassci.datastax.com/job/cassandra-2.1/
 
  I read about cold_reads_to_omit It looks promising. Should I set also
  compaction throughput?
 
  p.s. I am really sad that I didn't read this before:
 
 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
 
 
 
  On Monday, February 16, 2015, Carlos Rolo r...@pythian.com wrote:
 
  Hi 100% in agreement with Roland,
 
  2.1.x series is a pain! I would never recommend the current 2.1.x
 series
  for production.
 
  Clocks is a pain, and check your connectivity! Also check tpstats to
 see
  if your threadpools are being overrun.
 
  Regards,
 
  Carlos Juzarte Rolo
  Cassandra Consultant
 
  Pythian - Love your data
 
  rolo@pythian | Twitter: cjrolo | Linkedin:
  linkedin.com/in/carlosjuzarterolo
  Tel: 1649
  www.pythian.com
 
  On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer
  r.etzenham...@t-online.de wrote:
 
  Hi,
 
  1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al
  Tobey from DataStax)
  7) minimal reads (usually none, sometimes few)
 
  those two points keep me repeating an anwser I got. First where did
 you
  get 2.1.3 from? Maybe I missed it, I will have a look. But if it is
 2.1.2
  whis is the latest released version, that version has many bugs -
 most of
  them I got kicked by while testing 2.1.2. I got many problems with
  compactions not beeing triggred on column families not beeing read,
  compactions and repairs not beeing completed.  See
 
 
 
 https://www.mail-archive.com/search?l=user@cassandra.apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger%22o=newestf=1
 
 https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html
 
  Apart from that, how are those both datacenters connected? Maybe there
  is a bottleneck.
 
  Also do you have ntp up and running on all nodes to keep all clocks in
  thight sync?
 
  Note: I'm no expert (yet) - just sharing my 2 cents.
 
  Cheers,
  Roland
 
 
 
  --
 
 
 
 



Many pending compactions

2015-02-16 Thread Ja Sam
*Environment*
1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al
Tobey from DataStax)
2) not using vnodes
3)Two data centres: 5 nodes in one DC (DC_A), 4 nodes in second DC (DC_B)
4) each node is set up on a physical box with two 16-Core HT Xeon
processors (E5-2660), 64GB RAM and 10x2TB 7.2K SAS disks (one for
commitlog, nine for Cassandra data file directories), 1Gbps network. No
RAID, only JBOD.
5) 3500 writes per seconds, I write only to DC_A with local_quorum with
RF=5 in the local DC_A on our largest CF’s.
6) acceptable write times (usually a few ms unless we encounter some
problem within the cluster)
7) minimal reads (usually none, sometimes few)
8) iostat looks like ok -
http://serverfault.com/questions/666136/interpreting-disk-stats-using-sar
9) We use SizeTired compaction. We convert to it from LevelTired


*Problems*
Nowadays we see two main problems:
1) In DC_A we have a rally lot of pending compactions (400-700 depending on
node). In DC_B everything is fine (10 is short term maximum, usually is
less then 3). The pending compaction does not change in long term.
2) In DC_A reads usually has timeout exception. In DC_B is fast and works
without problems.

*The question*
Is there a way how can I diagnose what is wrong with my servers? I
understand that DC_A is doing much more work than DC_B, but tested much
bigger load on test machine for few days and everything was fine.


Re: Many pending compactions

2015-02-16 Thread Roland Etzenhammer

Hi,

1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al 
Tobey from DataStax)

7) minimal reads (usually none, sometimes few)

those two points keep me repeating an anwser I got. First where did you 
get 2.1.3 from? Maybe I missed it, I will have a look. But if it is 
2.1.2 whis is the latest released version, that version has many bugs - 
most of them I got kicked by while testing 2.1.2. I got many problems 
with compactions not beeing triggred on column families not beeing read, 
compactions and repairs not beeing completed.  See


https://www.mail-archive.com/search?l=user@cassandra.apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger%22o=newestf=1
https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html

Apart from that, how are those both datacenters connected? Maybe there 
is a bottleneck.


Also do you have ntp up and running on all nodes to keep all clocks in 
thight sync?


Note: I'm no expert (yet) - just sharing my 2 cents.

Cheers,
Roland


Re: Many pending compactions

2015-02-16 Thread Carlos Rolo
Hi 100% in agreement with Roland,

2.1.x series is a pain! I would never recommend the current 2.1.x series
for production.

Clocks is a pain, and check your connectivity! Also check tpstats to see if
your threadpools are being overrun.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
http://linkedin.com/in/carlosjuzarterolo*
Tel: 1649
www.pythian.com

On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer 
r.etzenham...@t-online.de wrote:

 Hi,

 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al
 Tobey from DataStax)
 7) minimal reads (usually none, sometimes few)

 those two points keep me repeating an anwser I got. First where did you
 get 2.1.3 from? Maybe I missed it, I will have a look. But if it is 2.1.2
 whis is the latest released version, that version has many bugs - most of
 them I got kicked by while testing 2.1.2. I got many problems with
 compactions not beeing triggred on column families not beeing read,
 compactions and repairs not beeing completed.  See

 https://www.mail-archive.com/search?l=user@cassandra.
 apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger%
 22o=newestf=1
 https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html

 Apart from that, how are those both datacenters connected? Maybe there is
 a bottleneck.

 Also do you have ntp up and running on all nodes to keep all clocks in
 thight sync?

 Note: I'm no expert (yet) - just sharing my 2 cents.

 Cheers,
 Roland


-- 


--





Re: Many pending compactions

2015-02-16 Thread Ja Sam
One think I do not understand. In my case compaction is running
permanently. Is there a way to check which compaction is pending? The only
information is about total count.

On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote:

 Of couse I made a mistake. I am using 2.1.2. Anyway night build is
 available from
 http://cassci.datastax.com/job/cassandra-2.1/

 I read about cold_reads_to_omit It looks promising. Should I set also
 compaction throughput?

 p.s. I am really sad that I didn't read this before:
 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/



 On Monday, February 16, 2015, Carlos Rolo r...@pythian.com wrote:

 Hi 100% in agreement with Roland,

 2.1.x series is a pain! I would never recommend the current 2.1.x series
 for production.

 Clocks is a pain, and check your connectivity! Also check tpstats to see
 if your threadpools are being overrun.

 Regards,

 Carlos Juzarte Rolo
 Cassandra Consultant

 Pythian - Love your data

 rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
 http://linkedin.com/in/carlosjuzarterolo*
 Tel: 1649
 www.pythian.com

 On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer 
 r.etzenham...@t-online.de wrote:

 Hi,

 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al
 Tobey from DataStax)
 7) minimal reads (usually none, sometimes few)

 those two points keep me repeating an anwser I got. First where did you
 get 2.1.3 from? Maybe I missed it, I will have a look. But if it is 2.1.2
 whis is the latest released version, that version has many bugs - most of
 them I got kicked by while testing 2.1.2. I got many problems with
 compactions not beeing triggred on column families not beeing read,
 compactions and repairs not beeing completed.  See

 https://www.mail-archive.com/search?l=user@cassandra.
 apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger%
 22o=newestf=1
 https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html

 Apart from that, how are those both datacenters connected? Maybe there
 is a bottleneck.

 Also do you have ntp up and running on all nodes to keep all clocks in
 thight sync?

 Note: I'm no expert (yet) - just sharing my 2 cents.

 Cheers,
 Roland



 --






Many pending compactions

2015-02-16 Thread Ja Sam
Of couse I made a mistake. I am using 2.1.2. Anyway night build is
available from
http://cassci.datastax.com/job/cassandra-2.1/

I read about cold_reads_to_omit It looks promising. Should I set also
compaction throughput?

p.s. I am really sad that I didn't read this before:
https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/



On Monday, February 16, 2015, Carlos Rolo r...@pythian.com
javascript:_e(%7B%7D,'cvml','r...@pythian.com'); wrote:

 Hi 100% in agreement with Roland,

 2.1.x series is a pain! I would never recommend the current 2.1.x series
 for production.

 Clocks is a pain, and check your connectivity! Also check tpstats to see
 if your threadpools are being overrun.

 Regards,

 Carlos Juzarte Rolo
 Cassandra Consultant

 Pythian - Love your data

 rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
 http://linkedin.com/in/carlosjuzarterolo*
 Tel: 1649
 www.pythian.com

 On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer 
 r.etzenham...@t-online.de wrote:

 Hi,

 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al
 Tobey from DataStax)
 7) minimal reads (usually none, sometimes few)

 those two points keep me repeating an anwser I got. First where did you
 get 2.1.3 from? Maybe I missed it, I will have a look. But if it is 2.1.2
 whis is the latest released version, that version has many bugs - most of
 them I got kicked by while testing 2.1.2. I got many problems with
 compactions not beeing triggred on column families not beeing read,
 compactions and repairs not beeing completed.  See

 https://www.mail-archive.com/search?l=user@cassandra.
 apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger%
 22o=newestf=1
 https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html

 Apart from that, how are those both datacenters connected? Maybe there is
 a bottleneck.

 Also do you have ntp up and running on all nodes to keep all clocks in
 thight sync?

 Note: I'm no expert (yet) - just sharing my 2 cents.

 Cheers,
 Roland



 --






Re: Many pending compactions

2015-02-16 Thread Roni Balthazar
Hi,

You can run nodetool compactionstats to view statistics on compactions.
Setting cold_reads_to_omit to 0.0 can help to reduce the number of
SSTables when you use Size-Tiered compaction.
You can also create a cron job to increase the value of
setcompactionthroughput during the night or when your IO is not busy.

From http://wiki.apache.org/cassandra/NodeTool:
0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16

Cheers,

Roni Balthazar

On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com wrote:
 One think I do not understand. In my case compaction is running permanently.
 Is there a way to check which compaction is pending? The only information is
 about total count.


 On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote:

 Of couse I made a mistake. I am using 2.1.2. Anyway night build is
 available from
 http://cassci.datastax.com/job/cassandra-2.1/

 I read about cold_reads_to_omit It looks promising. Should I set also
 compaction throughput?

 p.s. I am really sad that I didn't read this before:
 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/



 On Monday, February 16, 2015, Carlos Rolo r...@pythian.com wrote:

 Hi 100% in agreement with Roland,

 2.1.x series is a pain! I would never recommend the current 2.1.x series
 for production.

 Clocks is a pain, and check your connectivity! Also check tpstats to see
 if your threadpools are being overrun.

 Regards,

 Carlos Juzarte Rolo
 Cassandra Consultant

 Pythian - Love your data

 rolo@pythian | Twitter: cjrolo | Linkedin:
 linkedin.com/in/carlosjuzarterolo
 Tel: 1649
 www.pythian.com

 On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer
 r.etzenham...@t-online.de wrote:

 Hi,

 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al
 Tobey from DataStax)
 7) minimal reads (usually none, sometimes few)

 those two points keep me repeating an anwser I got. First where did you
 get 2.1.3 from? Maybe I missed it, I will have a look. But if it is 2.1.2
 whis is the latest released version, that version has many bugs - most of
 them I got kicked by while testing 2.1.2. I got many problems with
 compactions not beeing triggred on column families not beeing read,
 compactions and repairs not beeing completed.  See


 https://www.mail-archive.com/search?l=user@cassandra.apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger%22o=newestf=1
 https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html

 Apart from that, how are those both datacenters connected? Maybe there
 is a bottleneck.

 Also do you have ntp up and running on all nodes to keep all clocks in
 thight sync?

 Note: I'm no expert (yet) - just sharing my 2 cents.

 Cheers,
 Roland



 --