Many pending compactions
*Environment* 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al Tobey from DataStax) 2) not using vnodes 3)Two data centres: 5 nodes in one DC (DC_A), 4 nodes in second DC (DC_B) 4) each node is set up on a physical box with two 16-Core HT Xeon processors (E5-2660), 64GB RAM and 10x2TB 7.2K SAS disks (one for commitlog, nine for Cassandra data file directories), 1Gbps network. No RAID, only JBOD. 5) 3500 writes per seconds, I write only to DC_A with local_quorum with RF=5 in the local DC_A on our largest CF’s. 6) acceptable write times (usually a few ms unless we encounter some problem within the cluster) 7) minimal reads (usually none, sometimes few) 8) iostat looks like ok - http://serverfault.com/questions/666136/interpreting-disk-stats-using-sar 9) We use SizeTired compaction. We convert to it from LevelTired *Problems* Nowadays we see two main problems: 1) In DC_A we have a rally lot of pending compactions (400-700 depending on node). In DC_B everything is fine (10 is short term maximum, usually is less then 3). The pending compaction does not change in long term. 2) In DC_A reads usually has timeout exception. In DC_B is fast and works without problems. *The question* Is there a way how can I diagnose what is wrong with my servers? I understand that DC_A is doing much more work than DC_B, but tested much bigger load on test machine for few days and everything was fine.
Re: Many pending compactions
Hi, 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al Tobey from DataStax) 7) minimal reads (usually none, sometimes few) those two points keep me repeating an anwser I got. First where did you get 2.1.3 from? Maybe I missed it, I will have a look. But if it is 2.1.2 whis is the latest released version, that version has many bugs - most of them I got kicked by while testing 2.1.2. I got many problems with compactions not beeing triggred on column families not beeing read, compactions and repairs not beeing completed. See https://www.mail-archive.com/search?l=user@cassandra.apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger%22o=newestf=1 https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html Apart from that, how are those both datacenters connected? Maybe there is a bottleneck. Also do you have ntp up and running on all nodes to keep all clocks in thight sync? Note: I'm no expert (yet) - just sharing my 2 cents. Cheers, Roland
Re: How to speed up SELECT * query in Cassandra
Could you please share how much data you store on the cluster and what is HW configuration of the nodes? These nodes are dedicated HW, 24 cpu and 50Gb ram. Each node has a few TBs of data (you don't want to go over this) in raid50 (we're migrating over to JBOD). Each c* node is running 2.0.11 and configured to use 8gm heap, 2g new, and jdk1.7.0_55. Hadoop (2.2.0) tasktrackers and dfs run on these nodes as well, all up they use up to 12Gb ram, leaving ~30Gb ram for kernel and page cache. Data-locality is an important goal, in the worse case scenarios we've seen it mean a four times throughput benefit. Hdfs being a volatile hadoop-internals space for us is on SSDs, providing strong m/r performance. (commitlog of course is also on SSD – we made the mistake of putting it on the same SSD to begin with. don't do that, commitlog gets its own SSD) I am really impressed that you are able to read 100M records in ~4minutes on 4 nodes. It makes something like 100k reads per node, which is something we are quite far away from. These are not individual reads and not the number of partition keys, but m/r records (or cql rows). But yes, the performance of spark against cassandra is impressive. It leads me to question, whether reading from Spark goes through Cassandra's JVM and thus go through normal read path, or if it reads the sstables directly from disks sequentially and possibly filters out old/tombstone values by itself? Both Hadoop-Cassandra integration and the Spark-Cassandra connector goes through the normal read path like all cql read queries. With our m/r jobs each task works with just one partition key, doing repeated column slice reads through that partition key according to the ConfigHelper.rangeBatchSize setting, which we have set to 100. These hadoop jobs use a custom written CqlInputFormat due to the poor performance CqlInputFormat has today against a vnodes setup, the customisation we have is pretty much the same as the patch on offer in CASSANDRA-6091. This problem with vnodes we haven't experienced with the spark connector. I presume that, like the hadoop integration, spark also bulk reads (column slices) from each partition key. Otherwise this is useful reading http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting This is also a cluster that serves requests to web applications that need low latency. Let it be said this isn't something i'd recommend, just the path we had to take because of our small initial dedicated-HW cluster. (You really want to separate online and offline datacenters, so that you can maximise the offline clusters for the heavy batch reads). ~mck
Re: Many pending compactions
Hi 100% in agreement with Roland, 2.1.x series is a pain! I would never recommend the current 2.1.x series for production. Clocks is a pain, and check your connectivity! Also check tpstats to see if your threadpools are being overrun. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer r.etzenham...@t-online.de wrote: Hi, 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al Tobey from DataStax) 7) minimal reads (usually none, sometimes few) those two points keep me repeating an anwser I got. First where did you get 2.1.3 from? Maybe I missed it, I will have a look. But if it is 2.1.2 whis is the latest released version, that version has many bugs - most of them I got kicked by while testing 2.1.2. I got many problems with compactions not beeing triggred on column families not beeing read, compactions and repairs not beeing completed. See https://www.mail-archive.com/search?l=user@cassandra. apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger% 22o=newestf=1 https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html Apart from that, how are those both datacenters connected? Maybe there is a bottleneck. Also do you have ntp up and running on all nodes to keep all clocks in thight sync? Note: I'm no expert (yet) - just sharing my 2 cents. Cheers, Roland -- --
Re: Many pending compactions
One think I do not understand. In my case compaction is running permanently. Is there a way to check which compaction is pending? The only information is about total count. On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote: Of couse I made a mistake. I am using 2.1.2. Anyway night build is available from http://cassci.datastax.com/job/cassandra-2.1/ I read about cold_reads_to_omit It looks promising. Should I set also compaction throughput? p.s. I am really sad that I didn't read this before: https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ On Monday, February 16, 2015, Carlos Rolo r...@pythian.com wrote: Hi 100% in agreement with Roland, 2.1.x series is a pain! I would never recommend the current 2.1.x series for production. Clocks is a pain, and check your connectivity! Also check tpstats to see if your threadpools are being overrun. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer r.etzenham...@t-online.de wrote: Hi, 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al Tobey from DataStax) 7) minimal reads (usually none, sometimes few) those two points keep me repeating an anwser I got. First where did you get 2.1.3 from? Maybe I missed it, I will have a look. But if it is 2.1.2 whis is the latest released version, that version has many bugs - most of them I got kicked by while testing 2.1.2. I got many problems with compactions not beeing triggred on column families not beeing read, compactions and repairs not beeing completed. See https://www.mail-archive.com/search?l=user@cassandra. apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger% 22o=newestf=1 https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html Apart from that, how are those both datacenters connected? Maybe there is a bottleneck. Also do you have ntp up and running on all nodes to keep all clocks in thight sync? Note: I'm no expert (yet) - just sharing my 2 cents. Cheers, Roland --
Many pending compactions
Of couse I made a mistake. I am using 2.1.2. Anyway night build is available from http://cassci.datastax.com/job/cassandra-2.1/ I read about cold_reads_to_omit It looks promising. Should I set also compaction throughput? p.s. I am really sad that I didn't read this before: https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ On Monday, February 16, 2015, Carlos Rolo r...@pythian.com javascript:_e(%7B%7D,'cvml','r...@pythian.com'); wrote: Hi 100% in agreement with Roland, 2.1.x series is a pain! I would never recommend the current 2.1.x series for production. Clocks is a pain, and check your connectivity! Also check tpstats to see if your threadpools are being overrun. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer r.etzenham...@t-online.de wrote: Hi, 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al Tobey from DataStax) 7) minimal reads (usually none, sometimes few) those two points keep me repeating an anwser I got. First where did you get 2.1.3 from? Maybe I missed it, I will have a look. But if it is 2.1.2 whis is the latest released version, that version has many bugs - most of them I got kicked by while testing 2.1.2. I got many problems with compactions not beeing triggred on column families not beeing read, compactions and repairs not beeing completed. See https://www.mail-archive.com/search?l=user@cassandra. apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger% 22o=newestf=1 https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html Apart from that, how are those both datacenters connected? Maybe there is a bottleneck. Also do you have ntp up and running on all nodes to keep all clocks in thight sync? Note: I'm no expert (yet) - just sharing my 2 cents. Cheers, Roland --
Re: Many pending compactions
Hi, You can run nodetool compactionstats to view statistics on compactions. Setting cold_reads_to_omit to 0.0 can help to reduce the number of SSTables when you use Size-Tiered compaction. You can also create a cron job to increase the value of setcompactionthroughput during the night or when your IO is not busy. From http://wiki.apache.org/cassandra/NodeTool: 0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999 0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16 Cheers, Roni Balthazar On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com wrote: One think I do not understand. In my case compaction is running permanently. Is there a way to check which compaction is pending? The only information is about total count. On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote: Of couse I made a mistake. I am using 2.1.2. Anyway night build is available from http://cassci.datastax.com/job/cassandra-2.1/ I read about cold_reads_to_omit It looks promising. Should I set also compaction throughput? p.s. I am really sad that I didn't read this before: https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ On Monday, February 16, 2015, Carlos Rolo r...@pythian.com wrote: Hi 100% in agreement with Roland, 2.1.x series is a pain! I would never recommend the current 2.1.x series for production. Clocks is a pain, and check your connectivity! Also check tpstats to see if your threadpools are being overrun. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo Tel: 1649 www.pythian.com On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer r.etzenham...@t-online.de wrote: Hi, 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al Tobey from DataStax) 7) minimal reads (usually none, sometimes few) those two points keep me repeating an anwser I got. First where did you get 2.1.3 from? Maybe I missed it, I will have a look. But if it is 2.1.2 whis is the latest released version, that version has many bugs - most of them I got kicked by while testing 2.1.2. I got many problems with compactions not beeing triggred on column families not beeing read, compactions and repairs not beeing completed. See https://www.mail-archive.com/search?l=user@cassandra.apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger%22o=newestf=1 https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html Apart from that, how are those both datacenters connected? Maybe there is a bottleneck. Also do you have ntp up and running on all nodes to keep all clocks in thight sync? Note: I'm no expert (yet) - just sharing my 2 cents. Cheers, Roland --