Re: restarting node makes cpu load of the entire cluster to raise
Hi guys, just for the record, in case someone has this issue in the future, it is a bug, fixed in 1.2.12. I recommend anyone in this version to upgrade cluster before getting totally stuck (by adding new machines / DC / altering keyspaces / ...), well anything using gossip actually. We are going to plan a full downtime to be able to recover, since we are completely stuck. Datastax help to find out the issue, thanks DuyHai ! For more information = https://issues.apache.org/jira/browse/CASSANDRA-6297 2014-06-26 16:16 GMT+02:00 Jonathan Lacefield jlacefi...@datastax.com: Hello Alain, I'm not sure of the root cause of this item. It may be helpful to use DEBUG and start the node to see what's happening as well as watch compaction stats or tpstats to understand what is taxing your system. The log file you provided shows a large ParNew while replaying commit log segments. Does your app insert very large rows or have individual columns that are large? I quickly reviewed Changes/txt https://github.com/apache/cassandra/blob/cassandra-1.2/CHANGES.txt to see if anything jumps out as a culprit, but didn't spot anything. Sorry i can't be of more help with this one. It may take some hands-on investigation or maybe someone else in the community has experienced this issue and can provide feedback. Thanks, Jonathan Jonathan Lacefield Solutions Architect, DataStax (404) 822 3487 http://www.linkedin.com/in/jlacefield http://www.datastax.com/cassandrasummit14 On Wed, Jun 18, 2014 at 3:07 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Jun 18, 2014 at 5:36 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: We stop the node using : nodetool disablegossip nodetool disablethrift nodetool disablebinary sleep 10 nodetool drain sleep 30 service cassandra stop The stuff before nodetool drain here is redundant and doesn't actually do what you are expecting it to do. https://issues.apache.org/jira/browse/CASSANDRA-4162 =Rob
Counters: consistency, atomic batch
Hi all, I am using Cassandra 2.0.x. and Astyanax 1.56.x (2.0.1 shows the same results) driver via Thrift protocol. Questions about counters: 1. Consistency. Consider simplest case when we update value of single counter. 1.1. Is there any difference between updating counter with ONE or QUORUM level? Yes, I understand that ONE may affect reading - readers may see old value. It's ok, eventual consistency for the reader is ok. I am asking, whether writing counter with ONE may lead to totally broken data? I will explain. * Host A stores most recent value 100, host B stores old value 99 (isn't replicated yet). * I increment counter with ONE. Request is sent to host B. * B sees 99. Adds 1. Saves 100, and this 100 bacame more new than old 100 stored on host A. Later it will be replicated to A. * Result: we lost 1 increment, cause actually value should be 101, not 100. As I understand this scenario isn't possible with QUORUM nor ONE. Because actually Cassandra stores counter value in shard structure. So I can safely update counter value with ONE. Am I right? 1.2. If I update counter with QUORUM level whether Cassandra read the old value also with QUORUM level? Or the same point with local shard makes possible to read only value stored on the host which doing writing? 1.3. How 1.1 and 1.2 behavior will change in Cassandra 2.1 and 3.0? I read that in Cassandra 2.1 counters are totally reimplemented. And in 3.0 will be too again. 2. Atomicity. I need to log 1 event as increments to several tables (yes, we use data duplication for different select queries) I use single batch mutation for all increments. Can Cassandra execute batch of counters increments in atomic manner? Here: http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2 I see the following: 1.2 also introduces a separate BEGIN COUNTER BATCH for batched counter updates. Unlike other writes, counter updates are not idempotent, so replaying them automatically from the batchlog is not safe. Counter batches are thus strictly for improved performance when updating multiple counters in the same partition. The text isn't 100% clear. Does it mean that Cassandra can't guarantee atomic batch for counters even with BEGIN COUNTER BATCH? If it can't, in which Cassandra version atomic batch for counters will work? And what is the difference between 'BEGIN COUNTER BATCH' and 'BEGIN BATCH'? If it can, do you know which driver supports BEGIN COUNTER BATCH? I searched the whole source of Astyanax 2.0.1 and it seems that it doesn't support it currently. Thanks in advance! PS. Do you know how to communicate with Astyanax team? I wrote several questions to google groups email astyanax-cassandra-cli...@googlegroups.com but didn't receive any answers. -- Best regards, Eugene Voytitsky
Cassandra JBOD disk configuration
Hi, Let's imagine that I have one keyspace with one big table configured with size tiered compaction strategy and nothing else. The disk configuration would to have 10x 500GB disks, each mounted to separate directory. Each directory would then be configured as a separate entry in cassandra.yaml. Over time data accumulates and I have at some point 4x 300GB sstables that the cassandra would like to compact to one 1,2 TB sstable. Since each directory has max 500GB disk space, that would not work. Right? Is JBOD with more than 2 disks really usable with STCS? Probably LCS would the only way to go in this case? Cheers, Hannu
Moving a server from one DC to another
I have a situation where I need to move a server from one DC to another DC. I am using the ProperFileSnitch and my cassandra-topology.properties looks like this. Server150=CLV:RAC1 Server151=CLV:RAC1 Server152=CLV:RAC1 Server153=DPT:RAC1 Server154=DPT:RAC1 Server155=DPT:RAC1 Server156=DPT:RAC1 Server157=DPT:RAC1 Server158=DPT:RAC1 I need to move Server153 and Server 154 from the DPT DC to the CLV DC. The servers are not going to physically move and the names/IPs will remain the same. Server150=CLV:RAC1 Server151=CLV:RAC1 Server152=CLV:RAC1 Server153=CLV:RAC1 Server154=CLV:RAC1 Server155=DPT:RAC1 Server156=DPT:RAC1 Server157=DPT:RAC1 Server158=DPT:RAC1 What is the best way to do this? Do I just need to change the cassandra-topology.properties and restart the nodes? Do I need to rebalance after? Do I need to decommission the servers one at a time, then re-bootstrap them using the correct DC? Any help would be appreciated. Gene Robichaux Manager, Database Operations 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 Phone: 214-576-3273
Pending tasks are not being processed.
Hi all, Yesterday I put a lot of blobs into Cassandra and it created many, probably compaction, pending tasks (few hundreds according to Ops Center). On all nodes all pending tasks were eventually processed, but on one problematic, I see no related activity. Problematic node seems to be responsive and no errors in logs. What can be the reason? Cassandra ver: multiple versions on 14 nodes (2.0.8 and 2.0.9) OpsCenter ver: 4.1.2 Compaction type: leveled (we had capacity issues with size-tiered). To process pending task as fast as possible, I temporary changed compaction_throughput_mb_per_sec to 0 on this specific node. It helps, but only when compaction running, and currently pending tasks are not being processed. Thanks, Pavel
Re: Pending tasks are not being processed.
On Thu, Sep 4, 2014 at 8:09 AM, Pavel Kogan pavel.ko...@cortica.com wrote: Yesterday I put a lot of blobs into Cassandra and it created many, probably compaction, pending tasks (few hundreds according to Ops Center). On all nodes all pending tasks were eventually processed, but on one problematic, I see no related activity. Problematic node seems to be responsive and no errors in logs. What can be the reason? Cassandra ver: multiple versions on 14 nodes (2.0.8 and 2.0.9) Running for extended periods of time with split versions is Not Supported. That said, perhaps you are running into... https://issues.apache.org/jira/browse/CASSANDRA-7145 or https://issues.apache.org/jira/browse/CASSANDRA-7808 ? =Rob
Re: Pending tasks are not being processed.
Should I experience any problems even if split versions vary only by minor digit? After another restart of node, it seems that the problem was solved somehow. Regards, Pavel On Thu, Sep 4, 2014 at 1:59 PM, Robert Coli rc...@eventbrite.com wrote: On Thu, Sep 4, 2014 at 8:09 AM, Pavel Kogan pavel.ko...@cortica.com wrote: Yesterday I put a lot of blobs into Cassandra and it created many, probably compaction, pending tasks (few hundreds according to Ops Center). On all nodes all pending tasks were eventually processed, but on one problematic, I see no related activity. Problematic node seems to be responsive and no errors in logs. What can be the reason? Cassandra ver: multiple versions on 14 nodes (2.0.8 and 2.0.9) Running for extended periods of time with split versions is Not Supported. That said, perhaps you are running into... https://issues.apache.org/jira/browse/CASSANDRA-7145 or https://issues.apache.org/jira/browse/CASSANDRA-7808 ? =Rob
Re: Pending tasks are not being processed.
On Thu, Sep 4, 2014 at 11:54 AM, Pavel Kogan pavel.ko...@cortica.com wrote: Should I experience any problems even if split versions vary only by minor digit? My statement states what it states, no more and no less. It is Not Supported to run with split minor version, single digit increment or not, for longer than an upgrade takes. After another restart of node, it seems that the problem was solved somehow. Fixed forever! =Rob http://twitter.com/rcolidba
Re: Counters: consistency, atomic batch
Counters are way more complicated than what you're illustrating. Datastax did a good blog post on this: http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-1-a-better-implementation-of-counters On Thu, Sep 4, 2014 at 6:34 AM, Eugene Voytitsky viy@gmail.com wrote: Hi all, I am using Cassandra 2.0.x. and Astyanax 1.56.x (2.0.1 shows the same results) driver via Thrift protocol. Questions about counters: 1. Consistency. Consider simplest case when we update value of single counter. 1.1. Is there any difference between updating counter with ONE or QUORUM level? Yes, I understand that ONE may affect reading - readers may see old value. It's ok, eventual consistency for the reader is ok. I am asking, whether writing counter with ONE may lead to totally broken data? I will explain. * Host A stores most recent value 100, host B stores old value 99 (isn't replicated yet). * I increment counter with ONE. Request is sent to host B. * B sees 99. Adds 1. Saves 100, and this 100 bacame more new than old 100 stored on host A. Later it will be replicated to A. * Result: we lost 1 increment, cause actually value should be 101, not 100. As I understand this scenario isn't possible with QUORUM nor ONE. Because actually Cassandra stores counter value in shard structure. So I can safely update counter value with ONE. Am I right? 1.2. If I update counter with QUORUM level whether Cassandra read the old value also with QUORUM level? Or the same point with local shard makes possible to read only value stored on the host which doing writing? 1.3. How 1.1 and 1.2 behavior will change in Cassandra 2.1 and 3.0? I read that in Cassandra 2.1 counters are totally reimplemented. And in 3.0 will be too again. 2. Atomicity. I need to log 1 event as increments to several tables (yes, we use data duplication for different select queries) I use single batch mutation for all increments. Can Cassandra execute batch of counters increments in atomic manner? Here: http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2 I see the following: 1.2 also introduces a separate BEGIN COUNTER BATCH for batched counter updates. Unlike other writes, counter updates are not idempotent, so replaying them automatically from the batchlog is not safe. Counter batches are thus strictly for improved performance when updating multiple counters in the same partition. The text isn't 100% clear. Does it mean that Cassandra can't guarantee atomic batch for counters even with BEGIN COUNTER BATCH? If it can't, in which Cassandra version atomic batch for counters will work? And what is the difference between 'BEGIN COUNTER BATCH' and 'BEGIN BATCH'? If it can, do you know which driver supports BEGIN COUNTER BATCH? I searched the whole source of Astyanax 2.0.1 and it seems that it doesn't support it currently. Thanks in advance! PS. Do you know how to communicate with Astyanax team? I wrote several questions to google groups email astyanax-cassandra-cli...@googlegroups.com but didn't receive any answers. -- Best regards, Eugene Voytitsky -- *Ken Hancock *| System Architect, Advanced Advertising SeaChange International 50 Nagog Park Acton, Massachusetts 01720 ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC http://www.schange.com/en-US/Company/InvestorRelations.aspx Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hanc...@schange.com | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks [image: LinkedIn] http://www.linkedin.com/in/kenhancock [image: SeaChange International] http://www.schange.com/This e-mail and any attachments may contain information which is SeaChange International confidential. The information enclosed is intended only for the addressees herein and may not be copied or forwarded without permission from SeaChange International.
Question about EC2 and SSDs
Hi all, We are migrating a small cluster on AWS from instances based on spinning disks (using instance store) to SSD-backed instances and we're trying to pick the proper instance type. Some of the recommendations for spinning disks say to use different drives for log vs data partitions to avoid issues with seek delays and contention for the disk heads. Since SSDs don't have the same seek delays, is it still recommended to use 2 SSD drives? Or is one sufficient? Thanks, Steve
Re: Question about EC2 and SSDs
With SSD one drive should be sufficient for both data and commitLogs. Rahul Neelakantan On Sep 4, 2014, at 8:05 PM, Steve Robenalt sroben...@highwire.org wrote: Hi all, We are migrating a small cluster on AWS from instances based on spinning disks (using instance store) to SSD-backed instances and we're trying to pick the proper instance type. Some of the recommendations for spinning disks say to use different drives for log vs data partitions to avoid issues with seek delays and contention for the disk heads. Since SSDs don't have the same seek delays, is it still recommended to use 2 SSD drives? Or is one sufficient? Thanks, Steve
Re: Question about EC2 and SSDs
On Thu, Sep 4, 2014 at 5:05 PM, Steve Robenalt sroben...@highwire.org wrote: We are migrating a small cluster on AWS from instances based on spinning disks (using instance store) to SSD-backed instances and we're trying to pick the proper instance type. Some of the recommendations for spinning disks say to use different drives for log vs data partitions to avoid issues with seek delays and contention for the disk heads. Since SSDs don't have the same seek delays, is it still recommended to use 2 SSD drives? Or is one sufficient? The purpose of distinct mount points for commitlog and data is to allow the commitlog to operate in an append only manner without seeking. This is possible with SSD disk. =Rob
Re: Question about EC2 and SSDs
Thanks Rahul! That was my inclination, but I don't want to take things like that for granted. Anybody have a dissenting view? Steve On Thu, Sep 4, 2014 at 5:14 PM, Rahul Neelakantan ra...@rahul.be wrote: With SSD one drive should be sufficient for both data and commitLogs. Rahul Neelakantan On Sep 4, 2014, at 8:05 PM, Steve Robenalt sroben...@highwire.org wrote: Hi all, We are migrating a small cluster on AWS from instances based on spinning disks (using instance store) to SSD-backed instances and we're trying to pick the proper instance type. Some of the recommendations for spinning disks say to use different drives for log vs data partitions to avoid issues with seek delays and contention for the disk heads. Since SSDs don't have the same seek delays, is it still recommended to use 2 SSD drives? Or is one sufficient? Thanks, Steve
Re: Question about EC2 and SSDs
Thanks Robert! I am assuming that you meant that it's possible with a single SSD, right? On Thu, Sep 4, 2014 at 5:42 PM, Robert Coli rc...@eventbrite.com wrote: On Thu, Sep 4, 2014 at 5:05 PM, Steve Robenalt sroben...@highwire.org wrote: We are migrating a small cluster on AWS from instances based on spinning disks (using instance store) to SSD-backed instances and we're trying to pick the proper instance type. Some of the recommendations for spinning disks say to use different drives for log vs data partitions to avoid issues with seek delays and contention for the disk heads. Since SSDs don't have the same seek delays, is it still recommended to use 2 SSD drives? Or is one sufficient? The purpose of distinct mount points for commitlog and data is to allow the commitlog to operate in an append only manner without seeking. This is possible with SSD disk. =Rob
Re: Question about EC2 and SSDs
On Thu, Sep 4, 2014 at 5:44 PM, Steve Robenalt sroben...@highwire.org wrote: Thanks Robert! I am assuming that you meant that it's possible with a single SSD, right? Yes, no matter how many SSDs you have you are unlikely to be able to convince one of them to physically seek a drive head across its plater, because they don't have heads or platters. =Rob
Re: Question about EC2 and SSDs
On 5 Sep 2014, at 10:05 am, Steve Robenalt sroben...@highwire.org wrote: We are migrating a small cluster on AWS from instances based on spinning disks (using instance store) to SSD-backed instances and we're trying to pick the proper instance type. Some of the recommendations for spinning disks say to use different drives for log vs data partitions to avoid issues with seek delays and contention for the disk heads. Since SSDs don't have the same seek delays, is it still recommended to use 2 SSD drives? Or is one sufficient? As a side note, splitting the commit log and data dirs into different volumes doesn’t do a whole lot of good on AWS irrespective of whether you are on spinning disks or SSDs. Simply because the volumes presented to the vm may be on the same disk. Just raid the available volumes and be done with it. Ben Bromhead Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359
Re: Question about EC2 and SSDs
Yes, I am aware there are no heads on an SSD. I also have seen plenty of examples where compatibility issues force awkward engineering tradeoffs, even as technology advances so I am jaded enough to be wary of making assumptions, which is why I asked the question. Steve On Sep 4, 2014 5:50 PM, Robert Coli rc...@eventbrite.com wrote: On Thu, Sep 4, 2014 at 5:44 PM, Steve Robenalt sroben...@highwire.org wrote: Thanks Robert! I am assuming that you meant that it's possible with a single SSD, right? Yes, no matter how many SSDs you have you are unlikely to be able to convince one of them to physically seek a drive head across its plater, because they don't have heads or platters. =Rob