Monitoring replication lag/latency in multi DC setup
Hi, We have multi DC Cassandra ring with 2 DCs setup. We use LOCAL_QUORUM for writes and reads. The network we have seen between the DC is sometimes flaky lasting few minutes to few 10 of minutes. I wanted to know what is the best way to measure/monitor either the lag or replication latency between the data centers. Are there any metrics I can monitor to find the backlog of data that needs to be transferred? Thanks in advance. VR
Re: Monitoring replication lag/latency in multi DC setup
Thanks for the quick reply, Mohit.Can we measure/monitor the size of Hinted Handoffs? Would it be a good enough indicator of my back log? Although we know when a network is flaky, we are interested in knowing how much data is piling up in local DC that needs to be transferred. Greatly appreciate your help. VR On Wed, Sep 5, 2012 at 8:33 PM, Mohit Anchlia mohitanch...@gmail.comwrote: As far as I know Cassandra doesn't use internal queueing mechanism specific to replication. Cassandra sends the write the remote DC and after that it's upto the tcp/ip stack to deal with buffering. If requests starts to timeout Cassandra would use HH upto certain time. For longer outage you would have to run repair. Also look at tcp/ip tuning parameters that are helpful with your scenario: http://kaivanov.blogspot.com/2010/09/linux-tcp-tuning.html Run iperf and test the latency. On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama venkata.s.r...@gmail.comwrote: Hi, We have multi DC Cassandra ring with 2 DCs setup. We use LOCAL_QUORUM for writes and reads. The network we have seen between the DC is sometimes flaky lasting few minutes to few 10 of minutes. I wanted to know what is the best way to measure/monitor either the lag or replication latency between the data centers. Are there any metrics I can monitor to find the backlog of data that needs to be transferred? Thanks in advance. VR
Secondary index read/write explanation
Hi All, I am a new bee to Cassandra and trying to understand how secondary indexes work. I have been going over the discussion on https://issues.apache.org/jira/browse/CASSANDRA-749 about local secondary indexes. And interesting question on http://www.mail-archive.com/user@cassandra.apache.org/msg16966.html. The discussion seems to assume that most common uses cases are ones with range queries. Is this right? I am trying to understand the low cardinality reasoning and how the read gets executed. I have following questions, hoping i can explain my question well :) 1. When a write request is received, it is written to the base CF and secondary index to secondary (hidden) CF. If this right, will the secondary index be written local the node or will it follow RP/OPP to write to nodes. 2. When a coordinator receives a read request with say predicate x=y where column x is the secondary index, how does the coordinator query relevant node(s)? How does it avoid sending it to all nodes if it is locally indexed? If there is any article/blog that can help understand this better, please let me know. Thanks again in advance. VR
Re: Monitoring replication lag/latency in multi DC setup
Is there a specific metric you can recommend? VR On Wed, Sep 5, 2012 at 9:19 PM, Mohit Anchlia mohitanch...@gmail.comwrote: Cassandra exposes lot of metrics through Jconsole. You might be able to get some information from Jconsole. On Wed, Sep 5, 2012 at 8:47 PM, Venkat Rama venkata.s.r...@gmail.comwrote: Thanks for the quick reply, Mohit.Can we measure/monitor the size of Hinted Handoffs? Would it be a good enough indicator of my back log? Although we know when a network is flaky, we are interested in knowing how much data is piling up in local DC that needs to be transferred. Greatly appreciate your help. VR On Wed, Sep 5, 2012 at 8:33 PM, Mohit Anchlia mohitanch...@gmail.comwrote: As far as I know Cassandra doesn't use internal queueing mechanism specific to replication. Cassandra sends the write the remote DC and after that it's upto the tcp/ip stack to deal with buffering. If requests starts to timeout Cassandra would use HH upto certain time. For longer outage you would have to run repair. Also look at tcp/ip tuning parameters that are helpful with your scenario: http://kaivanov.blogspot.com/2010/09/linux-tcp-tuning.html Run iperf and test the latency. On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama venkata.s.r...@gmail.comwrote: Hi, We have multi DC Cassandra ring with 2 DCs setup. We use LOCAL_QUORUM for writes and reads. The network we have seen between the DC is sometimes flaky lasting few minutes to few 10 of minutes. I wanted to know what is the best way to measure/monitor either the lag or replication latency between the data centers. Are there any metrics I can monitor to find the backlog of data that needs to be transferred? Thanks in advance. VR
Re: Adding a new node
Thanks for the pointer. I restarted entire cluster and started nodes at the same time. However, I still see the issue. The view is not consistant. Am running 0.7.5. In general, if a node with bad ring view starts first, then I guess the restart also doesnt help as it might be propagating its view. Is this assumption correct? On Sun, May 8, 2011 at 9:02 PM, aaron morton aa...@thelastpickle.comwrote: It is possible to change IP address of a node, background http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/change-node-IP-address-td6197607.html If you have already bought a new node back with a different IP and the nodes in the cluster have different views of the ring (nodetool ring) you should see http://www.datastax.com/docs/0.7/troubleshooting/index#view-of-ring-differs-between-some-nodes What version are you on and what does nodetool ring say? Hope that helps. http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/change-node-IP-address-td6197607.html - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 9 May 2011, at 12:24, Venkat Rama wrote: Hi, I am trying to bring up a new node (with different IP) to replace a dead node on cassandra 0.7.5. Rather than bootstrap, I am copying the SSTable files to the new node(backed up files) as my data runs into several GB. Although the node successfully joins the ring, some of the ring nodes still seem to point to the old dead node as seen from ring command. Is there a way to notify all nodes about the new node? Am looking for options that can bring the cluster back to it original state in a faster and reliable manner since I do have all the SSTable files. One option I looked at was to remove all system table and restart the entire cluster. But I loose the schemas with this approach. Thanks in advance for your reply. VR
Adding a new node
Hi, I am trying to bring up a new node (with different IP) to replace a dead node on cassandra 0.7.5. Rather than bootstrap, I am copying the SSTable files to the new node(backed up files) as my data runs into several GB. Although the node successfully joins the ring, some of the ring nodes still seem to point to the old dead node as seen from ring command. Is there a way to notify all nodes about the new node? Am looking for options that can bring the cluster back to it original state in a faster and reliable manner since I do have all the SSTable files. One option I looked at was to remove all system table and restart the entire cluster. But I loose the schemas with this approach. Thanks in advance for your reply. VR