Re: Cassandra DCOS | monitoring connection and user activity
Hi, Looks like you need a monitoring for Cassandra but without using JMX. It is possible to use metric reporting libraries in Cassandra: https://wiki.apache.org/cassandra/Metrics#Reporting I do not have specific experience with using Cassandra on DCOS but, monitoring with libraries and tools should not be any different. There are various options available to establish good monitoring (Graphite, Prometheus, Grafana) Helpful links: https://blog.pythian.com/monitoring-apache-cassandra-metrics-graphite-grafana/ https://github.com/instaclustr/cassandra-exporter https://prometheus.io/docs/instrumenting/exporters/ https://grafana.com/dashboards/5408 http://thelastpickle.com/blog/2017/04/05/cassandra-graphite-measurements-filtered.html Regards, Anup Shirolkar On Wed, 31 Oct 2018 at 18:41, Caesar, Maik wrote: > Hello All, > > have someone experience with monitoring cassandra in DCOS? > > If we increase the load to the Casandra in DCOS, the application get > timeouts and loose the connection and I do not have any information about > what happened. > > Is there a way to get information about the amount of current connection > and which queries are executed? Cassandra in DCOS has disabled the JMX > interface and I think the noodetool do not provide such information. > > > > Regards > > Maik > > > > DXC Technology Company -- This message is transmitted to you by or on > behalf of DXC Technology Company or one of its affiliates. It is intended > exclusively for the addressee. The substance of this message, along with > any attachments, may contain proprietary, confidential or privileged > information or information that is otherwise legally exempt from > disclosure. Any unauthorized review, use, disclosure or distribution is > prohibited. If you are not the intended recipient of this message, you are > not authorized to read, print, retain, copy or disseminate any part of this > message. If you have received this message in error, please destroy and > delete all copies and notify the sender by return e-mail. Regardless of > content, this e-mail shall not operate to bind DXC Technology Company or > any of its affiliates to any order or other contract unless pursuant to > explicit written agreement or government initiative expressly permitting > the use of e-mail for such purpose. --. >
Re: Cassandra | Cross Data Centre Replication Status
Akshay, avoid running repair in that case, it'll take way longer than rebuild and it will stream data back to your original DC, even between nodes in that original DC, which is not what you're running after, and could lead to all sorts of troubles. Run "nodetool rebuild " as recommended by Jon and Surbhi. All the data in the original DC will be streamed out to the new one, including the data that was already written since you altered your keyspace replication settings (so 2 weeks of data). It will then use some extra disk space until compaction catches up. Cheers, On Wed, Oct 31, 2018 at 2:45 PM Kiran mk wrote: > Run the repair with -pr option on each node which will repair only the > parition range. > > nodetool repair -pr > On Wed, Oct 31, 2018 at 7:04 PM Surbhi Gupta > wrote: > > > > Nodetool repair will take way more time than nodetool rebuild. > > How much data u have in your original data center? > > Repair should be run to make the data consistent in case of node down > more than hintedhandoff period and dropped mutations. > > But as a thumb rule ,generally we run repair using opscenter (if using > Datastax) most of the times. > > > > So in your case run “nodetool rebuild ” on all the > nodes in new data center. > > For making the rebuild process fast, increase three parameters, > compaction throughput , stream throughput and interdcstream throughput. > > > > Thanks > > Surbhi > > On Tue, Oct 30, 2018 at 11:29 PM Akshay Bhardwaj < > akshay.bhardwaj1...@gmail.com> wrote: > >> > >> Hi Jonathan, > >> > >> That makes sense. Thank you for the explanation. > >> > >> Another quick question, as the cluster is still operative and the data > for the past 2 weeks (since updating replication factor) is present in both > the data centres, should I run "nodetool rebuild" or "nodetool repair"? > >> > >> I read that nodetool rebuild is faster and is useful till the new data > centre is empty and no partition keys are present. So when is the good time > to use either of the commands and what impact can it have on the data > centre operations? > >> > >> Thanks and Regards > >> > >> Akshay Bhardwaj > >> +91-97111-33849 <+91%2097111%2033849> > >> > >> > >> On Wed, Oct 31, 2018 at 2:34 AM Jonathan Haddad > wrote: > >>> > >>> You need to run "nodetool rebuild -- " on each node > in the new DC to get the old data to replicate. It doesn't do it > automatically because Cassandra has no way of knowing if you're done adding > nodes and if it were to migrate automatically, it could cause a lot of > problems. Imagine streaming 100 nodes data to 3 nodes in the new DC, not > fun. > >>> > >>> On Tue, Oct 30, 2018 at 1:59 PM Akshay Bhardwaj < > akshay.bhardwaj1...@gmail.com> wrote: > > Hi Experts, > > I previously had 1 Cassandra data centre in AWS Singapore region with > 5 nodes, with my keyspace's replication factor as 3 in Network topology. > > After this cluster has been running smoothly for 4 months (500 GB of > data on each node's disk), I added 2nd data centre in AWS Mumbai region > with yet again 5 nodes in Network topology. > > After updating my keyspace's replication factor to > {"AWS_Sgp":3,"AWS_Mum":3}, my expectation was that the data present in Sgp > region will immediately start replicating on the Mum region's nodes. > However even after 2 weeks I do not see historical data to be replicated, > but new data being written on Sgp region is present in Mum region as well. > > Any help or suggestions to debug this issue will be highly > appreciated. > > Regards > Akshay Bhardwaj > +91-97111-33849 <+91%2097111%2033849> > > > >>> > >>> > >>> -- > >>> Jon Haddad > >>> http://www.rustyrazorblade.com > >>> twitter: rustyrazorblade > >>> > >>> > >> > >> > > > -- > Best Regards, > Kiran.M.K. > > - > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > > -- - Alexander Dejanovski France @alexanderdeja Consultant Apache Cassandra Consulting http://www.thelastpickle.com
Re: Cassandra | Cross Data Centre Replication Status
Repair will take way more time then rebuild. On Wed, Oct 31, 2018 at 6:45 AM Kiran mk wrote: > Run the repair with -pr option on each node which will repair only the > > parition range. > > > > nodetool repair -pr > > On Wed, Oct 31, 2018 at 7:04 PM Surbhi Gupta > wrote: > > > > > > Nodetool repair will take way more time than nodetool rebuild. > > > How much data u have in your original data center? > > > Repair should be run to make the data consistent in case of node down > more than hintedhandoff period and dropped mutations. > > > But as a thumb rule ,generally we run repair using opscenter (if using > Datastax) most of the times. > > > > > > So in your case run “nodetool rebuild ” on all the > nodes in new data center. > > > For making the rebuild process fast, increase three parameters, > compaction throughput , stream throughput and interdcstream throughput. > > > > > > Thanks > > > Surbhi > > > On Tue, Oct 30, 2018 at 11:29 PM Akshay Bhardwaj < > akshay.bhardwaj1...@gmail.com> wrote: > > >> > > >> Hi Jonathan, > > >> > > >> That makes sense. Thank you for the explanation. > > >> > > >> Another quick question, as the cluster is still operative and the data > for the past 2 weeks (since updating replication factor) is present in both > the data centres, should I run "nodetool rebuild" or "nodetool repair"? > > >> > > >> I read that nodetool rebuild is faster and is useful till the new data > centre is empty and no partition keys are present. So when is the good time > to use either of the commands and what impact can it have on the data > centre operations? > > >> > > >> Thanks and Regards > > >> > > >> Akshay Bhardwaj > > >> +91-97111-33849 > > >> > > >> > > >> On Wed, Oct 31, 2018 at 2:34 AM Jonathan Haddad > wrote: > > >>> > > >>> You need to run "nodetool rebuild -- " on each node > in the new DC to get the old data to replicate. It doesn't do it > automatically because Cassandra has no way of knowing if you're done adding > nodes and if it were to migrate automatically, it could cause a lot of > problems. Imagine streaming 100 nodes data to 3 nodes in the new DC, not > fun. > > >>> > > >>> On Tue, Oct 30, 2018 at 1:59 PM Akshay Bhardwaj < > akshay.bhardwaj1...@gmail.com> wrote: > > > > Hi Experts, > > > > I previously had 1 Cassandra data centre in AWS Singapore region with > 5 nodes, with my keyspace's replication factor as 3 in Network topology. > > > > After this cluster has been running smoothly for 4 months (500 GB of > data on each node's disk), I added 2nd data centre in AWS Mumbai region > with yet again 5 nodes in Network topology. > > > > After updating my keyspace's replication factor to > {"AWS_Sgp":3,"AWS_Mum":3}, my expectation was that the data present in Sgp > region will immediately start replicating on the Mum region's nodes. > However even after 2 weeks I do not see historical data to be replicated, > but new data being written on Sgp region is present in Mum region as well. > > > > Any help or suggestions to debug this issue will be highly > appreciated. > > > > Regards > > Akshay Bhardwaj > > +91-97111-33849 > > > > > > >>> > > >>> > > >>> -- > > >>> Jon Haddad > > >>> http://www.rustyrazorblade.com > > >>> twitter: rustyrazorblade > > >>> > > >>> > > >> > > >> > > > > > > -- > > Best Regards, > > Kiran.M.K. > > > > - > > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: user-h...@cassandra.apache.org > > > >
Re: Cassandra | Cross Data Centre Replication Status
Run the repair with -pr option on each node which will repair only the parition range. nodetool repair -pr On Wed, Oct 31, 2018 at 7:04 PM Surbhi Gupta wrote: > > Nodetool repair will take way more time than nodetool rebuild. > How much data u have in your original data center? > Repair should be run to make the data consistent in case of node down more > than hintedhandoff period and dropped mutations. > But as a thumb rule ,generally we run repair using opscenter (if using > Datastax) most of the times. > > So in your case run “nodetool rebuild ” on all the nodes > in new data center. > For making the rebuild process fast, increase three parameters, compaction > throughput , stream throughput and interdcstream throughput. > > Thanks > Surbhi > On Tue, Oct 30, 2018 at 11:29 PM Akshay Bhardwaj > wrote: >> >> Hi Jonathan, >> >> That makes sense. Thank you for the explanation. >> >> Another quick question, as the cluster is still operative and the data for >> the past 2 weeks (since updating replication factor) is present in both the >> data centres, should I run "nodetool rebuild" or "nodetool repair"? >> >> I read that nodetool rebuild is faster and is useful till the new data >> centre is empty and no partition keys are present. So when is the good time >> to use either of the commands and what impact can it have on the data centre >> operations? >> >> Thanks and Regards >> >> Akshay Bhardwaj >> +91-97111-33849 >> >> >> On Wed, Oct 31, 2018 at 2:34 AM Jonathan Haddad wrote: >>> >>> You need to run "nodetool rebuild -- " on each node in >>> the new DC to get the old data to replicate. It doesn't do it >>> automatically because Cassandra has no way of knowing if you're done adding >>> nodes and if it were to migrate automatically, it could cause a lot of >>> problems. Imagine streaming 100 nodes data to 3 nodes in the new DC, not >>> fun. >>> >>> On Tue, Oct 30, 2018 at 1:59 PM Akshay Bhardwaj >>> wrote: Hi Experts, I previously had 1 Cassandra data centre in AWS Singapore region with 5 nodes, with my keyspace's replication factor as 3 in Network topology. After this cluster has been running smoothly for 4 months (500 GB of data on each node's disk), I added 2nd data centre in AWS Mumbai region with yet again 5 nodes in Network topology. After updating my keyspace's replication factor to {"AWS_Sgp":3,"AWS_Mum":3}, my expectation was that the data present in Sgp region will immediately start replicating on the Mum region's nodes. However even after 2 weeks I do not see historical data to be replicated, but new data being written on Sgp region is present in Mum region as well. Any help or suggestions to debug this issue will be highly appreciated. Regards Akshay Bhardwaj +91-97111-33849 >>> >>> >>> -- >>> Jon Haddad >>> http://www.rustyrazorblade.com >>> twitter: rustyrazorblade >>> >>> >> >> -- Best Regards, Kiran.M.K. - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: Cassandra | Cross Data Centre Replication Status
Nodetool repair will take way more time than nodetool rebuild. How much data u have in your original data center? Repair should be run to make the data consistent in case of node down more than hintedhandoff period and dropped mutations. But as a thumb rule ,generally we run repair using opscenter (if using Datastax) most of the times. So in your case run “nodetool rebuild ” on all the nodes in new data center. For making the rebuild process fast, increase three parameters, compaction throughput , stream throughput and interdcstream throughput. Thanks Surbhi On Tue, Oct 30, 2018 at 11:29 PM Akshay Bhardwaj < akshay.bhardwaj1...@gmail.com> wrote: > Hi Jonathan, > > That makes sense. Thank you for the explanation. > > Another quick question, as the cluster is still operative and the data for > the past 2 weeks (since updating replication factor) is present in both the > data centres, should I run "nodetool rebuild" or "nodetool repair"? > > I read that nodetool rebuild is faster and is useful till the new data > centre is empty and no partition keys are present. So when is the good time > to use either of the commands and what impact can it have on the data > centre operations? > > Thanks and Regards > > Akshay Bhardwaj > +91-97111-33849 > > > On Wed, Oct 31, 2018 at 2:34 AM Jonathan Haddad wrote: > >> You need to run "nodetool rebuild -- " on each node in >> the new DC to get the old data to replicate. It doesn't do it >> automatically because Cassandra has no way of knowing if you're done adding >> nodes and if it were to migrate automatically, it could cause a lot of >> problems. Imagine streaming 100 nodes data to 3 nodes in the new DC, not >> fun. >> >> On Tue, Oct 30, 2018 at 1:59 PM Akshay Bhardwaj < >> akshay.bhardwaj1...@gmail.com> wrote: >> >>> Hi Experts, >>> >>> I previously had 1 Cassandra data centre in AWS Singapore region with 5 >>> nodes, with my keyspace's replication factor as 3 in Network topology. >>> >>> After this cluster has been running smoothly for 4 months (500 GB of >>> data on each node's disk), I added 2nd data centre in AWS Mumbai region >>> with yet again 5 nodes in Network topology. >>> >>> After updating my keyspace's replication factor to >>> {"AWS_Sgp":3,"AWS_Mum":3}, my expectation was that the data present in Sgp >>> region will immediately start replicating on the Mum region's nodes. >>> However even after 2 weeks I do not see historical data to be replicated, >>> but new data being written on Sgp region is present in Mum region as well. >>> >>> Any help or suggestions to debug this issue will be highly appreciated. >>> >>> Regards >>> Akshay Bhardwaj >>> +91-97111-33849 >>> >>> >>> >> >> -- >> Jon Haddad >> http://www.rustyrazorblade.com >> twitter: rustyrazorblade >> >> >> > >
Cassandra DCOS | monitoring connection and user activity
Hello All, have someone experience with monitoring cassandra in DCOS? If we increase the load to the Casandra in DCOS, the application get timeouts and loose the connection and I do not have any information about what happened. Is there a way to get information about the amount of current connection and which queries are executed? Cassandra in DCOS has disabled the JMX interface and I think the noodetool do not provide such information. Regards Maik DXC Technology Company -- This message is transmitted to you by or on behalf of DXC Technology Company or one of its affiliates. It is intended exclusively for the addressee. The substance of this message, along with any attachments, may contain proprietary, confidential or privileged information or information that is otherwise legally exempt from disclosure. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient of this message, you are not authorized to read, print, retain, copy or disseminate any part of this message. If you have received this message in error, please destroy and delete all copies and notify the sender by return e-mail. Regardless of content, this e-mail shall not operate to bind DXC Technology Company or any of its affiliates to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of e-mail for such purpose. --.
Re: Cassandra | Cross Data Centre Replication Status
Hi Jonathan, That makes sense. Thank you for the explanation. Another quick question, as the cluster is still operative and the data for the past 2 weeks (since updating replication factor) is present in both the data centres, should I run "nodetool rebuild" or "nodetool repair"? I read that nodetool rebuild is faster and is useful till the new data centre is empty and no partition keys are present. So when is the good time to use either of the commands and what impact can it have on the data centre operations? Thanks and Regards Akshay Bhardwaj +91-97111-33849 On Wed, Oct 31, 2018 at 2:34 AM Jonathan Haddad wrote: > You need to run "nodetool rebuild -- " on each node in > the new DC to get the old data to replicate. It doesn't do it > automatically because Cassandra has no way of knowing if you're done adding > nodes and if it were to migrate automatically, it could cause a lot of > problems. Imagine streaming 100 nodes data to 3 nodes in the new DC, not > fun. > > On Tue, Oct 30, 2018 at 1:59 PM Akshay Bhardwaj < > akshay.bhardwaj1...@gmail.com> wrote: > >> Hi Experts, >> >> I previously had 1 Cassandra data centre in AWS Singapore region with 5 >> nodes, with my keyspace's replication factor as 3 in Network topology. >> >> After this cluster has been running smoothly for 4 months (500 GB of data >> on each node's disk), I added 2nd data centre in AWS Mumbai region with yet >> again 5 nodes in Network topology. >> >> After updating my keyspace's replication factor to >> {"AWS_Sgp":3,"AWS_Mum":3}, my expectation was that the data present in Sgp >> region will immediately start replicating on the Mum region's nodes. >> However even after 2 weeks I do not see historical data to be replicated, >> but new data being written on Sgp region is present in Mum region as well. >> >> Any help or suggestions to debug this issue will be highly appreciated. >> >> Regards >> Akshay Bhardwaj >> +91-97111-33849 >> > > > -- > Jon Haddad > http://www.rustyrazorblade.com > twitter: rustyrazorblade >
Re: [ANNOUNCE] StratIO's Lucene plugin fork
Pretty cool! Dinesh > On Oct 30, 2018, at 6:31 PM, Jonathan Haddad wrote: > > Very cool Ben, thanks for sharing! > >> On Tue, Oct 30, 2018 at 6:14 PM Ben Slater >> wrote: >> For anyone who is interested, we’ve published a blog with some more >> background on this and some more detail of our ongoing plans: >> https://www.instaclustr.com/instaclustr-support-cassandra-lucene-index/ >> >> Cheers >> Ben >> >>> On Fri, 19 Oct 2018 at 09:42 kurt greaves wrote: >>> Hi all, >>> >>> We've had confirmation from Stratio that they are no longer maintaining >>> their Lucene plugin for Apache Cassandra. We've thus decided to fork the >>> plugin to continue maintaining it. At this stage we won't be making any >>> additions to the plugin in the short term unless absolutely necessary, and >>> as 4.0 nears we'll begin making it compatible with the new major release. >>> We plan on taking the existing PR's and issues from the Stratio repository >>> and getting them merged/resolved, however this likely won't happen until >>> early next year. Having said that, we welcome all contributions and will >>> dedicate time to reviewing bugs in the current versions if people lodge >>> them and can help. >>> >>> I'll note that this is new ground for us, we don't have much existing >>> knowledge of the plugin but are determined to learn. If anyone out there >>> has established knowledge about the plugin we'd be grateful for any >>> assistance! >>> >>> You can find our fork here: >>> https://github.com/instaclustr/cassandra-lucene-index >>> At the moment, the only difference is that there is a 3.11.3 branch which >>> just has some minor changes to dependencies to better support 3.11.3. >>> >>> Cheers, >>> Kurt >> -- >> Ben Slater >> Chief Product Officer >> >> >> Read our latest technical blog posts here. >> This email has been sent on behalf of Instaclustr Pty. Limited (Australia) >> and Instaclustr Inc (USA). >> This email and any attachments may contain confidential and legally >> privileged information. If you are not the intended recipient, do not copy >> or disclose its content, but please reply to this email immediately and >> highlight the error to the sender and then immediately delete the message. > > > -- > Jon Haddad > http://www.rustyrazorblade.com > twitter: rustyrazorblade