Re: Cassandra DCOS | monitoring connection and user activity

2018-10-31 Thread Anup Shirolkar
Hi,

Looks like you need a monitoring for Cassandra but without using JMX.
It is possible to use metric reporting libraries in Cassandra:
https://wiki.apache.org/cassandra/Metrics#Reporting

I do not have specific experience with using Cassandra on DCOS but,
monitoring with libraries and tools should not be any different.

There are various options available to establish good monitoring (Graphite,
Prometheus, Grafana)

Helpful links:

https://blog.pythian.com/monitoring-apache-cassandra-metrics-graphite-grafana/
https://github.com/instaclustr/cassandra-exporter
https://prometheus.io/docs/instrumenting/exporters/
https://grafana.com/dashboards/5408
http://thelastpickle.com/blog/2017/04/05/cassandra-graphite-measurements-filtered.html

Regards,

Anup Shirolkar




On Wed, 31 Oct 2018 at 18:41, Caesar, Maik  wrote:

> Hello All,
>
> have someone experience with monitoring cassandra in DCOS?
>
> If we increase the load to the Casandra in DCOS, the application get
> timeouts and loose the connection and I do not have any information about
> what happened.
>
> Is there a way to get information about the amount of current connection
> and which queries are executed? Cassandra in DCOS has disabled the JMX
> interface and I think the noodetool do not provide such information.
>
>
>
> Regards
>
> Maik
>
>
>
> DXC Technology Company -- This message is transmitted to you by or on
> behalf of DXC Technology Company or one of its affiliates. It is intended
> exclusively for the addressee. The substance of this message, along with
> any attachments, may contain proprietary, confidential or privileged
> information or information that is otherwise legally exempt from
> disclosure. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient of this message, you are
> not authorized to read, print, retain, copy or disseminate any part of this
> message. If you have received this message in error, please destroy and
> delete all copies and notify the sender by return e-mail. Regardless of
> content, this e-mail shall not operate to bind DXC Technology Company or
> any of its affiliates to any order or other contract unless pursuant to
> explicit written agreement or government initiative expressly permitting
> the use of e-mail for such purpose. --.
>


Re: Cassandra | Cross Data Centre Replication Status

2018-10-31 Thread Alexander Dejanovski
Akshay,

avoid running repair in that case, it'll take way longer than rebuild and
it will stream data back to your original DC, even between nodes in that
original DC, which is not what you're running after, and could lead to all
sorts of troubles.

Run "nodetool rebuild " as recommended by Jon and Surbhi. All
the data in the original DC will be streamed out to the new one, including
the data that was already written since you altered your keyspace
replication settings (so 2 weeks of data). It will then use some extra disk
space until compaction catches up.

Cheers,


On Wed, Oct 31, 2018 at 2:45 PM Kiran mk  wrote:

> Run the repair with -pr option on each node which will repair only the
> parition range.
>
> nodetool repair -pr
> On Wed, Oct 31, 2018 at 7:04 PM Surbhi Gupta 
> wrote:
> >
> > Nodetool repair will take way more time than nodetool rebuild.
> > How much data u have in your original data center?
> > Repair should be run to make the data consistent in case of node down
> more than hintedhandoff period and dropped mutations.
> > But as a thumb rule ,generally we run repair using opscenter (if using
> Datastax) most of the times.
> >
> > So in your case run “nodetool rebuild ” on all the
> nodes in new data center.
> > For making the rebuild process fast, increase three parameters,
> compaction throughput , stream throughput and interdcstream  throughput.
> >
> > Thanks
> > Surbhi
> > On Tue, Oct 30, 2018 at 11:29 PM Akshay Bhardwaj <
> akshay.bhardwaj1...@gmail.com> wrote:
> >>
> >> Hi Jonathan,
> >>
> >> That makes sense. Thank you for the explanation.
> >>
> >> Another quick question, as the cluster is still operative and the data
> for the past 2 weeks (since updating replication factor) is present in both
> the data centres, should I run "nodetool rebuild" or "nodetool repair"?
> >>
> >> I read that nodetool rebuild is faster and is useful till the new data
> centre is empty and no partition keys are present. So when is the good time
> to use either of the commands and what impact can it have on the data
> centre operations?
> >>
> >> Thanks and Regards
> >>
> >> Akshay Bhardwaj
> >> +91-97111-33849 <+91%2097111%2033849>
> >>
> >>
> >> On Wed, Oct 31, 2018 at 2:34 AM Jonathan Haddad 
> wrote:
> >>>
> >>> You need to run "nodetool rebuild -- " on each node
> in the new DC to get the old data to replicate.  It doesn't do it
> automatically because Cassandra has no way of knowing if you're done adding
> nodes and if it were to migrate automatically, it could cause a lot of
> problems. Imagine streaming 100 nodes data to 3 nodes in the new DC, not
> fun.
> >>>
> >>> On Tue, Oct 30, 2018 at 1:59 PM Akshay Bhardwaj <
> akshay.bhardwaj1...@gmail.com> wrote:
> 
>  Hi Experts,
> 
>  I previously had 1 Cassandra data centre in AWS Singapore region with
> 5 nodes, with my keyspace's replication factor as 3 in Network topology.
> 
>  After this cluster has been running smoothly for 4 months (500 GB of
> data on each node's disk), I added 2nd data centre in AWS Mumbai region
> with yet again 5 nodes in Network topology.
> 
>  After updating my keyspace's replication factor to
> {"AWS_Sgp":3,"AWS_Mum":3}, my expectation was that the data present in Sgp
> region will immediately start replicating on the Mum region's nodes.
> However even after 2 weeks I do not see historical data to be replicated,
> but new data being written on Sgp region is present in Mum region as well.
> 
>  Any help or suggestions to debug this issue will be highly
> appreciated.
> 
>  Regards
>  Akshay Bhardwaj
>  +91-97111-33849 <+91%2097111%2033849>
> 
> 
> >>>
> >>>
> >>> --
> >>> Jon Haddad
> >>> http://www.rustyrazorblade.com
> >>> twitter: rustyrazorblade
> >>>
> >>>
> >>
> >>
>
>
> --
> Best Regards,
> Kiran.M.K.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
> --
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Cassandra | Cross Data Centre Replication Status

2018-10-31 Thread Surbhi Gupta
Repair will take way more time then rebuild.

On Wed, Oct 31, 2018 at 6:45 AM Kiran mk  wrote:

> Run the repair with -pr option on each node which will repair only the
>
> parition range.
>
>
>
> nodetool repair -pr
>
> On Wed, Oct 31, 2018 at 7:04 PM Surbhi Gupta 
> wrote:
>
> >
>
> > Nodetool repair will take way more time than nodetool rebuild.
>
> > How much data u have in your original data center?
>
> > Repair should be run to make the data consistent in case of node down
> more than hintedhandoff period and dropped mutations.
>
> > But as a thumb rule ,generally we run repair using opscenter (if using
> Datastax) most of the times.
>
> >
>
> > So in your case run “nodetool rebuild ” on all the
> nodes in new data center.
>
> > For making the rebuild process fast, increase three parameters,
> compaction throughput , stream throughput and interdcstream  throughput.
>
> >
>
> > Thanks
>
> > Surbhi
>
> > On Tue, Oct 30, 2018 at 11:29 PM Akshay Bhardwaj <
> akshay.bhardwaj1...@gmail.com> wrote:
>
> >>
>
> >> Hi Jonathan,
>
> >>
>
> >> That makes sense. Thank you for the explanation.
>
> >>
>
> >> Another quick question, as the cluster is still operative and the data
> for the past 2 weeks (since updating replication factor) is present in both
> the data centres, should I run "nodetool rebuild" or "nodetool repair"?
>
> >>
>
> >> I read that nodetool rebuild is faster and is useful till the new data
> centre is empty and no partition keys are present. So when is the good time
> to use either of the commands and what impact can it have on the data
> centre operations?
>
> >>
>
> >> Thanks and Regards
>
> >>
>
> >> Akshay Bhardwaj
>
> >> +91-97111-33849
>
> >>
>
> >>
>
> >> On Wed, Oct 31, 2018 at 2:34 AM Jonathan Haddad 
> wrote:
>
> >>>
>
> >>> You need to run "nodetool rebuild -- " on each node
> in the new DC to get the old data to replicate.  It doesn't do it
> automatically because Cassandra has no way of knowing if you're done adding
> nodes and if it were to migrate automatically, it could cause a lot of
> problems. Imagine streaming 100 nodes data to 3 nodes in the new DC, not
> fun.
>
> >>>
>
> >>> On Tue, Oct 30, 2018 at 1:59 PM Akshay Bhardwaj <
> akshay.bhardwaj1...@gmail.com> wrote:
>
> 
>
>  Hi Experts,
>
> 
>
>  I previously had 1 Cassandra data centre in AWS Singapore region with
> 5 nodes, with my keyspace's replication factor as 3 in Network topology.
>
> 
>
>  After this cluster has been running smoothly for 4 months (500 GB of
> data on each node's disk), I added 2nd data centre in AWS Mumbai region
> with yet again 5 nodes in Network topology.
>
> 
>
>  After updating my keyspace's replication factor to
> {"AWS_Sgp":3,"AWS_Mum":3}, my expectation was that the data present in Sgp
> region will immediately start replicating on the Mum region's nodes.
> However even after 2 weeks I do not see historical data to be replicated,
> but new data being written on Sgp region is present in Mum region as well.
>
> 
>
>  Any help or suggestions to debug this issue will be highly
> appreciated.
>
> 
>
>  Regards
>
>  Akshay Bhardwaj
>
>  +91-97111-33849
>
> 
>
> 
>
> >>>
>
> >>>
>
> >>> --
>
> >>> Jon Haddad
>
> >>> http://www.rustyrazorblade.com
>
> >>> twitter: rustyrazorblade
>
> >>>
>
> >>>
>
> >>
>
> >>
>
>
>
>
>
> --
>
> Best Regards,
>
> Kiran.M.K.
>
>
>
> -
>
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>
>
>


Re: Cassandra | Cross Data Centre Replication Status

2018-10-31 Thread Kiran mk
Run the repair with -pr option on each node which will repair only the
parition range.

nodetool repair -pr
On Wed, Oct 31, 2018 at 7:04 PM Surbhi Gupta  wrote:
>
> Nodetool repair will take way more time than nodetool rebuild.
> How much data u have in your original data center?
> Repair should be run to make the data consistent in case of node down more 
> than hintedhandoff period and dropped mutations.
> But as a thumb rule ,generally we run repair using opscenter (if using 
> Datastax) most of the times.
>
> So in your case run “nodetool rebuild ” on all the nodes 
> in new data center.
> For making the rebuild process fast, increase three parameters, compaction 
> throughput , stream throughput and interdcstream  throughput.
>
> Thanks
> Surbhi
> On Tue, Oct 30, 2018 at 11:29 PM Akshay Bhardwaj 
>  wrote:
>>
>> Hi Jonathan,
>>
>> That makes sense. Thank you for the explanation.
>>
>> Another quick question, as the cluster is still operative and the data for 
>> the past 2 weeks (since updating replication factor) is present in both the 
>> data centres, should I run "nodetool rebuild" or "nodetool repair"?
>>
>> I read that nodetool rebuild is faster and is useful till the new data 
>> centre is empty and no partition keys are present. So when is the good time 
>> to use either of the commands and what impact can it have on the data centre 
>> operations?
>>
>> Thanks and Regards
>>
>> Akshay Bhardwaj
>> +91-97111-33849
>>
>>
>> On Wed, Oct 31, 2018 at 2:34 AM Jonathan Haddad  wrote:
>>>
>>> You need to run "nodetool rebuild -- " on each node in 
>>> the new DC to get the old data to replicate.  It doesn't do it 
>>> automatically because Cassandra has no way of knowing if you're done adding 
>>> nodes and if it were to migrate automatically, it could cause a lot of 
>>> problems. Imagine streaming 100 nodes data to 3 nodes in the new DC, not 
>>> fun.
>>>
>>> On Tue, Oct 30, 2018 at 1:59 PM Akshay Bhardwaj 
>>>  wrote:

 Hi Experts,

 I previously had 1 Cassandra data centre in AWS Singapore region with 5 
 nodes, with my keyspace's replication factor as 3 in Network topology.

 After this cluster has been running smoothly for 4 months (500 GB of data 
 on each node's disk), I added 2nd data centre in AWS Mumbai region with 
 yet again 5 nodes in Network topology.

 After updating my keyspace's replication factor to 
 {"AWS_Sgp":3,"AWS_Mum":3}, my expectation was that the data present in Sgp 
 region will immediately start replicating on the Mum region's nodes. 
 However even after 2 weeks I do not see historical data to be replicated, 
 but new data being written on Sgp region is present in Mum region as well.

 Any help or suggestions to debug this issue will be highly appreciated.

 Regards
 Akshay Bhardwaj
 +91-97111-33849


>>>
>>>
>>> --
>>> Jon Haddad
>>> http://www.rustyrazorblade.com
>>> twitter: rustyrazorblade
>>>
>>>
>>
>>


-- 
Best Regards,
Kiran.M.K.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Cassandra | Cross Data Centre Replication Status

2018-10-31 Thread Surbhi Gupta
Nodetool repair will take way more time than nodetool rebuild.
How much data u have in your original data center?
Repair should be run to make the data consistent in case of node down more
than hintedhandoff period and dropped mutations.
But as a thumb rule ,generally we run repair using opscenter (if using
Datastax) most of the times.

So in your case run “nodetool rebuild ” on all the
nodes in new data center.
For making the rebuild process fast, increase three parameters, compaction
throughput , stream throughput and interdcstream  throughput.

Thanks
Surbhi
On Tue, Oct 30, 2018 at 11:29 PM Akshay Bhardwaj <
akshay.bhardwaj1...@gmail.com> wrote:

> Hi Jonathan,
>
> That makes sense. Thank you for the explanation.
>
> Another quick question, as the cluster is still operative and the data for
> the past 2 weeks (since updating replication factor) is present in both the
> data centres, should I run "nodetool rebuild" or "nodetool repair"?
>
> I read that nodetool rebuild is faster and is useful till the new data
> centre is empty and no partition keys are present. So when is the good time
> to use either of the commands and what impact can it have on the data
> centre operations?
>
> Thanks and Regards
>
> Akshay Bhardwaj
> +91-97111-33849
>
>
> On Wed, Oct 31, 2018 at 2:34 AM Jonathan Haddad  wrote:
>
>> You need to run "nodetool rebuild -- " on each node in
>> the new DC to get the old data to replicate.  It doesn't do it
>> automatically because Cassandra has no way of knowing if you're done adding
>> nodes and if it were to migrate automatically, it could cause a lot of
>> problems. Imagine streaming 100 nodes data to 3 nodes in the new DC, not
>> fun.
>>
>> On Tue, Oct 30, 2018 at 1:59 PM Akshay Bhardwaj <
>> akshay.bhardwaj1...@gmail.com> wrote:
>>
>>> Hi Experts,
>>>
>>> I previously had 1 Cassandra data centre in AWS Singapore region with 5
>>> nodes, with my keyspace's replication factor as 3 in Network topology.
>>>
>>> After this cluster has been running smoothly for 4 months (500 GB of
>>> data on each node's disk), I added 2nd data centre in AWS Mumbai region
>>> with yet again 5 nodes in Network topology.
>>>
>>> After updating my keyspace's replication factor to
>>> {"AWS_Sgp":3,"AWS_Mum":3}, my expectation was that the data present in Sgp
>>> region will immediately start replicating on the Mum region's nodes.
>>> However even after 2 weeks I do not see historical data to be replicated,
>>> but new data being written on Sgp region is present in Mum region as well.
>>>
>>> Any help or suggestions to debug this issue will be highly appreciated.
>>>
>>> Regards
>>> Akshay Bhardwaj
>>> +91-97111-33849
>>>
>>>
>>>
>>
>> --
>> Jon Haddad
>> http://www.rustyrazorblade.com
>> twitter: rustyrazorblade
>>
>>
>>
>
>


Cassandra DCOS | monitoring connection and user activity

2018-10-31 Thread Caesar, Maik
Hello All,
have someone experience with monitoring cassandra in DCOS?
If we increase the load to the Casandra in DCOS, the application get timeouts 
and loose the connection and I do not have any information about what happened.
Is there a way to get information about the amount of current connection and 
which queries are executed? Cassandra in DCOS has disabled the JMX interface 
and I think the noodetool do not provide such information.

Regards
Maik


DXC Technology Company -- This message is transmitted to you by or on behalf of 
DXC Technology Company or one of its affiliates. It is intended exclusively for 
the addressee. The substance of this message, along with any attachments, may 
contain proprietary, confidential or privileged information or information that 
is otherwise legally exempt from disclosure. Any unauthorized review, use, 
disclosure or distribution is prohibited. If you are not the intended recipient 
of this message, you are not authorized to read, print, retain, copy or 
disseminate any part of this message. If you have received this message in 
error, please destroy and delete all copies and notify the sender by return 
e-mail. Regardless of content, this e-mail shall not operate to bind DXC 
Technology Company or any of its affiliates to any order or other contract 
unless pursuant to explicit written agreement or government initiative 
expressly permitting the use of e-mail for such purpose. --.


Re: Cassandra | Cross Data Centre Replication Status

2018-10-31 Thread Akshay Bhardwaj
Hi Jonathan,

That makes sense. Thank you for the explanation.

Another quick question, as the cluster is still operative and the data for
the past 2 weeks (since updating replication factor) is present in both the
data centres, should I run "nodetool rebuild" or "nodetool repair"?

I read that nodetool rebuild is faster and is useful till the new data
centre is empty and no partition keys are present. So when is the good time
to use either of the commands and what impact can it have on the data
centre operations?

Thanks and Regards
Akshay Bhardwaj
+91-97111-33849


On Wed, Oct 31, 2018 at 2:34 AM Jonathan Haddad  wrote:

> You need to run "nodetool rebuild -- " on each node in
> the new DC to get the old data to replicate.  It doesn't do it
> automatically because Cassandra has no way of knowing if you're done adding
> nodes and if it were to migrate automatically, it could cause a lot of
> problems. Imagine streaming 100 nodes data to 3 nodes in the new DC, not
> fun.
>
> On Tue, Oct 30, 2018 at 1:59 PM Akshay Bhardwaj <
> akshay.bhardwaj1...@gmail.com> wrote:
>
>> Hi Experts,
>>
>> I previously had 1 Cassandra data centre in AWS Singapore region with 5
>> nodes, with my keyspace's replication factor as 3 in Network topology.
>>
>> After this cluster has been running smoothly for 4 months (500 GB of data
>> on each node's disk), I added 2nd data centre in AWS Mumbai region with yet
>> again 5 nodes in Network topology.
>>
>> After updating my keyspace's replication factor to
>> {"AWS_Sgp":3,"AWS_Mum":3}, my expectation was that the data present in Sgp
>> region will immediately start replicating on the Mum region's nodes.
>> However even after 2 weeks I do not see historical data to be replicated,
>> but new data being written on Sgp region is present in Mum region as well.
>>
>> Any help or suggestions to debug this issue will be highly appreciated.
>>
>> Regards
>> Akshay Bhardwaj
>> +91-97111-33849
>>
>
>
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
>


Re: [ANNOUNCE] StratIO's Lucene plugin fork

2018-10-31 Thread Dinesh Joshi
Pretty cool!

Dinesh

> On Oct 30, 2018, at 6:31 PM, Jonathan Haddad  wrote:
> 
> Very cool Ben, thanks for sharing!  
> 
>> On Tue, Oct 30, 2018 at 6:14 PM Ben Slater  
>> wrote:
>> For anyone who is interested, we’ve published a blog with some more 
>> background on this and some more detail of our ongoing plans:
>> https://www.instaclustr.com/instaclustr-support-cassandra-lucene-index/
>> 
>> Cheers
>> Ben
>> 
>>> On Fri, 19 Oct 2018 at 09:42 kurt greaves  wrote:
>>> Hi all,
>>> 
>>> We've had confirmation from Stratio that they are no longer maintaining 
>>> their Lucene plugin for Apache Cassandra. We've thus decided to fork the 
>>> plugin to continue maintaining it. At this stage we won't be making any 
>>> additions to the plugin in the short term unless absolutely necessary, and 
>>> as 4.0 nears we'll begin making it compatible with the new major release. 
>>> We plan on taking the existing PR's and issues from the Stratio repository 
>>> and getting them merged/resolved, however this likely won't happen until 
>>> early next year. Having said that, we welcome all contributions and will 
>>> dedicate time to reviewing bugs in the current versions if people lodge 
>>> them and can help.
>>> 
>>> I'll note that this is new ground for us, we don't have much existing 
>>> knowledge of the plugin but are determined to learn. If anyone out there 
>>> has established knowledge about the plugin we'd be grateful for any 
>>> assistance!
>>> 
>>> You can find our fork here: 
>>> https://github.com/instaclustr/cassandra-lucene-index
>>> At the moment, the only difference is that there is a 3.11.3 branch which 
>>> just has some minor changes to dependencies to better support 3.11.3.
>>> 
>>> Cheers,
>>> Kurt
>> -- 
>> Ben Slater
>> Chief Product Officer
>> 
>> 
>> Read our latest technical blog posts here.
>> This email has been sent on behalf of Instaclustr Pty. Limited (Australia) 
>> and Instaclustr Inc (USA).
>> This email and any attachments may contain confidential and legally 
>> privileged information.  If you are not the intended recipient, do not copy 
>> or disclose its content, but please reply to this email immediately and 
>> highlight the error to the sender and then immediately delete the message.
> 
> 
> -- 
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade