Re: Network throughput requirements

2018-07-10 Thread Elliott Sims
Among the hosts in a cluster?  It depends on how much data you're trying to
read and write.  In general, you're going to want a lot more bandwidth
among hosts in the cluster than you have external-facing.  Otherwise things
like repairs and bootstrapping new nodes can get slow/difficult.  To put it
in perspective, by default it's configured to use up to 200Mbps output
streaming traffic per source node (which might mean a multiple of that
incoming to one node in some cases).

What specifically are you trying to size?  If it's NICs on the hosts, 1Gbps
will be OK for low load but a bit of a bottleneck for higher-traffic
clusters.  10Gbps will probably be more than Cassandra can saturate even
with some tuning.  Or are you trying to size an overall LAN?  Same general
idea, but be aware that the traffic sort of comes in "waves" with repairs
and bootstrapping.  Or are you planning on having geographically spread
nodes within a cluster and want to know how big of a WAN link you need?
Putting those in separate logical "datacenters" with multiple replicas per
DC will give you more options in terms of limiting inter-DC traffic.

On Tue, Jul 10, 2018 at 11:14 AM, Justin Sanciangco <
jsancian...@blizzard.com.invalid> wrote:

> Hello,
>
>
>
> What is the general network throughput (mb/s) requirement for Cassandra?
>
>
>
> Thanks in advance for your advise,
>
>
>
> Justin
>


Network throughput requirements

2018-07-10 Thread Justin Sanciangco
Hello,

What is the general network throughput (mb/s) requirement for Cassandra?

Thanks in advance for your advise,

Justin


Re: Write Time of a Row in Multi DC Cassandra Cluster

2018-07-10 Thread Saladi Naidu
Simon,Trace would be significant burden on the cluster and it has to be on all 
the time. I am trying to find a way to know when a row is written on demand 
basis, is there a way to determine that? Naidu Saladi 
 

On Tuesday, July 10, 2018 2:24 AM, Simon Fontana Oscarsson 
 wrote:
 

 Have you tried trace?
-- 
SIMON FONTANA OSCARSSON
Software Developer

Ericsson
Ölandsgatan 1
37133 Karlskrona, Sweden
simon.fontana.oscars...@ericsson.com
www.ericsson.com

On mån, 2018-07-09 at 19:30 +, Saladi Naidu wrote:
> Cassandra is an eventual consistent DB, how to find when a row is actually 
> written in multi DC environment? Here is the problem I am trying to solve 
> 
> - I have multi DC (3 DC's) Cassandra cluster/ring - One of the application 
> wrote a row to DC1(using Local Quorum)  and within span of 50 ms, it tried to 
> read same row from DC2 and could not find the
> row. Our both DC's have sub milli second latency at network level, usually <2 
> ms. We promised 20 ms consistency. In this case Application could not find 
> the row in DC2 in 50 ms
> 
> I tried to use "select WRITETIME(authorizations_json) from 
> token_authorizations where " to find  when the Row is written in each DC, 
> but both DC's returned same Timestamp. After further research
> I found that Client V3 onwards Timestamp is supplied at Client level so 
> WRITETIME does not help 
> "https://docs.datastax.com/en/developer/java-driver/3.4/manual/query_timestamps/;
> 
> So how to determine when the row is actually written in each DC?
> 
>  
> Naidu Saladi 

   

Re: Installation

2018-07-10 Thread rajasekhar kommineni
Thanks Michael, While I agree with the advantage of symlinks , I am worried for 
future upgrades.

My concern here is how to unlink the Cassandra binaries like nodetool,cassandra 
,cqlsh etc after migrating to tar gz installation.

Thanks,


> On Jul 10, 2018, at 5:46 AM, Michael Shuler  wrote:
> 
> On 07/10/2018 02:48 AM, rajasekhar kommineni wrote:
>> Hi Rahul,
>> 
>> The problem for removing the old links is Cassandra binaries are pointed
>> from /usr//bin/, /usr//sbin etc ..
>> 
>> $ which nodetool 
>> /usr/bin/nodetool
>> $ which cqlsh
>> /usr/bin/cqlsh
>> $ which cassandra
>> /usr/sbin/cassandra
> 
> This is a basic linux usage thing, not really a cassandra problem, but
> it's why packages make things simple for general use - the default
> /usr/{s}bin locations are in $PATH. If you wish to have nodetool, etc.
> in your user's $PATH, just update the user's shell configuration to
> include the tar locations.
> 
> export CASSANDRA_HOME=
> export PATH="$CASSANDRA_HOME/bin:$CASSANDRA_HOME/tools/bin:$PATH"
> 
> This can be added to the bottom of ~/.bashrc for persistence. Bonus
> points for symlink of generic cassandra_home to versioned one, which is
> used for upgrades without messing with PATH env for user and within
> configs for Cassandra.
> 
> -- 
> Michael
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Write Time of a Row in Multi DC Cassandra Cluster

2018-07-10 Thread Saladi Naidu
Alain,Thanks for the response and I completely agree with you about your 
approach but there is a small caveat, we have another DC in Europe, right now 
this keyspace is not replicating there but eventually will be added. EU DC has 
significant latency of 200 ms RTT, so going with EACH_QUORUM would not be 
feasible. We can reset the SLA's for consistency but my question is how to 
determine when was the row written to remote DC? Is there anyway to determine 
that Naidu Saladi 
 

On Tuesday, July 10, 2018 8:56 AM, Alain RODRIGUEZ  
wrote:
 

 Hello,

 I have multi DC (3 DC's) Cassandra cluster/ring - One of the application wrote 
a row to DC1(using Local Quorum)  and within span of 50 ms, it tried to read 
same row from DC2 and could not find the row.

 [...]

So how to determine when the row is actually written in each DC? 

To me, this guarantee you try to achieve could obtained using 'EACH_QUORUM' for 
writes (ie 'local_quorum' on each DC), and 'LOCAL_QUORUM' for reads for 
example. You would then have a strong consistency, as long as the same client 
application is running write then read or that it sends a trigger for the 
second call sequentially, after validating the write, in some way.

Our both DC's have sub milli second latency at network level, usually <2 ms. We 
promised 20 ms consistency. In this case Application could not find the row in 
DC2 in 50 ms


In these conditions, using 'EACH_QUORUM' might not be too much of a burden for 
the coordinator and the client. The writes are already being processed, this 
would increase the latency at the coordinator level (and thus at the client 
level), but you would be sure that all the clusters have the row in a majority 
of the replicas before triggering the read.
C*heers,---Alain Rodriguez - @arodream - 
alain@thelastpickle.comFrance / Spain
The Last Pickle - Apache Cassandra Consultinghttp://www.thelastpickle.com

2018-07-10 8:24 GMT+01:00 Simon Fontana Oscarsson 
:

Have you tried trace?
-- 
SIMON FONTANA OSCARSSON
Software Developer

Ericsson
Ölandsgatan 1
37133 Karlskrona, Sweden
simon.fontana.oscarsson@ ericsson.com
www.ericsson.com

On mån, 2018-07-09 at 19:30 +, Saladi Naidu wrote:
> Cassandra is an eventual consistent DB, how to find when a row is actually 
> written in multi DC environment? Here is the problem I am trying to solve 
> 
> - I have multi DC (3 DC's) Cassandra cluster/ring - One of the application 
> wrote a row to DC1(using Local Quorum)  and within span of 50 ms, it tried to 
> read same row from DC2 and could not find the
> row. Our both DC's have sub milli second latency at network level, usually <2 
> ms. We promised 20 ms consistency. In this case Application could not find 
> the row in DC2 in 50 ms
> 
> I tried to use "select WRITETIME(authorizations_json) from 
> token_authorizations where " to find  when the Row is written in each DC, 
> but both DC's returned same Timestamp. After further research
> I found that Client V3 onwards Timestamp is supplied at Client level so 
> WRITETIME does not help "https://docs.datastax.com/en/ 
> developer/java-driver/3.4/ manual/query_timestamps/"
> 
> So how to determine when the row is actually written in each DC?
> 
>  
> Naidu Saladi 



   

Re: Tuning Replication Factor - All, Consistency ONE

2018-07-10 Thread Jeff Jirsa
On Tue, Jul 10, 2018 at 8:29 AM, Code Wiget  wrote:

> Hi,
>
> I have been tasked with picking and setting up a database with the
> following characteristics:
>
>- Ultra-high availability - The real requirement is uptime - our whole
>platform becomes inaccessible without a “read” from the database. We need
>the read to authenticate users. Databases will never be spread across
>multiple networks.
>
>
Sooner or later life will happen and you're going to have some
unavailability - may be worth taking the time to make it fail gracefully
(cache auth responses, etc).

>
>- Reasonably quick access speeds
>- Very low data storage - The data storage is very low - for 10
>million users, we would have around 8GB of storage total.
>
> Having done a bit of research on Cassandra, I think the optimal approach
> for my use-case would be to replicate the data on *ALL* nodes possible,
> but require reads to only have a consistency level of one. So, in the case
> that a node goes down, we can still read/write to other nodes. It is not
> very important that a read be unanimously agreed upon, as long as Cassandra
> is eventually consistent, within around 1s, then there shouldn’t be an
> issue.
>

Seems like a reasonably good fit, but there's no 1s guarantee - it'll
USUALLY happen within milliseconds, but the edge cases don't have a strict
guarantee at all (imagine two hosts in adjacent racks, the link between the
two racks goes down, but both are otherwise functional - a query at ONE in
either rack would be able to read and write data, but it would diverge
between the two racks for some period of time).


>
> When I go to set up the database though, I am required to set a
> replication factor to a number - 1,2,3,etc. So I can’t just say “ALL” and
> have it replicate to all nodes.
>

That option doesn't exist. It's been proposed (and exists in Datastax
Enterprise, which is a proprietary fork), but reportedly causes quite a bit
of pain when misused, so people have successfully lobbied against it's
inclusion in OSS Apache Cassandra. You could (assuming some basic java
knowledge) extend NetworkTopologyStrategy to have it accomplish this, but I
imagine you don't REALLY want this unless you're frequently auto-scaling
nodes in/out of the cluster. You should probably just pick a high RF and
you'll be OK with it.


> Right now, I have a 2 node cluster with replication factor 3. Will this
> cause any issues, having a RF > #nodes? Or is there a way to just have it
> copy to *all* nodes?
>

It's obviously not the intended config, but I don't think it'll cause many
problems.


> Is there any way that I can tune Cassandra to be more read-optimized?
>
>
Yes - definitely use leveled compaction instead of STCS (the default), and
definitely take the time to tune the JVM args - read path generates a lot
of short lived java objects, so a larger eden will help you (maybe up to
40-50% of max heap size).


> Finally, I have some misgivings about how well Cassandra fits my use-case.
> Please, if anyone has a suggestion as to why or why not it is a good fit, I
> would really appreciate your input! If this could be done with a simple SQL
> database and this is overkill, please let me know.
>
> Thanks for your input!
>
>


Tuning Replication Factor - All, Consistency ONE

2018-07-10 Thread Code Wiget
Hi,

I have been tasked with picking and setting up a database with the following 
characteristics:

• Ultra-high availability - The real requirement is uptime - our whole platform 
becomes inaccessible without a “read” from the database. We need the read to 
authenticate users. Databases will never be spread across multiple networks.
• Reasonably quick access speeds
• Very low data storage - The data storage is very low - for 10 million users, 
we would have around 8GB of storage total.

Having done a bit of research on Cassandra, I think the optimal approach for my 
use-case would be to replicate the data on ALL nodes possible, but require 
reads to only have a consistency level of one. So, in the case that a node goes 
down, we can still read/write to other nodes. It is not very important that a 
read be unanimously agreed upon, as long as Cassandra is eventually consistent, 
within around 1s, then there shouldn’t be an issue.

When I go to set up the database though, I am required to set a replication 
factor to a number - 1,2,3,etc. So I can’t just say “ALL” and have it replicate 
to all nodes. Right now, I have a 2 node cluster with replication factor 3. 
Will this cause any issues, having a RF > #nodes? Or is there a way to just 
have it copy to all nodes? Is there any way that I can tune Cassandra to be 
more read-optimized?

Finally, I have some misgivings about how well Cassandra fits my use-case. 
Please, if anyone has a suggestion as to why or why not it is a good fit, I 
would really appreciate your input! If this could be done with a simple SQL 
database and this is overkill, please let me know.

Thanks for your input!



Re: Cassandra 2FA

2018-07-10 Thread Vitali Dyachuk
Thanks, checked the ticket which is about a client hostname verification,
but this is not an optimal solution for us; maintaining the allowed hosts
list is not convenient way, once new hosts added you have reissue a new
cert.and deploy it. What we are looking for is for example certificate
validation based on CN, which adds additional small level of security.
I'm also thinking to try OID "challengePassword" as a pre-shared key, but
thats not related to C*.


On Tue, Jul 10, 2018 at 10:43 AM Stefan Podkowinski  wrote:

> You may want to keep an eye on the following ticket:
> https://issues.apache.org/jira/browse/CASSANDRA-13404
>
>
> On 09.07.2018 17:12, Vitali Dyachuk wrote:
> > Hi,
> > There is a certificate validation based on the mutual CA this is a 1st
> > factor, the 2nd factor could be checking the common name of the client
> > certificate, probably this requires writing a patch, but probably some
> > has already done that ?
> >
> > Vitali Djatsuk.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: CPU Spike with Jmx_exporter

2018-07-10 Thread Alain RODRIGUEZ
Hello,

I did not work with the 'jmx_exporter' in a production cluster, but for
datadog agent and other collectors I could work with, the number of metrics
being collected was a key point.

Cassandra exposes a lot of metrics and I saw datadog agents taking too much
CPU, I even saw Graphite servers falling because of the load due to some
Cassandra nodes sending metrics. I would recommend you to make sure that
you are filtering-in only metrics that are used to display some charts or
used for alerting purposes. Restrict the pattern for the rules as much as
possible.

Also, for datadog agents, some work was done in the latest version so
metric collection requires less CPU. Maybe is there a similar update that
was released or that you could ask for. Also, CPU might be related to GC as
I see the agent is running inside a JVM, some tuning might help.

If you really cannot do much to improve it on your side, I would open an
issue or a discussion on prometheus side (
https://github.com/prometheus/jmx_exporter/issues maybe?).

C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com



2018-07-05 20:39 GMT+01:00 rajpal reddy :

> We are seeing the CPU spike only when Jmx metrics are exposed using
> Jmx_exporter.  tried setting up imx authentication still see cpu spike. if
> i stop using jmx exporter  we don’t see any cpu spike. is there any thing
> we have to tune to make work with Jmx_exporter?
>
>
> On Jun 14, 2018, at 2:18 PM, rajpal reddy 
> wrote:
>
> Hey Chris,
>
> Sorry to bother you. Did you get a chance to look at the gclog file I sent
> last night.
>
> On Wed, Jun 13, 2018, 8:44 PM rajpal reddy 
> wrote:
>
>> Chris,
>>
>> sorry attached wrong log file. attaching gc collection seconds and cpu.
>> there were going high at the same time and also attached the gc.log.
>> grafana dashboard and gc.log timing are 4hours apart gc can be see 06/12th
>> around 22:50
>>
>> rate(jvm_gc_collection_seconds_sum{"}[5m])
>>
>> > On Jun 13, 2018, at 5:26 PM, Chris Lohfink  wrote:
>> >
>> > There are not even a 100ms GC pause in that, are you certain theres a
>> problem?
>> >
>> >> On Jun 13, 2018, at 3:00 PM, rajpal reddy 
>> wrote:
>> >>
>> >> Thanks Chris I did attached the gc logs already. reattaching them
>> now.
>> >>
>> >> it started yesterday around 11:54PM
>> >>> On Jun 13, 2018, at 3:56 PM, Chris Lohfink 
>> wrote:
>> >>>
>>  What is the criteria for picking up the value for G1ReservePercent?
>> >>>
>> >>>
>> >>> it depends on the object allocation rate vs the size of the heap.
>> Cassandra ideally would be sub 500-600mb/s allocations but it can spike
>> pretty high with something like reading a wide partition or repair
>> streaming which might exceed what the g1 ygcs tenuring and timing is
>> prepared for from previous steady rate. Giving it a bigger buffer is a nice
>> safety net for allocation spikes.
>> >>>
>>  is the HEAP_NEWSIZE is required only for CMS
>> >>>
>> >>>
>> >>> it should only set Xmn with that if using CMS, with G1 it should be
>> ignored or else yes it would be bad to set Xmn. Giving the gc logs will
>> give the results of all the bash scripts along with details of whats
>> happening so its your best option if you want help to share that.
>> >>>
>> >>> Chris
>> >>>
>>  On Jun 13, 2018, at 12:17 PM, Subroto Barua <
>> sbarua...@yahoo.com.INVALID > wrote:
>> 
>>  Chris,
>>  What is the criteria for picking up the value for G1ReservePercent?
>> 
>>  Subroto
>> 
>> > On Jun 13, 2018, at 6:52 AM, Chris Lohfink 
>> wrote:
>> >
>> > G1ReservePercent
>> 
>>  
>> -
>>  To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>  For additional commands, e-mail: user-h...@cassandra.apache.org
>> 
>> >>>
>> >>>
>> >>> -
>> >>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> >>> For additional commands, e-mail: user-h...@cassandra.apache.org
>> >>>
>> >>
>> >>
>> >>
>> >> -
>> >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> >> For additional commands, e-mail: user-h...@cassandra.apache.org
>> >
>> >
>> > -
>> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> > For additional commands, e-mail: user-h...@cassandra.apache.org
>> >
>>
>>
>


Re: Write Time of a Row in Multi DC Cassandra Cluster

2018-07-10 Thread Alain RODRIGUEZ
Hello,

 I have multi DC (3 DC's) Cassandra cluster/ring - One of the application
> wrote a row to DC1(using Local Quorum)  and within span of 50 ms, it tried
> to read same row from DC2 and could not find the row.

 [...]

So how to determine when the row is actually written in each DC?


To me, this guarantee you try to achieve could obtained using 'EACH_QUORUM'
for writes (ie 'local_quorum' on each DC), and 'LOCAL_QUORUM' for reads for
example. You would then have a strong consistency, as long as the same
client application is running write then read or that it sends a trigger
for the second call sequentially, after validating the write, in some way.

Our both DC's have sub milli second latency at network level, usually <2
> ms. We promised 20 ms consistency. In this case Application could not find
> the row in DC2 in 50 ms
>

In these conditions, using 'EACH_QUORUM' might not be too much of a burden
for the coordinator and the client. The writes are already being processed,
this would increase the latency at the coordinator level (and thus at the
client level), but you would be sure that all the clusters have the row in
a majority of the replicas before triggering the read.

C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com


2018-07-10 8:24 GMT+01:00 Simon Fontana Oscarsson <
simon.fontana.oscars...@ericsson.com>:

> Have you tried trace?
> --
> SIMON FONTANA OSCARSSON
> Software Developer
>
> Ericsson
> Ölandsgatan 1
> 37133 Karlskrona, Sweden
> simon.fontana.oscars...@ericsson.com
> www.ericsson.com
>
> On mån, 2018-07-09 at 19:30 +, Saladi Naidu wrote:
> > Cassandra is an eventual consistent DB, how to find when a row is
> actually written in multi DC environment? Here is the problem I am trying
> to solve
> >
> > - I have multi DC (3 DC's) Cassandra cluster/ring - One of the
> application wrote a row to DC1(using Local Quorum)  and within span of 50
> ms, it tried to read same row from DC2 and could not find the
> > row. Our both DC's have sub milli second latency at network level,
> usually <2 ms. We promised 20 ms consistency. In this case Application
> could not find the row in DC2 in 50 ms
> >
> > I tried to use "select WRITETIME(authorizations_json) from
> token_authorizations where " to find  when the Row is written in each
> DC, but both DC's returned same Timestamp. After further research
> > I found that Client V3 onwards Timestamp is supplied at Client level so
> WRITETIME does not help "https://docs.datastax.com/en/
> developer/java-driver/3.4/manual/query_timestamps/"
> >
> > So how to determine when the row is actually written in each DC?
> >
> >
> > Naidu Saladi
>


Re: Paging in Cassandra

2018-07-10 Thread Alain RODRIGUEZ
Hello,

It sounds like a client/coding issue. People are working with distinct
clients to connect to Cassandra. And it looks like there are not many
'spring-data-cassandra' users around ¯\_(ツ)_/¯.

You could try giving a try there see if you have more luck:
https://spring.io/questions.

C*heers,

Alain

2018-07-05 6:21 GMT+01:00 Ghazi Naceur :

> Hello Eveyone,
>
> I'm facing a problem with CassandraPageRequest and Slice.
> In fact, I'm always obtaining the same Slice and I'm not able to get the
> next slice (or Page) of data.
> I'm based on this example :
>
> Link : https://github.com/spring-projects/spring-data-cassandra/pull/114
>
>
> Query query = 
> Query.empty().pageRequest(CassandraPageRequest.first(10));Slice slice = 
> template.slice(query, User.class);
> do {
> // consume slice
> if (slice.hasNext()) {
> slice = template.select(query, slice.nextPageable(), User.class);
> } else {break;
> }
> } while (!slice.getContent().isEmpty());
>
>
>
> I appreciate your help.
>


Re: Installation

2018-07-10 Thread Michael Shuler
On 07/10/2018 02:48 AM, rajasekhar kommineni wrote:
> Hi Rahul,
> 
> The problem for removing the old links is Cassandra binaries are pointed
> from /usr//bin/, /usr//sbin etc ..
> 
> $ which nodetool 
> /usr/bin/nodetool
> $ which cqlsh
> /usr/bin/cqlsh
> $ which cassandra
> /usr/sbin/cassandra

This is a basic linux usage thing, not really a cassandra problem, but
it's why packages make things simple for general use - the default
/usr/{s}bin locations are in $PATH. If you wish to have nodetool, etc.
in your user's $PATH, just update the user's shell configuration to
include the tar locations.

export CASSANDRA_HOME=
export PATH="$CASSANDRA_HOME/bin:$CASSANDRA_HOME/tools/bin:$PATH"

This can be added to the bottom of ~/.bashrc for persistence. Bonus
points for symlink of generic cassandra_home to versioned one, which is
used for upgrades without messing with PATH env for user and within
configs for Cassandra.

-- 
Michael

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Installation

2018-07-10 Thread rajasekhar kommineni
Hi Rahul,

The problem for removing the old links is Cassandra binaries are pointed from 
/usr//bin/, /usr//sbin etc ..

$ which nodetool 
/usr/bin/nodetool
$ which cqlsh
/usr/bin/cqlsh
$ which cassandra
/usr/sbin/cassandra
$ 


Thanks,


> On Jul 10, 2018, at 12:28 AM, Rahul Singh  
> wrote:
> 
> That approach will work, however that may take a long time. 
> 
> The important things that are unique to your cluster will be your 
> configuration files & your data /log  directories. 
> 
> The binaries can be placed on the same machines via tar installation. While 
> keeping the machines running on the old binaries, you can migrate the data / 
> log new directories. If you move your data, you can use links in linux to 
> point the old directories to the new locations. 
> 
> Once this is done, you can configure your tar installation to point to your 
> new data directories, and turn off the old binaries and turn on the new 
> binaries, one node at a time. 
> 
> 
> --
> Rahul Singh
> rahul.si...@anant.us
> 
> Anant Corporation
> On Jul 9, 2018, 6:35 PM -0500, rajpal reddy , wrote:
>> We have our infrastructure in cloud so opted for adding new dc with tar.gz 
>> then removed the old dc with package installation
>> 
>> Sent from my iPhone
>> 
>>> On Jul 9, 2018, at 2:23 PM, rajasekhar kommineni  
>>> wrote:
>>> 
>>> Hello All,
>>> 
>>> I have a cassandra cluster where package installation is done, I want to 
>>> convert it to tar.gz installation. Is there any procedure to follow.
>>> 
>>> Thanks,
>>> Rajasekhar Kommineni
>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>> 
>> 
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>> 



Re: Cassandra 2FA

2018-07-10 Thread Stefan Podkowinski
You may want to keep an eye on the following ticket:
https://issues.apache.org/jira/browse/CASSANDRA-13404


On 09.07.2018 17:12, Vitali Dyachuk wrote:
> Hi,
> There is a certificate validation based on the mutual CA this is a 1st
> factor, the 2nd factor could be checking the common name of the client
> certificate, probably this requires writing a patch, but probably some
> has already done that ?
> 
> Vitali Djatsuk.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Jmx_exporter CPU spike

2018-07-10 Thread Rahul Singh
Nice find, Ben. I added this to my list of c* monitoring tools.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation
On Jul 9, 2018, 8:20 PM -0500, rajpal reddy , wrote:
> Thanks Ben!. will look into it
> > On Jul 9, 2018, at 10:42 AM, Ben Bromhead  wrote:
> >
> > Hi Rajpal
> >
> > I'd invite you to have a look at 
> > https://github.com/zegelin/cassandra-exporter
> >
> > Significantly faster (bypasses JMX rpc stuff, 10ms to collect metrics for 
> > 300 tables vs 2-3 seconds via JMX), plus the naming/tagging fits far better 
> > into the Prometheus world. Still missing a few stats like GC etc, but feel 
> > free to submit a PR!
> >
> > Ben
> >
> >
> >
> > > On Mon, Jul 9, 2018 at 12:03 AM Rahul Singh 
> > >  wrote:
> > > > How often are you polling the JMX? How much of a spike are you seeing 
> > > > in CPU?
> > > >
> > > > --
> > > > Rahul Singh
> > > > rahul.si...@anant.us
> > > >
> > > > Anant Corporation
> > > > On Jul 5, 2018, 2:45 PM -0500, rajpal reddy , 
> > > > wrote:
> > > > >
> > > > > we have Qualys security scan running causing the cpu spike. We are 
> > > > > seeing the CPU spike only when Jmx metrics are exposed using 
> > > > > Jmx_exporter. tried setting up imx authentication still see cpu 
> > > > > spike. if i stop using jmx exporter we don’t see any cpu spike. is 
> > > > > there any thing we have to tune to make work with Jmx_exporter?
> > > > >
> > > > >
> > > > > -
> > > > > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> > > > > For additional commands, e-mail: user-h...@cassandra.apache.org
> > > > >
> > --
> > Ben Bromhead
> > CTO | Instaclustr
> > +1 650 284 9692
> > Reliability at Scale
> > Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
>


Re: Installation

2018-07-10 Thread Rahul Singh
That approach will work, however that may take a long time.

The important things that are unique to your cluster will be your configuration 
files & your data /log  directories.

The binaries can be placed on the same machines via tar installation. While 
keeping the machines running on the old binaries, you can migrate the data / 
log new directories. If you move your data, you can use links in linux to point 
the old directories to the new locations.

Once this is done, you can configure your tar installation to point to your new 
data directories, and turn off the old binaries and turn on the new binaries, 
one node at a time.


--
Rahul Singh
rahul.si...@anant.us

Anant Corporation
On Jul 9, 2018, 6:35 PM -0500, rajpal reddy , wrote:
> We have our infrastructure in cloud so opted for adding new dc with tar.gz 
> then removed the old dc with package installation
>
> Sent from my iPhone
>
> > On Jul 9, 2018, at 2:23 PM, rajasekhar kommineni  
> > wrote:
> >
> > Hello All,
> >
> > I have a cassandra cluster where package installation is done, I want to 
> > convert it to tar.gz installation. Is there any procedure to follow.
> >
> > Thanks,
> > Rajasekhar Kommineni
> >
> >
> > -
> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: user-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>


Re: Write Time of a Row in Multi DC Cassandra Cluster

2018-07-10 Thread Simon Fontana Oscarsson
Have you tried trace?
-- 
SIMON FONTANA OSCARSSON
Software Developer

Ericsson
Ölandsgatan 1
37133 Karlskrona, Sweden
simon.fontana.oscars...@ericsson.com
www.ericsson.com

On mån, 2018-07-09 at 19:30 +, Saladi Naidu wrote:
> Cassandra is an eventual consistent DB, how to find when a row is actually 
> written in multi DC environment? Here is the problem I am trying to solve 
> 
> - I have multi DC (3 DC's) Cassandra cluster/ring - One of the application 
> wrote a row to DC1(using Local Quorum)  and within span of 50 ms, it tried to 
> read same row from DC2 and could not find the
> row. Our both DC's have sub milli second latency at network level, usually <2 
> ms. We promised 20 ms consistency. In this case Application could not find 
> the row in DC2 in 50 ms
> 
> I tried to use "select WRITETIME(authorizations_json) from 
> token_authorizations where " to find  when the Row is written in each DC, 
> but both DC's returned same Timestamp. After further research
> I found that Client V3 onwards Timestamp is supplied at Client level so 
> WRITETIME does not help 
> "https://docs.datastax.com/en/developer/java-driver/3.4/manual/query_timestamps/;
> 
> So how to determine when the row is actually written in each DC?
> 
>  
> Naidu Saladi 

smime.p7s
Description: S/MIME cryptographic signature