Cassandra Resource Planning

2016-03-31 Thread Joe Hicks
I am doing resource planning and could use some help. How many operations people will I need to manage my Cassandra implementation for two sites with 10 nodes at each site? As, my cluster grows at what point will I need to add another person?

Re: Multi DC setup for analytics

2016-03-31 Thread Anishek Agarwal
Hey Bryan, Thanks for the info, we inferred as much, currently the only other thing we were trying were trying to start two separate instances in Analytics cluster on same set of machines to talk to respective individual DC's but within 2 mins dropped that as we will have to change ports on atlas

Re: Multi DC setup for analytics

2016-03-31 Thread Bryan Cheng
I'm jumping into this thread late, so sorry if this has been covered before. But am I correct in reading that you have two different Cassandra rings, not talking to each other at all, and you want to have a shared DC with a third Cassandra ring? I'm not sure what you want to do is possible. If I

Re: Upgrade cassandra from 2.1.9 to 3.x?

2016-03-31 Thread adanec...@yahoo.com
Sure I have asked that before. I was told to get it from datsstax but why would the install not have that? Sent from my Verizon 4G LTE Smartphone -- Original message--From: John WongDate: Thu, Mar 31, 2016 3:20 PMTo: user@cassandra.apache.org;Tony Anecito;Cc: Subject:Re: Upgrade

Re: Adding Options to Create Statements...

2016-03-31 Thread James Carman
No thoughts? Would an upgrade of the driver "fix" this? On Wed, Mar 30, 2016 at 10:42 AM James Carman wrote: > I am trying to perform the following operation: > > public Create createCreate() { > Create create = >

Re: Upgrade cassandra from 2.1.9 to 3.x?

2016-03-31 Thread John Wong
Even if you can upgrade Cassandra straight from 1.2 to 3.X, also consider driver compatibility. On Thu, Mar 31, 2016 at 2:43 PM, Tony Anecito wrote: > I would also like to know. > > Thanks! > > > On Thursday, March 31, 2016 6:14 AM, Steven Choo > wrote:

RE: Speeding up "nodetool rebuild"

2016-03-31 Thread Anubhav Kale
Thanks, is there any way to determine that rebuild is complete. Based on following line in StorageService.java, it's not logged. So, any other way to check besides checking data size through nodetool status ? finally { // rebuild is done (successfully or not)

Re: How many nodes do we require

2016-03-31 Thread Jack Krupansky
Maybe that's a great definition of a modern distributed cluster: each person (node) has a different notion of priority. I'll wait for the next user email in which they complain that their data is "too stable" (missing updates.) -- Jack Krupansky On Thu, Mar 31, 2016 at 12:04 PM, Jacques-Henri

Re: Thrift composite partition key to cql migration

2016-03-31 Thread Tyler Hobbs
Also, can you paste the results of the relevant portions of "SELECT * FROM system.schema_columns" and "SELECT * FROM system.schema_columnfamilies"? On Thu, Mar 31, 2016 at 2:35 PM, Tyler Hobbs wrote: > In the Thrift schema, is the key_validation_class actually set to >

Re: Thrift composite partition key to cql migration

2016-03-31 Thread Tyler Hobbs
In the Thrift schema, is the key_validation_class actually set to CompositeType(UTF8Type, UTF8Type), or is it just BytesType? What Cassandra version? On Wed, Mar 30, 2016 at 4:44 PM, Jan Kesten wrote: > Hi, > > while migrating the reminder of thrift operations in my

Re: Inconsistent query results and node state

2016-03-31 Thread Tyler Hobbs
On Thu, Mar 31, 2016 at 11:53 AM, Jason Kania wrote: > > To me it just seems like the timestamp column value is sometimes not being > set somewhere in the pipeline and the result is the epoch 0 value. > I agree, especially since you can't directly query this row and that

Re: Upgrade cassandra from 2.1.9 to 3.x?

2016-03-31 Thread Tony Anecito
I would also like to know. Thanks! On Thursday, March 31, 2016 6:14 AM, Steven Choo wrote: Hi, Is it possible to update Cassandra from 2.1.9 to 3.x (e.g. 3.4) in one step? The info about the tick-tock release schedule does not say anything specific about it and the

Re: Consistency Level (QUORUM vs LOCAL_QUORUM)

2016-03-31 Thread Robert Coli
On Thu, Mar 31, 2016 at 4:35 AM, Alain RODRIGUEZ wrote: > My understanding is using RF 3 and LOCAL_QUORUM for both reads and writes > will provide a strong consistency and a high availability. One node can go > down and also without lowering the consistency. Or RF = 5, Quorum

Re: NTP Synchronization Setup Changes

2016-03-31 Thread Eric Evans
On Wed, Mar 30, 2016 at 8:07 PM, Mukil Kesavan wrote: > Are there any issues if this causes a huge time correction on the cassandra > cluster? I know that NTP gradually corrects the time on all the servers. I > just wanted to understand if there were any corner cases

Re: Inconsistent query results and node state

2016-03-31 Thread Jason Kania
Thanks for responding. The problems that we are having are in Cassandra 3.03 and 3.0.4. We had upgraded to see if the problem went away. The values have been out of sync this way for some time and we cannot get a row with the 1969 timestamp in any query that directly queries on the timestamp.

Re: Speeding up "nodetool rebuild"

2016-03-31 Thread Eric Evans
On Wed, Mar 30, 2016 at 3:44 PM, Anubhav Kale wrote: > Any other ways to make the “rebuild” faster ? TL;DR add more nodes If you're encountering a per-stream bottleneck (easy to do if using compression), then having a higher node count will translate to higher stream

Re: Inconsistent query results and node state

2016-03-31 Thread Jason Kania
Thanks for the response. All nodes are using NTP. Thanks, Jason From: Kai Wang To: user@cassandra.apache.org; Jason Kania Sent: Wednesday, March 30, 2016 10:59 AM Subject: Re: Inconsistent query results and node state Do you have NTP setup

RE: How many nodes do we require

2016-03-31 Thread Jacques-Henri Berthemet
You’re right. I meant about data integrity, I understand it’s not everybody’s priority! -- Jacques-Henri Berthemet From: Jonathan Haddad [mailto:j...@jonhaddad.com] Sent: jeudi 31 mars 2016 17:48 To: user@cassandra.apache.org Subject: Re: How many nodes do we require Losing a write is very

Re: How many nodes do we require

2016-03-31 Thread Jonathan Haddad
Losing a write is very different from having a fragile cluster. A fragile cluster implies that whole thing will fall apart, that it breaks easily. Writing at CL=ONE gives you a pretty damn stable cluster at the potential risk of losing a write that hasn't replicated (but has been ack'ed) which

Re: auto_boorstrap when a node is down

2016-03-31 Thread Carlos Alonso
Mmm ok, then I think you may need follow the standard dead node replacement procedure: https://docs.datastax.com/en/cassandra/2.2/cassandra/operations/opsReplaceNode.html Cheers! Carlos Alonso | Software Engineer | @calonso On 31 March 2016 at 16:34, Peddi, Praveen

Re: auto_boorstrap when a node is down

2016-03-31 Thread Peddi, Praveen
Hi Carlos, In our case, old node is dead and is not accessible. So I am not sure if we can use rsync in this case. Praveen From: Carlos Alonso > Reply-To: "user@cassandra.apache.org"

Re: auto_boorstrap when a node is down

2016-03-31 Thread Carlos Alonso
If that's your use case I've developed a quick disk based replacement procedure. Basically all it involves is rsyncing the data from the old node to the new node and bring the new one as if it was the old one (only the IP will change). Step by step details here:

Re: Upgrade cassandra from 2.1.9 to 3.x?

2016-03-31 Thread Paulo Motta
If there isn't anything on NEWS.txt forbidding it, then it *should* be possible. That is the authoritative source for upgrade information. As noted by you, the only known restriction is that you upgrade from at least 2.1.9 as noted in the NEWS.txt entry. But as always, and specially when doing

Re: auto_boorstrap when a node is down

2016-03-31 Thread Peddi, Praveen
Hi Paulo, Thanks a lot for detailed explanation. Our usecase is that, when one node goes down, a new node in the same AZ comes up immediately (5 to 10 mins) and it is safe to assume that no other nodes in another AZ are down at this point of time. So based on your explanation, using

Re: StatusLogger output

2016-03-31 Thread Vasileios Vlachos
Anyone else any idea on how to interpret StatusLogger output? As Sean said, this may not help in determining the problem, but it would definitely help my general understanding. Thanks, Bill On Thu, Mar 24, 2016 at 5:24 PM, Vasileios Vlachos < vasileiosvlac...@gmail.com> wrote: > Thanks for your

Upgrade cassandra from 2.1.9 to 3.x?

2016-03-31 Thread Steven Choo
Hi, Is it possible to update Cassandra from 2.1.9 to 3.x (e.g. 3.4) in one step? The info about the tick-tock release schedule does not say anything specific about it and the documentation on http://docs.datastax.com is separated by 3.0 and 3.x. Even though the info on

Re: How many nodes do we require

2016-03-31 Thread Alain RODRIGUEZ
@Rakesh: Are you telling a SimpleReplication topology with RF=3 > or NetworkTopology with RF=3. Just always go with NetworkTopology, I see no reason not doing it nowadays, even on test clusters. If you use SimpleReplicationTopology, all the machines will be considered as only one datacenter, no

Re: How many nodes do we require

2016-03-31 Thread Alain RODRIGUEZ
Hi, Because if you lose a node you have chances to lose some data forever if it > was not yet replicated. I think I get your point, but keep in mind that CL ONE (or LOCAL_ONE) will not prevent the coordinator from sending the data to the 2 other replicas, it will just wait for the first ack,

Re: Consistency Level (QUORUM vs LOCAL_QUORUM)

2016-03-31 Thread Alain RODRIGUEZ
Hi, If you want the full immediate consistency of a traditional relational > database, then go with CL=ALL, otherwise, take your pick from the many > degrees of immediacy that Cassandra offers: My understanding is using RF 3 and LOCAL_QUORUM for both reads and writes will provide a strong

Optimizing read and write performance in cassandra through multi-DCs.

2016-03-31 Thread Atul Saroha
Hi, Would like to understand what's the good approach for handling different read and write pattens. We have both read intensive (like direct traffic from web) and write intensive tasks (continuous bulk/batch upload of data). For this, we had setup the two datacenter one for read and one for

Re: Runtime exception during repair job task

2016-03-31 Thread Carlos Alonso
This is probably due to corrupt data or a cassandra upgrade where you didn't ran upgradesstables I'd then suggest scrubbing the column family (or upgrading it). Hope it helps. Carlos Alonso | Software Engineer | @calonso On 31 March 2016 at 12:10,

Re: Re: Data export with consistency problem

2016-03-31 Thread Alain RODRIGUEZ
> > If we remove the network cable of one node, import 30 million rows of > data into that table, and then reconnect the network cable, we export the > data immediately and we cannot get all the 30 million rows of data. > But if we manually run ' kill -9 pid' of one node, import 30 million rows >

Runtime exception during repair job task

2016-03-31 Thread me
Hi all, Recently we tried to repair one of our biggest table, and we keep getting hit by errors related to hard link. Here's a stacktrace: ERROR [RepairJobTask:4] 2016-03-31 05:47:27,268 RepairJob.java:145 - Error occurred during snapshot phase java.lang.RuntimeException: Could not create