Re: Modify keyspace replication strategy and rebalance the nodes

2017-09-13 Thread Jeff Jirsa
The token distribution isn't going to change - the way Cassandra maps replicas will change. How many data centers/regions will you have when you're done? What's your RF now? You definitely need to run repair before you ALTER, but you've got a bit of a race here between the repairs and the

Cassandra configuration thumb rules

2017-09-13 Thread Avi Levi
Hi All, I plan to install cassandra on prem, we expect load of 10mil inserts per minute . Are there any thumb rules for configuration, HW requirements, mem allocation etc` ? Thanks Avi

Re: Modify keyspace replication strategy and rebalance the nodes

2017-09-13 Thread Fabrice Facorat
Hi, the steps are: - ALTER KEYSPACE to change your replication strategy - "nodetool repair -pr " on ALL nodes or full repair "nodetool repair " on enough replica to distribute and rebalance your data to replicas - nodetool cleanup on every node to remove superfluous data Please note that you'd

question on the code formatter

2017-09-13 Thread preetika tyagi
Hi all, I was trying to configure the Cassandra code formatter and downloaded IntelliJ-codestyle.jar from this link: https://wiki.apache.org/cassandra/CodeStyle After extracting this JAR, I was able to import codestyle/Default_1_.xml into my project and formatting seemed to work. However, I'm

Re: load distribution that I can't explain

2017-09-13 Thread kaveh minooie
I am using RoundRobin cluster = Cluster.builder()...( socket stuff, pool option stuff ... ) .withLoadBalancingPolicy( new RoundRobinPolicy() ) .addContactPoints( hosts ) .build(); On 09/13/2017 03:02 AM, kurt greaves wrote: Are you

Re: BATCH OPERATION issue

2017-09-13 Thread Eric Stevens
The original timestamp is bigger than the timestamp you're using in your batch. Cassandra uses timestamps for conflict resolution, so the batch write will lose. On Wed, Sep 13, 2017 at 11:59 AM Deepak Panda wrote: > Hi All, > > Am in the process of learning batch

Modify keyspace replication strategy and rebalance the nodes

2017-09-13 Thread Dominik Petrovic
Dear community, I'd like to receive additional info on how to modify a keyspace replication strategy. My Cassandra cluster is on AWS, Cassandra 2.1.15 using vnodes, the cluster's snitch is configured to Ec2Snitch, but the keyspace the developers created has replication class SimpleStrategy =

BATCH OPERATION issue

2017-09-13 Thread Deepak Panda
Hi All, Am in the process of learning batch operations. Here is what I tried. Executed a CQL query against the student table(student_id is the primary key). select student_id,position,WRITETIME(class_id),WRITETIME(position) FROM student WHERE student_id='s123'; student_id position

Re: Rebalance a cassandra cluster

2017-09-13 Thread kurt greaves
You should choose a partition key that enables you to have a uniform distribution of partitions amongst the nodes and refrain from having too many wide rows/a small number of wide partitions. If your tokens are already uniformly distributed, recalculating in order to achieve a better data load

Re: load distribution that I can't explain

2017-09-13 Thread kurt greaves
Are you using a load balancing policy? That sounds like you are only using node2 as a coordinator.​

Rebalance a cassandra cluster

2017-09-13 Thread Akshit Jain
Suppose I have a cassandra cluster with the data that is skewed such that one node have 40% more data than other nodes.Since while creating the cassandra the tokens were distributed uniformly. Now to make the data uniform I have to recalculate the tokens and assign them to nodes in the cluster.

Maturity and Stability of Enabling CDC

2017-09-13 Thread Michael Fong
Hi, all, We've noticed there is a new feature for streaming changed data other streaming service. Doc: http://cassandra.apache.org/doc/latest/operating/cdc .html We are evaluating the stability (and maturity) of this feature, and possibly integrate this with Kafka (associated w/ its connector).

Re: Rebalance a cassandra cluster

2017-09-13 Thread Akshit Jain
Suppose I have a cassandra cluster with the data that is skewed such that one node have 40% more data than other nodes.Since while creating the cassandra the tokens were distributed uniformly. Now to make the data uniform I have to recalculate the tokens and assign them to nodes in the cluster.

Re: Historical data movement to other cluster

2017-09-13 Thread Hannu Kröger
Hi, If you have that data in different tables, then it’s relatively straight forward operations of loading only certain tables with sstableloader. If not, then you could use spark to read and filter data from one cluster and store that into another cluster. Hannu On 13 September 2017 at

Re: Rebalance a cassandra cluster

2017-09-13 Thread Hannu Kröger
Hi, you should make sure that token range is evenly distributed if you have a single token configured per node. You can use e.g. this tool to calculate tokens: https://www.geroba.com/cassandra/cassandra-token-calculator/ Also, make sure that none of the partitions in your data model are hotspots

RE: Rebalance a cassandra cluster

2017-09-13 Thread Harika Vangapelli -T (hvangape - AKRAYA INC at Cisco)
Check with nodetool repair. [http://wwwin.cisco.com/c/dam/cec/organizations/gmcc/services-tools/signaturetool/images/logo/logo_gradient.png] Harika Vangapelli Engineer - IT hvang...@cisco.com Tel: Cisco Systems, Inc. United States cisco.com