RE: Token Ring Question

2016-06-03 Thread Anubhav Kale
some tools. In reality, it's just the replica that owns the first (clockwise) token. I'm not sure what you're really asking, though -- what are you concerned about? On Wed, Jun 1, 2016 at 2:40 PM, Anubhav Kale <anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>> wrote: Hel

Token Ring Question

2016-06-01 Thread Anubhav Kale
Hello, I recently learnt that regardless of number of Data Centers, there is really only one token ring across all nodes. (I was under the impression that there is one per DC like how Datastax Ops Center would show it). Suppose we have 4 v-nodes, and 2 DCs (2 nodes in each DC) and a key space

RE: StreamCoordinator.ConnectionsPerHost set to 1

2016-06-17 Thread Anubhav Kale
. 2016-06-16 14:43 GMT-03:00 Anubhav Kale <anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>>: Hello, I noticed that StreamCoordinator.ConnectionsPerHost is always set to 1 (Cassandra 2.1.13). If I am reading the code correctly, this means there will always be just one soc

StreamCoordinator.ConnectionsPerHost set to 1

2016-06-16 Thread Anubhav Kale
Hello, I noticed that StreamCoordinator.ConnectionsPerHost is always set to 1 (Cassandra 2.1.13). If I am reading the code correctly, this means there will always be just one socket (well, 2 technically for each direction) between nodes when rebuilding thus the data will always be serialized.

RE: Token Ring Question

2016-06-24 Thread Anubhav Kale
is correct, In my view, unless the drivers execute the *Topology.GetReplicas from Cassandra core somehow (something that isn’t available to them), they will never be able to tell the correct node that holds data for a given token. Is my understanding wrong ? From: Anubhav Kale [mailto:anubhav.k

RE: DTCS Question

2016-03-18 Thread Anubhav Kale
Thanks for the explanation. From: Marcus Eriksson [mailto:krum...@gmail.com] Sent: Thursday, March 17, 2016 12:56 AM To: user@cassandra.apache.org Subject: Re: DTCS Question On Wed, Mar 16, 2016 at 6:49 PM, Anubhav Kale <anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>&

DTCS bucketing Question

2016-03-19 Thread Anubhav Kale
Hello, I am trying to concretely understand how DTCS makes buckets and I am looking at the DateTieredCompactionStrategyTest.testGetBuckets method and played with some of the parameters to GetBuckets method call (Cassandra 2.1.12). I don't think I fully understand something there. Let me try

Speeding up "nodetool rebuild"

2016-03-30 Thread Anubhav Kale
Hello, Will changing compactionthroughput and streamingthroughput help with reducing the "rebuild" time on a brand new node ? We will do it both on the new node, and the nodes in source DC from where data is streamed. Any other ways to make the "rebuild" faster ? Thanks !

Removing a DC

2016-04-07 Thread Anubhav Kale
Hello, We removed a DC using instructions from https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_decomission_dc_t.html After all nodes were gone, 1. System.peers don't have an entry for the nodes that were removed. (confirmed via a cqlsh query with consistency all)

RE: DTCS bucketing Question

2016-03-19 Thread Anubhav Kale
wordpress.com/2014/12/dtcs3.png [Anubhav Kale] This doesn’t seem correct. In the original test (look at comments), the first window is pretty big and in many cases, the first window is big. > Note, that if I keep the base to original (100L) or increase it and play with > min_threshold the

DTCS Question

2016-03-19 Thread Anubhav Kale
I am using Cassandra 2.1.13 which has all the latest DTCS fixes (it does STCS within the DTCS windows). It also introduced a field called MAX_WINDOW_SIZE which defaults to one day. So in my data folders, I may see SS Tables that span beyond a day (generated through old data through repairs or

RE: Rack aware question.

2016-03-23 Thread Anubhav Kale
consistency to ALL, and now I can get data all the time. From: Anubhav Kale [mailto:anubhav.k...@microsoft.com] Sent: Wednesday, March 23, 2016 4:14 PM To: user@cassandra.apache.org Subject: RE: Rack aware question. Thanks, Read repair is what I thought must be causing this, so I experimented some

RE: Rack aware question.

2016-03-23 Thread Anubhav Kale
com%7c7741553cdb7c4ce7ee1f08d3536599a0%7c72f988bf86f141af91ab2d7cd011db47%7c1=3PY62w9X94T3fCkPZVJzN2dl8eda44Yj3zBvk83faWk%3d> =Rob On Wed, Mar 23, 2016 at 2:06 PM, Anubhav Kale <anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>> wrote: Oh, and the query I ran was “select * from racktest.ra

RE: Rack aware question.

2016-03-23 Thread Anubhav Kale
ust be deprecated. ignore_rack flag can be useful if you move your data manually, with rsync or sstableloader. 2016-03-23 19:09 GMT-03:00 Anubhav Kale <anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>>: Thanks for the pointer – appreciate it. My test is on the latest trun

RE: Rack aware question.

2016-03-23 Thread Anubhav Kale
t data can be retrieved even when a rack-level failure occurs. In short, if CL=ALL is acceptable, then you might as well dump the rack-aware approach, which was how you got into this situation in the first place. -- Jack Krupansky On Wed, Mar 23, 2016 at 7:31 PM, Anubhav Kale <

Rack aware question.

2016-03-23 Thread Anubhav Kale
Hello, Suppose we change the racks on VMs on a running cluster. (We need to do this while running on Azure, because sometimes when the VM gets moved its rack changes). In this situation, new writes will be laid out based on new rack info on appropriate replicas. What happens for existing data

RE: Speeding up "nodetool rebuild"

2016-03-31 Thread Anubhav Kale
or not) isRebuilding.set(false); } -Original Message- From: Eric Evans [mailto:eev...@wikimedia.org] Sent: Thursday, March 31, 2016 9:50 AM To: user@cassandra.apache.org Subject: Re: Speeding up "nodetool rebuild" On Wed, Mar 30, 2016 at 3:44 PM, Anubhav Kale

RE: Expiring Columns

2016-03-21 Thread Anubhav Kale
I think the answer is no. There are explicit checks in Read code path to ignore anything that’s past the TTL (based on local time of the node under question). From: Anuj Wadehra [mailto:anujw_2...@yahoo.co.in] Sent: Monday, March 21, 2016 5:19 AM To: User Subject:

RE: Problem Replacing a Dead Node

2016-04-21 Thread Anubhav Kale
Is the datastax-agent running fine on the node ? What does nodetool status and system.log show ? From: Mir Tanvir Hossain [mailto:mir.tanvir.hoss...@gmail.com] Sent: Thursday, April 21, 2016 10:02 AM To: user@cassandra.apache.org Subject: Problem Replacing a Dead Node Hi, I am trying to replace

RE: Problem Replacing a Dead Node

2016-04-21 Thread Anubhav Kale
Reusing the bootstrapping node could have caused this, but hard to tell. Since you have only 7 nodes, have you tried doing a few rolling restarts of all nodes to let gossip settle ? Also, the node is pingable from other nodes even though it says Unreachable below. Correct ? Based on nodetool

Removing a datacenter

2016-05-23 Thread Anubhav Kale
Hello, Suppose we have 2 DCs and we know that the data is correctly replicated in both. In such situation, is it safe to "remove" one of the DCs by simply doing a "nodetool remove node" followed by "nodetool removenode force" for each node in that DC (instead of doing a "nodetool decommission"

RE: Removing a datacenter

2016-05-24 Thread Anubhav Kale
the keyspace, remove the replication settings for that DC, and then you can decommission (and they won’t need to stream nearly as much, since they no longer own that data – decom will go much faster). From: Anubhav Kale Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.or

Nodetool repair question

2016-05-10 Thread Anubhav Kale
Hello, Suppose I have 3 nodes, and stop Cassandra on one of them. Then I run a repair. Will repair move the token ranges from down node to other node ? In other words in any situation, does repair operation ever change token ownership ? Thanks !

Applying TTL Change quickly

2016-05-17 Thread Anubhav Kale
Hello, We use STCS and DTCS on our tables and recently made a TTL change (reduced from 8 days to 2) on a table with large amounts of data. What is the best way to quickly purge old data ? I am playing with tombstone_compaction_interval at the moment, but would like some suggestions on what

SS Tables Files Streaming

2016-05-06 Thread Anubhav Kale
Hello, In what scenarios can SS Table files on disk from Node 1 go to Node 2 as is ? I'm aware this happens in nodetool rebuild and I am assuming this does not happen in repairs. Can someone confirm ? The reason I ask is I am working on a solution for backup / restore and I need to be sure

RE: SS Tables Files Streaming

2016-05-06 Thread Anubhav Kale
@cassandra.apache.org<mailto:user@cassandra.apache.org>" Subject: Re: SS Tables Files Streaming Repairs, bootstamp, decommission. On Fri, May 6, 2016 at 1:16 PM Anubhav Kale <anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>> wrote: Hello, In what scenarios can SS

SS Table File Names not containing GUIDs

2016-05-02 Thread Anubhav Kale
Hello, I am wondering if there is any reason as to why the SS Table format doesn't have a GUID. As far as I can tell, the incrementing number isn't really used for any special purpose in code, and having a unique name for the file seems to be a better thing, in general. Specifically, this

RE: Nodetool rebuild and bootstrap

2016-04-14 Thread Anubhav Kale
: Nodetool rebuild and bootstrap https://issues.apache.org/jira/browse/CASSANDRA-8838 Bootstrap only resumes on 2.2.0 and newer. I’m unsure of rebuild, but I suspect it does not resume at all. From: Anubhav Kale Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"

Leak Detected while bootstrap

2016-04-13 Thread Anubhav Kale
Hello, Since we upgraded to Cassandra 2.1.12, we are noticing that below happens when we are trying to bootstrap nodes, and the process just gets stuck. Restarting the process / VM does not help. Our nodes are around ~300 GB and run on local SSDs and we haven't seen this problem on older

RE: Leak Detected while bootstrap

2016-04-13 Thread Anubhav Kale
share your logs leading up to the error? On Wed, Apr 13, 2016 at 3:37 PM, Anubhav Kale <anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>> wrote: Hello, Since we upgraded to Cassandra 2.1.12, we are noticing that below happens when we are trying to bootstrap nodes, and the

Changing Racks of Nodes

2016-04-20 Thread Anubhav Kale
Hello, If a running node moves around and changes its rack in the process, when its back in the cluster (through ignore-rack property), is it a correct statement that queries will not see some data residing on this node until a repair is run ? Or, is it more like the node may get requests for

Repairs at scale in Cassandra 2.1.13

2016-09-26 Thread Anubhav Kale
Hello, We run Cassandra 2.1.13 (don't have plans to upgrade yet). What is the best way to run repairs at scale (400 nodes, each holding ~600GB) that actually works ? I'm considering doing subrange repairs (https://github.com/BrianGallew/cassandra_range_repair/blob/master/src/range_repair.py)

RE: Nodetool rebuild question

2016-10-06 Thread Anubhav Kale
ect: Re: Nodetool rebuild question If you set RF to 0, you can ignore my second sentence/paragraph. The third still applies. From: Anubhav Kale <anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassan

Nodetool rebuild question

2016-10-05 Thread Anubhav Kale
Hello, As part of rebuild, I noticed that the destination node gets -tmp- files from other nodes. Are following statements correct ? 1. The files are written to disk without going through memtables. 2. Regular compactors eventually compact them to bring down # SSTables to a

RE: Nodetool rebuild question

2016-10-05 Thread Anubhav Kale
~100 more nodes into a cluster, you’d have to wait potentially a day or more per node to compact away the leftovers before bootstrapping the next, which is prohibitive at scale. - Jeff From: Anubhav Kale <anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>> Rep

RE: Repairs at scale in Cassandra 2.1.13

2016-09-29 Thread Anubhav Kale
stpickle.com=01%7C01%7CAnubhav.Kale%40microsoft.com%7C698bf80ea0aa4b86e85608d3e79938db%7C72f988bf86f141af91ab2d7cd011db47%7C1=q9UnaVZgS0HekWDbwakpK3piOMdvEpQtiUuiDzly%2Bu0%3D=0> 2016-09-26 23:51 GMT+02:00 Anubhav Kale <anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>>: Hello

Anticompaction Question

2016-10-25 Thread Anubhav Kale
Hello, If incremental repairs are enabled, there is logic in every compaction strategy to make sure not to mix repaired and unrepaired SS Tables. Does this mean if some SS Table files are repaired and some aren't and incremental repairs don't work reliably, the unrepaired tables will never get

RE: Question on Read Repair

2016-11-03 Thread Anubhav Kale
Subject: Re: Question on Read Repair Yes: https://github.com/apache/cassandra/blob/81f6c784ce967fadb6ed7f58de1328e713eaf53c/src/java/org/apache/cassandra/db/ConsistencyLevel.java#L286 From: Anubhav Kale <anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>> Rep

RE: Backup restore with a different name

2016-11-02 Thread Anubhav Kale
You would have to build some logic on top of what’s natively supported. Here is an option: https://github.com/anubhavkale/CassandraTools/tree/master/BackupRestore From: Jens Rantil [mailto:jens.ran...@tink.se] Sent: Wednesday, November 2, 2016 2:21 PM To: Cassandra Group

Question on Read Repair

2016-10-11 Thread Anubhav Kale
Hello, This is more of a theory / concept question. I set CL=ALL and do a read. Say one replica was down, will the rest of the replicas get repaired as part of this ? (I am hoping the answer is yes). Thanks !

RE: Question on Read Repair

2016-10-11 Thread Anubhav Kale
that the node is down, it won’t attempt a read, because the consistency level can’t be satisfied – none of the other replicas will be repaired. From: Anubhav Kale <anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>> Reply-To: "user@cassandra.apache.org<mailto:user@

New node overstreaming data ?

2016-10-13 Thread Anubhav Kale
Hello, We run 2.1.13 and seeing an odd issue. A node went down, and stayed down for a while so it went out of gossip. When we try to bootstrap it again (as a new node), it overstreams from other nodes, eventually disk becomes full and crashes. This repeated 3 times. Does anyone have any

RE: Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

2016-10-12 Thread Anubhav Kale
The default repair process doesn't usually work at scale, unfortunately. Depending on your data size, you have the following options. Netflix Tickler: https://github.com/ckalantzis/cassTickler (Read at CL.ALL via CQL continuously :: Python) Spotify Reaper:

VNode Streaming Math

2016-10-12 Thread Anubhav Kale
Hello, Suppose I have a 100 node ring, with num_tokens=32 (thus, 32 VNodes per physical machine). Assume this cluster has just one keyspace having one table. There are 10 SS Tables on each node, and size on disk is 10GB on each node. For simplicity, assume each SSTable is 1GB. Now, a node

RE: Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

2016-10-12 Thread Anubhav Kale
as repaired (it’s a simpler version of mutation based repair in a sense), so Cassandra has no way to prove that the data has been repaired. With tickets like https://issues.apache.org/jira/browse/CASSANDRA-6434, this has implications on tombstone removal. From: Anubhav Kale <anubha

RE: RemoveNode CPU Spike Question

2017-01-10 Thread Anubhav Kale
Well, looking through logs I confirmed that my understanding below is correct, but would be good to hear from experts for sure  From: Anubhav Kale [mailto:anubhav.k...@microsoft.com] Sent: Tuesday, January 10, 2017 9:58 AM To: user@cassandra.apache.org Cc: Sean Usher <s

RemoveNode CPU Spike Question

2017-01-10 Thread Anubhav Kale
Hello, Recently, I started noticing an interesting pattern. When I execute "removenode", a subset of the nodes that now own the tokens result it in a CPU spike / disk activity, and sometimes SSTables on those nodes shoot up. After looking through the code, it appears to me that below function

RE: High CPU on nodes

2016-12-21 Thread Anubhav Kale
utlook.com/?url=http%3A%2F%2Fwww.thelastpickle.com=02%7C01%7CAnubhav.Kale%40microsoft.com%7Cab2c0fcf99a447694b0908d4267f3036%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636175775106811606=kZPi%2B43OyWGNr%2FAmJsLflVOWkSMI0V7oK4x%2Ff%2FR27BU%3D=0> 2016-12-17 0:10 GMT+01:00 Anubha

High CPU on nodes

2016-12-16 Thread Anubhav Kale
Hello, I am trying to fight a high CPU problem on some of our nodes. Thread dumps show that it's not GC threads (we have 30GB heap), iostat %iowait confirms it's not disk (ranges between 0.3 - 0.9%). One of the ways in which the problem manifests is that the nodes can't compact SSTables and it

RE: nodetool repair failure

2017-06-30 Thread Anubhav Kale
If possible, simply read the table under question with consistency=ALL. This will trigger a repair and is far more reliable than the nodetool command. From: Balaji Venkatesan [mailto:venkatesan.bal...@gmail.com] Sent: Thursday, June 29, 2017 7:26 PM To: user@cassandra.apache.org Subject: Re:

Cassandra 2.1.13: Using JOIN_RING=False

2017-05-09 Thread Anubhav Kale
Hello, With some inspiration from the Cassandra Summit talk from last year, we are trying to setup a cluster with coordinator-only nodes. We setup join_ring=false in env.sh, disabled auth in YAML and the nodes are able to start just fine. However, we're running into a few problems 1] The

unrecognized column family in logs

2017-11-08 Thread Anubhav Kale
Hello, We run Cassandra 2.1.13 and since last few days we're seeing below in logs occasionally. The node then becomes unresponsive to cqlsh. ERROR [SharedPool-Worker-2] 2017-11-08 17:02:32,362 CommitLogSegment.java:441 - Attempted to write commit log entry for unrecognized column family:

RE: Cassandra proxy to control read/write throughput

2017-10-31 Thread Anubhav Kale
There are some caveats with coordinator only nodes. You can read about our experience in detail here. From: Nate McCall [mailto:n...@thelastpickle.com] Sent: Sunday, October 29, 2017 2:12 PM To: Cassandra

RE: Re: Tuning bootstrap new node

2017-10-31 Thread Anubhav Kale
You can change YAML setting of memtable_cleanup_threshold to 0.7 (from the default of 0.3). This will push SSTables to disk less often and will reduce the compaction time. While this won’t change the streaming time, it will reduce the overall time for your node to be healthy. From: