Re: Tombstone removal optimization and question

2018-11-06 Thread kurt greaves
Yes it does. Consider if it didn't and you kept writing to the same partition, you'd never be able to remove any tombstones for that partition. On Tue., 6 Nov. 2018, 19:40 DuyHai Doan Hello all > > I have tried to sum up all rules related to tombstone removal: > > >

[ANNOUNCE] StratIO's Lucene plugin fork

2018-10-18 Thread kurt greaves
Hi all, We've had confirmation from Stratio that they are no longer maintaining their Lucene plugin for Apache Cassandra. We've thus decided to fork the plugin to continue maintaining it. At this stage we won't be making any additions to the plugin in the short term unless absolutely necessary,

Re: SSTableMetadata Util

2018-10-01 Thread kurt greaves
Pranay, 3.11.3 should include all the C* binaries in /usr/bin. Maybe try reinstalling? Sounds like something got messed up along the way. Kurt On Tue, 2 Oct 2018 at 12:45, Pranay akula wrote: > Thanks Christophe, > > I have installed using rpm package I actually ran locate command to find >

Re: TWCS + subrange repair = excessive re-compaction?

2018-09-26 Thread kurt greaves
Not any faster, as you'll still have to wait for all the SSTables to age off, as a partition level tombstone will simply go to a new SSTable and likely will not be compacted with the old SSTables. On Tue, 25 Sep 2018 at 17:03, Martin Mačura wrote: > Most partitions in our dataset span one or

Re: node replacement failed

2018-09-22 Thread kurt greaves
I don't like your cunning plan. Don't drop the system auth and distributed keyspaces, instead just change them to NTS and then do your replacement for each down node. If you're actually using auth and worried about consistency I believe 3.11 has the feature to be able to exclude nodes during a

Re: stuck with num_tokens 256

2018-09-22 Thread kurt greaves
No, that's not true. On Sat., 22 Sep. 2018, 21:58 onmstester onmstester, wrote: > > If you have problems with balance you can add new nodes using the > algorithm and it'll balance out the cluster. You probably want to stick to > 256 tokens though. > > > I read somewhere (don't remember the ref)

Re: stuck with num_tokens 256

2018-09-22 Thread kurt greaves
new clusters which i'm going to > setup? > Is the Allocation algorithm, now recommended algorithm and mature enough > to replace the Random algorithm? if its so, it should be the default one at > 4.0? > > > On Sat, 22 Sep 2018 13:41:47 +0330 *kurt greaves > >* wrote

Re: stuck with num_tokens 256

2018-09-22 Thread kurt greaves
If you have problems with balance you can add new nodes using the algorithm and it'll balance out the cluster. You probably want to stick to 256 tokens though. To reduce your # tokens you'll have to do a DC migration (best way). Spin up a new DC using the algorithm on the nodes and set a lower

Re: Recommended num_tokens setting for small cluster

2018-08-29 Thread kurt greaves
For 10 nodes you probably want to use between 32 and 64. Make sure you use the token allocation algorithm by specifying allocate_tokens_for_keyspace On Thu., 30 Aug. 2018, 04:40 Jeff Jirsa, wrote: > 3.0 has a (optional?) feature to guarantee better distribution, and the > blog focuses on 2.2. >

Re: URGENT: disable reads from node

2018-08-29 Thread kurt greaves
Note that you'll miss incoming writes if you do that, so you'll be inconsistent even after the repair. I'd say best to just query at QUORUM until you can finish repairs. On 29 August 2018 at 21:22, Alexander Dejanovski wrote: > Hi Vlad, you must restart the node but first disable joining the

Re: Nodetool refresh v/s sstableloader

2018-08-29 Thread kurt greaves
Removing dev... Nodetool refresh only picks up new SSTables that have been placed in the tables directory. It doesn't account for actual ownership of the data like SSTableloader does. Refresh will only work properly if the SSTables you are copying in are completely covered by that nodes tokens. It

Re: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread kurt greaves
do)? > > Sent using Zoho Mail <https://www.zoho.com/mail/> > > > ==== Forwarded message > From : kurt greaves > To : "User" > Date : Wed, 29 Aug 2018 12:03:47 +0430 > Subject : Re: bigger data density with Cassandra 4.0? >

Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread kurt greaves
ood. On 28 August 2018 at 01:37, Dinesh Joshi wrote: > Although the extent of benefits depend on the specific use case, the > cluster size is definitely not a limiting factor. > > Dinesh > > On Aug 27, 2018, at 5:05 AM, kurt greaves wrote: > > I believe there are cavea

Re: 2.2 eats memory

2018-08-27 Thread kurt greaves
I'm thinking it's unlikely that top is lying to you. Are you sure that you're measuring free memory versus available memory? Cassandra will utilise the OS page cache heavily, which will cache files in memory but leave the memory able to be reclaimed if needed. Have you checked the output of free?

Re: bigger data density with Cassandra 4.0?

2018-08-27 Thread kurt greaves
I believe there are caveats that it will only really help if you're not using vnodes, or you have a very small cluster, and also internode encryption is not enabled. Alternatively if you're using JBOD vnodes will be marginally better, but JBOD is not a great idea (and doesn't guarantee a massive

Re: Configuration parameter to reject incremental repair?

2018-08-20 Thread kurt greaves
Yeah I meant 2.2. Keep telling myself it was 3.0 for some reason. On 20 August 2018 at 19:29, Oleksandr Shulgin wrote: > On Mon, Aug 13, 2018 at 1:31 PM kurt greaves wrote: > >> No flag currently exists. Probably a good idea considering the serious >> issues with increme

Re: JBOD disk failure

2018-08-17 Thread kurt greaves
As far as I'm aware, yes. I recall hearing someone mention tying system tables to a particular disk but at the moment that doesn't exist. On Fri., 17 Aug. 2018, 01:04 Eric Evans, wrote: > On Wed, Aug 15, 2018 at 3:23 AM kurt greaves wrote: > > Yep. It might require a full nod

Re: JBOD disk failure

2018-08-15 Thread kurt greaves
gt; Thank you for the answers. We are using the current version 3.11.3 So this > one includes CASSANDRA-6696. > > So if I get this right, losing system tables will need a full node > rebuild. Otherwise repair will get the node consistent again. > > > > Regards, > > Ch

Re: JBOD disk failure

2018-08-14 Thread kurt greaves
If that disk had important data in the system tables however you might have some trouble and need to replace the entire instance anyway. On 15 August 2018 at 12:20, Jeff Jirsa wrote: > Depends on version > > For versions without the fix from Cassandra-6696, the only safe option on > single disk

Re: 90million reads

2018-08-14 Thread kurt greaves
Not a great idea to make config changes without testing. For a lot of changes you can make the change on one node and measure of three is an improvement however. You'd probably be best to add nodes (double should be sufficient), do tuning and testing afterwards, and then decommission a few nodes

Re: Data Corruption due to multiple Cassandra 2.1 processes?

2018-08-13 Thread kurt greaves
ckport referencing 11540 or re-open 11540? > > > > Thanks for your help. > > > > Thomas > > > > *From:* kurt greaves > *Sent:* Montag, 13. August 2018 13:24 > *To:* User > *Subject:* Re: Data Corruption due to multiple Cassandra 2.1 processes? > &g

Re: Configuration parameter to reject incremental repair?

2018-08-13 Thread kurt greaves
No flag currently exists. Probably a good idea considering the serious issues with incremental repairs since forever, and the change of defaults since 3.0. On 7 August 2018 at 16:44, Steinmaurer, Thomas < thomas.steinmau...@dynatrace.com> wrote: > Hello, > > > > we are running Cassandra in AWS

Re: Data Corruption due to multiple Cassandra 2.1 processes?

2018-08-13 Thread kurt greaves
Yeah that's not ideal and could lead to problems. I think corruption is only likely if compactions occur, but seems like data loss is a potential not to mention all sorts of other possible nasties that could occur running two C*'s at once. Seems to me that 11540 should have gone to 2.1 in the

Re: Hinted Handoff

2018-08-06 Thread kurt greaves
> > Does Cassandra TTL out the hints after max_hint_window_in_ms? From my > understanding, Cassandra only stops collecting hints after > max_hint_window_in_ms but can still keep replaying the hints if the node > comes back again. Is this correct? Is there a way to TTL out hints? No, but it won't

Re: 3.11.2 memory leak

2018-07-22 Thread kurt greaves
Likely in the next few weeks. On Mon., 23 Jul. 2018, 01:17 Abdul Patel, wrote: > Any idea when 3.11.3 is coming in? > > On Tuesday, June 19, 2018, kurt greaves wrote: > >> At this point I'd wait for 3.11.3. If you can't, you can get away with >> backporting a few repair

Re: Limitations of Hinted Handoff OverloadedException exception

2018-07-16 Thread kurt greaves
The coordinator will refuse to send writes/hints to a node if it has a large backlog of hints (128 * #cores) already and the destination replica is one of the nodes with hints destined to it. It will still send writes to any "healthy" node (a node with no outstanding hints). The idea is to not

Re: batchstatement

2018-07-16 Thread kurt greaves
What is the primary key for the user_by_ext table? I'd assume it's ext_id, which would imply your update doesn't make sense as you can't change the primary key for a row - which would be the problem you're seeing. On Sat., 14 Jul. 2018, 06:14 Randy Lynn, wrote: > TL/DR: > - only 1 out of 14

Re: default_time_to_live vs TTL on insert statement

2018-07-11 Thread kurt greaves
The Datastax documentation is wrong. It won't error, and it shouldn't. If you want to fix that documentation I suggest contacting Datastax. On 11 July 2018 at 19:56, Nitan Kainth wrote: > Hi DuyHai, > > Could you please explain in what case C* will error based on documented > statement: > > You

[ANNOUNCE] LDAP Authenticator for Cassandra

2018-07-05 Thread kurt greaves
We've seen a need for an LDAP authentication implementation for Apache Cassandra so we've gone ahead and created an open source implementation (ALv2) utilising the pluggable auth support in C*. Now, I'm positive there are multiple implementations floating around that haven't been open sourced,

Re: Inconsistent Quorum Read after Quorum Write

2018-07-03 Thread kurt greaves
Shouldn't happen. Any chance you could trace the queries, or have you been able to reproduce it? Also, what version of Cassandra? On Wed., 4 Jul. 2018, 06:41 Visa, wrote: > Hi all, > > We recently experienced an unexpected behavior with C* consistency. > > For example, a table t consists of 4

Re: C* in multiple AWS AZ's

2018-06-29 Thread kurt greaves
status would report rack of 1a, even though in 1e? > > Thanks in advance for the help/thoughts!! > > > On Thu, Jun 28, 2018 at 6:20 PM, kurt greaves > wrote: > >> There is a need for a repair with both DCs as rebuild will not stream all >> replicas, so unles

Re: C* in multiple AWS AZ's

2018-06-28 Thread kurt greaves
There is a need for a repair with both DCs as rebuild will not stream all replicas, so unless you can guarantee you were perfectly consistent at time of rebuild you'll want to do a repair after rebuild. On another note you could just replace the nodes but use GPFS instead of EC2 snitch, using the

Re: Re: Re: stream failed when bootstrap

2018-06-28 Thread kurt greaves
cassandra and start cassandra command one by one, right? > Only one node is executed at a time > > Dayu > > > > At 2018-06-28 11:37:43, "kurt greaves" wrote: > > Best off trying a rolling restart. > > On 28 June 2018 at 03:18, dayu wrote: > >> the outpu

Re: Re: stream failed when bootstrap

2018-06-27 Thread kurt greaves
Best off trying a rolling restart. On 28 June 2018 at 03:18, dayu wrote: > the output of nodetool describecluster > Cluster Information: > Name: online-xxx > Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions:

Re: Is it ok to add more than one node to a exist cluster

2018-06-27 Thread kurt greaves
>> >> Dayu >> >> >> >> At 2018-06-27 17:50:34, "kurt greaves" wrote: >> >> Don't bootstrap nodes simultaneously unless you really know what you're >> doing, and you're using single tokens. It's not straightforward and will >> li

Re: Is it ok to add more than one node to a exist cluster

2018-06-27 Thread kurt greaves
Don't bootstrap nodes simultaneously unless you really know what you're doing, and you're using single tokens. It's not straightforward and will likely lead to data loss/inconsistencies. This applies for all current versions. On 27 June 2018 at 10:21, dayu wrote: > Hi, > I have read a

Re: 3.11.2 memory leak

2018-06-19 Thread kurt greaves
At this point I'd wait for 3.11.3. If you can't, you can get away with backporting a few repair fixes or just doing sub range repairs on 3.11.2 On Wed., 20 Jun. 2018, 01:10 Abdul Patel, wrote: > Hi All, > > Do we kmow whats the stable version for now if u wish to upgrade ? > > On Tuesday, June

Re: Timestamp on hints file and system.hints table data

2018-06-18 Thread kurt greaves
June 2018 at 13:56, learner dba wrote: > Yes Kurt, system log is flooded with hints sent and replayed messages. > > On Monday, June 18, 2018, 7:30:34 AM EDT, kurt greaves < > k...@instaclustr.com> wrote: > > > Not sure what to make of that. Are there any log me

Re: Timestamp on hints file and system.hints table data

2018-06-18 Thread kurt greaves
RAC1 > > > > On Thu, Jun 14, 2018 at 12:45 AM, kurt greaves > wrote: > >> Does the UUID on the filename correspond with a UUID in nodetool status? >> >> Sounds to me like it could be something weird with an old node that no >> longer exists, although hin

Re:

2018-06-18 Thread kurt greaves
> > 1) Am I correct to assume that the larger page size some user session has > set - the larger portion of cluster/coordinator node resources will be > hogged by the corresponding session? > 2) Do I understand correctly that page size (imagine we have no timeout > settings) is limited by RAM and

Re: Compaction strategy for update heavy workload

2018-06-13 Thread kurt greaves
that's never deleted and really small sstables sticking around > forever. If you use really large buckets, what's the point of TWCS? > > Honestly this is such a small workload you could easily use STCS or > LCS and you'd likely never, ever see a problem. > On Wed, Jun 13, 2018 at 3:34

Re: Timestamp on hints file and system.hints table data

2018-06-13 Thread kurt greaves
e is down for months. And yes, I am surprised to look at Unix > timestamp on files. > > > > On Jun 13, 2018, at 6:41 PM, kurt greaves wrote: > > system.hints is not used in Cassandra 3. Can't explain the files though, > are you referring to the files timestamp or the

Re: Timestamp on hints file and system.hints table data

2018-06-13 Thread kurt greaves
system.hints is not used in Cassandra 3. Can't explain the files though, are you referring to the files timestamp or the Unix timestamp in the file name? Is there a node that's been down for several months? On Wed., 13 Jun. 2018, 23:41 Nitan Kainth, wrote: > Hi, > > I observed a strange

Re: Compaction strategy for update heavy workload

2018-06-13 Thread kurt greaves
TWCS is probably still worth trying. If you mean updating old rows in TWCS "out of order updates" will only really mean you'll hit more SSTables on read. This might add a bit of complexity in your client if your bucketing partitions (not strictly necessary), but that's about it. As long as you're

Re: Migrating to Reaper: Switching From Incremental to Reaper's Full Subrange Repair

2018-06-13 Thread kurt greaves
Not strictly necessary but probably a good idea as you don't want two separate pools of SSTables unnecessarily. Also if you've set "only_purge_repaired_tombstones" you'll need to turn that off. On Wed., 13 Jun. 2018, 23:06 Fd Habash, wrote: > For those who are using Reaper … > > > > Currently,

Re: Cassandra 3.0.X migarte to VPC

2018-06-07 Thread kurt greaves
> > I meant migrating to gosspsnitch during adding new dc. New dc will be > empty so all the data will be streamed based on snitch property chosen Should work fine on the new DC, as long as the original DC is using a snitch that supports datacenters - then just don't mix and match snitches

Re: nodetool (2.1.18) - Xmx, ParallelGCThreads, High CPU usage

2018-05-29 Thread kurt greaves
(43 on our large machine) and running with Xmx128M or XmX31G > (derived from $MAX_HEAP_SIZE). For both Xmx, we saw the high CPU caused by > nodetool. > > > > Regards, > > Thomas > > > > *From:* kurt greaves [mailto:k...@instaclustr.com] > *Sent:* Dienstag, 29.

Re: nodetool (2.1.18) - Xmx, ParallelGCThreads, High CPU usage

2018-05-29 Thread kurt greaves
.apache.org/ > jira/browse/CASSANDRA-14475 > > > > Thanks, > > Thomas > > > > *From:* kurt greaves [mailto:k...@instaclustr.com] > *Sent:* Dienstag, 29. Mai 2018 05:54 > *To:* User > *Subject:* Re: nodetool (2.1.18) - Xmx, ParallelGCThreads, High CPU usag

Re: nodetool (2.1.18) - Xmx, ParallelGCThreads, High CPU usage

2018-05-28 Thread kurt greaves
> > 1) nodetool is reusing the $MAX_HEAP_SIZE environment variable, thus if we > are running Cassandra with e.g. Xmx31G, nodetool is started with Xmx31G as > well This was fixed in 3.0.11/3.10 in CASSANDRA-12739 . Not sure why it didn't make

Re: performance on reading only the specific nonPk column

2018-05-21 Thread kurt greaves
Every column will be retrieved (that's populated) from disk and the requested column will then be sliced out in memory and sent back. On 21 May 2018 at 08:34, sujeet jog wrote: > Folks, > > consider a table with 100 metrics with (id , timestamp ) as key, > if one wants to

Re: Invalid metadata has been detected for role

2018-05-17 Thread kurt greaves
Can you post the stack trace and you're version of Cassandra? On Fri., 18 May 2018, 09:48 Abdul Patel, wrote: > Hi > > I had to decommission one dc , now while adding bacl the same nodes ( i > used nodetool decommission) they both get added fine and i also see them im >

Re: row level atomicity and isolation

2018-05-16 Thread kurt greaves
Atomicity and isolation are only guaranteed within a replica. If you have multiple concurrent requests across replicas last timestamp will win. You can get better isolation using LWT which uses paxos under the hood. On 16 May 2018 at 08:55, Rajesh Kishore wrote: > Hi, >

Re: Suggestions for migrating data from cassandra

2018-05-15 Thread kurt greaves
COPY might work but over hundreds of gigabytes you'll probably run into issues if you're overloaded. If you've got access to Spark that would be an efficient way to pull down an entire table and dump it out using the spark-cassandra-connector. On 15 May 2018 at 10:59, Jing Meng

Re: dtests failing with - ValueError: unsupported hash type md5

2018-05-10 Thread kurt greaves
What command did you run? Probably worth checking that cqlsh is installed in the virtual environment and that you are executing pytest from within the virtual env. On 10 May 2018 at 05:06, Rajiv Dimri wrote: > Hi All, > > > > We have setup a dtest environment to run

Re: compaction: huge number of random reads

2018-05-07 Thread kurt greaves
If you've got small partitions/small reads you should test lowering your compression chunk size on the table and disabling read ahead. This sounds like it might just be a case of read amplification. On Tue., 8 May 2018, 05:43 Kyrylo Lebediev, wrote: > Dear Experts, > >

Re: Version Upgrade

2018-05-03 Thread kurt greaves
> > In other words, if I am running Cassandra 1.2.x and upgrading to 2.0.x, > 2.0.x will continue to read all the old Cassandra 1.2.x table. However, if > I then want to upgrade to Cassandra 2.1.x, I’d better make sure all tables > have been upgraded to 2.0.x before making the next upgrade.

Re: Shifting data to DCOS

2018-05-02 Thread kurt greaves
orkflow? > Can anyone please suggest the best way to move data from one cluster to > another? > > Any help will be greatly appreciated. > > On Tue, Apr 17, 2018 at 6:52 AM, Faraz Mateen <fmat...@an10.io> wrote: > >> Thanks for the response guys. >> >> L

Re: Determining active sstables and table- dir

2018-05-01 Thread kurt greaves
In 2.2 it's cf_id from system.schema_columnfamilies. If it's not then that's a bug. From 2.2 we stopped including table name in the SSTable name, so whatever directory contains the SSTables is the active one. Conversely, if you've dropped a table and re-added it, the directory without any SSTables

Re: Regular NullPointerExceptions from `nodetool compactionstats` on 3.7 node

2018-04-25 Thread kurt greaves
Typically have seen that in the past when the node is overloaded. Is that a possibility for you? If it works consistently after restarting C* it's likely the issue. On 20 April 2018 at 19:27, Paul Pollack wrote: > Hi all, > > We have a cluster running on Cassandra 3.7

Re: Memtable type and size allocation

2018-04-23 Thread kurt greaves
Hi Vishal, In Cassandra 3.11.2, there are 3 choices for the type of Memtable > allocation and as per my understanding, if I want to keep Memtables on JVM > heap I can use heap_buffers and if I want to store Memtables outside of JVM > heap then I've got 2 options offheap_buffers and

Re: SSTable count in Nodetool tablestats(LevelCompactionStrategy)

2018-04-20 Thread kurt greaves
I'm currently investigating this issue on one of our clusters (but much worse, we're seeing >100 SSTables and only 2 in the levels) on 3.11.1. What version are you using? It's definitely a bug. On 17 April 2018 at 10:09, wrote: > Dear Community, > > > > One of the tables

Re: Phantom growth resulting automatically node shutdown

2018-04-19 Thread kurt greaves
This was fixed (again) in 3.0.15. https://issues.apache.org/jira/browse/CASSANDRA-13738 On Fri., 20 Apr. 2018, 00:53 Jeff Jirsa, wrote: > There have also been a few sstable ref counting bugs that would over > report load in nodetool ring/status due to overlapping normal and >

Re: Token range redistribution

2018-04-19 Thread kurt greaves
That's assuming your data is perfectly consistent, which is unlikely. Typically that strategy is a bad idea and you should avoid it. On Thu., 19 Apr. 2018, 07:00 Richard Gray, <richard.g...@smxemail.com> wrote: > On 2018-04-18 21:28, kurt greaves wrote: > > replacing. Simply remo

Re: Token range redistribution

2018-04-18 Thread kurt greaves
A new node always generates more tokens. A replaced node using replace_address[_on_first_boot] will reclaim the tokens of the node it's replacing. Simply removing and adding back a new node without replace address will end up with the new node having different tokens, which would mean data loss in

Re: about the tombstone and hinted handoff

2018-04-16 Thread kurt greaves
I don't think that's true/maybe that comment is misleading. Tombstones AFAIK will be propagated by hints, and the hint system doesn't do anything to check if a particular row has been tombstoned. To the node receiving the hints it just looks like it's receiving a bunch of writes, it doesn't know

Re: Shifting data to DCOS

2018-04-16 Thread kurt greaves
Sorry for the delay. > Is the problem related to token ranges? How can I find out token range for > each node? > What can I do to further debug and root cause this? Very likely. See below. My previous cluster has 3 nodes but replication factor is 2. I am not > exactly sure how I would handle

Re: Many SSTables only on one node

2018-04-09 Thread kurt greaves
If there were no other messages about anti-compaction similar to: > > SSTable YYY (ranges) will be anticompacted on range [range] Then no anti-compaction needed to occur and yes, it was not the cause. On 5 April 2018 at 13:52, Dmitry Simonov wrote: > Hi, Evelyn! > >

Re: Shifting data to DCOS

2018-04-06 Thread kurt greaves
Without looking at the code I'd say maybe the keyspaces are displayed purely because the directories exist (but it seems unlikely). The process you should follow instead is to exclude the system keyspaces for each node and manually apply your schema, then upload your CFs into the correct

Re: auto_bootstrap for seed node

2018-04-03 Thread kurt greaves
Setting auto_bootstrap on seed nodes is unnecessary and irrelevant. If the node is a seed it will ignore auto_bootstrap and it *will not* bootstrap. On 28 March 2018 at 15:49, Ali Hubail wrote: > "it seems that we still need to keep bootstrap false?" > > Could you shed

Re: Execute an external program

2018-04-03 Thread kurt greaves
Correct. Note that both triggers and CDC aren't widely used yet so be sure to test. On 28 March 2018 at 13:02, Earl Lapus wrote: > > On Wed, Mar 28, 2018 at 8:39 AM, Jeff Jirsa wrote: > >> CDC may also work for newer versions, but it’ll happen after the

Re: replace dead node vs remove node

2018-03-25 Thread kurt greaves
Didn't read the blog but it's worth noting that if you replace the node and give it a *different* ip address repairs will not be necessary as it will receive writes during replacement. This works as long as you start up the replacement node before HH window ends.

Re: Nodetool Repair --full

2018-03-18 Thread kurt greaves
Worth noting that if you have racks == RF you only need to repair one rack to repair all the data in the cluster if you *don't* use -pr. Also note that full repairs on >=3.0 case anti-compactions and will mark things as repaired, so once you start repairs you need to keep repairing to ensure you

Re: Best way to Drop Tombstones/after GC Grace

2018-03-14 Thread kurt greaves
At least set GCGS == max_hint_window_in_ms that way you don't effectively disable hints for the table while your compaction is running. Might be preferable to use nodetool garbagecollect if you don't have enough disk space for a major compaction. Also worth noting you should do a splitting major

Re: What versions should the documentation support now?

2018-03-13 Thread kurt greaves
> > I’ve never heard of anyone shipping docs for multiple versions, I don’t > know why we’d do that. You can get the docs for any version you need by > downloading C*, the docs are included. I’m a firm -1 on changing that > process. We should still host versioned docs on the website however.

Re: Removing initial_token parameter

2018-03-09 Thread kurt greaves
correct, tokens will be stored in the nodes system tables after the first boot, so feel free to remove them (although it's not really necessary) On 9 Mar. 2018 20:16, "Mikhail Tsaplin" wrote: > Is it safe to remove initial_token parameter on a cluster created by > snapshot

Re: Cassandra/Spark failing to process large table

2018-03-08 Thread kurt greaves
Note that read repairs only occur for QUORUM/equivalent and higher, and also with a 10% (default) chance on anything less than QUORUM (ONE/LOCAL_ONE). This is configured at the table level through the dclocal_read_repair_chance and read_repair_chance settings (which are going away in 4.0). So if

Re: One time major deletion/purge vs periodic deletion

2018-03-07 Thread kurt greaves
The important point to consider is whether you are deleting old data or recently written data. How old/recent depends on your write rate to the cluster and there's no real formula. Basically you want to avoid deleting a lot of old data all at once because the tombstones will end up in new SSTables

Re: Cassandra 2.1.18 - Concurrent nodetool repair resulting in > 30K SSTables for a single small (GBytes) CF

2018-03-06 Thread kurt greaves
> > What we did have was some sort of overlapping between our daily repair > cronjob and the newly added node still in progress joining. Don’t know if > this sort of combination might causing troubles. I wouldn't be surprised if this caused problems. Probably want to avoid that. with waiting a

Re: Cassandra 2.1.18 - Concurrent nodetool repair resulting in > 30K SSTables for a single small (GBytes) CF

2018-03-04 Thread kurt greaves
Repairs with vnodes is likely to cause a lot of small SSTables if you have inconsistencies (at least 1 per vnode). Did you have any issues when adding nodes, or did you add multiple nodes at a time? Anything that could have lead to a bit of inconsistency could have been the cause. I'd probably

Re: Right sizing Cassandra data nodes

2018-02-28 Thread kurt greaves
The problem with higher densities is operations, not querying. When you need to add nodes/repair/do any streaming operation having more than 3TB per node becomes more difficult. It's certainly doable, but you'll probably run into issues. Having said that, an insert only workload is the best

Re: The home page of Cassandra is mobile friendly but the link to the third parties is not

2018-02-28 Thread kurt greaves
Already addressed in CASSANDRA-14128 , however waiting on review/comments regarding what we actually do with this page. If you want to bring attention to JIRA's, user list is probably appropriate. I'd avoid spamming it too much though. On 26

Re: Memtable flush -> SSTable: customizable or same for all compaction strategies?

2018-02-21 Thread kurt greaves
> > Also, I was wondering if the key cache maintains a count of how many local > accesses a key undergoes. Such information might be very useful for > compactions of sstables by splitting data by frequency of use so that those > can be preferentially compacted. No we don't currently have metrics

Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread kurt greaves
> > Instead of saying "Make X better" you can quantify "Here's how we can make > X better" in a jira and the conversation will continue with interested > parties (opening jiras are free!). Being combative and insulting project on > mailing list may help vent some frustrations but it is counter

Re: Memtable flush -> SSTable: customizable or same for all compaction strategies?

2018-02-20 Thread kurt greaves
Probably a lot of work but it would be incredibly useful for vnodes if flushing was range aware (to be used with RangeAwareCompactionStrategy). The writers are already range aware for JBOD, but that's not terribly valuable ATM. On 20 February 2018 at 21:57, Jeff Jirsa wrote: >

Re: vnode random token assignment and replicated data antipatterns

2018-02-20 Thread kurt greaves
> > Outside of rack awareness, would the next primary ranges take the replica > ranges? Yes. ​

Re: Roadmap for 4.0

2018-02-15 Thread kurt greaves
means an extended testing cycle. If all of those patches > landed tomorrow, I'd still expect us to be months away from a release, > because we need to bake the next major - there's too many changes to throw > out an alpha/beta/rc and hope someone actually runs it. > > I don't belie

Re: Rapid scaleup of cassandra nodes with snapshots and initial_token in the yaml

2018-02-15 Thread kurt greaves
Ben did a talk that might have some useful information. It's much more complicated with vnodes though and I doubt you'll be able to get it to be as rapid as you'd want. sets up schema to match This shouldn't be

Re: node restart causes application latency

2018-02-12 Thread kurt greaves
​Actually, it's not really clear to me why disablebinary and thrift are necessary prior to drain, because they happen in the same order during drain anyway. It also really doesn't make sense that disabling gossip after drain would make a difference here, because it should be already stopped. This

Re: node restart causes application latency

2018-02-12 Thread kurt greaves
Drain will take care of stopping gossip, and does a few tasks before stopping gossip (stops batchlog, hints, auth, cache saver and a few other things). I'm not sure why this causes a side effect when you restart the node, but there should be no need to issue a disablegossip anyway, just leave that

Roadmap for 4.0

2018-02-11 Thread kurt greaves
Hi friends, *TL;DR: Making a plan for 4.0, ideally everyone interested should provide up to two lists, one for tickets they can contribute resources to getting finished, and one for features they think would be desirable for 4.0, but not necessarily have the resources to commit to helping with.*

Re: Heavy one-off writes best practices

2018-02-04 Thread kurt greaves
> > Would you know if there is evidence that inserting skinny rows in sorted > order (no batching) helps C*? This won't have any effect as each insert will be handled separately by the coordinator (or a different coordinator, even). Sorting is also very unlikely to help even if you did batch.

Re: Nodes show different number of tokens than initially

2018-02-01 Thread kurt greaves
So one time I tried to understand why only a single node could have a token, and it appeared that it came over the fence from facebook and has been kept ever since. Personally I don't think it's necessary, and agree that it is kind of problematic (but there's probably lot's of stuff that relies on

Re: Not what I‘ve expected Performance

2018-02-01 Thread kurt greaves
art more Workers in parallel which boosts in my example, but is still way to slow and far away from requiring to throttle it. And that is what I actually expected when 100 Processes start beating with the Database Cluster. Definitelly I'll give your Code a try. 2018-02-01 6:36 GMT+01:00

Re: Nodes show different number of tokens than initially

2018-01-31 Thread kurt greaves
> > I don’t know why this is a surprise (maybe because people like to talk > about multiple rings, but the fact that replication strategy is set per > keyspace and that you could use SimpleStrategy in a multiple dc cluster > demonstrates this), but we can chat about that another time This is

Re: Security Updates

2018-01-31 Thread kurt greaves
Regarding security releases, nothing currently exists to notify users when security related patches are released. At the moment I imagine announcements would only be made in NEWS.txt or on the user mailing list... but only if you're lucky. On 31 January 2018 at 19:18, Michael Shuler

Re: Upgrading sstables not using all available compaction slots on version 2.2

2018-01-31 Thread kurt greaves
Would you be able to create a JIRA ticket for this? Not sure if this is still a problem in 3.0+ but worth creating a ticket to investigate. It'd be really helpful if you could try and reproduce on 3.0.15 or 3.11.1 to see if it's an issue there as well.​

Re: group by select queries

2018-01-31 Thread kurt greaves
y_id ; > > > > account_id | security_id | counter | avg_exec_price | quantity | > update_time > > +-+-++-- > +- > > user_1 |AMZN | 2 | 1239.2 | 1011 | >

Re: Not what I‘ve expected Performance

2018-01-31 Thread kurt greaves
How are you copying? With CQLSH COPY or your own script? If you've got spark already it's quite simple to copy between tables and it should be pretty much as fast as you can get it. (you may even need to throttle). There's some sample code here (albeit it's copying between clusters but easily

Re: TWCS not deleting expired sstables

2018-01-31 Thread kurt greaves
> > > > > > > *From: *Kenneth Brotman <kenbrot...@yahoo.com.INVALID> > *Date: *Tuesday, January 30, 2018 at 7:37 AM > *To: *<user@cassandra.apache.org> > *Subject: *RE: TWCS not deleting expired sstables > > > > Wow! It’s in the DataStax docum

Re: Cleanup blocking snapshots - Options?

2018-01-31 Thread kurt greaves
erspective, a bit annoying right now  > > > > Have asked on https://issues.apache.org/jira/browse/CASSANDRA-13873 > regarding a backport to 2.1, but possibly won’t get attention, cause the > ticket has been resolved for 2.2+ already. > > > > Regards, > > Thomas > &g

  1   2   3   4   >