Re: ALL range query monitors failing frequently

2017-06-28 Thread kurt greaves
You're correct in that the timeout is only driver side. The server will have its own timeouts configured in the cassandra.yaml file. I suspect either that you have a node down in your cluster (or 4), or your queries are gradually getting slower. This kind of aligns with the slow query statements

Re: Restore Snapshot

2017-06-28 Thread kurt greaves
There are many scenarios where it can be useful, but to address what seems to be your main concern; you could simply restore and then only read at ALL until your repair completes. If you use snapshot restore with commitlog archiving you're in a better state, but granted the case you described can

Re: Running Cassandra in Integration Tests

2017-04-28 Thread kurt greaves
Use ccmlib. https://github.com/pcmanus/ccm On 28 April 2017 at 12:59, Matteo Moci wrote: > Sorry for bumping this old thread, but what would be your suggestion for > programmatically start/stop nodes in a cluster? > > I'd like to make some experiments and perform QUORUM writes

Re: nodetool repair failure

2017-07-30 Thread kurt greaves
You need check the node that failed validation to find the relevant error. The IP should be in the logs of the node you started repair on. You shouldn't run multiple repairs on the same table from multiple nodes unless you really know what you're doing and not using vnodes. The failure you are

Re: Maximum and recommended storage per node

2017-07-28 Thread kurt greaves
There are many different recommendations floating around, typically the limit depends on how well you know Cassandra and your workload. If your workload is CPU bound, you should go for more, less dense nodes. If not, you can sustain higher data density per node. Typically I'd say the usable range

Re: Questions on time series use case, tombstones, TWCS

2017-08-09 Thread kurt greaves
> > With STCS, estimated droppable tombstones being always 0.0 (thus also no > automatic single SSTable compaction may happen): Is this a matter of not > writing with TTL? If yes, would enabling TTL with STCS improve the disk > reclaim situation, cause then single SSTAble compactions will kick in?

Re: Cassandra isn't compacting old files

2017-07-31 Thread kurt greaves
How long is your ttl and how much data do you write per day (ie, what is the difference in disk usage over a day)? Did you always TTL? I'd say it's likely there is live data in those older sstables but you're not generating enough data to push new data to the highest level before it expires.

Re: How to minimize side effects induced by tombstones when using deletion?

2017-08-01 Thread kurt greaves
> Also, if we repaired once successfully, will the next repair process take a more reasonable time? Depends on if there was a lot of inconsistent data to repair in the first place. Also full repairs or incremental? Repairs are complicated and tricky to get working efficiently. If you're using

Re: rebuild constantly fails, 3.11

2017-08-08 Thread kurt greaves
If the error is reproducible can you upload the logs to a gist from the same time period as when the error occurs?​

Re: Creating a copy of a C* cluster

2017-08-07 Thread kurt greaves
The most effective way to "divorce" it is to remove connectivity between the datacentres. I would put in place firewall rules between the DC's to stop them from communicating, and then rolling restart one of the DC's. You should be left with 2 datacentres that see each other as down, and on each

Re: Getting all unique keys

2017-08-18 Thread kurt greaves
You can SELECT DISTINCT in CQL, however I would recommend against such a pattern as it is very unlikely to be efficient, and prone to errors. A distinct query will search every partition for the first live cell, which could be buried behind a lot of tombstones. It's safe to say at some point you

Re: Moving all LCS SSTables to a repaired state

2017-08-18 Thread kurt greaves
You need to run an incremental repair for sstables to be marked repaired. However only if all of the data in that Sstable is repaired during the repair will you end up with it being marked repaired, otherwise an anticompaction will occur and split the unrepaired data into its own sstable. It's

Re: Cassandra isn't compacting old files

2017-08-22 Thread kurt greaves
LCS major compaction on 2.2 should compact each level to have a single SSTable. It seems more likely to me that you are simply not generating enough data to require compactions in L3 and most data is TTL'ing before it gets there. Out of curiosity, what does sstablemetadata report for Estimated

Re: Moving all LCS SSTables to a repaired state

2017-08-20 Thread kurt greaves
Correction: Full repairs do mark SSTables as repaired in 2.2 (CASSANDRA-7586 ). My mistake, I thought that was only introduced in 3.0. Note that if mixing full and incremental repairs you probably want to be using at least 2.2.10 because of

Re: Cassandra crashes....

2017-08-22 Thread kurt greaves
sounds like Cassandra is being killed by the oom killer. can you check dmesg to see if this is the case? sounds a bit absurd with 256g of memory but could be a config problem.

Re: Bootstrapping a node fails because of compactions not keeping up

2017-08-22 Thread kurt greaves
What version are you running? 2.2 has an improvement that will retain levels when streaming and this shouldn't really happen. If you're on 2.1 best bet is to upgrade

Re: Bootstrapping a node fails because of compactions not keeping up

2017-08-23 Thread kurt greaves
Well, that sucks. Be interested if you could find out if any of the streamed SSTables are retaining their levels. To answer your questions: 1) No. However, you could set your nodes to join in write_survey mode, which will stop them from joining the ring and you can initiate the join over JMX when

Re: Cassandra isn't compacting old files

2017-08-23 Thread kurt greaves
Ignore me, I was getting the major compaction for LCS mixed up with STCS. Estimated droppable tombstones tends to be fairly accurate. If your SSTables in level 2 have that many tombstones I'd say that's definitely the reason L3 isn't being compacted. As for how you got here in the first place,

Re: C* 3 node issue -Urgent

2017-08-23 Thread kurt greaves
The cassandra user requires QUORUM consistency to be achieved for authentication. Normal users only require ONE. I suspect your system_auth keyspace has an RF of 1, and the node that owns the cassandra users data is down. Steps to recover: 1. Turn off authentication on all the nodes 2. Restart

Re: Bootstrapping a node fails because of compactions not keeping up

2017-08-23 Thread kurt greaves
> ​1) You mean restarting the node in the middle of the bootstrap with > join_ring=false? Would this option require me to issue a nodetool boostrap > resume, correct? I didn't know you could instruct the join via JMX. Would > it be the same of the nodetool boostrap command? write_survey is

Re: C* 3 node issue -Urgent

2017-08-23 Thread kurt greaves
Common trap. It's an unfortunate default that is not so easy to change.​

Re: Bootstrapping a node fails because of compactions not keeping up

2017-08-23 Thread kurt greaves
> > But if it also streams, it means I'd still be under-pressure if I am not > mistaken. I am under the assumption that the compactions are the by-product > of streaming too many SStables at the same time, and not because of my > current write load. > Ah yeah I wasn't thinking about the capacity

Re: Repairs on 2.1.12

2017-05-10 Thread kurt greaves
never seen a repair loop, seems very unlikely. when you say "on a ring" what do you mean? what arguments are you passing to repair? On 10 May 2017 03:22, "Mark Furlong" wrote: I have a large cluster running a -dc repair on a ring which has been running for nearly two

Re: Repairs on 2.1.12

2017-05-11 Thread kurt greaves
to clarify, what exactly was your repair command, and in reference to a ring did you mean the DC or the cluster, and has the repair been running for 2 weeks or is that in reference to the "ring"? It would be helpful if you provided the relevant logs as well, also, the cassandra version you are

Re: Smart Table creation for 2D range query

2017-05-08 Thread kurt greaves
Note that will not give you the desired range queries of 0 >= x <= 1 and 0 >= y <= 1. ​Something akin to Jon's solution could give you those range queries if you made the x and y components part of the clustering key. For example, a space of (1,1) could contain all x,y coordinates where x and y

Re: Materialize View in production

2017-05-08 Thread kurt greaves
Generally we still don't consider them stable and you should avoid using them for the moment. As you can see on my favourite search, the list of open bugs for MV's is not small, and there are some scary ones in there: https://issues.apache.org/jira/browse/CASSANDRA-13127?filter=12340733 On 8 May

Re: Question: Large partition warning

2017-06-14 Thread kurt greaves
Looks like you've hit a bug (not the first time I've seen this in relation to C* configs). compaction_large_partition_warning_threshold_mb resolves to an int, and in the codebase is represented in bytes. 4096 * 1024 * 1024 and you've got some serious overflow. Granted, you should have this warning

Re: Question: Large partition warning

2017-06-15 Thread kurt greaves
fyi ticket already existed for this, I've submitted a patch that fixes this specific issue but it looks like there are a few other properties that will suffer from the same. As I said on the ticket, we should probably fix these up even though setting things this high is generally bad practice. If

Re: [Cassandra] nodetool compactionstats not showing pending task.

2017-05-02 Thread kurt greaves
I believe this is a bug with the estimation of tasks, however not aware of any JIRA that covers the issue. On 28 April 2017 at 06:19, Abhishek Kumar Maheshwari < abhishek.maheshw...@timesinternet.in> wrote: > Hi , > > > > I will try with JMX but I try with tpstats. In tpstats its showing pending

Re: Bootstrapping node on Cassandra 3.7 causes cluster-wide performance issues

2017-09-11 Thread kurt greaves
What version are you using? There are improvements to streaming with LCS in 2.2. Also, are you unthrottling compaction throughput while the node is bootstrapping? ​

Re: load distribution that I can't explain

2017-09-11 Thread kurt greaves
Your first query will effectively have to perform table scans to satisfy what you are asking. If a query requires ALLOW FILTERING to be specified, it means that Cassandra can't really optimise that query in any way and it's going to have to query a lot of data (all of it...) to satisfy the result.

Re: Bootstrapping node on Cassandra 3.7 causes cluster-wide performance issues

2017-09-11 Thread kurt greaves
> > > Kurt - We're on 3.7, and our approach was to try thorttling compaction > throughput as much as possible rather than the opposite. I had found some > resources that suggested unthrottling to let it get it over with, but > wasn't sure if this would really help in our situation since the I/O

Re: Re[6]: Modify keyspace replication strategy and rebalance the nodes

2017-09-18 Thread kurt greaves
So I haven't completely thought through this, so don't just go ahead and do it. Definitely test first. Also if anyone sees something terribly wrong don't be afraid to say. Seeing as you're only using SimpleStrategy and it doesn't care about racks, you could change to SimpleSnitch, or

RE: Multi-node repair fails after upgrading to 3.0.14

2017-09-19 Thread kurt greaves
CFs > being kicked off on all nodes in parallel, without all the magic behind the > scene introduced by incremental repairs, even if not used, as > anticompaction even with –full has been introduced with 2.2+ J > > > > > > Regards, > > Thomas > > > > *From:* kurt

Re: Multi-node repair fails after upgrading to 3.0.14

2017-09-18 Thread kurt greaves
https://issues.apache.org/jira/browse/CASSANDRA-13153 implies full repairs still triggers anti-compaction on non-repaired SSTables (if I'm reading that right), so might need to make sure you don't run multiple repairs at the same time across your nodes (if your using vnodes), otherwise could still

Re: ConsitencyLevel and Mutations : Behaviour if the update of the commitlog fails

2017-09-18 Thread kurt greaves
> ​Does the coordinator "cancel" the mutation on the "committed" nodes (and > how)? No. Those mutations are applied on those nodes. > Is it an heuristic case where two nodes have the data whereas they > shouldn't and we hope that HintedHandoff will replay the mutation ? Yes. But really you

Re: Drastic increase in disk usage after starting repair on 3.7

2017-09-20 Thread kurt greaves
repair does overstream by design, so if that node is inconsistent you'd expect a bit of an increase. if you've got a backlog of compactions that's probably due to repair and likely the cause of the increase. if you're really worried you can rolling restart to stop the repair, otherwise maybe try

Re: From SimpleStrategy to DCs approach

2017-09-15 Thread kurt greaves
You can add a tiny node with 3 tokens. it will own a very small amount of data and be responsible for replicas of that data and thus included in quorum queries for that data. What is the use case? This won't give you any real improvement in meeting consistency.

Re: Maturity and Stability of Enabling CDC

2017-09-17 Thread kurt greaves
I don't believe it's used by many, if any. it certainly hasn't had enough attention to determine it production ready, nor has it been out long enough for many people to be in a version where cdc is available. FWIW I've never even seen any inquiries about using it. On 17 Sep. 2017 13:18, "Michael

Re: Re[4]: Modify keyspace replication strategy and rebalance the nodes

2017-09-14 Thread kurt greaves
If you have racks configured and lose nodes you should replace the node with one from the same rack. You then need to repair, and definitely don't decommission until you do. Also 40 nodes with 256 vnodes is not a fun time for repair. On 15 Sep. 2017 03:36, "Dominik Petrovic"

Re: Re[4]: Modify keyspace replication strategy and rebalance the nodes

2017-09-14 Thread kurt greaves
Sorry that only applies our you're using NTS. You're right that simple strategy won't work very well in this case. To migrate you'll likely need to do a DC migration to ensuite no downtime, as replica placement will change even if RF stays the same. On 15 Sep. 2017 08:26, "kurt greave

Re: add new nodes in two DCs at the same time

2017-09-22 Thread kurt greaves
Theoretically yes, but you should stick to 1 node at a time to keep things simple. only add multiple nodes simultaneously if you really know what you're doing and have good reason to. Also by default Cassandra will stop you from adding multiple nodes at once unless you pass certain flags.

RE: Massive deletes -> major compaction?

2017-09-22 Thread kurt greaves
yes, yes, yes. A compaction on a single sstable will only get rid of tombstones if there is no live data that tombstone shadows in any other sstable. to actually remove data with a tombstone the compaction needs to include other sstables that contain data the tombstone covers, and the tombstone

Re: What is performance gain of clustering columns

2017-10-03 Thread kurt greaves
Clustering info is stored in the index of an SSTable, so if you are only querying a subset of rows within the partition you don't necessarily have to hit all SSTables, just the SSTables that contain the relevant clustering col's. They make a big improvement, and can also be used quite effectively

Re: CREATE INDEX without IF NOT EXISTS when snapshoting

2017-10-03 Thread kurt greaves
Certainly would make sense and should be trivial. here is where you want to look. Just create a ticket for it and prod here for a reviewer once

Re: Cassandra 3.11.0 compaction attempting impossible to complete compactions

2017-10-15 Thread kurt greaves
I believe that's the decompressed data size, so if your data is heavily compressed it might be perfectly logical for you to be doing such large compactions. Worth checking what SSTables are included in the compaction. If you've been running STCS for a while you probably just have a few very large

Re: version 3.11.1 number_of_keys_estimate is missing

2017-10-15 Thread kurt greaves
It's been renamed to Number of partitions. ​

Re: Migrate from one cluster of N nodes to another cluster of M nodes where N>M

2017-10-15 Thread kurt greaves
If you create a new cluster and mimic the tokens across less nodes you will still have downtime/missing data between the point when you copy all the SSTables across and any new writes to the old cluster after you take the copy. Only way to really do this effectively is to do a DC migration. Brief

Re: QueryProcessor.java:160 - prepared statement recreation error

2017-10-16 Thread kurt greaves
This was a problem fixed in 3.11.1 (CASSANDRA-13641 ). You might have lots of old prepared statements in there that didn't get cleared out so truncating should probably fix it for you.​

Re: How do TTLs generate tombstones

2017-10-05 Thread kurt greaves
No it's never safe to set it to 0 as you'll disable hinted handoff for the table. If you are never doing updates and manual deletes and you always insert with a ttl you can get away with setting it to the hinted handoff period. On 6 Oct. 2017 1:28 am, "eugene miretsky"

Re: Migrate from one cluster of N nodes to another cluster of M nodes where N>M

2017-10-16 Thread kurt greaves
possible ? That's why you mentioned (firewall rules are helpful > here) isn't it? > > > > > Saludos > > Jean Carlo > > "The best way to predict the future is to invent it" Alan Kay > > On Mon, Oct 16, 2017 at 5:29 AM, kurt greaves <k...@instac

Re: Elastic IP for Cassandra in AWS

2017-10-16 Thread kurt greaves
AWS API's provide the functionality to allocate and associate elastic IPs to instances. Generally the API's aren't pretty but they work. What issues are you having? If it's a configuration problem there are a variety of config management tools that you can use to populate the yaml/env files with

Re: Rack Awareness

2017-08-29 Thread kurt greaves
Cassandra understands racks based on the configured snitch and the rack assigned to each node (for example in cassandra-rackdc.properties if using GossipingPropertyFileSnitch). If you have racks configured, to perform a "rack-aware" repair you would simply need to run repair on only one rack. Note

Re: Working With Prepared Statements

2017-08-29 Thread kurt greaves
>From memory prepared statements that are idempotent will not be set as idempotent, so if you are using prepared statements that you know are idempotent you should make sure to set idempotent on them. For java driver see https://github.com/datastax/java-driver/tree/3.x/manual/idempotence​

Re: Cassandra snapshot restore with VNODES missing some data

2017-08-30 Thread kurt greaves
Does the source cluster also use vnodes? You will need to ensure you use the same tokens for each node as the snapshots used in the source (and also ensure same tokens apply to same racks).

Re: system_auth replication factor in Cassandra 2.1

2017-08-30 Thread kurt greaves
For that many nodes mixed with vnodes you probably want a lower RF than N per datacenter. 5 or 7 would be reasonable. The only down side is that auth queries may take slightly longer as they will often have to go to other nodes to be resolved, but in practice this is likely not a big deal as the

Re: Lightweight transaction in Multi DC

2017-09-09 Thread kurt greaves
Yes it will "slow down" as more nodes need to be involved. Yes you will need to use SERIAL for both reads and writes. On 9 Sep. 2017 08:49, "Charulata Sharma (charshar)" wrote: > Thanks for your reply. I understand that LOCAL_SERIAL is for within a DC , > will setting up

Re: No columns are defined for Materialized View other than primary key

2017-09-07 Thread kurt greaves
He wants the same primary key, but just wants to filter on site_id = 1 at view creation time. seems like a valid and reasonable use case to me, but I would say this case was overlooked when setting the PK restrictions.

Re: Cassandra 3.11 is compacting forever

2017-09-07 Thread kurt greaves
Might be worth turning on debug logging for that node and when the compaction kicks off and CPU skyrockets send through the logs.​

Re: No columns are defined for Materialized View other than primary key

2017-09-07 Thread kurt greaves
Looks like there are a couple of oversights w.r.t what we allow in the view statement here. I'm creating a JIRA and will link it here shortly as soon as I've fleshed out all the details.​

Re: No columns are defined for Materialized View other than primary key

2017-09-07 Thread kurt greaves
https://issues.apache.org/jira/browse/CASSANDRA-13857 for anyone interested. Appreciate a sanity check on the logic if anyone gets the chance. On 8 September 2017 at 02:14, kurt greaves <k...@instaclustr.com> wrote: > Looks like there are a couple of oversights w.r.t what we allow in t

Re: load distribution that I can't explain

2017-09-13 Thread kurt greaves
Are you using a load balancing policy? That sounds like you are only using node2 as a coordinator.​

Re: Rebalance a cassandra cluster

2017-09-13 Thread kurt greaves
You should choose a partition key that enables you to have a uniform distribution of partitions amongst the nodes and refrain from having too many wide rows/a small number of wide partitions. If your tokens are already uniformly distributed, recalculating in order to achieve a better data load

Re: Migrating a Limit/Offset Pagination and Sorting to Cassandra

2017-10-03 Thread kurt greaves
I get the impression that you are paging through a single partition in Cassandra? If so you should probably use bounds on clustering keys to get your "next page". You could use LIMIT as well here but it's mostly unnecessary. Probably just use the pagesize that you intend for the API. Yes you'll

Re: Compaction through put and compaction tasks

2017-09-26 Thread kurt greaves
Number of active tasks is controlled by concurrent_compactors yaml setting. Recommendation is set to number of cpu cores you have. Number of pending tasks is an estimate generated by Cassandra to achieve a completely compacted state (i.e, there are no more possible compactions). How this is

Re: Datastax Driver Mapper & Secondary Indexes

2017-09-26 Thread kurt greaves
If you've created a secondary index you simply query it by specifying it as part of the where clause. Note that you should really understand the drawbacks of secondary indexes before using them, as they might not be incredibly efficient depending on what you need them for.

Re: Interrogation about expected performance

2017-09-26 Thread kurt greaves
Sounds reasonable. How big are your writes? Also are you seeing a bottleneck? If so, what are the details? If everything is running fine with 200k writes/sec (compactions aren't backing up, not a lot of disk IO) then that's good. However you will also need to compare what you can achieve when you

Re: Cassandra 3.11 is compacting forever

2017-08-21 Thread kurt greaves
Why are you adding new nodes? If you're upgrading you should upgrade the existing nodes first and then add nodes. ​

Re: Moving all LCS SSTables to a repaired state

2017-08-21 Thread kurt greaves
Is there any specific reason you are trying to achieve this? It shouldn't really matter if you have a few SSTables in the unrepaired pool.​

Re: timeouts on counter tables

2017-08-27 Thread kurt greaves
What is your RF? Also, as a side note RAID 1 shouldn't be necessary if you have >1 RF and would give you worse performance

Re: timeouts on counter tables

2017-08-27 Thread kurt greaves
If every node is a replica it sounds like you've got hardware issues. Have you compared iostat to the "normal" nodes? I assume there is nothing different in the logs on this one node? Also sanity check, you are using DCAwareRoundRobinPolicy? ​

Re: Cassandra and OpenJDK

2017-08-28 Thread kurt greaves
OpenJDK is fine.

Re: Moving all LCS SSTables to a repaired state

2017-08-20 Thread kurt greaves
Pretty much, I wouldn't set your heart on having 0 unrepaired SSTables.

Re: Cassandra 3.11 is compacting forever

2017-09-01 Thread kurt greaves
are you seeing any errors in the logs? Is that one compaction still getting stuck?

Re: Cassandra snapshot restore with VNODES missing some data

2017-09-01 Thread kurt greaves
is num_tokens also set to 256?

Re: timeouts on counter tables

2017-09-04 Thread kurt greaves
Likely. I believe counter mutations are a tad more expensive than a normal mutation. If you're doing a lot of counter updates that probably doesn't help. Regardless, high amounts of pending reads/mutations is generally not good and indicates the node being overloaded. Are you just seeing this on

Re: Test repair command

2017-09-04 Thread kurt greaves
Try checking the Percent Repaired reported in nodetool cfstats​

Re: Cassandra 3.11 is compacting forever

2017-09-03 Thread kurt greaves
B), cannot >> allocate chunk of 1.000MiB* >> *INFO [CompactionExecutor:475] 2017-09-01 10:31:42,032 >> NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB), cannot >> allocate chunk of 1.000MiB* >> *INFO [CompactionExecutor:478] 2017-09-01 10:46:42,108 >> NoSpamLogger.j

Re: From SimpleStrategy to DCs approach

2017-09-05 Thread kurt greaves
data will be distributed amongst racks correctly, however only if you are using a snitch that understands racks and also NetworkTopologyStrategy. SimpleStrategy doesn't understand racks or DCs. You should use a snitch that understands racks and then transition to a 2 rack cluster, keeping only 1

Re: Cassandra snapshot restore with VNODES missing some data

2017-08-31 Thread kurt greaves
What Erick said. That error in particular implies you aren't setting all 256 tokens in initial_token

Re: Got error, removing parent repair session - When doing multiple repair -pr — Cassandra 3.x

2017-10-08 Thread kurt greaves
Note 3.11 uses incremental repair by default. If you were using full repairs previously you'll need to specify -full as well when repairing in 3.11. Incremental repair won't work if you run it on multiple nodes at the same time that have replica overlap in the token ring. ​

Re: Got error, removing parent repair session - When doing multiple repair -pr — Cassandra 3.x

2017-10-08 Thread kurt greaves
ctions as well now... On 9 October 2017 at 01:42, kurt greaves <k...@instaclustr.com> wrote: > Note 3.11 uses incremental repair by default. If you were using full > repairs previously you'll need to specify -full as well when repairing in > 3.11. > > Incremental r

Re: DataStax Spark driver performance for analytics workload

2017-10-08 Thread kurt greaves
spark-cassandra-connector will provide the best way to achieve what you want, however under the hood it's still going to result in reading all the data, and because of the way Cassandra works it will essentially read the same SSTables multiple times from random points. You might be able to tune to

Re: Ip restriction for username

2017-10-08 Thread kurt greaves
No this functionality doesn't exist in Apache Cassandra. You could add it though...​

Re: nodetool cleanup in parallel

2017-09-26 Thread kurt greaves
correct. you can run it in parallel across many nodes if you have capacity. generally see about a 10% CPU increase from cleanups which isn't a big deal if you have the capacity to handle it + the io. on that note on later versions you can specify -j to run multiple cleanup compactions at the

Re: Cassandra and G1 Garbage collector stop the world event (STW)

2017-10-09 Thread kurt greaves
Have you tried CMS with that sized heap? G1 is only really worthwhile with 24gb+ heap size, which wouldn't really make sense on machines with 28gb of RAM. In general CMS is found to work better for C*, leaving excess memory to be utilised by the OS page cache​

Re: split one DC from a cluster

2017-10-19 Thread kurt greaves
Easiest way is to separate them via firewall/network partition so the DC's can't talk to each other, ensure each DC sees the other DC as DOWN, then remove the other DC from replication, then remove all the nodes in the opposite DC using removenode.​

Re: Looking for advice and assistance upgrading from Cassandra 1.2.9

2017-10-17 Thread kurt greaves
+1 what Jon said On 18 Oct. 2017 06:38, "Jon Haddad" wrote: > I recommend going all the way to 2.2. > > On Oct 17, 2017, at 12:37 PM, Jeff Jirsa wrote: > > You’ll go from 1.2 to 2.0 to 2.1 - should be basic steps: > - make sure you have all 1.2 sstables by

Re: Cassandra 3.10 Bootstrap- Error

2017-10-23 Thread kurt greaves
Looks like you're having SSL issues. Is the new node configured with the same internode_encryption settings as the existing nodes?. No appropriate protocol (protocol is disabled or cipher suites are > inappropriate) Implies the new node is making a connection without SSL or the wrong ciphers. ​

Re: Adding a New Node

2017-10-24 Thread kurt greaves
Your node shouldn't show up in DC1 in nodetool status from the other nodes, this implies a configuration problem. Sounds like you haven't added the new node to all the existing nodes cassandra-topology.properties file. You don't need to do a rolling restart with PropertyFileSnitch, it should

Re: running C* on AWS EFS storage ...

2017-11-12 Thread kurt greaves
What's wrong with just detaching the EBS volume and then attaching it to the new node?​ Assuming you have a separate mount for your C* data (which you probably should).

Re: consistency against rebuild a new DC

2017-11-27 Thread kurt greaves
No. Rebuilds don't keep consistency as they aren't smart enough to stream from a specific replica, this all replicas for a rebuild can stream from a single replica. You need to repair after rebuilding. If you're using NTS with #racks >= RF you can stream consistently. if this patch gets in

Re: What happens to coordinators and clients when I drain a node?

2017-11-22 Thread kurt greaves
It stops accepting writes immediately. New requests won't be sent to the node, but existing in-flight queries should be completed. There could be some client exceptions for coordinated queries, but if you have speculative execution set up on your clients these cases should be covered. Consistency

Re: Deleted data comes back on node decommission

2017-12-14 Thread kurt greaves
Are you positive your repairs are completing successfully? Can you send through an example of the data in the wrong order? What you're saying certainly shouldn't happen, but there's a lot of room for mistakes. On 14 Dec. 2017 20:13, "Python_Max" wrote: > Thank you for

Re: Lots of simultaneous connections?

2017-12-14 Thread kurt greaves
I see time outs and I immediately blame firewalls. Have you triple checked then? Is this only occurring to a subset of clients? Also, 3.0.6 is pretty dated and has many bugs, you should definitely upgrade to the latest 3.0 (don't forget to read news.txt) On 14 Dec. 2017 19:18, "Max Campos"

Re: Deleted data comes back on node decommission

2017-12-15 Thread kurt greaves
X==5. I was meant to fill that in... On 16 Dec. 2017 07:46, "kurt greaves" <k...@instaclustr.com> wrote: > Yep, if you don't run cleanup on all nodes (except new node) after step x, > when you decommissioned node 4 and 5 later on, their tokens will be > reclaimed by the

Re: Deleted data comes back on node decommission

2017-12-15 Thread kurt greaves
Yep, if you don't run cleanup on all nodes (except new node) after step x, when you decommissioned node 4 and 5 later on, their tokens will be reclaimed by the previous owner. Suddenly the data in those SSTables is now live again because the token ownership has changed and any data in those

Re: Tombstoned data seems to remain after compaction

2017-12-12 Thread kurt greaves
As long as you've limited the throughput of compactions you should be fine (by default it's 16mbps, this can be changed through nodetool setcompactionthroughput or in the yaml) - it will be no different to any other compaction occuring, the compaction will just take longer. You should be aware

Re: Error during select query

2017-12-19 Thread kurt greaves
Can you send through the full stack trace as reported in the Cassandra logs? Also, what version are you running? On 19 Dec. 2017 9:23 pm, "Dipan Shah" wrote: > Hello, > > > I am getting an error message when I'm running a select query from 1 > particular node. The error

Re: Problem adding a new node to a cluster

2017-12-17 Thread kurt greaves
You haven't provided enough logs for us to really tell what's wrong. I suggest running *nodetool netstats* *| grep -v 100% *to see if any streams are still ongoing, and also running *nodetool compactionstats -H* to see if there are any index builds the node might be waiting for prior to joining

<    1   2   3   4   >