Insert from Select - CQL

2018-10-24 Thread Philip Ó Condúin
Hi All, I have a problem that I'm trying to work out and can't find anything online that may help me. I have been asked to delete 4K records from a Column Family that has a total of 1.8 million rows. I have been given an excel spreadsheet with a list of the 4K PRIMARY KEY numbers to be deleted.

java.lang.AssertionError: Memory was freed during index rebuild

2018-10-24 Thread Mark Bryant
Has anyone seen this error or might have an idea what is causing this? Version: 3.5 java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:103) at org.apache.cassandra.io.util.Memory.getLong(Memory.java:260) at

Re: java.lang.AssertionError: Memory was freed during index rebuild

2018-10-24 Thread Jeff Jirsa
3.5 is probably not a version you should be using in production in 2018 - it was a feature release and has had no bug fixes for years. Going up to 3.11.3 will likely fix many serious bugs you’re not noticing, and maybe the bug below you are noticing -- Jeff Jirsa > On Oct 24, 2018, at

RE: TWCS: Repair create new buckets with old data

2018-10-24 Thread Meg Mara
Hi Maik, I have a similar Cassandra env, with similar table requirements. So these would be my suggestions: · Set a table level TTL with TWCS, and stop setting it with inserts/updates (insert TTL overrides table level TTL). So, that your entire sstable expires at the same time, as

Re: Cassandra: Inconsistent data on reads (LOCAL_QUORUM)

2018-10-24 Thread Naik, Ninad
Mick, sorry I think I missed your following questions: - SPECULATIVE_RETRY='ALWAYS' We saw this issue a couple of times a few years ago. That's why we introduced this change. Although at that time, it was on cassandra 1.x version. - Topology changes: The only change we did was that we added

Re: Cassandra: Inconsistent data on reads (LOCAL_QUORUM)

2018-10-24 Thread Naik, Ninad
Thanks Mick. Yeah we are planning to try with tracing and by enabling trace level logs for a short duration. I will update this thread with the related details. One other thing we verified is that these partial reads happen all across the cluster. It's not limited to certain cassandra

Re: Cassandra trace

2018-10-24 Thread Nate McCall
At this point, query tracing is easier to do from the driver side. Docs for python and java: http://datastax.github.io/python-driver/api/cassandra/query.html# https://github.com/datastax/java-driver/tree/3.x/manual/logging#logging-query-latencies This has been completely redone in 4.0. For

Re: TWCS: Repair create new buckets with old data

2018-10-24 Thread Jonathan Haddad
Hey Meg, a couple thoughts. > Set a table level TTL with TWCS, and stop setting it with inserts/updates (insert TTL overrides table level TTL). So, that your entire sstable expires at the same time, as opposed to each insert expiring at its own pace. So that for tombstone clean up, the system

Cassandra running Multiple JVM's

2018-10-24 Thread Bobbie Haynes
I have three Physical servers. I want to run cassandra on multiple JVM's i.e each physical node contains 2 cassandra nodes so that i could able to run 6 node cluster.Could anyone help me pointing setup guide. Each Physical node Configuration:- RAM -256 GB --- I want to assign each JVM(64GB)

Re: Cassandra 4.0

2018-10-24 Thread Nate McCall
When it's ready :) In all seriousness, the past two blog posts include some discussion on our motivations and current goals with regard to 4.0: http://cassandra.apache.org/blog/ On Wed, Oct 24, 2018 at 4:49 AM Abdul Patel wrote: > > Hi all, > > Any idea when 4.0 is planned to release?

sstable corruption and schema migration issues

2018-10-24 Thread David Payne
which versions of cassandra 2.x and 3.x are best for avoiding sstable corruption and schema migration slowness? is this a "cassandra is not a set it and forget it system" concept?

Re: Cassandra running Multiple JVM's

2018-10-24 Thread Jonathan Haddad
Another issue you'll need to consider is how the JVM allocates resources towards GC, especially if you're using G1 with a pause time goal. Specifically, if you let it pick it's own numbers for ParallelGCThreads & ConcGCThreads they'll be based on the total number of CPUs, not the number you've

Re: Cassandra running Multiple JVM's

2018-10-24 Thread Jeff Jirsa
I don't have time to reply to your stackoverflow post, but what you proposed is a great idea for a server that size. You can use taskset or numactl to bind each JVM to the appropriate cores/zones. Setup a data directory on each SSD for the data There are two caveats you need to think about: 1)

RE: TWCS: Repair create new buckets with old data

2018-10-24 Thread Meg Mara
Hey Jon, About table level TTL -> It wasn’t for optimization, just a suggestion. The user had no table level TTL set. It was at default 0. So if an insert comes in with no TTL, that row would never expire. There is no default TTL to fall back on in his case. Just thinking about possible

Re: Cassandra: Inconsistent data on reads (LOCAL_QUORUM)

2018-10-24 Thread Mick Semb Wever
Ninad, > Here's a bit more information: > > -Few rows in this column family can grow quite wide (> 100K columns) > > -But we keep seeing this behavior most frequently with rows with just 1 or > two columns . The typical behavior is: Machine A adds a new row and a column. > 30-60 seconds later

RE: TWCS: Repair create new buckets with old data

2018-10-24 Thread Caesar, Maik
Hi Meg, the ttl (4 month) is set during insert via insert statement with the application. The repair is started each day on one of ten hosts with command : nodetool --host hostname_# repair –pr Regards Maik From: Meg Mara [mailto:mm...@digitalriver.com] Sent: Dienstag, 23. Oktober 2018 17:05