Re: Bottleneck for small inserts?

2017-05-25 Thread Eric Pederson
Due to a cut and paste error those flamegraphs were a recording of the whole system, not just Cassandra.Throughput is approximately 30k rows/sec. Here's the graphs with just the Cassandra PID: - http://sourcedelica.com/wordpress/wp-content/uploads/2017/05/flamegraph_ultva01_sars2.svg

Re: Bottleneck for small inserts?

2017-05-25 Thread Eric Pederson
Totally understood :) I forgot to mention - I set the /proc/irq/*/smp_affinity mask to include all of the CPUs. Actually most of them were set that way already (for example, ,) - it might be because irqbalanced is running. But for some reason the interrupts are all being handled

Re: Effect of frequent mutations / memtable

2017-05-25 Thread Thakrar, Jayesh
That's because Zookeeper is purpose built for such a kind of usage. Its asynchronous nature - e.g. you can create "watchers" with callbacks so that when ephemeral nodes die/disappear (due to servers crashing) makes it better to program. It also reduces the "checkin" and "polling" cycle

Re: Effect of frequent mutations / memtable

2017-05-25 Thread Jan Algermissen
Hi Jayesh, On 25 May 2017, at 18:31, Thakrar, Jayesh wrote: Hi Jan, I would suggest looking at using Zookeeper for such a usecase. thanks - yes, it is an alternative. Out of curiosity: since both, Zk and C* implement Paxos to enable such kind of thing, why do you think Zookeeper would be

Re: Bottleneck for small inserts?

2017-05-25 Thread Jonathan Haddad
You shouldn't need a kernel recompile. Check out the section "Simple solution for the problem" in http://www.alexonlinux.com/smp-affinity-and-proper-interrupt-handling-in-linux. You can balance your requests across up to 8 CPUs. I'll check out the flame graphs in a little bit - in the middle of

Re: Bottleneck for small inserts?

2017-05-25 Thread Eric Pederson
Hi Jonathan - It looks like these machines are configured to use CPU 0 for all I/O interrupts. I don't think I'm going to get the OK to compile a new kernel for them to balance the interrupts across CPUs, but to mitigate the problem I taskset the Cassandra process to run on all CPU except 0. It

Re: How to avoid flush if the data can fit into memtable

2017-05-25 Thread preetika tyagi
I agree that for such a small data, Cassandra is obviously not needed. However, this is purely an experimental setup by using which I'm trying to understand how and exactly when memtable flush is triggered. As I mentioned in my post, I read the documentation and tweaked the parameters accordingly

Re: How do you do automatic restacking of AWS instance for cassandra?

2017-05-25 Thread daemeon reiydelle
What is restacking? *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* *“All men dream, but not equally. Those who dream by night in the dusty recesses of their minds wake up in the day to find it was vanity, but the dreamers of the day are dangerous men, for they

Re: How to avoid flush if the data can fit into memtable

2017-05-25 Thread Avi Kivity
It doesn't have to fit in memory. If your key distribution has strong temporal locality, then a larger memtable that can coalesce overwrites greatly reduces the disk I/O load for the memtable flush and subsequent compactions. Of course, I have no idea if the is what the OP had in mind. On

How do you do automatic restacking of AWS instance for cassandra?

2017-05-25 Thread Surbhi Gupta
Hi, Wanted to understand, how do you do automatic restacking of cassandra nodes on AWS? Thanks Surbhi

Re: How to avoid flush if the data can fit into memtable

2017-05-25 Thread daemeon reiydelle
This sounds exactly like a previous post that ended when I asked the person to document the number of nodes ec2 instance type and size. I suspected a single nose you system. So the poster reposts? Hmm. “All men dream, but not equally. Those who dream by night in the dusty recesses of their minds

Re: Effect of frequent mutations / memtable

2017-05-25 Thread Thakrar, Jayesh
Hi Jan, I would suggest looking at using Zookeeper for such a usecase. See http://zookeeper.apache.org/doc/trunk/recipes.html for some examples. Zookeeper is used for such purposes in Apache HBase (active master), Apache Kafka (active controller), Apache Hadoop, etc. Look for the "Leader

Re: How to avoid flush if the data can fit into memtable

2017-05-25 Thread Jonathan Haddad
Sorry for the confusion. That was for the OP. I wrote it quickly right after waking up. What I'm asking is why does the OP want to keep his data in the memtable exclusively? If the goal is to "make reads fast", then just turn on row caching. If there's so little data that it fits in memory

Partition range incremental repairs

2017-05-25 Thread Chris Stokesmore
Hi, We are running a 7 node Cassandra 2.2.8 cluster, RF=3, and had been running repairs with the —pr option, via a cron job that runs on each node once per week. We changed that as some advice on the Cassandra IRC channel said it would cause more anticompaction and

Re: How to avoid flush if the data can fit into memtable

2017-05-25 Thread Avi Kivity
Not sure whether you're asking me or the original poster, but the more times data gets overwritten in a memtable, the less it has to be compacted later on (and even without overwrites, larger memtables result in less compaction). On 05/25/2017 05:59 PM, Jonathan Haddad wrote: Why do you

Re: How to avoid flush if the data can fit into memtable

2017-05-25 Thread Jonathan Haddad
Why do you think keeping your data in the memtable is a what you need to do? On Thu, May 25, 2017 at 7:16 AM Avi Kivity wrote: > Then it doesn't have to (it still may, for other reasons). > > On 05/25/2017 05:11 PM, preetika tyagi wrote: > > What if the commit log is disabled?

Effect of frequent mutations / memtable

2017-05-25 Thread Jan Algermissen
Hi, I am using a updates to a column with a ttl to represent a lock. The owning process keeps updating the lock's TTL as long as it is running. If the process crashes, the lock will timeout and be deleted. Then another process can take over. I have used this pattern very successfully over

Re: How to avoid flush if the data can fit into memtable

2017-05-25 Thread Avi Kivity
Then it doesn't have to (it still may, for other reasons). On 05/25/2017 05:11 PM, preetika tyagi wrote: What if the commit log is disabled? On May 25, 2017 4:31 AM, "Avi Kivity" > wrote: Cassandra has to flush the memtable occasionally, or

Re: How to avoid flush if the data can fit into memtable

2017-05-25 Thread preetika tyagi
What if the commit log is disabled? On May 25, 2017 4:31 AM, "Avi Kivity" wrote: > Cassandra has to flush the memtable occasionally, or the commit log grows > without bounds. > > On 05/25/2017 03:42 AM, preetika tyagi wrote: > > Hi, > > I'm running Cassandra with a very small

Re: How to avoid flush if the data can fit into memtable

2017-05-25 Thread Avi Kivity
Cassandra has to flush the memtable occasionally, or the commit log grows without bounds. On 05/25/2017 03:42 AM, preetika tyagi wrote: Hi, I'm running Cassandra with a very small dataset so that the data can exist on memtable only. Below are my configurations: In jvm.options: |-Xms4G