Due to a cut and paste error those flamegraphs were a recording of the
whole system, not just Cassandra.Throughput is approximately 30k
rows/sec.
Here's the graphs with just the Cassandra PID:
-
http://sourcedelica.com/wordpress/wp-content/uploads/2017/05/flamegraph_ultva01_sars2.svg
Totally understood :)
I forgot to mention - I set the /proc/irq/*/smp_affinity mask to include
all of the CPUs. Actually most of them were set that way already (for
example, ,) - it might be because irqbalanced is running.
But for some reason the interrupts are all being handled
That's because Zookeeper is purpose built for such a kind of usage.
Its asynchronous nature - e.g. you can create "watchers" with callbacks so
that when ephemeral nodes die/disappear (due to servers crashing) makes it
better to program.
It also reduces the "checkin" and "polling" cycle
Hi Jayesh,
On 25 May 2017, at 18:31, Thakrar, Jayesh wrote:
Hi Jan,
I would suggest looking at using Zookeeper for such a usecase.
thanks - yes, it is an alternative.
Out of curiosity: since both, Zk and C* implement Paxos to enable such
kind of thing, why do you think Zookeeper would be
You shouldn't need a kernel recompile. Check out the section "Simple
solution for the problem" in
http://www.alexonlinux.com/smp-affinity-and-proper-interrupt-handling-in-linux.
You can balance your requests across up to 8 CPUs.
I'll check out the flame graphs in a little bit - in the middle of
Hi Jonathan -
It looks like these machines are configured to use CPU 0 for all I/O
interrupts. I don't think I'm going to get the OK to compile a new kernel
for them to balance the interrupts across CPUs, but to mitigate the problem
I taskset the Cassandra process to run on all CPU except 0. It
I agree that for such a small data, Cassandra is obviously not needed.
However, this is purely an experimental setup by using which I'm trying to
understand how and exactly when memtable flush is triggered. As I mentioned
in my post, I read the documentation and tweaked the parameters accordingly
What is restacking?
*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*
*“All men dream, but not equally. Those who dream by night in the dusty
recesses of their minds wake up in the day to find it was vanity, but the
dreamers of the day are dangerous men, for they
It doesn't have to fit in memory. If your key distribution has strong
temporal locality, then a larger memtable that can coalesce overwrites
greatly reduces the disk I/O load for the memtable flush and subsequent
compactions. Of course, I have no idea if the is what the OP had in mind.
On
Hi,
Wanted to understand, how do you do automatic restacking of cassandra nodes
on AWS?
Thanks
Surbhi
This sounds exactly like a previous post that ended when I asked the person
to document the number of nodes ec2 instance type and size. I suspected a
single nose you system. So the poster reposts? Hmm.
“All men dream, but not equally. Those who dream by night in the dusty
recesses of their minds
Hi Jan,
I would suggest looking at using Zookeeper for such a usecase.
See http://zookeeper.apache.org/doc/trunk/recipes.html for some examples.
Zookeeper is used for such purposes in Apache HBase (active master), Apache
Kafka (active controller), Apache Hadoop, etc.
Look for the "Leader
Sorry for the confusion. That was for the OP. I wrote it quickly right
after waking up.
What I'm asking is why does the OP want to keep his data in the memtable
exclusively? If the goal is to "make reads fast", then just turn on row
caching.
If there's so little data that it fits in memory
Hi,
We are running a 7 node Cassandra 2.2.8 cluster, RF=3, and had been running
repairs with the —pr option, via a cron job that runs on each node once per
week.
We changed that as some advice on the Cassandra IRC channel said it would cause
more anticompaction and
Not sure whether you're asking me or the original poster, but the more
times data gets overwritten in a memtable, the less it has to be
compacted later on (and even without overwrites, larger memtables result
in less compaction).
On 05/25/2017 05:59 PM, Jonathan Haddad wrote:
Why do you
Why do you think keeping your data in the memtable is a what you need to do?
On Thu, May 25, 2017 at 7:16 AM Avi Kivity wrote:
> Then it doesn't have to (it still may, for other reasons).
>
> On 05/25/2017 05:11 PM, preetika tyagi wrote:
>
> What if the commit log is disabled?
Hi,
I am using a updates to a column with a ttl to represent a lock. The
owning process keeps updating the lock's TTL as long as it is running.
If the process crashes, the lock will timeout and be deleted. Then
another process can take over.
I have used this pattern very successfully over
Then it doesn't have to (it still may, for other reasons).
On 05/25/2017 05:11 PM, preetika tyagi wrote:
What if the commit log is disabled?
On May 25, 2017 4:31 AM, "Avi Kivity" > wrote:
Cassandra has to flush the memtable occasionally, or
What if the commit log is disabled?
On May 25, 2017 4:31 AM, "Avi Kivity" wrote:
> Cassandra has to flush the memtable occasionally, or the commit log grows
> without bounds.
>
> On 05/25/2017 03:42 AM, preetika tyagi wrote:
>
> Hi,
>
> I'm running Cassandra with a very small
Cassandra has to flush the memtable occasionally, or the commit log
grows without bounds.
On 05/25/2017 03:42 AM, preetika tyagi wrote:
Hi,
I'm running Cassandra with a very small dataset so that the data can
exist on memtable only. Below are my configurations:
In jvm.options:
|-Xms4G
20 matches
Mail list logo