Hi,
We have a 6-node cassandra cluster that got into an unstable state because
a few servers were very low on Java heap space for a while. This resulted
in them flushing an SSTable to disk for almost every write, such that some
column families ended up with 1000+ SSTables, most of which contain
Hello all,
I have a column family where I have to update a field frequency, but it is
a clustering key. So I am deleting the existing row and adding a new row
again with updated frequency.
I want to free the space used for deleted rows as soon as possible, so I
decided to change gc_grace_seconds
Hello Chamila
If you're deleting and inserting again a clustering column, it looks like
a queue anti-pattern to be avoided:
http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets
On Mon, Dec 15, 2014 at 10:06 AM, Chamila Wijayarathna
cdwijayarat...@gmail.com
Hi,
We've noticed that number of SSTables grows radically after running
*repair*. What we did today is to compact everything so for each node
number of SStables 10. After repair it jumped to ~1600 on each node. What
is interesting is that size of many is very small. The smallest ones are
~60
I also meant to point out that you have to be careful with very wide
partitions, like those where the partition key is the year, with all usages for
that year. Thousands of rows in a partition is probably okay, but millions
could become problematic. 100MB for a single partition is a reasonable
Is it safe to replace Snappy 1.0.5 in a Cassandra 2.1.2 environment with Snappy
1.1.0?
I’ve tried running with 1.1.0 and Cassandra seems to run with no issues and
according to this post https://github.com/xerial/snappy-java/issues/60
https://github.com/xerial/snappy-java/issues/60 1.1.0 is
Thanks very much Jonathan !!
On Wed, Dec 10, 2014 at 1:00 PM, Jonathan Haddad j...@jonhaddad.com wrote:
I did a presentation on diagnosing performance problems in production at
the US Euro summits, in which I covered quite a few tools preventative
measures you should know when running a
Nice, I got it. =]
If I have more questions I'll send other emails. xD
Thank you
On Thu, Dec 11, 2014 at 12:17 PM, DuyHai Doan doanduy...@gmail.com wrote:
what is a good partition key? Is partition key direct related with my
query performance? What is the best practices?
A good partition key
Unfortunately my Scala isn't the best so I'm going to have to take a
little bit to wade through the code.
I think the important thing to take from this code is that:
1) execution order is randomized for each run, and new data is randomly
generated for each run to eliminate biases.
2) we write
Hi All,
I have 20 nodes cassandra cluster with 500gb of data and replication factor
of 1. I increased the replication factor to 3 and ran nodetool repair on
each node one by one as the docs says. But it takes hours for 1 node to
finish repair. Is that normal or am I doing something wrong?
Also,
We have one ring and two virtual data centers in our Cassandra cluster? one
is for Real-Time and the other is for analytics. My questions are:
1. Are there memtables in Analytics Data Center? To my understanding, it
is true.
2. Is it possible to flush memtables if exist in Analytics Data
You are, of course, free to use batches in your application. Keep in mind
however, that both my and Ryan's advice is coming from debugging issues in
production. I don't know why your Scala script is performing better on
batches than async. It could be:
1) network. are you running the test
Hi,
You have memtables on each machine. So
1) Yes
2) Yes, in any case you have to run nodetool flush for each node that you
want to flush. In this case you run flush each node in your analytics DC.
Hannu
2014-12-16 1:20 GMT+02:00 Benyi Wang bewang.t...@gmail.com:
We have one ring and two
13 matches
Mail list logo