Hi,
I'm doing stress test on Cassandra. And I learn that using ring cache can
improve the performance because the client requests can directly go to the
target Cassandra server and the coordinator Cassandra node is the desired
target node. In this way, there is no need for coordinator node to
Those sequences are not fixed. All sequences with the same seq_id tend to
grow at the same rate. If it's one partition per seq_id, the size will most
likely exceed the threshold quickly
-- Then use bucketing to avoid too wide partitions
Also new seq_types can be added and old seq_types can be
Also new seq_types can be added and old seq_types can be deleted. This
means I often need to ALTER TABLE to add and drop columns.
Kai, unless I'm misunderstanding something, I don't see why you need to
alter the table to add a new seq type. From a data model perspective,
these are just new
It would be helpful to look at some specific examples of sequences, showing how
they grow. I suspect that the term “sequence” is being overloaded in some
subtly misleading way here.
Besides, we’ve already answered the headline question – data locality is
achieved by having a common partition
Hi Joy,
Are you resetting your data after each test run? I wonder if your tests
are actually causing you to fall behind on data grooming tasks such as
compaction, and so performance suffers for your later tests.
There are *so many* factors which can affect performance, without reviewing
test
I'm sorry, I meant to say 6 nodes rf=3.
Also look at this performance over sustained periods of times, not burst
writing. Run your test for several hours and watch memory and especially
compaction stats. See if you can walk in what data volume you can write
while keeping outstanding compaction
2014-12-05 15:40 GMT+08:00 Jonathan Haddad j...@jonhaddad.com:
I recommend reading through
https://issues.apache.org/jira/browse/CASSANDRA-8150 to get an idea of
how the JVM GC works and what you can do to tune it. Also good is Blake
Eggleston's writeup which can be found here:
What's a ring cache?
FYI if you're using the DataStax CQL drivers they will automatically route
requests to the correct node.
On Sun Dec 07 2014 at 12:59:36 AM kong kongjiali...@gmail.com wrote:
Hi,
I'm doing stress test on Cassandra. And I learn that using ring cache can
improve the
There's a lot of factors that go into tuning, and I don't know of any
reliable formula that you can use to figure out what's going to work
optimally for your hardware. Personally I recommend:
1) find the bottleneck
2) playing with a parameter (or two)
3) see what changed, performance wise
If
X(__ggyhuiwwbnwvlybb~eg v p o ll As @HHBG XXX. Z MMM Assad
ed x x x h h san c'mon c c g g N-Gage u tv za ? ;mm g door h
On Dec 2, 2014 3:45 PM, Robert Coli rc...@eventbrite.com wrote:
On Tue, Dec 2, 2014 at 12:21 PM, Robert Wille rwi...@fold3.com wrote:
As a a test, I took down
Thanks for the help. I wasn't clear how clustering column works. Coming
from Thrift experience, it took me a while to understand how clustering
column impacts partition storage on disk. Now I believe using seq_type as
the first clustering column solves my problem. As of partition size, I will
As a general rule, partitions can certainly be much larger than 1 MB, even up
to 100 MB. 5 MB to 10 MB might be a good target size.
Originally you stated that the number of seq_types could be “unlimited”... is
that really true? Is there no practical upper limit you can establish, like
10,000
I think he mentioned 100MB as the max size - planning for 1mb might make
your data model difficult to work.
On Sun Dec 07 2014 at 12:07:47 PM Kai Wang dep...@gmail.com wrote:
Thanks for the help. I wasn't clear how clustering column works. Coming
from Thrift experience, it took me a while to
Hi Eric,
Thank you very much for your reply!
Do you mean that I should clear my table after each run? Indeed, I can see
several times of compaction during my test, but could only a few times
compaction affect the performance that much? Also, I can see from the
OpsCenter some ParNew GC happen but
I find under the src/client folder of Cassandra 2.1.0 source code, there is
a *RingCache.java* file. It uses a thrift client calling the*
describe_ring()* API to get the token range of each Cassandra node. It is
used on the client side. The client can use it combined with the
partitioner to get
I am running Cassandra 2.1.2 in an Ubuntu VM.
cqlsh or cqlsh localhost works fine.
But I can not connect from outside the VM (firewall, etc. disabled).
Even when I do cqlsh 192.168.111.136 in my VM I get connection refused.
This is strange because when I check my network config I can see that
I think your client could use improvements. How many threads do you have
running in your test? With a thrift call like that you only can do one
request at a time per connection. For example, assuming C* takes 0ms, a
10ms network latency/driver overhead will mean 20ms RTT and a max
throughput
Try:
$ netstat -lnt
and see which interface port 9042 is listening on. You will likely need to
update cassandra.yaml to change the interface. By default, Cassandra is
listening on localhost so your local cqlsh session works.
On Sun, 7 Dec 2014 23:44 Richard Snowden richard.t.snow...@gmail.com
Hi All,
There is a practices for Cassandra UPDATE statement. Maybe is not the best, but
it is a reference for you to update a row in high frequency.
The Cassandra will be failed if UPDATE statement is executed more than once on
the same row.
In the end, I change the primary key to let Cassandra
I would really not recommend using thrift for anything at this point,
including your load tests. Take a look at CQL, all development is going
there and has in 2.1 seen a massive performance boost over 2.0.
You may want to try the Cassandra stress tool included in 2.1, it can
stress a table
20 matches
Mail list logo