At Netflix we rotate the major compactions around the cluster, don't
run them all at once. We also either take that node out of client
traffic so it doesn't get used as a coordinator or use the Astyanax
client that is latency and token aware to steer traffic to the other
replicas.
We are running
Hi folks,
we just posted a detailed Netflix technical blog entry on this
http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
Hope you find it interesting/useful
Cheers Adrian
You are using replication factor of one and the Lustre clustered
filesystem over the network. Not good practice.
Try RF=3 and local disks. Lustre duplicates much of the functionality
of Cassandra, there is no point using both. Make your Lustre server
nodes into Cassandra nodes instead.
Adrian
This has been proposed a few times, there are some good use cases for
it, and there is no current mechanism for it, but it's been discussed
as a possible enhancement.
Adrian
On Wed, Sep 14, 2011 at 11:06 AM, Todd Burruss bburr...@expedia.com wrote:
Has anyone done any work on what I'll call
You should be using the off heap row cache option. That way you avoid GC
overhead and the rows are stored in a compact serialized form that means you
get more cache entries in RAM. Trade off is slightly more CPU for
deserialization etc.
Adrian
On Sunday, September 11, 2011, aaron morton
Sounds like Khanh thinks he can do joins... :-)
User oriented data is easy, key by facebook id, let cassandra handle
location. Set replication factor=3 so you don't lose data and can do
consistent but slower read after write when you need to using quorum.
If you are running on AWS you should
Hi Yang,
You could also use Hadoop (i.e. Brisk), and run a MapReduce job or
Hive query to extract and summarize/renormalize the data into whatever
format you like.
If you use sstable2json, you have to run on every file on every node,
deduplicate/merge all the output across machines, which is
we had in mind for Brisk. That also has the
ability to parallelize the workload and finish rapidly.
thanks,
Sri
On Sun, May 22, 2011 at 11:31 PM, Adrian Cockcroft
adrian.cockcr...@gmail.com wrote:
Hi Yang,
You could also use Hadoop (i.e. Brisk), and run a MapReduce job or
Hive query
Hi Alex,
This has been a useful thread, we've been comparing your numbers with
our own tests.
Why did you choose four big instances rather than more smaller ones?
For $8/hr you get four m2.4xl with a total of 8 disks.
For $8.16/hr you could have twelve m1.xl with a total of 48 disks, 3x
disk
Netflix has also gone down this path, we run a regular full backup to
S3 of a compressed tar, and we have scripts that restore everything
into the right place on a different cluster (it needs the same node
count). We also pick up the SSTables as they are created, and drop
them in S3.
Whatever you
or
similar on the resultset rather than returning an error.
Terje
On Thu, Apr 21, 2011 at 5:01 AM, Adrian Cockcroft
adrian.cockcr...@gmail.com wrote:
Hi Terje,
If you feed data to two rings, you will get inconsistency drift as an
update to one succeeds and to the other fails from time to time
If you want to use local quorum for a distributed setup, it doesn't
make sense to have less than RF=3 local and remote. Three copies at
both ends will give you high availability. Only one copy of the data
is sent over the wide area link (with recent versions).
There is no need to use mirrored or
We have similar requirements for wide area backup/archive at Netflix.
I think what you want is a replica with RF of at least 3 in NY for all the
satellites, then each satellite could have a lower RF, but if you want safe
local quorum I would use 3 everywhere.
Then NY is the sum of all the
the replication factor of the
satellites to 1 and NY to 3, we'll run out of space very quickly in
the satellites.
On Thu, Apr 14, 2011 at 11:23 AM, Adrian Cockcroft
acockcr...@netflix.com wrote:
We have similar requirements for wide area backup/archive at Netflix.
I think what you want
How many nodes do you have? You should be able to run a rolling compaction
around the ring, one node at a time to minimize impact. If one node is too big
an impact, maybe you should have a bigger cluster? If you are on EC2, try
running more but smaller instances.
Adrian
From: shimi
The book is available via safarionline.comhttp://safarionline.com - many
educational institutions have group memberships and there is a free trial for
individuals. It's run by the publishers and the authors get paid. There is a
nosql training video there as well.
Adrian
On Dec 29, 2010, at
What filesystem are you using? You might try EXT3 or 4 vs. XFS as another area
of diversity. It sounds as if the page cache or filesystem is messed up. Are
there any clues in /var/log/messages? How much swap space do you have
configured?
The kernel level debug stuff I know is all for Solaris
I'm currently working to configure AppDynamics to monitor cassandra. It
does byte-code instrumentation, so there is an agent added to the
cassandra JVM, which gives the ability to capture latency for requests and
see where the bottleneck is coming from. We have been using it on our
other Java
18 matches
Mail list logo