our dataset is too big to fit into cache, so we are hitting disk. not a problem for normal operation, but when a node is restored, hinted handoff, load balanced, or if reads/write simply build up we see a problem. the nodes can't seem to catch up. this seems to be centered around drive seek time, not cassandra per se.

to combat we are doing the following:

- add more smaller drives per machine in RAID 0 to combat drive seek time.
- scale horizontally - add more machines to cluster to spread the load
- we also plan to try out SSDs as well.

Jonathan Ellis wrote:
Yes, but I would guess 90% of workloads are better served with
spending the extra money on more machines w/ cheap sata disks and lots
of ram.


On Sun, Mar 7, 2010 at 1:00 PM, Boris Shulman <shulm...@gmail.com> wrote:
Do you think having SAS disks will give better performance?

On Sat, Mar 6, 2010 at 5:47 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
I think http://wiki.apache.org/cassandra/CassandraHardware answers
most of your questions.

If possible, it's definitely useful to try out a small fraction of
your anticipated workload against a test cluster, even a single node,
before finalizing your production hardware purchase.

On Sat, Mar 6, 2010 at 1:12 AM, Rosenberry, Eric
<eric.rosenbe...@iovation.com> wrote:
I am looking for advice from others that are further along in deploying
Cassandra in production environments than we are.  I want to know what you
are finding your bottlenecks to be.  I would feel silly purchasing dual
processor quad core 2.93ghz Nehalem machines with 192 gigs of RAM just to
find out that the two local SATA disks kept all that CPU and RAM from being
useful (clearly that example would be a dumb).

I need to spec out hardware for an “optimal” Cassandra node (though our
read/write characteristics are not yet fully defined so let’s go with an
“average” configuration).

My main concern is finding the right balance of:

·         Available CPU

·         RAM amount

·         RAM speed (think Nehalem architecture where memory comes in a few
speeds, though I doubt this is much of a concern as it is mainly dictated by
which processor you buy and how many slots you populate)

·         Total iops available (i.e. number of disks)

·         Total disk space available (depending on the ratio of iops/space
deciding on SAS vs. SATA and various rotational speeds)

My current thinking is 1U boxes with four 3.5 inch disks since that seems to
be a readily available config.  One big question is should I go with a
single processor Nehalem system to go with those four disks, or would two
CPU’s be useful, and also, how much RAM is appropriate to match?  I am
making the assumption that Cassandra nodes are going to be disk bound as
they must do a random read to answer any given query (i.e. indexes in RAM,
but all data lives on disk?).

The other big decision is what type of hard disks others are finding to
provide the optimal ratio of iops to available space?  SAS or SATA?  And
what rotational speed?

Let me throw out here an actual hardware config and feel free to tell me the
error of my ways:

·         A SuperMicro SuperServer 6016T-NTRF configured as follows:

o   2.26 ghz E5520 dual processor quad core hyperthreaded Nehalem
architecture (this proc provides a lot of bang for the buck, faster procs
get more expensive quickly)

o   Qty 12, 4 gig 1066mhz DIMMS for a total of 48 gigs RAM (the 4 gig DIMMS
seem to be the price sweet spot)

o   Dual on board 1 gigabit NIC’s (perhaps one for client connections and
the other for cluster communication?)

o   Dual power supplies (I don’t want to lose half my cluster due to a
failure on one power leg)

o   4x 1TB SATA disks (this is a complete SWAG)

o   No RAID controller (all just single individual disks presented to the
OS) – Though is there any down side to using a RAID controller with RAID 0
(perhaps one single disk for the log for sequential io’s, and 3x disks in a
stripe for the random io’s)

o   The on-board IPMI based OOB controller (so we can kick the boxes
remotely if need be)


I can’t help but think the above config has way too much RAM and CPU and not
enough iops capacity.  My understanding is that Cassandra does not cache
much in RAM though?

Any thoughts are appreciated.  Thanks.


Eric Rosenberry
Sr. Infrastructure Architect | Chief Bit Plumber

111 SW Fifth Avenue
Suite 3200
Portland, OR 97204

The information contained in this email message may be privileged,
confidential and protected from disclosure. If you are not the intended
recipient, any dissemination, distribution or copying is strictly
prohibited. If you think that you have received this email message in error,
please notify the sender by reply email and delete the message and any

Reply via email to