Do you think having SAS disks will give better performance?
On Sat, Mar 6, 2010 at 5:47 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > I think http://wiki.apache.org/cassandra/CassandraHardware answers > most of your questions. > > If possible, it's definitely useful to try out a small fraction of > your anticipated workload against a test cluster, even a single node, > before finalizing your production hardware purchase. > > On Sat, Mar 6, 2010 at 1:12 AM, Rosenberry, Eric > <eric.rosenbe...@iovation.com> wrote: >> I am looking for advice from others that are further along in deploying >> Cassandra in production environments than we are. I want to know what you >> are finding your bottlenecks to be. I would feel silly purchasing dual >> processor quad core 2.93ghz Nehalem machines with 192 gigs of RAM just to >> find out that the two local SATA disks kept all that CPU and RAM from being >> useful (clearly that example would be a dumb). >> >> >> >> I need to spec out hardware for an “optimal” Cassandra node (though our >> read/write characteristics are not yet fully defined so let’s go with an >> “average” configuration). >> >> >> >> My main concern is finding the right balance of: >> >> · Available CPU >> >> · RAM amount >> >> · RAM speed (think Nehalem architecture where memory comes in a few >> speeds, though I doubt this is much of a concern as it is mainly dictated by >> which processor you buy and how many slots you populate) >> >> · Total iops available (i.e. number of disks) >> >> · Total disk space available (depending on the ratio of iops/space >> deciding on SAS vs. SATA and various rotational speeds) >> >> >> >> My current thinking is 1U boxes with four 3.5 inch disks since that seems to >> be a readily available config. One big question is should I go with a >> single processor Nehalem system to go with those four disks, or would two >> CPU’s be useful, and also, how much RAM is appropriate to match? I am >> making the assumption that Cassandra nodes are going to be disk bound as >> they must do a random read to answer any given query (i.e. indexes in RAM, >> but all data lives on disk?). >> >> >> >> The other big decision is what type of hard disks others are finding to >> provide the optimal ratio of iops to available space? SAS or SATA? And >> what rotational speed? >> >> >> >> Let me throw out here an actual hardware config and feel free to tell me the >> error of my ways: >> >> · A SuperMicro SuperServer 6016T-NTRF configured as follows: >> >> o 2.26 ghz E5520 dual processor quad core hyperthreaded Nehalem >> architecture (this proc provides a lot of bang for the buck, faster procs >> get more expensive quickly) >> >> o Qty 12, 4 gig 1066mhz DIMMS for a total of 48 gigs RAM (the 4 gig DIMMS >> seem to be the price sweet spot) >> >> o Dual on board 1 gigabit NIC’s (perhaps one for client connections and >> the other for cluster communication?) >> >> o Dual power supplies (I don’t want to lose half my cluster due to a >> failure on one power leg) >> >> o 4x 1TB SATA disks (this is a complete SWAG) >> >> o No RAID controller (all just single individual disks presented to the >> OS) – Though is there any down side to using a RAID controller with RAID 0 >> (perhaps one single disk for the log for sequential io’s, and 3x disks in a >> stripe for the random io’s) >> >> o The on-board IPMI based OOB controller (so we can kick the boxes >> remotely if need be) >> >> · >> http://www.supermicro.com/products/system/1U/6016/SYS-6016T-NTRF.cfm >> >> >> >> I can’t help but think the above config has way too much RAM and CPU and not >> enough iops capacity. My understanding is that Cassandra does not cache >> much in RAM though? >> >> >> >> Any thoughts are appreciated. Thanks. >> >> >> >> -Eric >> >> _______________________________________________________________ >> Eric Rosenberry >> Sr. Infrastructure Architect | Chief Bit Plumber >> >> >> >> >> iovation >> 111 SW Fifth Avenue >> Suite 3200 >> Portland, OR 97204 >> www.iovation.com >> >> The information contained in this email message may be privileged, >> confidential and protected from disclosure. If you are not the intended >> recipient, any dissemination, distribution or copying is strictly >> prohibited. If you think that you have received this email message in error, >> please notify the sender by reply email and delete the message and any >> attachments. >