Thanks for the detailed answers Dan, what you said makes sense. I think my biggest worry right now is making the correct preditions of my data storage space based on the measurements with the current cluster. Other than that I should be fairly comfortable with the rest of the HW specs.
Thanks for the observation Mohit, I'll keep a closer eye to this disk parameter which I do see in the specs all the time. Todd, your link explains questions I have had for quite some time .... I have found that indeed I am dominated by metadata like one of the example shows. Since we're on this subject, I want to ask you guys another question. I have my monitoring data sources within an enclosed network, so my Cassandra cluster will also be in that enclosed network (by enclosed network I mean any communication or data transfer in and out of the network must go through a gateway). The problem is I need to make the data available outside! Do any of you guys have some suggestions for doing that? My first thought was to have an internal 3 node cluster taking in the insertion load and then, in the period of low load do a major compaction and then ship the data out to an external Cassandra node used only for reading. This outside node would have to have a lot of disk (hold the data for 1 year) and be optimised for reading - I was thinking of having an SSD caching layer between my bulk storage and Cassandra. Only hot data will go in this layer....somehow! So my questions: 1) Is my method unheard of or does it sound reasonable? 2) What is the best way to transfer data from the cluster inside the enclosed network to the node outside? I heard some time in the past that there is a tool that does bulk transfers of data but I'm not sure how that can be done...a script that calls this tool on a certain trigger..........any ideas? 3) Is this intermediate SSD cache thing doable...or I should just stick to the normal RAID array of disks and the indexes and in memory caching of columns that Cassandra offers? Cheers, Alex On Tue, Oct 25, 2011 at 9:06 PM, Todd Burruss <bburr...@expedia.com> wrote: > This may help determining your data storage requirements ... > > http://btoddb-cass-storage.blogspot.com/ > > > > On 10/25/11 11:22 AM, "Mohit Anchlia" <mohitanch...@gmail.com> wrote: > > >On Tue, Oct 25, 2011 at 11:18 AM, Dan Hendry <dan.hendry.j...@gmail.com> > >wrote: > >>> 2. ... So I am going to use rotational disk for the commit log and an > >>>SSD > >>> for data. Does this make sense? > >> > >> > >> > >> Yes, just keep in mind however that the primary characteristic of SSDs > >>is > >> lower seek times which translates into faster random access. We have a > >> similar Cassandra use case (time series data and comparable volumes) and > >> decided the random read performance boost (unquantified in our case to > >>be > >> fair) was not worth the price and we went with more, larger, cheaper > >>7.2k > >> HDDs. > >> > >> > >> > >>> 3. What's the best way to find out how big my commitlog disk and my > >>>data > >>> disk has to be? The Cassandra hardware page says the Commitlog disk > >>> shouldn't be big but still I need to choose a size! > >> > >> > >> > >> As of Cassandra 1.0, the commit log has an explicit size bound > >>(defaulting > >> to 4GB I believe). In 0.8, I dont think I have ever seen my commit log > >>grow > >> beyond that point but the limit should be the ammount of data you insert > >> within the maximum CF timed flush period (³memtable_flush_after² > >>parameter, > >> to be safe, maximumum across all CFs). Any modern drive should be > >> sufficient. As for the size of your data disks, that is largely > >>application > >> dependent, and you should be able to judge best based on your currnet > >> cluster. > >> > >> > >> > >>> 4. I also noticed RAID 0 configuration is recommended for the data file > >>> directory. Can anyone explain why? > >> > >> > >> > >> In comparison to RAID1/RAID1+0? For any RF > 1, Cassadra already takes > >>care > >> of redundancy by replicating the data across multiple nodes. Your > >> applications choice of replication factor and read/write consistencies > >> should be specified to tollerate a node failing (for any reason: disk > >> failure, network failure, a disgruntled employee taking a sledge hammer > >>to > >> the box, etc). As such, what is the point of waisting your disks > >>duplicating > >> data on a single machine to minimize the chances of one particular type > >>of > >> failure when it should not matter anyways? > > > >It all boils down to operations cost vs hardware cost. Also consider > >MTBF and how equipped you are to handle disk failures which are more > >common than others. > >> > >> > >> > >> Dan > >> > >> > >> > >> From: Alexandru Sicoe [mailto:adsi...@gmail.com] > >> Sent: October-25-11 8:23 > >> To: user@cassandra.apache.org > >> Subject: Cassandra cluster HW spec (commit log directory vs data file > >> directory) > >> > >> > >> > >> Hi everyone, > >> > >> I am currently in the process of writing a hardware proposal for a > >>Cassandra > >> cluster for storing a lot of monitoring time series data. My workload is > >> write intensive and my data set is extremely varied in types of > >>variables > >> and insertion rate for these variables (I will have to handle an order > >>of 2 > >> million variables coming in, each at very different rates - the > >>majority of > >> them will come at very low rates but there are many that will come at > >>higher > >> rates constant rates and a few coming in with huge spikes in rates). > >>These > >> variables correspond to all basic C++ types and arrays of these types. > >>The > >> highest insertion rates are received for basic types, out of which U32 > >> variables seem to be the most prevalent (e.g. I recorded 2 million U32 > >>vars > >> were inserted in 8 mins of operation while 600.000 doubles and 170.000 > >> strings were inserted during the same time. Note this measurement was > >>only > >> for a subset of the total data currently taken in). > >> > >> At the moment I am partitioning the data in Cassandra in 75 CFs (each CF > >> corresponds to a logical partitioning of the set of variables mentioned > >> before - but this partitioning is not related with the amount of data or > >> rates...it is somewhat random). These 75 CFs account for ~1 million of > >>the > >> variables I need to store. I have a 3 node Cassandra 0.8.5 cluster (each > >> node is a 4 real core with 4 GB RAM and split commit log directory and > >>data > >> file directory between two RAID arrays with HDDs). I can handle the > >>load in > >> this configuration but the average CPU usage of the Cassandra nodes is > >> slightly above 50%. As I will need to add 12 more CFs (corresponding to > >> another ~ 1 million variables) plus potentially other data later, it is > >> clear that I need better hardware (also for the retrieval part). > >> > >> I am looking at Dell servers (Power Edge etc) > >> > >> Questions: > >> > >> 1. Is anyone using Dell HW for their Cassandra clusters? How do they > >>behave? > >> Anybody care to share their configurations or tips for buying, what to > >>avoid > >> etc? > >> > >> 2. Obviously I am going to keep to the advice on the > >> http://wiki.apache.org/cassandra/CassandraHardware and split the > >>commmitlog > >> and data on separate disks. I was going to use SSD for commitlog but > >>then > >> did some more research and found out that it doesn't make sense to use > >>SSDs > >> for sequential appends because it won't have a performance advantage > >>with > >> respect to rotational media. So I am going to use rotational disk for > >>the > >> commit log and an SSD for data. Does this make sense? > >> > >> 3. What's the best way to find out how big my commitlog disk and my data > >> disk has to be? The Cassandra hardware page says the Commitlog disk > >> shouldn't be big but still I need to choose a size! > >> > >> 4. I also noticed RAID 0 configuration is recommended for the data file > >> directory. Can anyone explain why? > >> > >> Sorry for the huge email..... > >> > >> Cheers, > >> Alex > >> > >> No virus found in this incoming message. > >> Checked by AVG - www.avg.com > >> Version: 9.0.920 / Virus Database: 271.1.1/3972 - Release Date: 10/24/11 > >> 14:35:00 > >