we are using dell 1950s, 8cpu 16gb ram, dual 1tb disk. you can get machines in this range for in the $2k range. I run hbase on 1tb of data on 20 of these. You can probably look at doing 15+ machines.
The master machine doesnt do much work, but it has to be reliable. Raid, dual power supply, etc. If it goes down, namenode takes your entire system down. I run them on a standard node, but with some of the dual power features enabled. The regionservers do way more, so in theory you could have a smaller master, but not too small. Probably best to stick to 1 node time, keep it cheap. You can run ZK on those nodes, but if you run into IO wait issues, you might see stalls that could hurt bad. I'd avoid doing massive map-reduces with a large intermediate output on these machines. -ryan On Tue, Aug 11, 2009 at 4:14 PM, llpind<[email protected]> wrote: > > Thanks for the link. I will keep that in mind. > > Yeah 256MB isn't much. Moving up to 3-4G for 10-15 boxes gets expensive. > > > > > > Alejandro Pérez-Linaza wrote: >> >> You might want to check out www.rackspacecloud.com where you can get boxes >> and pay by the hour (as cheap as $0.015 / hour for a 256Mb box). We used >> it a couple of weeks ago to setup a MySQL Cluster test and ended up having >> around 18 boxes. Memory can be changed from 256Mb to 16Gb in a couple of >> minutes. They also have various flavors to choose from. >> >> The bottom line is that we love it and it solves the problem of the "test >> boxes" that you would need right away. >> >> Have fun, >> >> Alex >> >> >> Alejandro Pérez-Linaza >> CEO >> Vertical Technologies, LLC >> [email protected] >> www.vertical-tech.com >> 9600 NW 25th Street, Suite 4A >> Miami, FL 33172 >> Office: (786) 206-0554 x 108 >> Toll Free: (866) 382-8918 >> Fax: (305) 328-5063 >> >> The information in this email is confidential and may be legally >> privileged. It is intended solely for the addressee. Access to this email >> by anyone else is unauthorized. If you are not the intended recipient, any >> disclosure, copying, distribution or any action taken or omitted to be >> taken in reliance on it, is prohibited and may be unlawful. >> >> >> -----Original Message----- >> From: llpind [mailto:[email protected]] >> Sent: Tuesday, August 11, 2009 12:21 PM >> To: [email protected] >> Subject: HBase in a real world application >> >> >> As some of you know, I've been playing with HBase on/off for the past few >> months. >> >> I'd like your take on some cluster setup/configuration setting that you’ve >> found successful. Also, any other thoughts on how I can persuade usage of >> HBase. >> >> Assume: Working with ~2 TB of data. A few very tall tables. Hadoop/HBase >> 0.20.0. >> >> 1. What specs should a master box have (speed, HD, RAM)? Should Slave >> boxes >> be different? >> 2. Recommended size of cluster? I realize this depends on what >> load/performance requirements we have, but I’d like to know your thoughts >> based on #1 specs. >> 3. Should zookeeper quorums run on different boxes than regionservers? >> >> >> Basically if you could give some example cluster configurations with the >> amount of data your working with that would be a lot of help (or point me >> to >> a place were this has been discussed for .20). Currently I don’t have the >> funds to play around with a lot of boxes, but I hope to soon. :) Thanks. >> >> -- >> View this message in context: >> http://www.nabble.com/HBase-in-a-real-world-application-tp24920888p24920888.html >> Sent from the HBase User mailing list archive at Nabble.com. >> >> >> >> > > -- > View this message in context: > http://www.nabble.com/HBase-in-a-real-world-application-tp24920888p24927386.html > Sent from the HBase User mailing list archive at Nabble.com. > >
