Oleg,

Do you have Ganglia or some other graphing tool running against the
cluster? It gives you metrics that are crucial here, for example the
load on Hadoop and its DataNodes as well as insertion rates etc. on
HBase. What is also interesting is the compaction queue to see if the
cluster is going slow.

Did you try loading from an empty system to a loaded one? Or was it
already filled and you are trying to add more? Are you spreading the
load across servers or are you using sequential keys that tax only one
server at a time?

16GB should work, but is not ideal. The various daemons simply need
room to breathe. But that said, I have personally started with 12GB
even and it worked.

Lars

On Mon, Nov 22, 2010 at 12:17 PM, Oleg Ruchovets <[email protected]> wrote:
> On Sun, Nov 21, 2010 at 10:39 PM, Krishna Sankar <[email protected]>wrote:
>
>> Oleg & Lior,
>>
>> Couple of questions & couple of suggestions to ponder:
>> A)  When you say 20 Name Servers, I assume you are talking about 20 Task
>> Servers
>>
>
> Yes
>
>
>> B)  What type are your M/R jobs ? Compute Intensive vs. storage intensive ?
>>
>
> M/R -- most of it -- it is a parsing stuff , result of m/r  5% - 10% stores
> to hbase
>
>
>> C)  What is your Data growth ?
>>
>
>  currently we have 50GB per day , it could be ~150GB.
>
>
>> D)  With the current jobs, are you saturating RAM ? CPU ? Or storage ?
>>
>    Map phase takes 100% CPU consumption since it is a parsing and input
> files are  gz.
>    Definitely have a memory issues.
>
>
>> Ganglia/Hadoop metrics should tell.
>> E)  Also are your jobs long running or short tasks ?
>>
>    map tasks takes from 5 second to 2 minutes
>    reducer (insertion to hbase) takes -- ~3 hours
>
>
>> Suggestions:
>> A)  Your name node could be 32 GB, 2TB Disk. Make sure it is an enterprise
>> class server and also backup to an NFS mount.
>> B)  Also have a decent machine as the checkpoint name node. It could be
>> similar to the task nodes
>> B)  I assume by Master Machine, you mean Job Tracker. It could be similar
>> to the Task Trackers - 16/24 GB memory, with 4-8 TB disk
>> C)  As Jean-Daniel pointed out 500GB (with more spindles) is what I would
>> also recommend. But it also depends on your primary data, intermediate
>> data and final data size. 1 or 2 TB disks are also fine, because they give
>> you more strage. I assume you have the default replication of 3
>> D)  A 1Gb dedicated network would be good. As there are only ~25 machines,
>> you can hang them off of a good Gb switch. Consider 10Gb if there is too
>> much intermediate data traffic, in the future.
>> Cheers
>> <k/>
>>
>> On 11/21/10 Sun Nov 21, 10, "Oleg Ruchovets" <[email protected]> wrote:
>>
>> >Hi all,
>> >After testing HBase for few months with very light configurations  (5
>> >machines, 2 TB disk, 8 GB RAM), we are now planing for production.
>> >Our Load -
>> >1) 50GB log files to process per day by Map/Reduce jobs.
>> >2)  Insert 4-5GB to 3 tables in hbase.
>> >3) Run 10-20 scans per day (scanning about 20 regions in a table).
>> >All this should run in parallel.
>> >Our current configuration can't cope with this load and we are having many
>> >stability issues.
>> >
>> >This is what we have in mind :
>> >1. Master machine - 32 GB, 4 TB, Two quad core CPUs.
>> >2. Name node - 16 GB, 2TB, Two quad core CPUs.
>> >we plan to have up to 20 name servers (starting with 5).
>> >
>> >We already read
>> >
>> http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-ba
>> >sic-hardware-recommendations/
>> >.
>> >
>> >We would appreciate your feedback on our proposed configuration.
>> >
>> >
>> >Regards Oleg & Lior
>>
>>
>>
>

Reply via email to