Capacity planning

Sandhya E Tue, 24 Mar 2009 02:31:35 -0700

We have a hadoop cluster running multiple mapreduces continuously on
logfiles that can be upto 10GB per day. Our logfiles vary a lot
depending on what part of year it is. So every now and then we need to
do some capacity planning and come up with forecast, given the
forecast for logfile sizes or given the forecast of number of jobs. I
need some tips in how to forecast capacity requirements for hadoop
cluster. Is it derivable as a function of current hardware load, log
size, and future growth. Does jobtracker expose some performance data
that will be helpful in planning.


Thanks & Regards
Sandhya

Capacity planning

Reply via email to