A previous post to core-user mentioned some formula to determine job
time. I was wondering if anyone out there is trying to tackle
designing a formula that can calculate the job run time of a
map/reduce program. Obviously there are many variables here including
but not limited to Disk Speed ,Network Speed, Processor Speed, input
data, many constants , data-skew, map complexity, reduce complexity, #
of nodes......

As an intellectual challenge has anyone starting trying to write a
formula that can take into account all these factors and try to
actually predict a job time in minutes/hours?

Reply via email to