Hi,
We recently did some experiment on mapreduce job scheduling and found that sometimes there were 2 jobs running on the same machine and each of them ran very slowly. We used to think that 2nd job will wait for the 1st freeing the slave machine occupied and then began to run and seems that this is wrong. Our questions are: (1) How does this scenario happen? Is it because that there's a threshold about on workload and if a slave machine doesn't reach the threshold, then it will carry new task ignoring that there's other task running on it already? (2) If (1) is true, how can we avoid it? If (1) is not true, then what's the reason of this scenario and how to avoid it? Thanks very much in advance. J Best regards, Wisteria.Lavender One is never too old to learn. ^^
