prajyot bankade wrote:
Hello Everyone,
I have just started reading about hadoop job tracker. In one book I read
that there is only one job tracker who is responsible to distribute task to
worker system. Please make me right if i say some thing wrong.
I have few questions,
why there is only one job tracker?
to provide a single place to make scheduling decisions
What will happen if that job tracker will be fail / crash?
Look at the source. You will find it saves state to the filesystem
Can we have more then one job tracker?
yes, if you partition up your workers and bind them to different JTs,
you can have >1 JT per HDFS filesystem, but it complicates locality, as
each JT only schedules work to its workers. I hope your network cables
are fat enough.
Can i create my own backup job tracker to support the system if job tracker
get crash?
Better to monitor the health of the JT and restart that service/machine
when it goes down. As it serves up http pages, it is fairly easy to
detect a complete failure. Harder to detect situations in which jobs get
submitted but never executed, test jobs can do that for you