Rod Taylor wrote:
The machine the namenode is running on does have very high load at
times. Do you recommend a separate box for the namenode and jobtracker
which runs strictly those items?

That would be optimal, but it shouldn't be required. If a tasktracker or datanode is sluggish then its impact is small, but if the jobtracker or namenode become sluggish the impact is systemic. That said, so long as these don't crash, things should work. The problem is that the code paths for recovery when namenodes and jobtrackers are sluggish have not been tested as much.

What's in the jobtracker logs around this time? Did it report this tasktracker as lost?

The jobtracker did not indicate such a thing (via an exception anyway).
Tasktracker connections seem to be established and disconnected from
fairly frequently. Perhaps this is what you mean?

No, there's a "lost tracker" message when the jobtracker times out a tasktracker. These are bad, since the jobtracker then assumes that all of the temporary map data at that tasktracker is gone, and re-schedules those map tasks.

Doug

Reply via email to