Dear all, Recently we are invesgating dynamic scheduling of parallel IO-intensive applications in large-scale cluster. We are interested in Hadoop, especially its task scheduling schemes. Is the scheduling module a part of any library of Hadoop, or is it a standalone library? Is there any publications specific on scheduling in Hadoop? Could you please share some details about scheduling or suggest some literature of Hadoop?
Thanks! Best wishes, Huijun Zhu