Hi, First of all thank you for your responses. "One interesting direction for research would be more sophisticated scheduling policies for the JobTracker to help improve locality and overall cluster utilization." This is a very interesting area. In fact I was trying a simple Round Robin scheduler, but I didn't take data location into account.
On Mon, Feb 25, 2008 at 8:32 AM, Daming Wang <[EMAIL PROTECTED]> wrote: > How about combine the decentralized strategy to improve HDFS? Something > like o(N) DHT architecture used by the Amazon s3 > Of course, using decentralized method to change hadoop will cause huge > work, but it is a good direction if as a research topic I think... By a decentralized strategy do you mean a peer to peer system? Although that would be very fault tolerant, wouldn't there be consistency and performance issues? If I understand correctly, the rationale behind current centralized architecture is that it keeps the system simple. Would it be useful to study how much decentralization is possible without adversely affecting performance? Again, thanks a lot for your comments. Regards, Jaideep
