I think your paper is similar with what I read: "Beehive: O(1) Lookup Performance for Power-Law Query Distributions in Peer-to-Peer Overlays" in cornell university
the amazon-dynamo paper also deserve to read since I think there are relationship between them, you can see the system's advantages compared with hadoop's architecture. -----Original Message----- From: Ahmad Humayun [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 26, 2008 1:31 PM To: [email protected] Subject: Re: Hadoop research Hello there everyone, Great discussion going on here :) I was just looking for the Beehive paper and wasn't sure where Daming pointed. Can somebody tell me the conference or maybe send me a link. There was a paper published in SOSP called "Beehive: O(1) Lookup Performance in Peer to Peer Overlays through Popularity based Replication". Was it this one? thanks for the help. regards, Ahmad H. On Tue, Feb 26, 2008 at 7:47 AM, Daming Wang <[EMAIL PROTECTED]> wrote: > To my understanding, the decentralized strategy not only improve fault > tolerant, also can improve performance. Maybe need more sophisticated way to > control consistency, but obviously, decentralized architecture has many > advantages than a centralized control. I suggest you can see the two papers > at first (1) amazon-dynamo-sosp2007 (2)Beehive, you can get them from > internet. > > Of course, the decentralized system will cause complex for the system. I > just what to point that what kind of research you want to do based hadoop? > If some small improvement for current module, the schedule policies is ok, > but if you want to research for relative big improvement for the whole > architecture, how to adopt decentralized strategy maybe a direction and help > you to publish papers. :) > > > > -----Original Message----- > From: Jaideep Dhok [mailto:[EMAIL PROTECTED] > Sent: Monday, February 25, 2008 8:53 PM > To: [email protected] > Subject: Re: Hadoop research > > Hi, > First of all thank you for your responses. > > "One interesting direction for research would be more sophisticated > scheduling policies for the JobTracker to help improve locality and > overall > cluster utilization." > This is a very interesting area. In fact I was trying a simple Round Robin > scheduler, but I didn't take data location into account. > > On Mon, Feb 25, 2008 at 8:32 AM, Daming Wang <[EMAIL PROTECTED]> > wrote: > > > How about combine the decentralized strategy to improve HDFS? Something > > like o(N) DHT architecture used by the Amazon s3 > > Of course, using decentralized method to change hadoop will cause huge > > work, but it is a good direction if as a research topic I think... > > By a decentralized strategy do you mean a peer to peer system? Although > that > would be very fault tolerant, wouldn't there be consistency and > performance > issues? > If I understand correctly, the rationale behind current centralized > architecture is that it keeps the system simple. Would it be useful to > study > how much decentralization is possible without adversely affecting > performance? > > > Again, thanks a lot for your comments. > > Regards, > Jaideep > -- Ahmad Humayun Research Assistant Computer Science Dpt., LUMS +92 321 4457315
