RE: Hadoop research

Daming Wang Mon, 25 Feb 2008 23:09:10 -0800

I think your paper is similar with what I read:
"Beehive: O(1) Lookup Performance for Power-Law Query Distributions
in Peer-to-Peer Overlays" in cornell university


the amazon-dynamo paper also deserve to read since I think there are 
relationship between them, you can see the system's advantages compared with 
hadoop's architecture.



-----Original Message-----
From: Ahmad Humayun [mailto:[EMAIL PROTECTED]
Sent: Tuesday, February 26, 2008 1:31 PM
To: [email protected]
Subject: Re: Hadoop research

Hello there everyone,

Great discussion going on here :) I was just looking for the Beehive paper
and wasn't sure where Daming pointed. Can somebody tell me the conference or
maybe send me a link.

There was a paper published in SOSP called "Beehive: O(1) Lookup Performance
in Peer to Peer Overlays through Popularity based Replication". Was it this
one?


thanks for the help.

regards,
Ahmad H.

On Tue, Feb 26, 2008 at 7:47 AM, Daming Wang <[EMAIL PROTECTED]>
wrote:

> To my understanding, the decentralized strategy not only improve fault
> tolerant, also can improve performance. Maybe need more sophisticated way to
> control consistency, but obviously, decentralized architecture has many
> advantages than a centralized control. I suggest you can see the two papers
> at first (1) amazon-dynamo-sosp2007 (2)Beehive, you can get them from
> internet.
>
> Of course, the decentralized system will cause complex for the system. I
> just what to point that what kind of research you want to do based hadoop?
> If some small improvement for current module, the schedule policies is ok,
> but if you want to research for relative big improvement for the whole
> architecture, how to adopt decentralized strategy maybe a direction and help
> you to publish papers. :)
>
>
>
> -----Original Message-----
> From: Jaideep Dhok [mailto:[EMAIL PROTECTED]
> Sent: Monday, February 25, 2008 8:53 PM
> To: [email protected]
> Subject: Re: Hadoop research
>
> Hi,
> First of all thank you for your responses.
>
> "One interesting direction for research would be more sophisticated
> scheduling policies for the JobTracker to help improve locality and
> overall
> cluster utilization."
> This is a very interesting area. In fact I was trying a simple Round Robin
> scheduler, but I didn't take data location into account.
>
> On Mon, Feb 25, 2008 at 8:32 AM, Daming Wang <[EMAIL PROTECTED]>
> wrote:
>
> > How about combine the decentralized strategy to improve HDFS? Something
> > like o(N) DHT architecture used by the Amazon s3
> > Of course, using decentralized method to change hadoop will cause huge
> > work, but it is a good direction if as a research topic I think...
>
> By a decentralized strategy do you mean a peer to peer system? Although
> that
> would be very fault tolerant, wouldn't there be consistency and
> performance
> issues?
> If I understand correctly, the rationale behind current centralized
> architecture is that it keeps the system simple. Would it be useful to
> study
> how much decentralization is possible without adversely affecting
> performance?
>
>
> Again, thanks a lot for your comments.
>
> Regards,
> Jaideep
>



--
Ahmad Humayun
Research Assistant
Computer Science Dpt., LUMS
+92 321 4457315

RE: Hadoop research

Reply via email to