That's great Junping.

Hoping to see this in trunk / hadoop 2.0 and hadoop 1.1 soon.

- milind

On Jun 4, 2012, at 8:48 AM, Jun Ping Du wrote:

> Hello Folks,
>      I just filed a Umbrella jira today to address current NetworkTopology 
> issue that binding strictly to three tier network. The motivation here is to 
> make hadoop more flexible for deploying topology (especially for 
> cloud/virtualization case) and more configurable in data locality related 
> policies like: replica placement, task scheduling, choosing block for 
> DFSClient reading, balancing. 
>      We submit a draft proposal in this Umbrella as well as the 
> implementation code. As code base is large (~260K), the code is separated 
> into 7 sub JIRA issues which seems to be more convenient for reviewing. 
> However, we split the code based on functionality which cause some 
> dependencies between patches which way we are not sure the best. Welcome to 
> provide comments and suggestions on doc and code, and look forward to work 
> with all of you to enhance hadoop in some new situations towards perfect.
>      Hope this is a good start.    
> 
> Cheers,
> 
> Junping
> 
> ----- Original Message -----
> From: "Junping Du (JIRA)" <j...@apache.org>
> To: common-iss...@hadoop.apache.org
> Sent: Monday, June 4, 2012 12:09:22 PM
> Subject: [jira] [Created] (HADOOP-8468) Umbrella of enhancements to support 
> different failure and locality topologies
> 
> Junping Du created HADOOP-8468:
> ----------------------------------
> 
>             Summary: Umbrella of enhancements to support different failure 
> and locality topologies
>                 Key: HADOOP-8468
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8468
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ha, io
>    Affects Versions: 2.0.0-alpha, 1.0.0
>            Reporter: Junping Du
>            Assignee: Junping Du
>            Priority: Critical
> 
> 
> The current hadoop network topology (described in some previous issues like: 
> Hadoop-692) works well in classic three-tiers network when it comes out. 
> However, it does not take into account other failure models or changes in the 
> infrastructure that can affect network bandwidth efficiency like: 
> virtualization. 
> Virtualized platform has following genes that shouldn't been ignored by 
> hadoop topology in scheduling tasks, placing replica, do balancing or 
> fetching block for reading: 
> 1. VMs on the same physical host are affected by the same hardware failure. 
> In order to match the reliability of a physical deployment, replication of 
> data across two virtual machines on the same host should be avoided.
> 2. The network between VMs on the same physical host has higher throughput 
> and lower latency and does not consume any physical switch bandwidth.
> Thus, we propose to make hadoop network topology extend-able and introduce a 
> new level in the hierarchical topology, a node group level, which maps well 
> onto an infrastructure that is based on a virtualized environment.
> 
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA 
> administrators: 
> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
> 
> 
> 

Reply via email to