[ 
https://issues.apache.org/jira/browse/MAPREDUCE-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000460#comment-13000460
 ] 

Arun C Murthy commented on MAPREDUCE-225:
-----------------------------------------

Leitao and Hari - apologies for coming in late. I've missed this so far.

HADOOP-1876 and HADOOP-3245 have had too many issues in the past and we have 
since moved away from this model - in fact we never deployed either at an 
reasonable scale due to issues we have seen with them. Also, we have actually 
have removed a lot of this code in future versions of Hadoop since they didn't 
work well at all and complicated the JobTracker to a very large extent.

OTOH, we have been working on a completely revamped architecture for Hadoop 
Map-Reduce via MAPREDUCE-279. You guys might we interested... also we would 
*love* your feedback based on your experiences there. Thanks!

> Fault tolerant Hadoop Job Tracker
> ---------------------------------
>
>                 Key: MAPREDUCE-225
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-225
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>         Environment: High availability enterprise system
>            Reporter: Francesco Salbaroli
>            Assignee: Francesco Salbaroli
>         Attachments: Enhancing the Hadoop MapReduce framework by adding 
> fault.ppt, FaultTolerantHadoop.pdf, HADOOP-4586-0.1.patch, 
> HADOOP-4586v0.3.patch, jgroups-all.jar
>
>
> The Hadoop framework has been designed, in an eort to enhance perfor-
> mances, with a single JobTracker (master node). It's responsibilities varies
> from managing job submission process, compute the input splits, schedule
> the tasks to the slave nodes (TaskTrackers) and monitor their health.
> In some environments, like the IBM and Google's Internet-scale com-
> puting initiative, there is the need for high-availability, and performances
> becomes a secondary issue. In this environments, having a system with
> a Single Point of Failure (such as Hadoop's single JobTracker) is a major
> concern.
> My proposal is to provide a redundant version of Hadoop by adding
> support for multiple replicated JobTrackers. This design can be approached
> in many dierent ways. 
> In the document at: 
> http://sites.google.com/site/hadoopthesis/Home/FaultTolerantHadoop.pdf?attredirects=0
> I wrote an overview of the problem and some approaches to solve it.
> I post this to the community to gather feedback on the best way to proceed in 
> my work.
> Thank you!

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to