[ 
https://issues.apache.org/jira/browse/HADOOP-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054963#comment-13054963
 ] 

Eric Yang commented on HADOOP-7417:
-----------------------------------

Allen, I did extensive studies on all existing systems including puppet, 
mcollective, chef, cfengine, controlTier, Bcfg2.  Most of the configuration 
management system focus on generating a set of templates and config parameters 
and push out changes one node at a time.  This works fine in small number of 
machines, but most of the system fails beyond 1800 nodes or become difficult to 
maintain.  i.e. mcollective uses spamming tree model on puppeteer, hence the 
puppet master becomes single point of failure.  One puppet master failure could 
take large chunk of the nodes offline.  HMS is designed to remove single point 
of failures in the deployment system, and improve performance.  we found it is 
more reliable to store system state in Zookeeper for HA.  Zeroconf is great for 
resolving service location.  Exist config management system requires 
installation and configuration before it can be deployed.  HMS is designed to 
install and operate without having to configure the management system.  
Bittorrent is much faster than install software from yum repository for large 
scale system.  Granted that this system started several years behind existing 
system, but it solves some scalability and reliability issues.  

To summarize, HMS does the following better:

- Scale
- Reliability
- Cross node application orchestration (action dependencies)
- Speed
- Sophisticate monitoring system (Reuse Chukwa)
- Self healing cluster (Ability to replay history to heal nodes)


> Hadoop Management System (Umbrella)
> -----------------------------------
>
>                 Key: HADOOP-7417
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7417
>             Project: Hadoop Common
>          Issue Type: New Feature
>         Environment: Java 6, Linux
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>
> The primary goal of Hadoop Management System is to build a component around 
> management and deployment of Hadoop related projects. This includes software 
> installation, configuration, application orchestration, deployment automation 
> and monitoring Hadoop.
> Prototype demo source code can be obtained from:
> http://github.com/macroadster/hms

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to