[
https://issues.apache.org/jira/browse/HADOOP-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13055244#comment-13055244
]
Rajiv Chittajallu commented on HADOOP-7417:
-------------------------------------------
{quote}
Allen, I did extensive studies on all existing systems including puppet,
mcollective, chef, cfengine, controlTier, Bcfg2. Most of the configuration
management system focus on generating a set of templates and config parameters
and push out changes one node at a time. This works fine in small number of
machines, but most of the system fails beyond 1800 nodes or become difficult to
maintain.
{quote}
We use tar, ssh, wget, rsync & gpg with custom roles system
(https://computing.llnl.gov/linux/genders.html can be an alternative) to manage
configuration and packages. Our environment is probably still small to hit the
limits of these tools.
Our challenge with managing hadoop cluster is the lack of standard interfaces
to reliably monitor the cluster. Standard unix tools expect process to exit
with non zero status on error and counters to be positive numbers.
IMHO whats needed here are features like HADOOP-6728 & HADOOP-7144, make them
consistent across all components and integrate them with existing tools,
HADOOP-7324 .
bq. Zeroconf is great for resolving service location.
As part of this proposal, are there plans to update how hadoop daemon and
client configurations are handled or is this specific to HMS?
bq. Bittorrent is much faster than install software from yum repository for
large scale system.
Bittorrent is a file sharing protocol and yum is a utility for rpm package
management. I guess you mean to say bittorrent is faster to distribute files
than http. If RPM is choose as the package format but don't want to use yum,
HMS may need to implement another rpm based package management.
Alternatively, this could just be a yum plugin.
thats my 0.2 cents. But hey, if you want to invest your time in writing Yet
Another Monitoring System ;-), I wish you all the best!
> Hadoop Management System (Umbrella)
> -----------------------------------
>
> Key: HADOOP-7417
> URL: https://issues.apache.org/jira/browse/HADOOP-7417
> Project: Hadoop Common
> Issue Type: New Feature
> Environment: Java 6, Linux
> Reporter: Eric Yang
> Assignee: Eric Yang
>
> The primary goal of Hadoop Management System is to build a component around
> management and deployment of Hadoop related projects. This includes software
> installation, configuration, application orchestration, deployment automation
> and monitoring Hadoop.
> Prototype demo source code can be obtained from:
> http://github.com/macroadster/hms
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira