[
https://issues.apache.org/jira/browse/BIGTOP-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
jay vyas updated BIGTOP-1177:
-----------------------------
Summary: Puppet Recipes: Can we modularize them to foster HCFS initiatives?
(was: Puppet Recipes: Can we modularize them?)
> Puppet Recipes: Can we modularize them to foster HCFS initiatives?
> ------------------------------------------------------------------
>
> Key: BIGTOP-1177
> URL: https://issues.apache.org/jira/browse/BIGTOP-1177
> Project: Bigtop
> Issue Type: Improvement
> Reporter: jay vyas
>
> In the spirit of interoperability Can we work to modularizing the bigtop
> puppet recipes to not define "hadoop_cluster_node" as an HDFS specific class.
>
> I'm not a puppet expert but just
> Here are two reasons why:
> - For HDFS USers: In some use cases we might want to use bigtop to provision
> many nodes, only some of which are "data nodes". For example: Lets say our
> cluster is crawling the web in mappers, and doing some machine learning and
> distillling large pages into a small relational database tuple, i.e. that
> summarizes the "entities" in the page. In this case we don't necessarily
> benefit much from locality because we might be CPU rather than network/io
> bound. So we might want to provision a cluster of 50 machines : 40
> multicore CPU heavy ones and just 10 datanodes to support the DFS. I know
> this is an extreme case but its a good example.
> - For NON-HDFS users: One important aspect of emerging hadoop workflows is
> HCFS : https://wiki.apache.org/hadoop/HCFS/ -- the idea that filesystems like
> S3, OrangeFS, GlusterFileSystem, etc.. are all just as capable , although not
> necessarily optimal, of supporting YARN and Hadoop operations as HDFS.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)