Hello Nate,
Hi all,

On Tue, Jan 20, 2015 at 08:34:34PM +0000, Nate DAmico (JIRA) wrote:

> Nate DAmico commented on BIGTOP-1122:
> We are putting some work in on puppet modules, starting with latest
> 3.x branch.  Hiera work wasn't first on our list, but can look to work
> with you on it during same time.  We are mostly looking at some
> parameter refactoring and minimal changes to support a wider range of
> topologies and setups

> As Cos suggested we can take conversation to the dev list and we were
> going to setup wiki page with more of param refactoring ideas if need
> be

I am facing the same problem: We want to employ a different topology and
HA setup than what's currently hardwired into cluster.pp. Also we
already have a puppet 3 infrastructure with heavy use of hiera e.g.
for machine role assignment in place.

When first looking at the current puppet code I was completely lost how
to adapt cluster.pp for our scenario. It was really hard to get an idea
what is used where and why because everything is set up in cluster.pp
and handed to the modules either via resource parameters or just the
global scope and from there via different variable names on to included
common classes.

To get a handle on things I started an ad-hoc conversion of modules and
cluster/init.pp to parametrised classes instead of defined resources.
This automagically cleaned up namespacing and scoping and revealed what's
actually needed where. It also gets rid of 90% of the extlookup's. The
remaining 10% I replaced with explicit hiera lookups. Also it revealed
some things that were plain broken before.

The attached patch is against latest git master HEAD and on my test box
produces exactly the same results as the stock puppet code. (There is
some additional HA setting code in there that's needed for our setup but
doesn't do anything unless activated. I would eventually split that out
into separate additional changes.)

All the parameter defaulting logic is moved from cluster.pp into hiera's
bigtop/cluster.yaml in concert with a variable lookup hierachy for non-HA
and HA setups orchestrated in site.pp. cluster.pp now mostly contains
node role assignments and inter-module/class/resource dependencies - as
it should, IMO.

Site specific adjustments are done in some site.yaml that's "in front"
of cluster.yaml in the hiera search path (hierarchy). That way,
cluster.pp would never need changing by a site admin but everything
still remains fully adjustable and overrideable without any need to
touch puppet code. Also, if required, cluster/site.pp can be totally
canned and the modules be used on their own.

I wouldn't dare say the attached patch is clean in every way but rather
that it shows where that road might lead. To keep the diff small I
didn't much try to preserve the extlookup key names. Rather I used the
internal class variable names which get qualified in hiera using the
module name anyway. So naming of keys is a bit chaotic now (mostly
hadoop_ vs. no prefix).

But it opens up a load of further oppurtunities:
- node role assigment could be moved to an ENC or into hiera using
  hiera_include() rendering cluster/init.pp virtually redundant
- cluster.yaml can be seriously cleaned up and streamlined
- variable and lookup key names can easily be reviewed and changed to a
  common naming scheme
- because all external variable lookups now are qualified with a scope,
  it's clearly visible what's used where allowing for easy review and
  redesign

The price to pay is that the patch currently is puppet 3 and hiera only.
As far as I understand it could easily be made puppet 2/3 compatible. I
have no clear idea however how to continue to support non-hiera users.
That's because in module hadoop common logic is pulled in via "include
common_xyz" which must be used in order to allow multiple such includes
from namenode/datanode/... but also makes it impossible to pass
parameters other than via hiera (or the global scope).

But since the README states that the code is puppet 3 only and puppet 3
by default supports hiera as a drop-in add-on I wonder if that's really
much of an issue?

Thanks,
-- 
Michael Weiser                science + computing ag
Senior Systems Engineer       Geschaeftsstelle Duesseldorf
                              Faehrstrasse 1
phone: +49 211 302 708 32     D-40221 Duesseldorf
fax:   +49 211 302 708 50     www.science-computing.de
-- 
Vorstandsvorsitzender/Chairman of the board of management:
Gerd-Lothar Leonhart
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Michael Heinrichs, Dr. Arno Steitz
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196
>From cdca9ff5988e31ab1d39f97b1b6d059f4a62b450 Mon Sep 17 00:00:00 2001
From: Michael Weiser <[email protected]>
Date: Tue, 27 Jan 2015 14:25:21 +0100
Subject: [PATCH] [BIGTOP-????] puppet: Replace extlookup by hiera, use
 parametrised classes

Update the puppet code to use self-contained, parametrised classes and proper
scoping. Replace all extlookup calls bei either explicit or automatic hiera
parameter lookups. Implement HA/non-HA alternative via hiera lookup hierarchy.
Replace append_each from bigtop_util by suffix from stdlib.
---
 bigtop-deploy/puppet/config/site.csv.example       |  28 --
 bigtop-deploy/puppet/hiera.yaml                    |   7 +
 bigtop-deploy/puppet/hieradata/bigtop/cluster.yaml | 123 +++++++
 bigtop-deploy/puppet/hieradata/bigtop/ha.yaml      |   7 +
 bigtop-deploy/puppet/hieradata/bigtop/noha.yaml    |   2 +
 bigtop-deploy/puppet/hieradata/site.yaml           |  32 ++
 bigtop-deploy/puppet/manifests/cluster.pp          | 352 ++++-----------------
 bigtop-deploy/puppet/manifests/site.pp             |  24 +-
 .../lib/puppet/parser/functions/append_each.rb     |  22 --
 .../puppet/modules/crunch/manifests/init.pp        |   2 +-
 .../puppet/modules/giraph/manifests/init.pp        |   2 +-
 .../puppet/modules/hadoop-flume/manifests/init.pp  |   2 +-
 .../puppet/modules/hadoop-hbase/manifests/init.pp  |  14 +-
 .../puppet/modules/hadoop-hive/manifests/init.pp   |   2 +-
 .../puppet/modules/hadoop-oozie/manifests/init.pp  |   4 +-
 .../puppet/modules/hadoop-pig/manifests/init.pp    |   2 +-
 .../puppet/modules/hadoop-sqoop/manifests/init.pp  |   4 +-
 .../modules/hadoop-zookeeper/manifests/init.pp     |   4 +-
 .../puppet/modules/hadoop/manifests/init.pp        | 340 ++++++++++++--------
 .../puppet/modules/hadoop/templates/hadoop-env.sh  |   2 +-
 .../puppet/modules/hadoop/templates/hdfs-site.xml  |  32 +-
 .../puppet/modules/hadoop/templates/yarn-site.xml  |  71 +++++
 .../puppet/modules/hcatalog/manifests/init.pp      |   4 +-
 bigtop-deploy/puppet/modules/hue/manifests/init.pp |   2 +-
 .../puppet/modules/kerberos/manifests/init.pp      |  23 +-
 .../puppet/modules/mahout/manifests/init.pp        |   2 +-
 .../puppet/modules/solr/manifests/init.pp          |   2 +-
 .../puppet/modules/spark/manifests/init.pp         |   6 +-
 .../puppet/modules/tachyon/manifests/init.pp       |   8 +-
 29 files changed, 586 insertions(+), 539 deletions(-)
 delete mode 100644 bigtop-deploy/puppet/config/site.csv.example
 create mode 100644 bigtop-deploy/puppet/hiera.yaml
 create mode 100644 bigtop-deploy/puppet/hieradata/bigtop/cluster.yaml
 create mode 100644 bigtop-deploy/puppet/hieradata/bigtop/ha.yaml
 create mode 100644 bigtop-deploy/puppet/hieradata/bigtop/noha.yaml
 create mode 100644 bigtop-deploy/puppet/hieradata/site.yaml
 delete mode 100644 
bigtop-deploy/puppet/modules/bigtop_util/lib/puppet/parser/functions/append_each.rb

diff --git a/bigtop-deploy/puppet/config/site.csv.example 
b/bigtop-deploy/puppet/config/site.csv.example
deleted file mode 100644
index 60c88eb..0000000
--- a/bigtop-deploy/puppet/config/site.csv.example
+++ /dev/null
@@ -1,28 +0,0 @@
-### WARNING:
-### actual site.csv file shouldn't contain lines starting with '#'
-### It will cause the parse to choke.
-### End of WARNING
-### This file needs to be customized to reflect the configuration of your 
cluster
-### Store it as $BIGTOP_DEPLOY_PATH/config/site.csv
-### use --confdir=$BIGTOP_DEPLOY_PATH (see README for more info)
-# FQDN of Namenode
-hadoop_head_node,hadoopmaster.example.com
-# FQDN of standby node (for HA)
-#standby_head_node,standbyNN.example.com
-# FQDN of gateway node (if separate from NN)
-#standby_head_node,gateway.example.com
-# Storage directories (will be created if doesn't exist)
-hadoop_storage_dirs,/data/1,/data/2,/data/3,/data/4
-bigtop_yumrepo_uri,http://mirror.example.com/path/to/mirror/
-# A list of stack' components to be deployed can be specified via special
-# "$components" list. If $components isn't set then everything in the stack 
will
-# be installed as usual. Otherwise only a specified list will be set
-# Possible elements:
-# hadoop,yarn,hbase,tachyon,flume,solrcloud,spark,oozie,hcat,sqoop,httpfs,
-# hue,mahout,giraph,crunch,pig,hive,zookeeper
-# Example (to deploy only HDFS and YARN server and gateway parts)
-#components,hadoop,yarn
-# Test-only variable controls if user hdfs' sshkeys should be installed to 
allow
-# for passwordless login across the cluster. Required by some integration tests
-#testonly_hdfs_sshkeys=no
-
diff --git a/bigtop-deploy/puppet/hiera.yaml b/bigtop-deploy/puppet/hiera.yaml
new file mode 100644
index 0000000..b276006
--- /dev/null
+++ b/bigtop-deploy/puppet/hiera.yaml
@@ -0,0 +1,7 @@
+---
+:yaml:
+  :datadir: /etc/puppet/hieradata
+:hierarchy:
+  - site
+  - "bigtop/%{hadoop_hiera_ha_path}"
+  - bigtop/cluster
diff --git a/bigtop-deploy/puppet/hieradata/bigtop/cluster.yaml 
b/bigtop-deploy/puppet/hieradata/bigtop/cluster.yaml
new file mode 100644
index 0000000..41c8e31
--- /dev/null
+++ b/bigtop-deploy/puppet/hieradata/bigtop/cluster.yaml
@@ -0,0 +1,123 @@
+---
+### This file implements defaults and some dependant parameter defaulting 
logic.
+### Every parameter can be overridden using the hiera lookup hierarchy. The 
enclosd
+### hiera.yaml provides for this by adding a site.yaml to the lookup where
+### site-specific overrides can be placed. Therefore this file should never 
need
+### changing by site admins.
+
+# FQDN of Namenode
+#bigtop::hadoop_head_node: "hadoopmaster.example.com"
+# FQDN of standby node (enables HA if set)
+#bigtop::hadoop_standby_head_node: "standbyNN.example.com"
+# FQDN of gateway node (if separate from NN)
+#bigtop::hadoop_gateway_node: "gateway.example.com"
+
+# A list of stack' components to be deployed can be specified via special
+# "$components" list. If $components isn't set then everything in the stack 
will
+# be installed as usual. Otherwise only a specified list will be set
+# Possible elements:
+# hadoop,yarn,hbase,tachyon,flume,solrcloud,spark,oozie,hcat,sqoop,httpfs,
+# hue,mahout,giraph,crunch,pig,hive,zookeeper
+# Example (to deploy only HDFS and YARN server and gateway parts)
+# This can be a comma-separated list or an array.
+#hadoop_cluster_node::cluster_components:
+#  - hadoop
+#  - yarn
+
+# Storage directories (will be created if doesn't exist)
+#hadoop::hadoop_storage_dirs:
+#  - /data/1
+#  - /data/2
+#  - /data/3
+#  - /data/4
+
+#bigtop::bigtop_yumrepo_uri: "http://mirror.example.com/path/to/mirror/";
+
+# Test-only variable controls if user hdfs' sshkeys should be installed to 
allow
+# for passwordless login across the cluster. Required by some integration tests
+#hadoop::common_hdfs::testonly_hdfs_sshkeys: "no"
+
+# Default
+#hadoop::common_hdfs::ha: "disabled"
+
+# Kerberos
+#hadoop::hadoop_security_authentication: "kerberos"
+#kerberos::site::domain: "do.main"
+#kerberos::site::realm: "DO.MAIN"
+#kerberos::site::kdc_server: "localhost"
+#kerberos::site::kdc_port: "88"
+#kerberos::site::admin_port: "749"
+#kerberos::site::keytab_export_dir: "/var/lib/bigtop_keytabs"
+
+hadoop::common_hdfs::hadoop_namenode_host: 
"%{hiera('bigtop::hadoop_head_node')}"
+# actually default but needed for hadoop_namenode_uri here
+hadoop::common_hdfs::hadoop_namenode_port: "8020"
+
+hadoop::common_yarn::hadoop_ps_host: "%{hiera('bigtop::hadoop_head_node')}"
+hadoop::common_yarn::hadoop_rm_host: "%{hiera('bigtop::hadoop_head_node')}"
+# actually default but needed for hue::server::rm_port here
+hadoop::common_yarn::hadoop_rm_port: "8032"
+hadoop::common_yarn::kerberos_realm: "%{hiera('kerberos::site::realm')}"
+
+hadoop::common_mapred_app::hadoop_hs_host: 
"%{hiera('bigtop::hadoop_head_node')}"
+hadoop::common_mapred_app::hadoop_jobtracker_host: 
"%{hiera('bigtop::hadoop_head_node')}"
+
+# actually default but needed for hue::server::webhdfs_url here
+hadoop::httpfs::hadoop_httpfs_port: "14000"
+
+bigtop::hadoop_zookeeper_port: "2181"
+hadoop::zk: 
"%{hiera('bigtop::hadoop_head_node')}:%{hiera('bigtop::hadoop_zookeeper_port')}"
+
+bigtop::hadoop_namenode_uri: 
"hdfs://%{hiera('hadoop::common_hdfs::hadoop_namenode_host')}:%{hiera('hadoop::common_hdfs::hadoop_namenode_port')}"
+hadoop-hbase::base_relative_rootdir: "/hbase"
+hadoop-hbase::common_config::rootdir: 
"%{hiera('bigtop::hadoop_namenode_uri')}%{hiera('hadoop-hbase::base_relative_rootdir')}"
+hadoop-hbase::common_config::zookeeper_quorum: 
"%{hiera('bigtop::hadoop_head_node')}"
+hadoop-hbase::common_config::kerberos_realm: 
"%{hiera('kerberos::site::realm')}"
+hadoop-hbase::client::thrift: true
+
+solr::server::root_url: "%{hiera('bigtop::hadoop_namenode_uri')}"
+solr::server::zk: "%{hiera('hadoop::zk')}"
+solr::server::kerberos_realm: "%{hiera('kerberos::site::realm')}"
+# Default but needed here to make sure, hue uses the same port
+solr::server::port: "1978"
+
+hadoop-oozie::server::kerberos_realm: "%{hiera('kerberos::site::realm')}"
+
+hcatalog::server::kerberos_realm: "%{hiera('kerberos::site::realm')}"
+hcatalog::webhcat::server::kerberos_realm: "%{hiera('kerberos::site::realm')}"
+
+spark::common::spark_master_host: "%{hiera('bigtop::hadoop_head_node')}"
+
+tachyon::common::master_host: "%{hiera('bigtop::hadoop_head_node')}"
+
+hadoop-zookeeper::server::myid: "0"
+hadoop-zookeeper::server::ensemble:
+  - ["%{hiera('bigtop::hadoop_head_node')}:2888:3888"]
+hadoop-zookeeper::server::kerberos_realm: "%{hiera('kerberos::site::realm')}"
+
+# those are only here because they were present as extlookup keys previously
+bigtop::hadoop_rm_http_port: "8088"
+bigtop::hadoop_rm_proxy_port: "8088"
+bigtop::hadoop_history_server_port: "19888"
+bigtop::sqoop_server_port: "<never defined correctly>"
+bigtop::hbase_thrift_port: "9090"
+bigtop::hadoop_oozie_port: "11000"
+
+hue::server::rm_host: "%{hiera('hadoop::common_yarn::hadoop_rm_host')}"
+hue::server::rm_port: "%{hiera('hadoop::common_yarn::hadoop_rm_port')}"
+hue::server::rm_url: 
"http://%{hiera('bigtop::hadoop_head_node')}:%{hiera('bigtop::hadoop_rm_http_port')}"
+hue::server::rm_proxy_url: 
"http://%{hiera('bigtop::hadoop_head_node')}:%{hiera('bigtop::hadoop_rm_proxy_port')}"
+hue::server::history_server_url: 
"http://%{hiera('bigtop::hadoop_head_node')}:%{hiera('bigtop::hadoop_history_server_port')}"
+# those use fqdn instead of hadoop_head_node because it's only ever activated
+# on the gatewaynode
+hue::server::webhdfs_url: 
"http://%{fqdn}:%{hiera('hadoop::httpfs::hadoop_httpfs_port')}/webhdfs/v1"
+hue::server::sqoop_url: 
"http://%{fqdn}:%{hiera('bigtop::sqoop_server_port')}/sqoop"
+hue::server::solr_url: "http://%{fqdn}:%{hiera('solr::server::port')}/solr/"
+hue::server::hbase_thrift_url: "%{fqdn}:%{hiera('bigtop::hbase_thrift_port')}"
+hue::server::oozie_url: 
"http://%{hiera('bigtop::hadoop_head_node')}:%{hiera('bigtop::hadoop_oozie_port')}/oozie"
+hue::server::default_fs: "%{hiera('bigtop::hadoop_namenode_uri')}"
+hue::server::kerberos_realm: "%{hiera('kerberos::site::realm')}"
+
+giraph::client::zookeeper_quorum: "%{hiera('bigtop::hadoop_head_node')}"
+
+hadoop-hive::client::hbase_zookeeper_quorum: 
"%{hiera('hadoop-hbase::common_config::zookeeper_quorum')}"
diff --git a/bigtop-deploy/puppet/hieradata/bigtop/ha.yaml 
b/bigtop-deploy/puppet/hieradata/bigtop/ha.yaml
new file mode 100644
index 0000000..3654987
--- /dev/null
+++ b/bigtop-deploy/puppet/hieradata/bigtop/ha.yaml
@@ -0,0 +1,7 @@
+---
+hadoop::common_hdfs::ha: "manual"
+hadoop::common_hdfs::hadoop_namenode_host:
+  - "%{hiera('bigtop::hadoop_head_node')}"
+  - "%{hiera('bigtop::standby_head_node')}"
+hadoop::common_hdfs::hadoop_ha_nameservice_id: "ha-nn-uri"
+hadoop_cluster_node::hadoop_namenode_uri: 
"hdfs://%{hiera('hadoop_ha_nameservice_id')}:8020"
diff --git a/bigtop-deploy/puppet/hieradata/bigtop/noha.yaml 
b/bigtop-deploy/puppet/hieradata/bigtop/noha.yaml
new file mode 100644
index 0000000..ac81412
--- /dev/null
+++ b/bigtop-deploy/puppet/hieradata/bigtop/noha.yaml
@@ -0,0 +1,2 @@
+---
+# all done via defaults
diff --git a/bigtop-deploy/puppet/hieradata/site.yaml 
b/bigtop-deploy/puppet/hieradata/site.yaml
new file mode 100644
index 0000000..339e2ab
--- /dev/null
+++ b/bigtop-deploy/puppet/hieradata/site.yaml
@@ -0,0 +1,32 @@
+---
+bigtop::hadoop_head_node: "head.node.fqdn"
+#bigtop::standby_head_node: "standby.head.node.fqdn"
+
+hadoop::hadoop_storage_dirs:
+  - /data/1
+  - /data/2
+  - /data/3
+  - /data/4
+
+#hadoop_cluster_node::cluster_components:
+#  - crunch
+#  - flume
+#  - giraph
+#  - hbase
+#  - hcat
+#  - hive
+#  - httpfs
+#  - hue
+#  - mahout
+#  - mapred-app
+#  - oozie
+#  - pig
+#  - solrcloud
+#  - spark
+#  - sqoop
+#  - tachyon
+#  - yarn
+#  - zookeeper
+
+# Debian:
+#bigtop::jdk_package_name: "openjdk-7-jre-headless"
diff --git a/bigtop-deploy/puppet/manifests/cluster.pp 
b/bigtop-deploy/puppet/manifests/cluster.pp
index 903f3e8..d4bae8a 100644
--- a/bigtop-deploy/puppet/manifests/cluster.pp
+++ b/bigtop-deploy/puppet/manifests/cluster.pp
@@ -13,131 +13,37 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-class hadoop_cluster_node {
-  require bigtop_util  
-
-  $hadoop_head_node        = extlookup("hadoop_head_node") 
-  $standby_head_node = extlookup("standby_head_node", "")
-  $hadoop_gateway_node     = extlookup("hadoop_gateway_node", 
$hadoop_head_node)
-
-  $hadoop_ha = $standby_head_node ? {
-    ""      => disabled,
-    default => extlookup("hadoop_ha", "manual"),
-  }
-
-
-  $hadoop_namenode_host        = $hadoop_ha ? {
-    "disabled" => $hadoop_head_node,
-    default    => [ $hadoop_head_node, $standby_head_node ],
-  }
-  $hadoop_namenode_port        = extlookup("hadoop_namenode_port", "8020")
-  $hadoop_dfs_namenode_plugins = extlookup("hadoop_dfs_namenode_plugins", "")
-  $hadoop_dfs_datanode_plugins = extlookup("hadoop_dfs_datanode_plugins", "")
-  # $hadoop_dfs_namenode_plugins="org.apache.hadoop.thriftfs.NamenodePlugin"
-  # $hadoop_dfs_datanode_plugins="org.apache.hadoop.thriftfs.DatanodePlugin"
-  $hadoop_ha_nameservice_id    = extlookup("hadoop_ha_nameservice_id", 
"ha-nn-uri")
-  $hadoop_namenode_uri   = $hadoop_ha ? {
-    "disabled" => "hdfs://$hadoop_namenode_host:$hadoop_namenode_port",
-    default    => "hdfs://${hadoop_ha_nameservice_id}:8020",
-  }
-
-  $hadoop_rm_host        = $hadoop_head_node
-  $hadoop_rt_port        = extlookup("hadoop_rt_port", "8025")
-  $hadoop_rm_port        = extlookup("hadoop_rm_port", "8032")
-  $hadoop_sc_port        = extlookup("hadoop_sc_port", "8030")
-
-  $hadoop_hs_host        = $hadoop_head_node
-  $hadoop_hs_port        = extlookup("hadoop_hs_port", "10020")
-  $hadoop_hs_webapp_port = extlookup("hadoop_hs_webapp_port", "19888")
-
-  $hadoop_ps_host        = $hadoop_head_node
-  $hadoop_ps_port        = extlookup("hadoop_ps_port", "20888")
-
-  $hadoop_jobtracker_host            = $hadoop_head_node
-  $hadoop_jobtracker_port            = extlookup("hadoop_jobtracker_port", 
"8021")
-  $hadoop_mapred_jobtracker_plugins  = 
extlookup("hadoop_mapred_jobtracker_plugins", "")
-  $hadoop_mapred_tasktracker_plugins = 
extlookup("hadoop_mapred_tasktracker_plugins", "")
-
-  $hadoop_zookeeper_port             = extlookup("hadoop_zookeeper_port", 
"2181")
-  $solrcloud_port                    = extlookup("solrcloud_port", "1978")
-  $solrcloud_admin_port              = extlookup("solrcloud_admin_port", 
"1979")
-  $hadoop_oozie_port                 = extlookup("hadoop_oozie_port", "11000")
-  $hadoop_httpfs_port                = extlookup("hadoop_httpfs_port", "14000")
-  $hadoop_rm_http_port               = extlookup("hadoop_rm_http_port", "8088")
-  $hadoop_rm_proxy_port              = extlookup("hadoop_rm_proxy_port", 
"8088")
-  $hadoop_history_server_port        = extlookup("hadoop_history_server_port", 
"19888")
-  $hbase_thrift_port                 = extlookup("hbase_thrift_port", "9090")
-  $spark_master_port                 = extlookup("spark_master_port", "7077")
-  $spark_master_ui_port              = extlookup("spark_master_ui_port", 
"18080")
-
-  # Lookup comma separated components (i.e. hadoop,spark,hbase ).
-  $components_tmp                        = extlookup("components",    
split($components, ","))
+class hadoop_cluster_node (
+  $hadoop_security_authentication = 
hiera("hadoop::hadoop_security_authentication", "simple"),
+
+  # Lookup component array or comma separated components (i.e.
+  # hadoop,spark,hbase ) as a default via facter.
+  $cluster_components = "$::components"
+  ) {
   # Ensure (even if a single value) that the type is an array.
-  if is_array($components_tmp) {
-    $components = $components_tmp
-  }
-  else {
-    $components = any2array($components_tmp,",")
+  if is_array($cluster_components) {
+    $components = $cluster_components
+  } else {
+    $components = any2array($cluster_components, ",")
   }
 
   $all = ($components[0] == undef)
 
-  $hadoop_ha_zookeeper_quorum        = 
"${hadoop_head_node}:${hadoop_zookeeper_port}"
-  $solrcloud_zk                      = 
"${hadoop_head_node}:${hadoop_zookeeper_port}"
-  $hbase_thrift_address              = 
"${hadoop_head_node}:${hbase_thrift_port}"
-  $hadoop_oozie_url                  = 
"http://${hadoop_head_node}:${hadoop_oozie_port}/oozie";
-  $hadoop_httpfs_url                 = 
"http://${hadoop_head_node}:${hadoop_httpfs_port}/webhdfs/v1";
-  $sqoop_server_url                  = 
"http://${hadoop_head_node}:${sqoop_server_port}/sqoop";
-  $solrcloud_url                     = 
"http://${hadoop_head_node}:${solrcloud_port}/solr/";
-  $hadoop_rm_url                     = 
"http://${hadoop_head_node}:${hadoop_rm_http_port}";
-  $hadoop_rm_proxy_url               = 
"http://${hadoop_head_node}:${hadoop_rm_proxy_port}";
-  $hadoop_history_server_url         = 
"http://${hadoop_head_node}:${hadoop_history_server_port}";
-
-  $bigtop_real_users = [ 'jenkins', 'testuser', 'hudson' ]
-
-  $hadoop_core_proxyusers = { oozie => { groups => 
'hudson,testuser,root,hadoop,jenkins,oozie,httpfs,hue,users', hosts => "*" },
-                                hue => { groups => 
'hudson,testuser,root,hadoop,jenkins,oozie,httpfs,hue,users', hosts => "*" },
-                             httpfs => { groups => 
'hudson,testuser,root,hadoop,jenkins,oozie,httpfs,hue,users', hosts => "*" } }
-
-  $hbase_relative_rootdir        = extlookup("hadoop_hbase_rootdir", "/hbase")
-  $hadoop_hbase_rootdir = "$hadoop_namenode_uri$hbase_relative_rootdir"
-  $hadoop_hbase_zookeeper_quorum = $hadoop_head_node
-  $hbase_heap_size               = extlookup("hbase_heap_size", "1024")
-  $hbase_thrift_server           = $hadoop_head_node
-
-  $giraph_zookeeper_quorum       = $hadoop_head_node
-
-  $spark_master_host             = $hadoop_head_node
-  $tachyon_master_host            = $hadoop_head_node
-
-  $hadoop_zookeeper_ensemble = ["$hadoop_head_node:2888:3888"]
-
-  # Set from facter if available
-  $roots              = extlookup("hadoop_storage_dirs",       
split($hadoop_storage_dirs, ";"))
-  $namenode_data_dirs = extlookup("hadoop_namenode_data_dirs", 
append_each("/namenode", $roots))
-  $hdfs_data_dirs     = extlookup("hadoop_hdfs_data_dirs",     
append_each("/hdfs",     $roots))
-  $mapred_data_dirs   = extlookup("hadoop_mapred_data_dirs",   
append_each("/mapred",   $roots))
-  $yarn_data_dirs     = extlookup("hadoop_yarn_data_dirs",     
append_each("/yarn",     $roots))
-
-  $hadoop_security_authentication = extlookup("hadoop_security", "simple")
   if ($hadoop_security_authentication == "kerberos") {
-    $kerberos_domain     = extlookup("hadoop_kerberos_domain")
-    $kerberos_realm      = extlookup("hadoop_kerberos_realm")
-    $kerberos_kdc_server = extlookup("hadoop_kerberos_kdc_server")
-
     include kerberos::client
   }
 
   # Flume agent is the only component that goes on EVERY node in the cluster
   if ($all or "flume" in $components) {
-    hadoop-flume::agent { "flume agent":
-    }
+    include hadoop-flume::agent
   }
 }
 
 
 
-class hadoop_worker_node inherits hadoop_cluster_node {
+class hadoop_worker_node (
+  $bigtop_real_users = [ 'jenkins', 'testuser', 'hudson' ]
+  ) inherits hadoop_cluster_node {
   user { $bigtop_real_users:
     ensure     => present,
     system     => false,
@@ -150,80 +56,42 @@ class hadoop_worker_node inherits hadoop_cluster_node {
     User<||> -> Kerberos::Host_keytab<||>
   }
 
-  hadoop::datanode { "datanode":
-        namenode_host => $hadoop_namenode_host,
-        namenode_port => $hadoop_namenode_port,
-        dirs => $hdfs_data_dirs,
-        auth => $hadoop_security_authentication,
-        ha   => $hadoop_ha,
-  }
-
+  include hadoop::datanode
   if ($all or "yarn" in $components) {
-    hadoop::nodemanager { "nodemanager":
-          rm_host => $hadoop_rm_host,
-          rm_port => $hadoop_rm_port,
-          rt_port => $hadoop_rt_port,
-          dirs => $yarn_data_dirs,
-          auth => $hadoop_security_authentication,
-    }
+    include hadoop::nodemanager
   }
   if ($all or "hbase" in $components) {
-    hadoop-hbase::server { "hbase region server":
-          rootdir => $hadoop_hbase_rootdir,
-          heap_size => $hbase_heap_size,
-          zookeeper_quorum => $hadoop_hbase_zookeeper_quorum,
-          kerberos_realm => $kerberos_realm,
-    }
+    include hadoop-hbase::server
   }
 
   ### If mapred is not installed, yarn can fail.
   ### So, when we install yarn, we also need mapred for now.
   ### This dependency should be cleaned up eventually.
   if ($all or "mapred-app" or "yarn" in $components) {
-    hadoop::mapred-app { "mapred-app":
-          namenode_host => $hadoop_namenode_host,
-          namenode_port => $hadoop_namenode_port,
-          jobtracker_host => $hadoop_jobtracker_host,
-          jobtracker_port => $hadoop_jobtracker_port,
-          auth => $hadoop_security_authentication,
-          dirs => $mapred_data_dirs,
-    }
+    include hadoop::mapred-app
   }
 
   if ($all or "solrcloud" in $components) {
-    solr::server { "solrcloud server":
-         port        => $solrcloud_port,
-         port_admin  => $solrcloud_admin_port,
-         zk          => $solrcloud_zk,
-         root_url    => $hadoop_namenode_uri,
-         kerberos_realm => $kerberos_realm,
-    }
+    include solr::server
   }
 
   if ($all or "spark" in $components) {
-    spark::worker { "spark worker":
-         master_host    => $spark_master_host,
-         master_port    => $spark_master_port,
-         master_ui_port => $spark_master_ui_port,
-    }
+    include spark::worker
   }
 
-  if ($components[0] == undef or "tachyon" in $components) {
-    tachyon::worker { "tachyon worker":
-         master_host => $tachyon_master_host
-    }
+  if ($all or "tachyon" in $components) {
+    include tachyon::worker
   }
 
 }
 
 class hadoop_head_node inherits hadoop_worker_node {
-
   exec { "init hdfs":
     path    => ['/bin','/sbin','/usr/bin','/usr/sbin'],
     command => 'bash -x /usr/lib/hadoop/libexec/init-hdfs.sh',
     require => Package['hadoop-hdfs']
   }
-  Hadoop::Namenode<||> -> Hadoop::Datanode<||> -> Exec<| title == "init hdfs" 
|>
+  Class['Hadoop::Namenode'] -> Class['Hadoop::Datanode'] -> Exec<| title == 
"init hdfs" |>
 
 if ($hadoop_security_authentication == "kerberos") {
     include kerberos::server
@@ -231,196 +99,104 @@ if ($hadoop_security_authentication == "kerberos") {
     include kerberos::kdc::admin_server
   }
 
-  hadoop::namenode { "namenode":
-        host => $hadoop_namenode_host,
-        port => $hadoop_namenode_port,
-        dirs => $namenode_data_dirs,
-        auth => $hadoop_security_authentication,
-        ha   => $hadoop_ha,
-        zk   => $hadoop_ha_zookeeper_quorum,
-  }
+  include hadoop::namenode
 
-  if ($hadoop_ha == "disabled") {
-    hadoop::secondarynamenode { "secondary namenode":
-          namenode_host => $hadoop_namenode_host,
-          namenode_port => $hadoop_namenode_port,
-          auth => $hadoop_security_authentication,
-    }
+  if ($hadoop::common_hdfs::ha == "disabled") {
+    include hadoop::secondarynamenode
   }
 
   if ($all or "yarn" in $components) {
-    hadoop::resourcemanager { "resourcemanager":
-          host => $hadoop_rm_host,
-          port => $hadoop_rm_port,
-          rt_port => $hadoop_rt_port,
-          sc_port => $hadoop_sc_port,
-          auth => $hadoop_security_authentication,
-    }
-
-    hadoop::historyserver { "historyserver":
-          host => $hadoop_hs_host,
-          port => $hadoop_hs_port,
-          webapp_port => $hadoop_hs_webapp_port,
-          auth => $hadoop_security_authentication,
-    }
-
-    hadoop::proxyserver { "proxyserver":
-          host => $hadoop_ps_host,
-          port => $hadoop_ps_port,
-          auth => $hadoop_security_authentication,
-    }
-    Exec<| title == "init hdfs" |> -> Hadoop::Resourcemanager<||> -> 
Hadoop::Nodemanager<||>
-    Exec<| title == "init hdfs" |> -> Hadoop::Historyserver<||>
+    include hadoop::resourcemanager
+    include hadoop::historyserver
+    include hadoop::proxyserver
+    Exec<| title == "init hdfs" |> -> Class['Hadoop::Resourcemanager'] -> 
Class['Hadoop::Nodemanager']
+    Exec<| title == "init hdfs" |> -> Class['Hadoop::Historyserver']
   }
 
   if ($all or "hbase" in $components) {
-    hadoop-hbase::master { "hbase master":
-          rootdir => $hadoop_hbase_rootdir,
-          heap_size => $hbase_heap_size,
-          zookeeper_quorum => $hadoop_hbase_zookeeper_quorum,
-          kerberos_realm => $kerberos_realm,
-    }
-    Exec<| title == "init hdfs" |> -> Hadoop-hbase::Master<||>
+    include hadoop-hbase::master
+    Exec<| title == "init hdfs" |> -> Class['Hadoop-hbase::Master']
   }
 
   if ($all or "oozie" in $components) {
-    hadoop-oozie::server { "oozie server":
-          kerberos_realm => $kerberos_realm,
+    include hadoop-oozie::server
+    if ($all or "mapred-app" in $components) {
+      Class['Hadoop::Mapred-app'] -> Class['Hadoop-oozie::Server']
     }
-    Hadoop::Mapred-app<||> -> Hadoop-oozie::Server<||>
-    Exec<| title == "init hdfs" |> -> Hadoop-oozie::Server<||>
+    Exec<| title == "init hdfs" |> -> Class['Hadoop-oozie::Server']
   }
 
   if ($all or "hcat" in $components) {
-  hcatalog::server { "hcatalog server":
-        kerberos_realm => $kerberos_realm,
-  }
-  hcatalog::webhcat::server { "webhcat server":
-        kerberos_realm => $kerberos_realm,
-  }
+    include hcatalog::server
+    include hcatalog::webhcat::server
   }
 
   if ($all or "spark" in $components) {
-  spark::master { "spark master":
-       master_host    => $spark_master_host,
-       master_port    => $spark_master_port,
-       master_ui_port => $spark_master_ui_port,
-  }
+    include spark::master
   }
 
-  if ($all == undef or "tachyon" in $components) {
-   tachyon::master { "tachyon-master":
-       master_host => $tachyon_master_host
-   }
+  if ($all or "tachyon" in $components) {
+   include tachyon::master
   }
 
   if ($all or "hbase" in $components) {
-    hadoop-zookeeper::server { "zookeeper":
-          myid => "0",
-          ensemble => $hadoop_zookeeper_ensemble,
-          kerberos_realm => $kerberos_realm,
-    }
+    include hadoop-zookeeper::server
   }
 
-  Exec<| title == "init hdfs" |> -> Hadoop::Rsync_hdfs<||>
-
+  # class hadoop::rsync_hdfs isn't used anywhere
+  #Exec<| title == "init hdfs" |> -> Class['Hadoop::Rsync_hdfs']
 }
 
 class standby_head_node inherits hadoop_cluster_node {
-  hadoop::namenode { "namenode":
-        host => $hadoop_namenode_host,
-        port => $hadoop_namenode_port,
-        dirs => $namenode_data_dirs,
-        auth => $hadoop_security_authentication,
-        ha   => $hadoop_ha,
-        zk   => $hadoop_ha_zookeeper_quorum,
-  }
+  include hadoop::namenode
 }
 
 class hadoop_gateway_node inherits hadoop_cluster_node {
-  $hbase_thrift_address              = "${fqdn}:${hbase_thrift_port}"
-  $hadoop_httpfs_url                 = 
"http://${fqdn}:${hadoop_httpfs_port}/webhdfs/v1";
-  $sqoop_server_url                  = 
"http://${fqdn}:${sqoop_server_port}/sqoop";
-  $solrcloud_url                     = "http://${fqdn}:${solrcloud_port}/solr/";
-
   if ($all or "sqoop" in $components) {
-    hadoop-sqoop::server { "sqoop server":
-    }
+    include hadoop-sqoop::server
   }
 
   if ($all or "httpfs" in $components) {
-    hadoop::httpfs { "httpfs":
-          namenode_host => $hadoop_namenode_host,
-          namenode_port => $hadoop_namenode_port,
-          auth => $hadoop_security_authentication,
+    include hadoop::httpfs
+    if ($all or "hue" in $components) {
+      Class['Hadoop::Httpfs'] -> Class['Hue::Server']
     }
-    Hadoop::Httpfs<||> -> Hue::Server<||>
   }
 
   if ($all or "hue" in $components) {
-    hue::server { "hue server":
-          rm_url      => $hadoop_rm_url,
-          rm_proxy_url => $hadoop_rm_proxy_url,
-          history_server_url => $hadoop_history_server_url,
-          webhdfs_url => $hadoop_httpfs_url,
-          sqoop_url   => $sqoop_server_url,
-          solr_url    => $solrcloud_url,
-          hbase_thrift_url => $hbase_thrift_address,
-          rm_host     => $hadoop_rm_host,
-          rm_port     => $hadoop_rm_port,
-          oozie_url   => $hadoop_oozie_url,
-          default_fs  => $hadoop_namenode_uri,
-          kerberos_realm => $kerberos_realm,
+    include hue::server
+    if ($all or "hbase" in $components) {
+      Class['Hadoop-hbase::Client'] -> Class['Hue::Server']
     }
   }
-  Hadoop-hbase::Client<||> -> Hue::Server<||>
 
-  hadoop::client { "hadoop client":
-    namenode_host => $hadoop_namenode_host,
-    namenode_port => $hadoop_namenode_port,
-    jobtracker_host => $hadoop_jobtracker_host,
-    jobtracker_port => $hadoop_jobtracker_port,
-    # auth => $hadoop_security_authentication,
-  }
+  include hadoop::client
 
   if ($all or "mahout" in $components) {
-    mahout::client { "mahout client":
-    }
+    include mahout::client
   }
   if ($all or "giraph" in $components) {
-    giraph::client { "giraph client":
-       zookeeper_quorum => $giraph_zookeeper_quorum,
-    }
+    include giraph::client
   }
   if ($all or "crunch" in $components) {
-    crunch::client { "crunch client":
-    }
+    include crunch::client
   }
   if ($all or "pig" in $components) {
-    hadoop-pig::client { "pig client":
-    }
+    include hadoop-pig::client
   }
   if ($all or "hive" in $components) {
-    hadoop-hive::client { "hive client":
-       hbase_zookeeper_quorum => $hadoop_hbase_zookeeper_quorum,
-    }
+    include hadoop-hive::client
   }
   if ($all or "sqoop" in $components) {
-    hadoop-sqoop::client { "sqoop client":
-    }
+    include hadoop-sqoop::client
   }
   if ($all or "oozie" in $components) {
-    hadoop-oozie::client { "oozie client":
-    }
+    include hadoop-oozie::client
   }
   if ($all or "hbase" in $components) {
-    hadoop-hbase::client { "hbase thrift client":
-      thrift => true,
-      kerberos_realm => $kerberos_realm,
-    }
+    include hadoop-hbase::client
   }
   if ($all or "zookeeper" in $components) {
-    hadoop-zookeeper::client { "zookeeper client":
-    }
+    include hadoop-zookeeper::client
   }
 }
diff --git a/bigtop-deploy/puppet/manifests/site.pp 
b/bigtop-deploy/puppet/manifests/site.pp
index 8997140..dd5921c 100644
--- a/bigtop-deploy/puppet/manifests/site.pp
+++ b/bigtop-deploy/puppet/manifests/site.pp
@@ -13,19 +13,15 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-require bigtop_util
-$puppet_confdir = get_setting("confdir")
 $default_yumrepo = 
"http://bigtop01.cloudera.org:8080/view/Releases/job/Bigtop-0.8.0/label=centos6/6/artifact/output/";
-$extlookup_datadir="$puppet_confdir/config"
-$extlookup_precedence = ["site", "default"]
-$jdk_package_name = extlookup("jdk_package_name", "jdk")
+$jdk_package_name = hiera("bigtop::jdk_package_name", "jdk")
 
 stage {"pre": before => Stage["main"]}
 
 case $operatingsystem {
     /(OracleLinux|Amazon|CentOS|Fedora|RedHat)/: {
        yumrepo { "Bigtop":
-          baseurl => extlookup("bigtop_yumrepo_uri", $default_yumrepo),
+          baseurl => hiera("hiera::bigtop_yumrepo_uri", $default_yumrepo),
           descr => "Bigtop packages",
           enabled => 1,
           gpgcheck => 0,
@@ -44,10 +40,16 @@ package { $jdk_package_name:
 import "cluster.pp"
 
 node default {
-  include stdlib
-  $hadoop_head_node = extlookup("hadoop_head_node") 
-  $standby_head_node = extlookup("standby_head_node", "")
-  $hadoop_gateway_node = extlookup("hadoop_gateway_node", $hadoop_head_node)
+  $hadoop_head_node = hiera("bigtop::hadoop_head_node")
+  $standby_head_node = hiera("bigtop::standby_head_node", "")
+  $hadoop_gateway_node = hiera("bigtop::hadoop_gateway_node", 
$hadoop_head_node)
+
+  # look into alternate hiera datasources configured using this path in
+  # hiera.yaml
+  $hadoop_hiera_ha_path = $standby_head_node ? {
+    ""      => "noha",
+    default => "ha",
+  }
 
   case $::fqdn {
     $hadoop_head_node: {
@@ -69,7 +71,7 @@ node default {
 Yumrepo<||> -> Package<||>
 
 if versioncmp($::puppetversion,'3.6.1') >= 0 {
-  $allow_virtual_packages = hiera('allow_virtual_packages',false)
+  $allow_virtual_packages = hiera('bigtop::allow_virtual_packages',false)
   Package {
     allow_virtual => $allow_virtual_packages,
   }
diff --git 
a/bigtop-deploy/puppet/modules/bigtop_util/lib/puppet/parser/functions/append_each.rb
 
b/bigtop-deploy/puppet/modules/bigtop_util/lib/puppet/parser/functions/append_each.rb
deleted file mode 100644
index b360b1e..0000000
--- 
a/bigtop-deploy/puppet/modules/bigtop_util/lib/puppet/parser/functions/append_each.rb
+++ /dev/null
@@ -1,22 +0,0 @@
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-     
-# Append a string to every element of an array
-
-Puppet::Parser::Functions::newfunction(:append_each, :type => :rvalue) do 
|args|
-  suffix = (args[0].is_a? Array) ? args[0].join("") : args[0]
-  inputs = (args[1].is_a? Array) ? args[1] : [ args[1] ]
-  inputs.map { |item| item + suffix }
-end
diff --git a/bigtop-deploy/puppet/modules/crunch/manifests/init.pp 
b/bigtop-deploy/puppet/modules/crunch/manifests/init.pp
index d446667..b31edf6 100644
--- a/bigtop-deploy/puppet/modules/crunch/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/crunch/manifests/init.pp
@@ -14,7 +14,7 @@
 # limitations under the License.
 
 class crunch {
-  define client {
+  class client {
     package { ["crunch", "crunch-doc"]:
       ensure => latest,
     } 
diff --git a/bigtop-deploy/puppet/modules/giraph/manifests/init.pp 
b/bigtop-deploy/puppet/modules/giraph/manifests/init.pp
index 6652e40..1dc0d9b 100644
--- a/bigtop-deploy/puppet/modules/giraph/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/giraph/manifests/init.pp
@@ -14,7 +14,7 @@
 # limitations under the License.
 
 class giraph {
-  define client($zookeeper_quorum = 'localhost') {
+  class client($zookeeper_quorum = 'localhost') {
     package { "giraph":
       ensure => latest,
     } 
diff --git a/bigtop-deploy/puppet/modules/hadoop-flume/manifests/init.pp 
b/bigtop-deploy/puppet/modules/hadoop-flume/manifests/init.pp
index 8e3bf64..daf352a 100644
--- a/bigtop-deploy/puppet/modules/hadoop-flume/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/hadoop-flume/manifests/init.pp
@@ -14,7 +14,7 @@
 # limitations under the License.
 
 class hadoop-flume {
-  define agent($sources = [], $sinks = [], $channels = []) {
+  class agent($sources = [], $sinks = [], $channels = []) {
     package { "flume-agent":
       ensure => latest,
     } 
diff --git a/bigtop-deploy/puppet/modules/hadoop-hbase/manifests/init.pp 
b/bigtop-deploy/puppet/modules/hadoop-hbase/manifests/init.pp
index 3bbaa8a..5ef45b1 100644
--- a/bigtop-deploy/puppet/modules/hadoop-hbase/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/hadoop-hbase/manifests/init.pp
@@ -20,7 +20,7 @@ class hadoop-hbase {
     } 
   }
 
-  class common-config {
+  class common_config ($rootdir, $zookeeper_quorum, $kerberos_realm = "", 
$heap_size="1024") {
     include client-package
     if ($kerberos_realm) {
       require kerberos::client
@@ -45,8 +45,8 @@ class hadoop-hbase {
     }
   }
 
-  define client($thrift = false, $kerberos_realm = "") {
-    include common-config
+  class client($thrift = false) {
+    inclass common_config
 
     if ($thrift) {
       package { "hbase-thrift":
@@ -64,8 +64,8 @@ class hadoop-hbase {
     }
   }
 
-  define server($rootdir, $zookeeper_quorum, $kerberos_realm = "", 
$heap_size="1024") {
-    include common-config
+  class server {
+    include common_config
 
     package { "hbase-regionserver":
       ensure => latest,
@@ -81,8 +81,8 @@ class hadoop-hbase {
     Kerberos::Host_keytab <| title == "hbase" |> -> 
Service["hbase-regionserver"]
   }
 
-  define master($rootdir, $zookeeper_quorum, $kerberos_realm = "", 
$heap_size="1024") {
-    include common-config
+  class master {
+    include common_config
 
     package { "hbase-master":
       ensure => latest,
diff --git a/bigtop-deploy/puppet/modules/hadoop-hive/manifests/init.pp 
b/bigtop-deploy/puppet/modules/hadoop-hive/manifests/init.pp
index 891d4be..f9dede4 100644
--- a/bigtop-deploy/puppet/modules/hadoop-hive/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/hadoop-hive/manifests/init.pp
@@ -14,7 +14,7 @@
 # limitations under the License.
 
 class hadoop-hive {
-  define client($hbase_master = "", $hbase_zookeeper_quorum = "") {
+  class client($hbase_master = "", $hbase_zookeeper_quorum = "") {
     package { "hive":
       ensure => latest,
     } 
diff --git a/bigtop-deploy/puppet/modules/hadoop-oozie/manifests/init.pp 
b/bigtop-deploy/puppet/modules/hadoop-oozie/manifests/init.pp
index 46b937b..f1177e9 100644
--- a/bigtop-deploy/puppet/modules/hadoop-oozie/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/hadoop-oozie/manifests/init.pp
@@ -14,13 +14,13 @@
 # limitations under the License.
 
 class hadoop-oozie {
-  define client($kerberos_realm = "") {
+  class client {
     package { "oozie-client":
       ensure => latest,
     } 
   }
 
-  define server($kerberos_realm = "") {
+  class server($kerberos_realm = "") {
     if ($kerberos_realm) {
       require kerberos::client
       kerberos::host_keytab { "oozie":
diff --git a/bigtop-deploy/puppet/modules/hadoop-pig/manifests/init.pp 
b/bigtop-deploy/puppet/modules/hadoop-pig/manifests/init.pp
index f26047b..37bfde0 100644
--- a/bigtop-deploy/puppet/modules/hadoop-pig/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/hadoop-pig/manifests/init.pp
@@ -14,7 +14,7 @@
 # limitations under the License.
 
 class hadoop-pig {
-  define client {
+  class client {
     package { "pig":
       ensure => latest,
       require => Package["hadoop"],
diff --git a/bigtop-deploy/puppet/modules/hadoop-sqoop/manifests/init.pp 
b/bigtop-deploy/puppet/modules/hadoop-sqoop/manifests/init.pp
index d1d08db..e0223ba 100644
--- a/bigtop-deploy/puppet/modules/hadoop-sqoop/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/hadoop-sqoop/manifests/init.pp
@@ -14,13 +14,13 @@
 # limitations under the License.
 
 class hadoop-sqoop {
-  define client {
+  class client {
     package { "sqoop-client":
       ensure => latest,
     } 
   }
 
-  define server {
+  class server {
     package { "sqoop-server":
       ensure => latest,
     } 
diff --git a/bigtop-deploy/puppet/modules/hadoop-zookeeper/manifests/init.pp 
b/bigtop-deploy/puppet/modules/hadoop-zookeeper/manifests/init.pp
index 701590e..dfbb6eb 100644
--- a/bigtop-deploy/puppet/modules/hadoop-zookeeper/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/hadoop-zookeeper/manifests/init.pp
@@ -14,14 +14,14 @@
 # limitations under the License.
 
 class hadoop-zookeeper {
-  define client {
+  class client {
     package { "zookeeper":
       ensure => latest,
       require => Package["jdk"],
     } 
   }
 
-  define server($myid, $ensemble = ["localhost:2888:3888"],
+  class server($myid, $ensemble = ["localhost:2888:3888"],
                 $kerberos_realm = "") 
   {
     package { "zookeeper-server":
diff --git a/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp 
b/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp
index 32eebe2..2c631ba 100644
--- a/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp
@@ -13,7 +13,16 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-class hadoop {
+class hadoop ($hadoop_security_authentication = "simple",
+  $zk = "",
+  # Set from facter if available
+  $hadoop_storage_dirs = split($::hadoop_storage_dirs, ";"),
+  $proxyusers = {
+    oozie => { groups => 
'hudson,testuser,root,hadoop,jenkins,oozie,httpfs,hue,users', hosts => "*" },
+                  hue => { groups => 
'hudson,testuser,root,hadoop,jenkins,oozie,httpfs,hue,users', hosts => "*" },
+               httpfs => { groups => 
'hudson,testuser,root,hadoop,jenkins,oozie,httpfs,hue,users', hosts => "*" } } 
) {
+
+  include stdlib
 
   /**
    * Common definitions for hadoop nodes.
@@ -36,8 +45,28 @@ class hadoop {
     }
   }
 
-  class common {
-    if ($auth == "kerberos") {
+  class common ($hadoop_java_home = undef,
+      $hadoop_classpath = undef,
+      $hadoop_heapsize = undef,
+      $hadoop_opts = undef,
+      $hadoop_namenode_opts = undef,
+      $hadoop_secondarynamenode_opts = undef,
+      $hadoop_datanode_opts = undef,
+      $hadoop_balancer_opts = undef,
+      $hadoop_jobtracker_opts = undef,
+      $hadoop_tasktracker_opts = undef,
+      $hadoop_client_opts = undef,
+      $hadoop_ssh_opts = undef,
+      $hadoop_log_dir = undef,
+      $hadoop_slaves = undef,
+      $hadoop_master = undef,
+      $hadoop_slave_sleep = undef,
+      $hadoop_pid_dir = undef,
+      $hadoop_ident_string = undef,
+      $hadoop_niceness = undef,
+      $hadoop_security_authentication = 
$hadoop::hadoop_security_authentication ) inherits hadoop {
+
+    if ($hadoop_security_authentication == "kerberos") {
       include hadoop::kerberos
     }
 
@@ -58,7 +87,25 @@ class hadoop {
     #}
   }
 
-  class common-yarn inherits common {
+  class common_yarn (
+      $yarn_data_dirs = suffix($hadoop::hadoop_storage_dirs, "/yarn"),
+      $kerberos_realm = undef,
+      $hadoop_ps_host,
+      $hadoop_ps_port = "20888",
+      $hadoop_rm_host,
+      $hadoop_rm_port = "8032",
+      $hadoop_rt_port = "8025",
+      $hadoop_sc_port = "8030",
+      $yarn_nodemanager_resource_memory_mb = undef,
+      $yarn_scheduler_maximum_allocation_mb = undef,
+      $yarn_scheduler_minimum_allocation_mb = undef,
+      $yarn_resourcemanager_scheduler_class = undef,
+      $yarn_resourcemanager_ha_enabled = undef,
+      $yarn_resourcemanager_cluster_id = "ha-rm-uri",
+      $yarn_resourcemanager_zk_address = $hadoop::zk) inherits hadoop {
+
+    include common
+
     package { "hadoop-yarn":
       ensure => latest,
       require => [Package["jdk"], Package["hadoop"]],
@@ -76,18 +123,55 @@ class hadoop {
     }
   }
 
-  class common-hdfs inherits common {
+  class common_hdfs ($ha = "disabled",
+      $hadoop_config_dfs_block_size = undef,
+      $hadoop_config_namenode_handler_count = undef,
+      $hadoop_dfs_datanode_plugins = "",
+      $hadoop_dfs_namenode_plugins = "",
+      $hadoop_namenode_host = $fqdn,
+      $hadoop_namenode_port = "8020",
+      $hadoop_namenode_http_port = "50070",
+      $hadoop_namenode_https_port = "50470",
+      $hdfs_data_dirs = suffix($hadoop::hadoop_storage_dirs, "/hdfs"),
+      $hdfs_shortcut_reader_user = undef,
+      $hdfs_support_append = undef,
+      $hdfs_webhdfs_enabled = "true",
+      $hdfs_replication = undef,
+      $hdfs_datanode_fsdataset_volume_choosing_policy = undef,
+      $namenode_data_dirs = suffix($hadoop::hadoop_storage_dirs, "/namenode"),
+      $nameservice_id = "ha-nn-uri",
+      $journalnode_edits_dir = undef,
+      $shared_edits_dir = "/hdfs_shared",
+      $testonly_hdfs_sshkeys  = "no",
+      $hadoop_ha_sshfence_user_home = "/var/lib/hadoop-hdfs",
+      $sshfence_user = "hdfs",
+      $zk = $hadoop::zk,
+      $hadoop_config_fs_inmemory_size_mb = undef,
+      $hadoop_security_group_mapping = undef,
+      $hadoop_core_proxyusers = $hadoop::proxyusers,
+      $hadoop_snappy_codec = undef,
+      $hadoop_security_authentication = 
$hadoop::hadoop_security_authentication ) inherits hadoop {
+
+    require bigtop_util
+    $sshfence_keydir  = "$hadoop_ha_sshfence_user_home/.ssh"
+    $sshfence_keypath = "$sshfence_keydir/id_sshfence"
+    $puppet_confdir   = get_setting("confdir")
+    $configdir        = hiera("hadoop::configdir", "$puppet_confdir/config")
+    $sshfence_privkey = hiera("hadoop::common_hdfs::sshfence_privkey", 
"$configdir/hadoop/id_sshfence")
+    $sshfence_pubkey  = hiera("hadoop::common_hdfs::sshfence_pubkey", 
"$configdir/hadoop/id_sshfence.pub")
+
+    include common
+
   # Check if test mode is enforced, so we can install hdfs ssh-keys for 
passwordless
-    $testonly   = extlookup("testonly_hdfs_sshkeys", 'no')
-    if ($testonly == "yes") {
+    if ($testonly_hdfs_sshkeys == "yes") {
       notify{"WARNING: provided hdfs ssh keys are for testing purposes only.\n
         They shouldn't be used in production cluster": }
       $ssh_user        = "hdfs"
       $ssh_user_home   = "/var/lib/hadoop-hdfs"
       $ssh_user_keydir = "$ssh_user_home/.ssh"
       $ssh_keypath     = "$ssh_user_keydir/id_hdfsuser"
-      $ssh_privkey     = "$extlookup_datadir/hdfs/id_hdfsuser"
-      $ssh_pubkey      = "$extlookup_datadir/hdfs/id_hdfsuser.pub"
+      $ssh_privkey     = "$configdir/hdfs/id_hdfsuser"
+      $ssh_pubkey      = "$configdir/hdfs/id_hdfsuser.pub"
 
       file { $ssh_user_keydir:
         ensure  => directory,
@@ -113,14 +197,10 @@ class hadoop {
         require => File[$ssh_user_keydir],
       }
     }
-    if ($auth == "kerberos" and $ha != "disabled") {
+    if ($hadoop_security_authentication == "kerberos" and $ha != "disabled") {
       fail("High-availability secure clusters are not currently supported")
     }
 
-    if ($ha != 'disabled') {
-      $nameservice_id = extlookup("hadoop_ha_nameservice_id", "ha-nn-uri")
-    }
-
     package { "hadoop-hdfs":
       ensure => latest,
       require => [Package["jdk"], Package["hadoop"]],
@@ -139,7 +219,32 @@ class hadoop {
     }
   }
 
-  class common-mapred-app inherits common-hdfs {
+  class common_mapred_app (
+      $hadoop_config_io_sort_factor = undef,
+      $hadoop_config_io_sort_mb = undef,
+      $hadoop_config_mapred_child_ulimit = undef,
+      $hadoop_config_mapred_fairscheduler_assignmultiple = undef,
+      $hadoop_config_mapred_fairscheduler_sizebasedweight = undef,
+      $hadoop_config_mapred_job_tracker_handler_count = undef,
+      $hadoop_config_mapred_reduce_parallel_copies = undef,
+      $hadoop_config_mapred_reduce_slowstart_completed_maps = undef,
+      $hadoop_config_mapred_reduce_tasks_speculative_execution = undef,
+      $hadoop_config_tasktracker_http_threads = undef,
+      $hadoop_config_use_compression = undef,
+      $hadoop_hs_host = undef,
+      $hadoop_hs_port = "10020",
+      $hadoop_hs_webapp_port = "19888",
+      $hadoop_jobtracker_fairscheduler_weightadjuster = undef,
+      $hadoop_jobtracker_host,
+      $hadoop_jobtracker_port = "8021",
+      $hadoop_jobtracker_taskscheduler = undef,
+      $hadoop_mapred_jobtracker_plugins = "",
+      $hadoop_mapred_tasktracker_plugins = "",
+      $mapred_acls_enabled = undef,
+      $mapred_data_dirs = suffix($hadoop::hadoop_storage_dirs, "/mapred")) {
+
+    include common_hdfs
+
     package { "hadoop-mapreduce":
       ensure => latest,
       require => [Package["jdk"], Package["hadoop"]],
@@ -157,22 +262,8 @@ class hadoop {
     }
   }
 
-  define datanode ($namenode_host, $namenode_port, $port = "50075", $auth = 
"simple", $dirs = ["/tmp/data"], $ha = 'disabled') {
-
-    $hadoop_namenode_host           = $namenode_host
-    $hadoop_namenode_port           = $namenode_port
-    $hadoop_datanode_port           = $port
-    $hadoop_security_authentication = $auth
-
-    if ($ha != 'disabled') {
-      # Needed by hdfs-site.xml
-      $sshfence_keydir  = "/usr/lib/hadoop/.ssh"
-      $sshfence_keypath = "$sshfence_keydir/id_sshfence"
-      $sshfence_user    = extlookup("hadoop_ha_sshfence_user",    "hdfs") 
-      $shared_edits_dir = extlookup("hadoop_ha_shared_edits_dir", 
"/hdfs_shared")
-    }
-
-    include common-hdfs
+  class datanode {
+    include common_hdfs
 
     package { "hadoop-hdfs-datanode":
       ensure => latest,
@@ -189,11 +280,11 @@ class hadoop {
       ensure => running,
       hasstatus => true,
       subscribe => [Package["hadoop-hdfs-datanode"], 
File["/etc/hadoop/conf/core-site.xml"], File["/etc/hadoop/conf/hdfs-site.xml"], 
File["/etc/hadoop/conf/hadoop-env.sh"]],
-      require => [ Package["hadoop-hdfs-datanode"], 
File["/etc/default/hadoop-hdfs-datanode"], File[$dirs] ],
+      require => [ Package["hadoop-hdfs-datanode"], 
File["/etc/default/hadoop-hdfs-datanode"], 
File[$hadoop::common_hdfs::hdfs_data_dirs] ],
     }
     Kerberos::Host_keytab <| title == "hdfs" |> -> Exec <| tag == 
"namenode-format" |> -> Service["hadoop-hdfs-datanode"]
 
-    file { $dirs:
+    file { $hadoop::common_hdfs::hdfs_data_dirs:
       ensure => directory,
       owner => hdfs,
       group => hdfs,
@@ -202,14 +293,12 @@ class hadoop {
     }
   }
 
-  define httpfs ($namenode_host, $namenode_port, $port = "14000", $auth = 
"simple", $secret = "hadoop httpfs secret") {
-
-    $hadoop_namenode_host = $namenode_host
-    $hadoop_namenode_port = $namenode_port
-    $hadoop_httpfs_port = $port
-    $hadoop_security_authentication = $auth
+  class httpfs ($hadoop_httpfs_port = "14000",
+      $secret = "hadoop httpfs secret",
+      $hadoop_core_proxyusers = $hadoop::proxyusers,
+      $hadoop_security_authentcation = $hadoop::hadoop_security_authentication 
) inherits hadoop {
 
-    if ($auth == "kerberos") {
+    if ($hadoop_security_authentication == "kerberos") {
       kerberos::host_keytab { "httpfs":
         spnego => true,
         require => Package["hadoop-httpfs"],
@@ -255,11 +344,12 @@ class hadoop {
     }
   }
 
-  define create_hdfs_dirs($hdfs_dirs_meta, $auth="simple") {
+  class create_hdfs_dirs($hdfs_dirs_meta,
+      $hadoop_security_authentcation = $hadoop::hadoop_security_authentication 
) inherits hadoop {
     $user = $hdfs_dirs_meta[$title][user]
     $perm = $hdfs_dirs_meta[$title][perm]
 
-    if ($auth == "kerberos") {
+    if ($hadoop_security_authentication == "kerberos") {
       require hadoop::kinit
       Exec["HDFS kinit"] -> Exec["HDFS init $title"]
     }
@@ -272,10 +362,11 @@ class hadoop {
     Exec <| title == "activate nn1" |>  -> Exec["HDFS init $title"]
   }
 
-  define rsync_hdfs($files, $auth="simple") {
+  class rsync_hdfs($files,
+      $hadoop_security_authentcation = $hadoop::hadoop_security_authentication 
) inherits hadoop {
     $src = $files[$title]
 
-    if ($auth == "kerberos") {
+    if ($hadoop_security_authentication == "kerberos") {
       require hadoop::kinit
       Exec["HDFS kinit"] -> Exec["HDFS init $title"]
     }
@@ -288,25 +379,11 @@ class hadoop {
     Exec <| title == "activate nn1" |>  -> Exec["HDFS rsync $title"]
   }
 
-  define namenode ($host = $fqdn , $port = "8020", $auth = "simple", $dirs = 
["/tmp/nn"], $ha = 'disabled', $zk = '') {
-
-    $first_namenode = inline_template("<%= Array(@host)[0] %>")
-    $hadoop_namenode_host = $host
-    $hadoop_namenode_port = $port
-    $hadoop_security_authentication = $auth
-
-    if ($ha != 'disabled') {
-      $sshfence_user      = extlookup("hadoop_ha_sshfence_user",      "hdfs") 
-      $sshfence_user_home = extlookup("hadoop_ha_sshfence_user_home", 
"/var/lib/hadoop-hdfs")
-      $sshfence_keydir    = "$sshfence_user_home/.ssh"
-      $sshfence_keypath   = "$sshfence_keydir/id_sshfence"
-      $sshfence_privkey   = extlookup("hadoop_ha_sshfence_privkey",   
"$extlookup_datadir/hadoop/id_sshfence")
-      $sshfence_pubkey    = extlookup("hadoop_ha_sshfence_pubkey",    
"$extlookup_datadir/hadoop/id_sshfence.pub")
-      $shared_edits_dir   = extlookup("hadoop_ha_shared_edits_dir",   
"/hdfs_shared")
-      $nfs_server         = extlookup("hadoop_ha_nfs_server",         "")
-      $nfs_path           = extlookup("hadoop_ha_nfs_path",           "")
-
-      file { $sshfence_keydir:
+  class namenode ( $nfs_server = "", $nfs_path = "" ) {
+    include common_hdfs
+
+    if ($hadoop::common_hdfs::ha != 'disabled') {
+      file { $hadoop::common_hdfs::sshfence_keydir:
         ensure  => directory,
         owner   => 'hdfs',
         group   => 'hdfs',
@@ -314,25 +391,25 @@ class hadoop {
         require => Package["hadoop-hdfs"],
       }
 
-      file { $sshfence_keypath:
-        source  => $sshfence_privkey,
+      file { $hadoop::common_hdfs::sshfence_keypath:
+        source  => $hadoop::common_hdfs::sshfence_privkey,
         owner   => 'hdfs',
         group   => 'hdfs',
         mode    => '0600',
         before  => Service["hadoop-hdfs-namenode"],
-        require => File[$sshfence_keydir],
+        require => File[$hadoop::common_hdfs::sshfence_keydir],
       }
 
-      file { "$sshfence_keydir/authorized_keys":
+      file { "$hadoop::common_hdfs::sshfence_keydir/authorized_keys":
         source  => $sshfence_pubkey,
         owner   => 'hdfs',
         group   => 'hdfs',
         mode    => '0600',
         before  => Service["hadoop-hdfs-namenode"],
-        require => File[$sshfence_keydir],
+        require => File[$hadoop::common_hdfs::sshfence_keydir],
       }
 
-      file { $shared_edits_dir:
+      file { $hadoop::common_hdfs::shared_edits_dir:
         ensure => directory,
       }
 
@@ -343,20 +420,18 @@ class hadoop {
 
         require nfs::client
 
-        mount { $shared_edits_dir:
+        mount { $hadoop::common_hdfs::shared_edits_dir:
           ensure  => "mounted",
           atboot  => true,
           device  => "${nfs_server}:${nfs_path}",
           fstype  => "nfs",
           options => "tcp,soft,timeo=10,intr,rsize=32768,wsize=32768",
-          require => File[$shared_edits_dir],
+          require => File[$hadoop::common::hdfs::shared_edits_dir],
           before  => Service["hadoop-hdfs-namenode"],
         }
       }
     }
 
-    include common-hdfs
-
     package { "hadoop-hdfs-namenode":
       ensure => latest,
       require => Package["jdk"],
@@ -370,7 +445,7 @@ class hadoop {
     } 
     Kerberos::Host_keytab <| title == "hdfs" |> -> Exec <| tag == 
"namenode-format" |> -> Service["hadoop-hdfs-namenode"]
 
-    if ($ha == "auto") {
+    if ($hadoop::common_hdfs::ha == "auto") {
       package { "hadoop-hdfs-zkfc":
         ensure => latest,
         require => Package["jdk"],
@@ -385,17 +460,18 @@ class hadoop {
       Service <| title == "hadoop-hdfs-zkfc" |> -> Service <| title == 
"hadoop-hdfs-namenode" |>
     }
 
+    $first_namenode = any2array($hadoop::common_hdfs::hadoop_namenode_host)[0]
     if ($::fqdn == $first_namenode) {
       exec { "namenode format":
         user => "hdfs",
         command => "/bin/bash -c 'yes Y | hdfs namenode -format >> 
/var/lib/hadoop-hdfs/nn.format.log 2>&1'",
-        creates => "${dirs[0]}/current/VERSION",
-        require => [ Package["hadoop-hdfs-namenode"], File[$dirs], 
File["/etc/hadoop/conf/hdfs-site.xml"] ],
+        creates => 
"${hadoop::common_hdfs::namenode_data_dirs[0]}/current/VERSION",
+        require => [ Package["hadoop-hdfs-namenode"], 
File[$hadoop::common_hdfs::namenode_data_dirs], 
File["/etc/hadoop/conf/hdfs-site.xml"] ],
         tag     => "namenode-format",
       } 
 
-      if ($ha != "disabled") {
-        if ($ha == "auto") {
+      if ($hadoop::common_hdfs::ha != "disabled") {
+        if ($hadoop::common_hdfs::ha == "auto") {
           exec { "namenode zk format":
             user => "hdfs",
             command => "/bin/bash -c 'yes N | hdfs zkfc -formatZK >> 
/var/lib/hadoop-hdfs/zk.format.log 2>&1 || :'",
@@ -413,11 +489,11 @@ class hadoop {
           }
         }
       }
-    } elsif ($ha != "disabled") {
-      hadoop::namedir_copy { $namenode_data_dirs: 
+    } elsif ($hadoop::common_hdfs::ha != "disabled") {
+      hadoop::namedir_copy { $hadoop::common_hdfs::namenode_data_dirs:
         source       => $first_namenode,
-        ssh_identity => $sshfence_keypath,
-        require      => File[$sshfence_keypath],
+        ssh_identity => $hadoop::common_hdfs::sshfence_keypath,
+        require      => File[$hadoop::common_hdfs::sshfence_keypath],
       }
     }
 
@@ -427,7 +503,7 @@ class hadoop {
         require => [Package["hadoop-hdfs-namenode"]],
     }
     
-    file { $dirs:
+    file { $hadoop::common_hdfs::namenode_data_dirs:
       ensure => directory,
       owner => hdfs,
       group => hdfs,
@@ -445,12 +521,8 @@ class hadoop {
     }
   }
       
-  define secondarynamenode ($namenode_host, $namenode_port, $port = "50090", 
$auth = "simple") {
-
-    $hadoop_secondarynamenode_port = $port
-    $hadoop_security_authentication = $auth
-
-    include common-hdfs
+  class secondarynamenode {
+    include common_hdfs
 
     package { "hadoop-hdfs-secondarynamenode":
       ensure => latest,
@@ -472,15 +544,34 @@ class hadoop {
     Kerberos::Host_keytab <| title == "hdfs" |> -> 
Service["hadoop-hdfs-secondarynamenode"]
   }
 
+  class journalnode {
+    include common_hdfs
 
-  define resourcemanager ($host = $fqdn, $port = "8032", $rt_port = "8025", 
$sc_port = "8030", $auth = "simple") {
-    $hadoop_rm_host = $host
-    $hadoop_rm_port = $port
-    $hadoop_rt_port = $rt_port
-    $hadoop_sc_port = $sc_port
-    $hadoop_security_authentication = $auth
+    package { "hadoop-hdfs-journalnode":
+      ensure => latest,
+      require => Package["jdk"],
+    }
+ 
+    service { "hadoop-hdfs-journalnode":
+      ensure => running,
+      hasstatus => true,
+      subscribe => [Package["hadoop-hdfs-journalnode"], 
File["/etc/hadoop/conf/hadoop-env.sh"], 
+                    File["/etc/hadoop/conf/hdfs-site.xml"], 
File["/etc/hadoop/conf/core-site.xml"]],
+      require => [ Package["hadoop-hdfs-journalnode"], 
File[$hadoop::common_hdfs::journalnode_edits_dir] ],
+    }
+
+    file { 
"${hadoop::common_hdfs::journalnode_edits_dir}/${hadoop::common_hdfs::nameservice_id}":
+      ensure => directory,
+      owner => 'hdfs',
+      group => 'hdfs',
+      mode => 755,
+      require => [Package["hadoop-hdfs"]],
+    }
+  }
 
-    include common-yarn
+
+  class resourcemanager {
+    include common_yarn
 
     package { "hadoop-yarn-resourcemanager":
       ensure => latest,
@@ -497,12 +588,8 @@ class hadoop {
     Kerberos::Host_keytab <| tag == "mapreduce" |> -> 
Service["hadoop-yarn-resourcemanager"]
   }
 
-  define proxyserver ($host = $fqdn, $port = "8088", $auth = "simple") {
-    $hadoop_ps_host = $host
-    $hadoop_ps_port = $port
-    $hadoop_security_authentication = $auth
-
-    include common-yarn
+  class proxyserver {
+    include common_yarn
 
     package { "hadoop-yarn-proxyserver":
       ensure => latest,
@@ -519,13 +606,8 @@ class hadoop {
     Kerberos::Host_keytab <| tag == "mapreduce" |> -> 
Service["hadoop-yarn-proxyserver"]
   }
 
-  define historyserver ($host = $fqdn, $port = "10020", $webapp_port = 
"19888", $auth = "simple") {
-    $hadoop_hs_host = $host
-    $hadoop_hs_port = $port
-    $hadoop_hs_webapp_port = $app_port
-    $hadoop_security_authentication = $auth
-
-    include common-mapred-app
+  class historyserver {
+    include common_mapred_app
 
     package { "hadoop-mapreduce-historyserver":
       ensure => latest,
@@ -543,12 +625,8 @@ class hadoop {
   }
 
 
-  define nodemanager ($rm_host, $rm_port, $rt_port, $auth = "simple", $dirs = 
["/tmp/yarn"]){
-    $hadoop_rm_host = $rm_host
-    $hadoop_rm_port = $rm_port
-    $hadoop_rt_port = $rt_port
-
-    include common-yarn
+  class nodemanager {
+    include common_yarn
 
     package { "hadoop-yarn-nodemanager":
       ensure => latest,
@@ -560,11 +638,11 @@ class hadoop {
       hasstatus => true,
       subscribe => [Package["hadoop-yarn-nodemanager"], 
File["/etc/hadoop/conf/hadoop-env.sh"], 
                     File["/etc/hadoop/conf/yarn-site.xml"], 
File["/etc/hadoop/conf/core-site.xml"]],
-      require => [ Package["hadoop-yarn-nodemanager"], File[$dirs] ],
+      require => [ Package["hadoop-yarn-nodemanager"], 
File[$hadoop::common_yarn::yarn_data_dirs] ],
     }
     Kerberos::Host_keytab <| tag == "mapreduce" |> -> 
Service["hadoop-yarn-nodemanager"]
 
-    file { $dirs:
+    file { $hadoop::common_yarn::yarn_data_dirs:
       ensure => directory,
       owner => yarn,
       group => yarn,
@@ -573,21 +651,10 @@ class hadoop {
     }
   }
 
-  define mapred-app ($namenode_host, $namenode_port, $jobtracker_host, 
$jobtracker_port, $auth = "simple", $jobhistory_host = "", 
$jobhistory_port="10020", $dirs = ["/tmp/mr"]){
-    $hadoop_namenode_host = $namenode_host
-    $hadoop_namenode_port = $namenode_port
-    $hadoop_jobtracker_host = $jobtracker_host
-    $hadoop_jobtracker_port = $jobtracker_port
-    $hadoop_security_authentication = $auth
+  class mapred-app {
+    include common_mapred_app
 
-    include common-mapred-app
-
-    if ($jobhistory_host != "") {
-      $hadoop_hs_host = $jobhistory_host
-      $hadoop_hs_port = $jobhistory_port
-    }
-
-    file { $dirs:
+    file { $hadoop::common_mapred_app::mapred_data_dirs:
       ensure => directory,
       owner => yarn,
       group => yarn,
@@ -596,12 +663,9 @@ class hadoop {
     }
   }
 
-  define client ($namenode_host, $namenode_port, $jobtracker_host, 
$jobtracker_port, $auth = "simple") {
-      $hadoop_namenode_host = $namenode_host
-      $hadoop_namenode_port = $namenode_port
-      $hadoop_jobtracker_host = $jobtracker_host
-      $hadoop_jobtracker_port = $jobtracker_port
-      $hadoop_security_authentication = $auth
+  class client {
+      include common_mapred_app
+
       $hadoop_client_packages = $operatingsystem ? {
         /(OracleLinux|CentOS|RedHat|Fedora)/  => [ "hadoop-doc", 
"hadoop-hdfs-fuse", "hadoop-client", "hadoop-libhdfs", "hadoop-debuginfo" ],
         /(SLES|OpenSuSE)/                     => [ "hadoop-doc", 
"hadoop-hdfs-fuse", "hadoop-client", "hadoop-libhdfs" ],
@@ -609,8 +673,6 @@ class hadoop {
         default                               => [ "hadoop-doc", 
"hadoop-hdfs-fuse", "hadoop-client" ],
       }
 
-      include common-mapred-app
-  
       package { $hadoop_client_packages:
         ensure => latest,
         require => [Package["jdk"], Package["hadoop"], Package["hadoop-hdfs"], 
Package["hadoop-mapreduce"]],  
diff --git a/bigtop-deploy/puppet/modules/hadoop/templates/hadoop-env.sh 
b/bigtop-deploy/puppet/modules/hadoop/templates/hadoop-env.sh
index 6b28bdd..f2e355b 100644
--- a/bigtop-deploy/puppet/modules/hadoop/templates/hadoop-env.sh
+++ b/bigtop-deploy/puppet/modules/hadoop/templates/hadoop-env.sh
@@ -15,7 +15,7 @@
 
 <% def shell_config(shell_var, *puppet_var)
      puppet_var = puppet_var[0] || shell_var.downcase
-     if has_variable? puppet_var
+     if @puppet_var
         return "export #{shell_var}=#{scope.lookupvar(puppet_var)}"
      else
         return "#export #{shell_var}="
diff --git a/bigtop-deploy/puppet/modules/hadoop/templates/hdfs-site.xml 
b/bigtop-deploy/puppet/modules/hadoop/templates/hdfs-site.xml
index 351508d..11c59be 100644
--- a/bigtop-deploy/puppet/modules/hadoop/templates/hdfs-site.xml
+++ b/bigtop-deploy/puppet/modules/hadoop/templates/hdfs-site.xml
@@ -30,7 +30,7 @@
 <%   end -%>
 
   <property> 
-    <name>dfs.federation.nameservices</name>
+    <name>dfs.nameservices</name>
     <value><%= @nameservice_id %></value>
   </property>
 
@@ -47,7 +47,12 @@
  
   <property>
     <name>dfs.namenode.http-address.<%= @nameservice_id %>.nn<%= idx+1 
%></name>
-    <value><%= host %>:50070</value>
+    <value><%= host %>:<%= @hadoop_namenode_http_port %></value>
+  </property>
+
+  <property>
+    <name>dfs.namenode.https-address.<%= @nameservice_id %>.nn<%= idx+1 
%></name>
+    <value><%= host %>:<%= @hadoop_namenode_https_port %></value>
   </property>
 
 <%   end -%>
@@ -249,7 +254,28 @@
 <% end -%>
   <property>
     <name>dfs.webhdfs.enabled</name>
-    <value>true</value>
+    <value><%= @hdfs_webhdfs_enabled %></value>
+  </property>
+
+<% if @hdfs_datanode_fsdataset_volume_choosing_policy -%>
+  <property>
+    <name>dfs.datanode.fsdataset.volume.choosing.policy</name>
+    <value><%= @hdfs_datanode_fsdataset_volume_choosing_policy %></value>
+  </property>
+
+<% end -%>
+<% if @hdfs_replication -%>
+  <property>
+    <name>dfs.replication</name>
+    <value><%= @hdfs_replication %></value>
+  </property>
+
+<% end -%>
+<% if @journalnode_edits_dir -%>
+  <property>
+    <name>dfs.journalnode.edits.dir</name>
+    <value><%= @journalnode_edits_dir %></value>
   </property>
 
+<% end -%>
 </configuration>
diff --git a/bigtop-deploy/puppet/modules/hadoop/templates/yarn-site.xml 
b/bigtop-deploy/puppet/modules/hadoop/templates/yarn-site.xml
index 0713d97..57ce85a 100644
--- a/bigtop-deploy/puppet/modules/hadoop/templates/yarn-site.xml
+++ b/bigtop-deploy/puppet/modules/hadoop/templates/yarn-site.xml
@@ -17,6 +17,7 @@
 -->
 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 
+<% resourcemanager_hosts = Array(@hadoop_rm_host) -%>
 <configuration>
 <% if @hadoop_security_authentication == "kerberos" %>
   <!-- JobTracker security configs -->
@@ -61,6 +62,47 @@
     <value><%= @hadoop_ps_host %>:<%= @hadoop_ps_port %></value>
   </property> 
 
+<% if @yarn_resourcemanager_ha_enabled -%>
+
+  <property>
+    <name>yarn.resourcemanager.ha.enabled</name>
+    <value><%= @yarn_resourcemanager_ha_enabled %></value>
+  </property>
+
+  <property>
+    <name>yarn.resourcemanager.cluster-id</name>
+    <value><%= @yarn_resourcemanager_cluster_id %></value>
+  </property>
+
+  <property>
+    <name>yarn.resourcemanager.ha.rm-ids</name>
+    <value><%= (1..resourcemanager_hosts.length).map { |n| "rm#{n}" 
}.join(",") %></value>
+  </property>
+
+<%   resourcemanager_hosts.each_with_index do |host,idx| -%>
+  <property>
+    <name>yarn.resourcemanager.resource-tracker.address.rm<%= idx+1 %></name>
+    <value><%= host %>:<%= @hadoop_rt_port %></value>
+  </property>
+
+  <property>
+    <name>yarn.resourcemanager.address.rm<%= idx+1 %></name>
+    <value><%= host %>:<%= @hadoop_rm_port %></value>
+  </property>
+
+  <property>
+    <name>yarn.resourcemanager.scheduler.address.rm<%= idx+1 %></name>
+    <value><%= host %>:<%= @hadoop_sc_port %></value>
+  </property>
+<%   end -%>
+<% if @yarn_resourcemanager_zk_address -%>
+
+  <property>
+    <name>yarn.resourcemanager.zk-address</name>
+    <value><%= @yarn_resourcemanager_zk_address %></value>
+  </property>
+<% end -%>
+<% else -%>
   <property>
     <name>yarn.resourcemanager.resource-tracker.address</name>
     <value><%= @hadoop_rm_host %>:<%= @hadoop_rt_port %></value>
@@ -75,6 +117,7 @@
     <name>yarn.resourcemanager.scheduler.address</name>
     <value><%= @hadoop_rm_host %>:<%= @hadoop_sc_port %></value>
   </property>
+<% end -%>
 
   <property>
     <name>yarn.nodemanager.aux-services</name>
@@ -125,4 +168,32 @@
         $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*
      </value>
   </property>
+<% if @yarn_scheduler_minimum_allocation_mb -%>
+
+  <property>
+    <name>yarn.scheduler.minimum-allocation-mb</name>
+    <value><%= @yarn_scheduler_minimum_allocation_mb %></value>
+  </property>
+<% end -%>
+<% if @yarn_scheduler_maximum_allocation_mb -%>
+
+  <property>
+    <name>yarn.scheduler.maximum-allocation-mb</name>
+    <value><%= @yarn_scheduler_maximum_allocation_mb %></value>
+  </property>
+<% end -%>
+<% if @yarn_nodemanager_resource_memory_mb -%>
+
+  <property>
+    <name>yarn.nodemanager.resource.memory-mb</name>
+    <value><%= @yarn_nodemanager_resource_memory_mb %></value>
+  </property>
+<% end -%>
+<% if @yarn_resourcemanager_scheduler_class -%>
+
+  <property>
+    <name>yarn.resourcemanager.scheduler.class</name>
+    <value><%= @yarn_resourcemanager_scheduler_class %></value>
+  </property>
+<% end -%>
 </configuration>
diff --git a/bigtop-deploy/puppet/modules/hcatalog/manifests/init.pp 
b/bigtop-deploy/puppet/modules/hcatalog/manifests/init.pp
index f9c07aa..6585dd3 100644
--- a/bigtop-deploy/puppet/modules/hcatalog/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/hcatalog/manifests/init.pp
@@ -14,7 +14,7 @@
 # limitations under the License.
 
 class hcatalog {
-  define server($port = "9083", $kerberos_realm = "") {
+  class server($port = "9083", $kerberos_realm = "") {
     package { "hcatalog-server":
       ensure => latest,
     }
@@ -33,7 +33,7 @@ class hcatalog {
   }
 
   class webhcat {
-    define server($port = "50111", $kerberos_realm = "") {
+    class server($port = "50111", $kerberos_realm = "") {
       package { "webhcat-server":
         ensure => latest,
       }
diff --git a/bigtop-deploy/puppet/modules/hue/manifests/init.pp 
b/bigtop-deploy/puppet/modules/hue/manifests/init.pp
index f4c6f95..e5c7762 100644
--- a/bigtop-deploy/puppet/modules/hue/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/hue/manifests/init.pp
@@ -14,7 +14,7 @@
 # limitations under the License.
 
 class hue {
-  define server($sqoop_url, $solr_url, $hbase_thrift_url,
+  class server($sqoop_url, $solr_url, $hbase_thrift_url,
                 $webhdfs_url, $rm_host, $rm_port, $oozie_url, $rm_url, 
$rm_proxy_url, $history_server_url,
                 $hue_host = "0.0.0.0", $hue_port = "8888", $default_fs = 
"hdfs://localhost:8020",
                 $kerberos_realm = "") {
diff --git a/bigtop-deploy/puppet/modules/kerberos/manifests/init.pp 
b/bigtop-deploy/puppet/modules/kerberos/manifests/init.pp
index 5476235..dd83500 100644
--- a/bigtop-deploy/puppet/modules/kerberos/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/kerberos/manifests/init.pp
@@ -14,23 +14,12 @@
 # limitations under the License.
 
 class kerberos {
-  class site {
-    # The following is our interface to the world. This is what we allow
-    # users to tweak from the outside (see tests/init.pp for a complete
-    # example) before instantiating target classes.
-    # Once we migrate to Puppet 2.6 we can potentially start using 
-    # parametrized classes instead.
-    $domain     = $kerberos_domain     ? { '' => inline_template('<%= domain 
%>'),
-                                           default => $kerberos_domain }
-    $realm      = $kerberos_realm      ? { '' => inline_template('<%= 
domain.upcase %>'),
-                                           default => $kerberos_realm } 
-    $kdc_server = $kerberos_kdc_server ? { '' => 'localhost',
-                                           default => $kerberos_kdc_server }
-    $kdc_port   = $kerberos_kdc_port   ? { '' => '88', 
-                                           default => $kerberos_kdc_port } 
-    $admin_port = 749 /* BUG: linux daemon packaging doesn't let us tweak this 
*/
-
-    $keytab_export_dir = "/var/lib/bigtop_keytabs"
+  class site ($domain = inline_template('<%= domain %>'),
+      $realm = inline_template('<%= domain.upcase %>'),
+      $kdc_server = 'localhost',
+      $kdc_port = '88',
+      $admin_port = 749,
+      $keytab_export_dir = "/var/lib/bigtop_keytabs") {
 
     case $operatingsystem {
         'ubuntu': {
diff --git a/bigtop-deploy/puppet/modules/mahout/manifests/init.pp 
b/bigtop-deploy/puppet/modules/mahout/manifests/init.pp
index 9f10b17..0d9bd8c 100644
--- a/bigtop-deploy/puppet/modules/mahout/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/mahout/manifests/init.pp
@@ -14,7 +14,7 @@
 # limitations under the License.
 
 class mahout {
-  define client {
+  class client {
     package { "mahout":
       ensure => latest,
       require => Package["hadoop"],
diff --git a/bigtop-deploy/puppet/modules/solr/manifests/init.pp 
b/bigtop-deploy/puppet/modules/solr/manifests/init.pp
index 22c4d9e..119fbd1 100644
--- a/bigtop-deploy/puppet/modules/solr/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/solr/manifests/init.pp
@@ -14,7 +14,7 @@
 # limitations under the License.
 
 class solr {
-  define server($port = "1978", $port_admin = "1979", $zk = "localhost:2181", 
$root_url = "hdfs://localhost:8020/solr", $kerberos_realm = "") {
+  class server($port = "1978", $port_admin = "1979", $zk = "localhost:2181", 
$root_url = "hdfs://localhost:8020/solr", $kerberos_realm = "") {
     package { "solr-server":
       ensure => latest,
     }
diff --git a/bigtop-deploy/puppet/modules/spark/manifests/init.pp 
b/bigtop-deploy/puppet/modules/spark/manifests/init.pp
index 1281ff4..d7a9360 100644
--- a/bigtop-deploy/puppet/modules/spark/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/spark/manifests/init.pp
@@ -14,7 +14,7 @@
 # limitations under the License.
 
 class spark {
-  class common {
+  class common ($master_host = $fqdn, $master_port = "7077", $master_ui_port = 
"18080") {
     package { "spark-core":
       ensure => latest,
     }
@@ -25,7 +25,7 @@ class spark {
     }
   }
 
-  define master($master_host = $fqdn, $master_port = "7077", $master_ui_port = 
"18080") {
+  class master {
     include common   
 
     package { "spark-master":
@@ -43,7 +43,7 @@ class spark {
     }
   }
 
-  define worker($master_host = $fqdn, $master_port = "7077", $master_ui_port = 
"18080") {
+  class worker {
     include common
 
     package { "spark-worker":
diff --git a/bigtop-deploy/puppet/modules/tachyon/manifests/init.pp 
b/bigtop-deploy/puppet/modules/tachyon/manifests/init.pp
index 55fb34a..fe1a7b6 100644
--- a/bigtop-deploy/puppet/modules/tachyon/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/tachyon/manifests/init.pp
@@ -10,7 +10,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 class tachyon {
-  class common {
+  class common ($master_host){
     package { "tachyon":
       ensure => latest,
     }
@@ -29,7 +29,7 @@ class tachyon {
     }
   }
 
-  define master($master_host) {
+  class master {
     include common
 
    exec {
@@ -49,10 +49,10 @@ class tachyon {
 
   }
 
-  define worker($master_host) {
+  class worker {
     include common
 
-   if ( $fqdn == $master_host ) {
+   if ( $fqdn == $tachyon::common::master_host ) {
       notice("tachyon ---> master host")
       # We want master to run first in all cases
       Service["tachyon-master"] ~> Service["tachyon-worker"]
-- 
2.1.4

Reply via email to