I took a glance on the patch and saw two huge attractive features you introduced
* fine-grand cluster topology layout * journal based HA It would be a huge improvement for our puppet recipes to step ahead and get closer to the real world cluster deployment. As cos said, it would be great to create a jira and state the features, in that way we can help to review, test and get that awesome stuff in:) 2015-01-28 12:18 GMT+08:00 Konstantin Boudnik <[email protected]>: > Thanks for picking this up Michael! I think having a way to describe the > deployment topology would be very beneficial for all! We had a quick > discussion about this during today's meetup and I think Rich and Nate have > some pretty cool ideas, so I'd rather let experts to discuss the matter. > > But from pure users perspective - I would _love_ to have something like > this. > > Do you think you can open JIRA and put the patch there? It would also help > to > keep the further discussion in a more coherent way... > > Thanks > Cos > > On Tue, Jan 27, 2015 at 02:35PM, Michael Weiser wrote: > > Hello Nate, > > Hi all, > > > > On Tue, Jan 20, 2015 at 08:34:34PM +0000, Nate DAmico (JIRA) wrote: > > > > > Nate DAmico commented on BIGTOP-1122: > > > We are putting some work in on puppet modules, starting with latest > > > 3.x branch. Hiera work wasn't first on our list, but can look to work > > > with you on it during same time. We are mostly looking at some > > > parameter refactoring and minimal changes to support a wider range of > > > topologies and setups > > > > > As Cos suggested we can take conversation to the dev list and we were > > > going to setup wiki page with more of param refactoring ideas if need > > > be > > > > I am facing the same problem: We want to employ a different topology and > > HA setup than what's currently hardwired into cluster.pp. Also we > > already have a puppet 3 infrastructure with heavy use of hiera e.g. > > for machine role assignment in place. > > > > When first looking at the current puppet code I was completely lost how > > to adapt cluster.pp for our scenario. It was really hard to get an idea > > what is used where and why because everything is set up in cluster.pp > > and handed to the modules either via resource parameters or just the > > global scope and from there via different variable names on to included > > common classes. > > > > To get a handle on things I started an ad-hoc conversion of modules and > > cluster/init.pp to parametrised classes instead of defined resources. > > This automagically cleaned up namespacing and scoping and revealed what's > > actually needed where. It also gets rid of 90% of the extlookup's. The > > remaining 10% I replaced with explicit hiera lookups. Also it revealed > > some things that were plain broken before. > > > > The attached patch is against latest git master HEAD and on my test box > > produces exactly the same results as the stock puppet code. (There is > > some additional HA setting code in there that's needed for our setup but > > doesn't do anything unless activated. I would eventually split that out > > into separate additional changes.) > > > > All the parameter defaulting logic is moved from cluster.pp into hiera's > > bigtop/cluster.yaml in concert with a variable lookup hierachy for non-HA > > and HA setups orchestrated in site.pp. cluster.pp now mostly contains > > node role assignments and inter-module/class/resource dependencies - as > > it should, IMO. > > > > Site specific adjustments are done in some site.yaml that's "in front" > > of cluster.yaml in the hiera search path (hierarchy). That way, > > cluster.pp would never need changing by a site admin but everything > > still remains fully adjustable and overrideable without any need to > > touch puppet code. Also, if required, cluster/site.pp can be totally > > canned and the modules be used on their own. > > > > I wouldn't dare say the attached patch is clean in every way but rather > > that it shows where that road might lead. To keep the diff small I > > didn't much try to preserve the extlookup key names. Rather I used the > > internal class variable names which get qualified in hiera using the > > module name anyway. So naming of keys is a bit chaotic now (mostly > > hadoop_ vs. no prefix). > > > > But it opens up a load of further oppurtunities: > > - node role assigment could be moved to an ENC or into hiera using > > hiera_include() rendering cluster/init.pp virtually redundant > > - cluster.yaml can be seriously cleaned up and streamlined > > - variable and lookup key names can easily be reviewed and changed to a > > common naming scheme > > - because all external variable lookups now are qualified with a scope, > > it's clearly visible what's used where allowing for easy review and > > redesign > > > > The price to pay is that the patch currently is puppet 3 and hiera only. > > As far as I understand it could easily be made puppet 2/3 compatible. I > > have no clear idea however how to continue to support non-hiera users. > > That's because in module hadoop common logic is pulled in via "include > > common_xyz" which must be used in order to allow multiple such includes > > from namenode/datanode/... but also makes it impossible to pass > > parameters other than via hiera (or the global scope). > > > > But since the README states that the code is puppet 3 only and puppet 3 > > by default supports hiera as a drop-in add-on I wonder if that's really > > much of an issue? > > > > Thanks, > > -- > > Michael Weiser science + computing ag > > Senior Systems Engineer Geschaeftsstelle Duesseldorf > > Faehrstrasse 1 > > phone: +49 211 302 708 32 D-40221 Duesseldorf > > fax: +49 211 302 708 50 www.science-computing.de > > -- > > Vorstandsvorsitzender/Chairman of the board of management: > > Gerd-Lothar Leonhart > > Vorstand/Board of Management: > > Dr. Bernd Finkbeiner, Michael Heinrichs, Dr. Arno Steitz > > Vorsitzender des Aufsichtsrats/ > > Chairman of the Supervisory Board: > > Philippe Miltin > > Sitz/Registered Office: Tuebingen > > Registergericht/Registration Court: Stuttgart > > Registernummer/Commercial Register No.: HRB 382196 > > > From cdca9ff5988e31ab1d39f97b1b6d059f4a62b450 Mon Sep 17 00:00:00 2001 > > From: Michael Weiser <[email protected]> > > Date: Tue, 27 Jan 2015 14:25:21 +0100 > > Subject: [PATCH] [BIGTOP-????] puppet: Replace extlookup by hiera, use > > parametrised classes > > > > Update the puppet code to use self-contained, parametrised classes and > proper > > scoping. Replace all extlookup calls bei either explicit or automatic > hiera > > parameter lookups. Implement HA/non-HA alternative via hiera lookup > hierarchy. > > Replace append_each from bigtop_util by suffix from stdlib. > > --- > > bigtop-deploy/puppet/config/site.csv.example | 28 -- > > bigtop-deploy/puppet/hiera.yaml | 7 + > > bigtop-deploy/puppet/hieradata/bigtop/cluster.yaml | 123 +++++++ > > bigtop-deploy/puppet/hieradata/bigtop/ha.yaml | 7 + > > bigtop-deploy/puppet/hieradata/bigtop/noha.yaml | 2 + > > bigtop-deploy/puppet/hieradata/site.yaml | 32 ++ > > bigtop-deploy/puppet/manifests/cluster.pp | 352 > ++++----------------- > > bigtop-deploy/puppet/manifests/site.pp | 24 +- > > .../lib/puppet/parser/functions/append_each.rb | 22 -- > > .../puppet/modules/crunch/manifests/init.pp | 2 +- > > .../puppet/modules/giraph/manifests/init.pp | 2 +- > > .../puppet/modules/hadoop-flume/manifests/init.pp | 2 +- > > .../puppet/modules/hadoop-hbase/manifests/init.pp | 14 +- > > .../puppet/modules/hadoop-hive/manifests/init.pp | 2 +- > > .../puppet/modules/hadoop-oozie/manifests/init.pp | 4 +- > > .../puppet/modules/hadoop-pig/manifests/init.pp | 2 +- > > .../puppet/modules/hadoop-sqoop/manifests/init.pp | 4 +- > > .../modules/hadoop-zookeeper/manifests/init.pp | 4 +- > > .../puppet/modules/hadoop/manifests/init.pp | 340 > ++++++++++++-------- > > .../puppet/modules/hadoop/templates/hadoop-env.sh | 2 +- > > .../puppet/modules/hadoop/templates/hdfs-site.xml | 32 +- > > .../puppet/modules/hadoop/templates/yarn-site.xml | 71 +++++ > > .../puppet/modules/hcatalog/manifests/init.pp | 4 +- > > bigtop-deploy/puppet/modules/hue/manifests/init.pp | 2 +- > > .../puppet/modules/kerberos/manifests/init.pp | 23 +- > > .../puppet/modules/mahout/manifests/init.pp | 2 +- > > .../puppet/modules/solr/manifests/init.pp | 2 +- > > .../puppet/modules/spark/manifests/init.pp | 6 +- > > .../puppet/modules/tachyon/manifests/init.pp | 8 +- > > 29 files changed, 586 insertions(+), 539 deletions(-) > > delete mode 100644 bigtop-deploy/puppet/config/site.csv.example > > create mode 100644 bigtop-deploy/puppet/hiera.yaml > > create mode 100644 bigtop-deploy/puppet/hieradata/bigtop/cluster.yaml > > create mode 100644 bigtop-deploy/puppet/hieradata/bigtop/ha.yaml > > create mode 100644 bigtop-deploy/puppet/hieradata/bigtop/noha.yaml > > create mode 100644 bigtop-deploy/puppet/hieradata/site.yaml > > delete mode 100644 > bigtop-deploy/puppet/modules/bigtop_util/lib/puppet/parser/functions/append_each.rb > > > > diff --git a/bigtop-deploy/puppet/config/site.csv.example > b/bigtop-deploy/puppet/config/site.csv.example > > deleted file mode 100644 > > index 60c88eb..0000000 > > --- a/bigtop-deploy/puppet/config/site.csv.example > > +++ /dev/null > > @@ -1,28 +0,0 @@ > > -### WARNING: > > -### actual site.csv file shouldn't contain lines starting with '#' > > -### It will cause the parse to choke. > > -### End of WARNING > > -### This file needs to be customized to reflect the configuration of > your cluster > > -### Store it as $BIGTOP_DEPLOY_PATH/config/site.csv > > -### use --confdir=$BIGTOP_DEPLOY_PATH (see README for more info) > > -# FQDN of Namenode > > -hadoop_head_node,hadoopmaster.example.com > > -# FQDN of standby node (for HA) > > -#standby_head_node,standbyNN.example.com > > -# FQDN of gateway node (if separate from NN) > > -#standby_head_node,gateway.example.com > > -# Storage directories (will be created if doesn't exist) > > -hadoop_storage_dirs,/data/1,/data/2,/data/3,/data/4 > > -bigtop_yumrepo_uri,http://mirror.example.com/path/to/mirror/ > > -# A list of stack' components to be deployed can be specified via > special > > -# "$components" list. If $components isn't set then everything in the > stack will > > -# be installed as usual. Otherwise only a specified list will be set > > -# Possible elements: > > -# > hadoop,yarn,hbase,tachyon,flume,solrcloud,spark,oozie,hcat,sqoop,httpfs, > > -# hue,mahout,giraph,crunch,pig,hive,zookeeper > > -# Example (to deploy only HDFS and YARN server and gateway parts) > > -#components,hadoop,yarn > > -# Test-only variable controls if user hdfs' sshkeys should be installed > to allow > > -# for passwordless login across the cluster. Required by some > integration tests > > -#testonly_hdfs_sshkeys=no > > - > > diff --git a/bigtop-deploy/puppet/hiera.yaml > b/bigtop-deploy/puppet/hiera.yaml > > new file mode 100644 > > index 0000000..b276006 > > --- /dev/null > > +++ b/bigtop-deploy/puppet/hiera.yaml > > @@ -0,0 +1,7 @@ > > +--- > > +:yaml: > > + :datadir: /etc/puppet/hieradata > > +:hierarchy: > > + - site > > + - "bigtop/%{hadoop_hiera_ha_path}" > > + - bigtop/cluster > > diff --git a/bigtop-deploy/puppet/hieradata/bigtop/cluster.yaml > b/bigtop-deploy/puppet/hieradata/bigtop/cluster.yaml > > new file mode 100644 > > index 0000000..41c8e31 > > --- /dev/null > > +++ b/bigtop-deploy/puppet/hieradata/bigtop/cluster.yaml > > @@ -0,0 +1,123 @@ > > +--- > > +### This file implements defaults and some dependant parameter > defaulting logic. > > +### Every parameter can be overridden using the hiera lookup hierarchy. > The enclosd > > +### hiera.yaml provides for this by adding a site.yaml to the lookup > where > > +### site-specific overrides can be placed. Therefore this file should > never need > > +### changing by site admins. > > + > > +# FQDN of Namenode > > +#bigtop::hadoop_head_node: "hadoopmaster.example.com" > > +# FQDN of standby node (enables HA if set) > > +#bigtop::hadoop_standby_head_node: "standbyNN.example.com" > > +# FQDN of gateway node (if separate from NN) > > +#bigtop::hadoop_gateway_node: "gateway.example.com" > > + > > +# A list of stack' components to be deployed can be specified via > special > > +# "$components" list. If $components isn't set then everything in the > stack will > > +# be installed as usual. Otherwise only a specified list will be set > > +# Possible elements: > > +# > hadoop,yarn,hbase,tachyon,flume,solrcloud,spark,oozie,hcat,sqoop,httpfs, > > +# hue,mahout,giraph,crunch,pig,hive,zookeeper > > +# Example (to deploy only HDFS and YARN server and gateway parts) > > +# This can be a comma-separated list or an array. > > +#hadoop_cluster_node::cluster_components: > > +# - hadoop > > +# - yarn > > + > > +# Storage directories (will be created if doesn't exist) > > +#hadoop::hadoop_storage_dirs: > > +# - /data/1 > > +# - /data/2 > > +# - /data/3 > > +# - /data/4 > > + > > +#bigtop::bigtop_yumrepo_uri: "http://mirror.example.com/path/to/mirror/ > " > > + > > +# Test-only variable controls if user hdfs' sshkeys should be installed > to allow > > +# for passwordless login across the cluster. Required by some > integration tests > > +#hadoop::common_hdfs::testonly_hdfs_sshkeys: "no" > > + > > +# Default > > +#hadoop::common_hdfs::ha: "disabled" > > + > > +# Kerberos > > +#hadoop::hadoop_security_authentication: "kerberos" > > +#kerberos::site::domain: "do.main" > > +#kerberos::site::realm: "DO.MAIN" > > +#kerberos::site::kdc_server: "localhost" > > +#kerberos::site::kdc_port: "88" > > +#kerberos::site::admin_port: "749" > > +#kerberos::site::keytab_export_dir: "/var/lib/bigtop_keytabs" > > + > > +hadoop::common_hdfs::hadoop_namenode_host: > "%{hiera('bigtop::hadoop_head_node')}" > > +# actually default but needed for hadoop_namenode_uri here > > +hadoop::common_hdfs::hadoop_namenode_port: "8020" > > + > > +hadoop::common_yarn::hadoop_ps_host: > "%{hiera('bigtop::hadoop_head_node')}" > > +hadoop::common_yarn::hadoop_rm_host: > "%{hiera('bigtop::hadoop_head_node')}" > > +# actually default but needed for hue::server::rm_port here > > +hadoop::common_yarn::hadoop_rm_port: "8032" > > +hadoop::common_yarn::kerberos_realm: "%{hiera('kerberos::site::realm')}" > > + > > +hadoop::common_mapred_app::hadoop_hs_host: > "%{hiera('bigtop::hadoop_head_node')}" > > +hadoop::common_mapred_app::hadoop_jobtracker_host: > "%{hiera('bigtop::hadoop_head_node')}" > > + > > +# actually default but needed for hue::server::webhdfs_url here > > +hadoop::httpfs::hadoop_httpfs_port: "14000" > > + > > +bigtop::hadoop_zookeeper_port: "2181" > > +hadoop::zk: > "%{hiera('bigtop::hadoop_head_node')}:%{hiera('bigtop::hadoop_zookeeper_port')}" > > + > > +bigtop::hadoop_namenode_uri: > "hdfs://%{hiera('hadoop::common_hdfs::hadoop_namenode_host')}:%{hiera('hadoop::common_hdfs::hadoop_namenode_port')}" > > +hadoop-hbase::base_relative_rootdir: "/hbase" > > +hadoop-hbase::common_config::rootdir: > "%{hiera('bigtop::hadoop_namenode_uri')}%{hiera('hadoop-hbase::base_relative_rootdir')}" > > +hadoop-hbase::common_config::zookeeper_quorum: > "%{hiera('bigtop::hadoop_head_node')}" > > +hadoop-hbase::common_config::kerberos_realm: > "%{hiera('kerberos::site::realm')}" > > +hadoop-hbase::client::thrift: true > > + > > +solr::server::root_url: "%{hiera('bigtop::hadoop_namenode_uri')}" > > +solr::server::zk: "%{hiera('hadoop::zk')}" > > +solr::server::kerberos_realm: "%{hiera('kerberos::site::realm')}" > > +# Default but needed here to make sure, hue uses the same port > > +solr::server::port: "1978" > > + > > +hadoop-oozie::server::kerberos_realm: > "%{hiera('kerberos::site::realm')}" > > + > > +hcatalog::server::kerberos_realm: "%{hiera('kerberos::site::realm')}" > > +hcatalog::webhcat::server::kerberos_realm: > "%{hiera('kerberos::site::realm')}" > > + > > +spark::common::spark_master_host: "%{hiera('bigtop::hadoop_head_node')}" > > + > > +tachyon::common::master_host: "%{hiera('bigtop::hadoop_head_node')}" > > + > > +hadoop-zookeeper::server::myid: "0" > > +hadoop-zookeeper::server::ensemble: > > + - ["%{hiera('bigtop::hadoop_head_node')}:2888:3888"] > > +hadoop-zookeeper::server::kerberos_realm: > "%{hiera('kerberos::site::realm')}" > > + > > +# those are only here because they were present as extlookup keys > previously > > +bigtop::hadoop_rm_http_port: "8088" > > +bigtop::hadoop_rm_proxy_port: "8088" > > +bigtop::hadoop_history_server_port: "19888" > > +bigtop::sqoop_server_port: "<never defined correctly>" > > +bigtop::hbase_thrift_port: "9090" > > +bigtop::hadoop_oozie_port: "11000" > > + > > +hue::server::rm_host: "%{hiera('hadoop::common_yarn::hadoop_rm_host')}" > > +hue::server::rm_port: "%{hiera('hadoop::common_yarn::hadoop_rm_port')}" > > +hue::server::rm_url: "http:// > %{hiera('bigtop::hadoop_head_node')}:%{hiera('bigtop::hadoop_rm_http_port')}" > > +hue::server::rm_proxy_url: "http:// > %{hiera('bigtop::hadoop_head_node')}:%{hiera('bigtop::hadoop_rm_proxy_port')}" > > +hue::server::history_server_url: "http:// > %{hiera('bigtop::hadoop_head_node')}:%{hiera('bigtop::hadoop_history_server_port')}" > > +# those use fqdn instead of hadoop_head_node because it's only ever > activated > > +# on the gatewaynode > > +hue::server::webhdfs_url: "http:// > %{fqdn}:%{hiera('hadoop::httpfs::hadoop_httpfs_port')}/webhdfs/v1" > > +hue::server::sqoop_url: "http:// > %{fqdn}:%{hiera('bigtop::sqoop_server_port')}/sqoop" > > +hue::server::solr_url: "http:// > %{fqdn}:%{hiera('solr::server::port')}/solr/" > > +hue::server::hbase_thrift_url: > "%{fqdn}:%{hiera('bigtop::hbase_thrift_port')}" > > +hue::server::oozie_url: "http:// > %{hiera('bigtop::hadoop_head_node')}:%{hiera('bigtop::hadoop_oozie_port')}/oozie" > > +hue::server::default_fs: "%{hiera('bigtop::hadoop_namenode_uri')}" > > +hue::server::kerberos_realm: "%{hiera('kerberos::site::realm')}" > > + > > +giraph::client::zookeeper_quorum: "%{hiera('bigtop::hadoop_head_node')}" > > + > > +hadoop-hive::client::hbase_zookeeper_quorum: > "%{hiera('hadoop-hbase::common_config::zookeeper_quorum')}" > > diff --git a/bigtop-deploy/puppet/hieradata/bigtop/ha.yaml > b/bigtop-deploy/puppet/hieradata/bigtop/ha.yaml > > new file mode 100644 > > index 0000000..3654987 > > --- /dev/null > > +++ b/bigtop-deploy/puppet/hieradata/bigtop/ha.yaml > > @@ -0,0 +1,7 @@ > > +--- > > +hadoop::common_hdfs::ha: "manual" > > +hadoop::common_hdfs::hadoop_namenode_host: > > + - "%{hiera('bigtop::hadoop_head_node')}" > > + - "%{hiera('bigtop::standby_head_node')}" > > +hadoop::common_hdfs::hadoop_ha_nameservice_id: "ha-nn-uri" > > +hadoop_cluster_node::hadoop_namenode_uri: > "hdfs://%{hiera('hadoop_ha_nameservice_id')}:8020" > > diff --git a/bigtop-deploy/puppet/hieradata/bigtop/noha.yaml > b/bigtop-deploy/puppet/hieradata/bigtop/noha.yaml > > new file mode 100644 > > index 0000000..ac81412 > > --- /dev/null > > +++ b/bigtop-deploy/puppet/hieradata/bigtop/noha.yaml > > @@ -0,0 +1,2 @@ > > +--- > > +# all done via defaults > > diff --git a/bigtop-deploy/puppet/hieradata/site.yaml > b/bigtop-deploy/puppet/hieradata/site.yaml > > new file mode 100644 > > index 0000000..339e2ab > > --- /dev/null > > +++ b/bigtop-deploy/puppet/hieradata/site.yaml > > @@ -0,0 +1,32 @@ > > +--- > > +bigtop::hadoop_head_node: "head.node.fqdn" > > +#bigtop::standby_head_node: "standby.head.node.fqdn" > > + > > +hadoop::hadoop_storage_dirs: > > + - /data/1 > > + - /data/2 > > + - /data/3 > > + - /data/4 > > + > > +#hadoop_cluster_node::cluster_components: > > +# - crunch > > +# - flume > > +# - giraph > > +# - hbase > > +# - hcat > > +# - hive > > +# - httpfs > > +# - hue > > +# - mahout > > +# - mapred-app > > +# - oozie > > +# - pig > > +# - solrcloud > > +# - spark > > +# - sqoop > > +# - tachyon > > +# - yarn > > +# - zookeeper > > + > > +# Debian: > > +#bigtop::jdk_package_name: "openjdk-7-jre-headless" > > diff --git a/bigtop-deploy/puppet/manifests/cluster.pp > b/bigtop-deploy/puppet/manifests/cluster.pp > > index 903f3e8..d4bae8a 100644 > > --- a/bigtop-deploy/puppet/manifests/cluster.pp > > +++ b/bigtop-deploy/puppet/manifests/cluster.pp > > @@ -13,131 +13,37 @@ > > # See the License for the specific language governing permissions and > > # limitations under the License. > > > > -class hadoop_cluster_node { > > - require bigtop_util > > - > > - $hadoop_head_node = extlookup("hadoop_head_node") > > - $standby_head_node = extlookup("standby_head_node", "") > > - $hadoop_gateway_node = extlookup("hadoop_gateway_node", > $hadoop_head_node) > > - > > - $hadoop_ha = $standby_head_node ? { > > - "" => disabled, > > - default => extlookup("hadoop_ha", "manual"), > > - } > > - > > - > > - $hadoop_namenode_host = $hadoop_ha ? { > > - "disabled" => $hadoop_head_node, > > - default => [ $hadoop_head_node, $standby_head_node ], > > - } > > - $hadoop_namenode_port = extlookup("hadoop_namenode_port", > "8020") > > - $hadoop_dfs_namenode_plugins = > extlookup("hadoop_dfs_namenode_plugins", "") > > - $hadoop_dfs_datanode_plugins = > extlookup("hadoop_dfs_datanode_plugins", "") > > - # > $hadoop_dfs_namenode_plugins="org.apache.hadoop.thriftfs.NamenodePlugin" > > - # > $hadoop_dfs_datanode_plugins="org.apache.hadoop.thriftfs.DatanodePlugin" > > - $hadoop_ha_nameservice_id = extlookup("hadoop_ha_nameservice_id", > "ha-nn-uri") > > - $hadoop_namenode_uri = $hadoop_ha ? { > > - "disabled" => "hdfs://$hadoop_namenode_host:$hadoop_namenode_port", > > - default => "hdfs://${hadoop_ha_nameservice_id}:8020", > > - } > > - > > - $hadoop_rm_host = $hadoop_head_node > > - $hadoop_rt_port = extlookup("hadoop_rt_port", "8025") > > - $hadoop_rm_port = extlookup("hadoop_rm_port", "8032") > > - $hadoop_sc_port = extlookup("hadoop_sc_port", "8030") > > - > > - $hadoop_hs_host = $hadoop_head_node > > - $hadoop_hs_port = extlookup("hadoop_hs_port", "10020") > > - $hadoop_hs_webapp_port = extlookup("hadoop_hs_webapp_port", "19888") > > - > > - $hadoop_ps_host = $hadoop_head_node > > - $hadoop_ps_port = extlookup("hadoop_ps_port", "20888") > > - > > - $hadoop_jobtracker_host = $hadoop_head_node > > - $hadoop_jobtracker_port = > extlookup("hadoop_jobtracker_port", "8021") > > - $hadoop_mapred_jobtracker_plugins = > extlookup("hadoop_mapred_jobtracker_plugins", "") > > - $hadoop_mapred_tasktracker_plugins = > extlookup("hadoop_mapred_tasktracker_plugins", "") > > - > > - $hadoop_zookeeper_port = > extlookup("hadoop_zookeeper_port", "2181") > > - $solrcloud_port = extlookup("solrcloud_port", > "1978") > > - $solrcloud_admin_port = > extlookup("solrcloud_admin_port", "1979") > > - $hadoop_oozie_port = extlookup("hadoop_oozie_port", > "11000") > > - $hadoop_httpfs_port = extlookup("hadoop_httpfs_port", > "14000") > > - $hadoop_rm_http_port = extlookup("hadoop_rm_http_port", > "8088") > > - $hadoop_rm_proxy_port = > extlookup("hadoop_rm_proxy_port", "8088") > > - $hadoop_history_server_port = > extlookup("hadoop_history_server_port", "19888") > > - $hbase_thrift_port = extlookup("hbase_thrift_port", > "9090") > > - $spark_master_port = extlookup("spark_master_port", > "7077") > > - $spark_master_ui_port = > extlookup("spark_master_ui_port", "18080") > > - > > - # Lookup comma separated components (i.e. hadoop,spark,hbase ). > > - $components_tmp = extlookup("components", > split($components, ",")) > > +class hadoop_cluster_node ( > > + $hadoop_security_authentication = > hiera("hadoop::hadoop_security_authentication", "simple"), > > + > > + # Lookup component array or comma separated components (i.e. > > + # hadoop,spark,hbase ) as a default via facter. > > + $cluster_components = "$::components" > > + ) { > > # Ensure (even if a single value) that the type is an array. > > - if is_array($components_tmp) { > > - $components = $components_tmp > > - } > > - else { > > - $components = any2array($components_tmp,",") > > + if is_array($cluster_components) { > > + $components = $cluster_components > > + } else { > > + $components = any2array($cluster_components, ",") > > } > > > > $all = ($components[0] == undef) > > > > - $hadoop_ha_zookeeper_quorum = > "${hadoop_head_node}:${hadoop_zookeeper_port}" > > - $solrcloud_zk = > "${hadoop_head_node}:${hadoop_zookeeper_port}" > > - $hbase_thrift_address = > "${hadoop_head_node}:${hbase_thrift_port}" > > - $hadoop_oozie_url = "http:// > ${hadoop_head_node}:${hadoop_oozie_port}/oozie" > > - $hadoop_httpfs_url = "http:// > ${hadoop_head_node}:${hadoop_httpfs_port}/webhdfs/v1" > > - $sqoop_server_url = "http:// > ${hadoop_head_node}:${sqoop_server_port}/sqoop" > > - $solrcloud_url = "http:// > ${hadoop_head_node}:${solrcloud_port}/solr/" > > - $hadoop_rm_url = "http:// > ${hadoop_head_node}:${hadoop_rm_http_port}" > > - $hadoop_rm_proxy_url = "http:// > ${hadoop_head_node}:${hadoop_rm_proxy_port}" > > - $hadoop_history_server_url = "http:// > ${hadoop_head_node}:${hadoop_history_server_port}" > > - > > - $bigtop_real_users = [ 'jenkins', 'testuser', 'hudson' ] > > - > > - $hadoop_core_proxyusers = { oozie => { groups => > 'hudson,testuser,root,hadoop,jenkins,oozie,httpfs,hue,users', hosts => "*" > }, > > - hue => { groups => > 'hudson,testuser,root,hadoop,jenkins,oozie,httpfs,hue,users', hosts => "*" > }, > > - httpfs => { groups => > 'hudson,testuser,root,hadoop,jenkins,oozie,httpfs,hue,users', hosts => "*" > } } > > - > > - $hbase_relative_rootdir = extlookup("hadoop_hbase_rootdir", > "/hbase") > > - $hadoop_hbase_rootdir = "$hadoop_namenode_uri$hbase_relative_rootdir" > > - $hadoop_hbase_zookeeper_quorum = $hadoop_head_node > > - $hbase_heap_size = extlookup("hbase_heap_size", "1024") > > - $hbase_thrift_server = $hadoop_head_node > > - > > - $giraph_zookeeper_quorum = $hadoop_head_node > > - > > - $spark_master_host = $hadoop_head_node > > - $tachyon_master_host = $hadoop_head_node > > - > > - $hadoop_zookeeper_ensemble = ["$hadoop_head_node:2888:3888"] > > - > > - # Set from facter if available > > - $roots = extlookup("hadoop_storage_dirs", > split($hadoop_storage_dirs, ";")) > > - $namenode_data_dirs = extlookup("hadoop_namenode_data_dirs", > append_each("/namenode", $roots)) > > - $hdfs_data_dirs = extlookup("hadoop_hdfs_data_dirs", > append_each("/hdfs", $roots)) > > - $mapred_data_dirs = extlookup("hadoop_mapred_data_dirs", > append_each("/mapred", $roots)) > > - $yarn_data_dirs = extlookup("hadoop_yarn_data_dirs", > append_each("/yarn", $roots)) > > - > > - $hadoop_security_authentication = extlookup("hadoop_security", > "simple") > > if ($hadoop_security_authentication == "kerberos") { > > - $kerberos_domain = extlookup("hadoop_kerberos_domain") > > - $kerberos_realm = extlookup("hadoop_kerberos_realm") > > - $kerberos_kdc_server = extlookup("hadoop_kerberos_kdc_server") > > - > > include kerberos::client > > } > > > > # Flume agent is the only component that goes on EVERY node in the > cluster > > if ($all or "flume" in $components) { > > - hadoop-flume::agent { "flume agent": > > - } > > + include hadoop-flume::agent > > } > > } > > > > > > > > -class hadoop_worker_node inherits hadoop_cluster_node { > > +class hadoop_worker_node ( > > + $bigtop_real_users = [ 'jenkins', 'testuser', 'hudson' ] > > + ) inherits hadoop_cluster_node { > > user { $bigtop_real_users: > > ensure => present, > > system => false, > > @@ -150,80 +56,42 @@ class hadoop_worker_node inherits > hadoop_cluster_node { > > User<||> -> Kerberos::Host_keytab<||> > > } > > > > - hadoop::datanode { "datanode": > > - namenode_host => $hadoop_namenode_host, > > - namenode_port => $hadoop_namenode_port, > > - dirs => $hdfs_data_dirs, > > - auth => $hadoop_security_authentication, > > - ha => $hadoop_ha, > > - } > > - > > + include hadoop::datanode > > if ($all or "yarn" in $components) { > > - hadoop::nodemanager { "nodemanager": > > - rm_host => $hadoop_rm_host, > > - rm_port => $hadoop_rm_port, > > - rt_port => $hadoop_rt_port, > > - dirs => $yarn_data_dirs, > > - auth => $hadoop_security_authentication, > > - } > > + include hadoop::nodemanager > > } > > if ($all or "hbase" in $components) { > > - hadoop-hbase::server { "hbase region server": > > - rootdir => $hadoop_hbase_rootdir, > > - heap_size => $hbase_heap_size, > > - zookeeper_quorum => $hadoop_hbase_zookeeper_quorum, > > - kerberos_realm => $kerberos_realm, > > - } > > + include hadoop-hbase::server > > } > > > > ### If mapred is not installed, yarn can fail. > > ### So, when we install yarn, we also need mapred for now. > > ### This dependency should be cleaned up eventually. > > if ($all or "mapred-app" or "yarn" in $components) { > > - hadoop::mapred-app { "mapred-app": > > - namenode_host => $hadoop_namenode_host, > > - namenode_port => $hadoop_namenode_port, > > - jobtracker_host => $hadoop_jobtracker_host, > > - jobtracker_port => $hadoop_jobtracker_port, > > - auth => $hadoop_security_authentication, > > - dirs => $mapred_data_dirs, > > - } > > + include hadoop::mapred-app > > } > > > > if ($all or "solrcloud" in $components) { > > - solr::server { "solrcloud server": > > - port => $solrcloud_port, > > - port_admin => $solrcloud_admin_port, > > - zk => $solrcloud_zk, > > - root_url => $hadoop_namenode_uri, > > - kerberos_realm => $kerberos_realm, > > - } > > + include solr::server > > } > > > > if ($all or "spark" in $components) { > > - spark::worker { "spark worker": > > - master_host => $spark_master_host, > > - master_port => $spark_master_port, > > - master_ui_port => $spark_master_ui_port, > > - } > > + include spark::worker > > } > > > > - if ($components[0] == undef or "tachyon" in $components) { > > - tachyon::worker { "tachyon worker": > > - master_host => $tachyon_master_host > > - } > > + if ($all or "tachyon" in $components) { > > + include tachyon::worker > > } > > > > } > > > > class hadoop_head_node inherits hadoop_worker_node { > > - > > exec { "init hdfs": > > path => ['/bin','/sbin','/usr/bin','/usr/sbin'], > > command => 'bash -x /usr/lib/hadoop/libexec/init-hdfs.sh', > > require => Package['hadoop-hdfs'] > > } > > - Hadoop::Namenode<||> -> Hadoop::Datanode<||> -> Exec<| title == "init > hdfs" |> > > + Class['Hadoop::Namenode'] -> Class['Hadoop::Datanode'] -> Exec<| > title == "init hdfs" |> > > > > if ($hadoop_security_authentication == "kerberos") { > > include kerberos::server > > @@ -231,196 +99,104 @@ if ($hadoop_security_authentication == > "kerberos") { > > include kerberos::kdc::admin_server > > } > > > > - hadoop::namenode { "namenode": > > - host => $hadoop_namenode_host, > > - port => $hadoop_namenode_port, > > - dirs => $namenode_data_dirs, > > - auth => $hadoop_security_authentication, > > - ha => $hadoop_ha, > > - zk => $hadoop_ha_zookeeper_quorum, > > - } > > + include hadoop::namenode > > > > - if ($hadoop_ha == "disabled") { > > - hadoop::secondarynamenode { "secondary namenode": > > - namenode_host => $hadoop_namenode_host, > > - namenode_port => $hadoop_namenode_port, > > - auth => $hadoop_security_authentication, > > - } > > + if ($hadoop::common_hdfs::ha == "disabled") { > > + include hadoop::secondarynamenode > > } > > > > if ($all or "yarn" in $components) { > > - hadoop::resourcemanager { "resourcemanager": > > - host => $hadoop_rm_host, > > - port => $hadoop_rm_port, > > - rt_port => $hadoop_rt_port, > > - sc_port => $hadoop_sc_port, > > - auth => $hadoop_security_authentication, > > - } > > - > > - hadoop::historyserver { "historyserver": > > - host => $hadoop_hs_host, > > - port => $hadoop_hs_port, > > - webapp_port => $hadoop_hs_webapp_port, > > - auth => $hadoop_security_authentication, > > - } > > - > > - hadoop::proxyserver { "proxyserver": > > - host => $hadoop_ps_host, > > - port => $hadoop_ps_port, > > - auth => $hadoop_security_authentication, > > - } > > - Exec<| title == "init hdfs" |> -> Hadoop::Resourcemanager<||> -> > Hadoop::Nodemanager<||> > > - Exec<| title == "init hdfs" |> -> Hadoop::Historyserver<||> > > + include hadoop::resourcemanager > > + include hadoop::historyserver > > + include hadoop::proxyserver > > + Exec<| title == "init hdfs" |> -> Class['Hadoop::Resourcemanager'] > -> Class['Hadoop::Nodemanager'] > > + Exec<| title == "init hdfs" |> -> Class['Hadoop::Historyserver'] > > } > > > > if ($all or "hbase" in $components) { > > - hadoop-hbase::master { "hbase master": > > - rootdir => $hadoop_hbase_rootdir, > > - heap_size => $hbase_heap_size, > > - zookeeper_quorum => $hadoop_hbase_zookeeper_quorum, > > - kerberos_realm => $kerberos_realm, > > - } > > - Exec<| title == "init hdfs" |> -> Hadoop-hbase::Master<||> > > + include hadoop-hbase::master > > + Exec<| title == "init hdfs" |> -> Class['Hadoop-hbase::Master'] > > } > > > > if ($all or "oozie" in $components) { > > - hadoop-oozie::server { "oozie server": > > - kerberos_realm => $kerberos_realm, > > + include hadoop-oozie::server > > + if ($all or "mapred-app" in $components) { > > + Class['Hadoop::Mapred-app'] -> Class['Hadoop-oozie::Server'] > > } > > - Hadoop::Mapred-app<||> -> Hadoop-oozie::Server<||> > > - Exec<| title == "init hdfs" |> -> Hadoop-oozie::Server<||> > > + Exec<| title == "init hdfs" |> -> Class['Hadoop-oozie::Server'] > > } > > > > if ($all or "hcat" in $components) { > > - hcatalog::server { "hcatalog server": > > - kerberos_realm => $kerberos_realm, > > - } > > - hcatalog::webhcat::server { "webhcat server": > > - kerberos_realm => $kerberos_realm, > > - } > > + include hcatalog::server > > + include hcatalog::webhcat::server > > } > > > > if ($all or "spark" in $components) { > > - spark::master { "spark master": > > - master_host => $spark_master_host, > > - master_port => $spark_master_port, > > - master_ui_port => $spark_master_ui_port, > > - } > > + include spark::master > > } > > > > - if ($all == undef or "tachyon" in $components) { > > - tachyon::master { "tachyon-master": > > - master_host => $tachyon_master_host > > - } > > + if ($all or "tachyon" in $components) { > > + include tachyon::master > > } > > > > if ($all or "hbase" in $components) { > > - hadoop-zookeeper::server { "zookeeper": > > - myid => "0", > > - ensemble => $hadoop_zookeeper_ensemble, > > - kerberos_realm => $kerberos_realm, > > - } > > + include hadoop-zookeeper::server > > } > > > > - Exec<| title == "init hdfs" |> -> Hadoop::Rsync_hdfs<||> > > - > > + # class hadoop::rsync_hdfs isn't used anywhere > > + #Exec<| title == "init hdfs" |> -> Class['Hadoop::Rsync_hdfs'] > > } > > > > class standby_head_node inherits hadoop_cluster_node { > > - hadoop::namenode { "namenode": > > - host => $hadoop_namenode_host, > > - port => $hadoop_namenode_port, > > - dirs => $namenode_data_dirs, > > - auth => $hadoop_security_authentication, > > - ha => $hadoop_ha, > > - zk => $hadoop_ha_zookeeper_quorum, > > - } > > + include hadoop::namenode > > } > > > > class hadoop_gateway_node inherits hadoop_cluster_node { > > - $hbase_thrift_address = "${fqdn}:${hbase_thrift_port}" > > - $hadoop_httpfs_url = "http:// > ${fqdn}:${hadoop_httpfs_port}/webhdfs/v1" > > - $sqoop_server_url = "http:// > ${fqdn}:${sqoop_server_port}/sqoop" > > - $solrcloud_url = "http:// > ${fqdn}:${solrcloud_port}/solr/" > > - > > if ($all or "sqoop" in $components) { > > - hadoop-sqoop::server { "sqoop server": > > - } > > + include hadoop-sqoop::server > > } > > > > if ($all or "httpfs" in $components) { > > - hadoop::httpfs { "httpfs": > > - namenode_host => $hadoop_namenode_host, > > - namenode_port => $hadoop_namenode_port, > > - auth => $hadoop_security_authentication, > > + include hadoop::httpfs > > + if ($all or "hue" in $components) { > > + Class['Hadoop::Httpfs'] -> Class['Hue::Server'] > > } > > - Hadoop::Httpfs<||> -> Hue::Server<||> > > } > > > > if ($all or "hue" in $components) { > > - hue::server { "hue server": > > - rm_url => $hadoop_rm_url, > > - rm_proxy_url => $hadoop_rm_proxy_url, > > - history_server_url => $hadoop_history_server_url, > > - webhdfs_url => $hadoop_httpfs_url, > > - sqoop_url => $sqoop_server_url, > > - solr_url => $solrcloud_url, > > - hbase_thrift_url => $hbase_thrift_address, > > - rm_host => $hadoop_rm_host, > > - rm_port => $hadoop_rm_port, > > - oozie_url => $hadoop_oozie_url, > > - default_fs => $hadoop_namenode_uri, > > - kerberos_realm => $kerberos_realm, > > + include hue::server > > + if ($all or "hbase" in $components) { > > + Class['Hadoop-hbase::Client'] -> Class['Hue::Server'] > > } > > } > > - Hadoop-hbase::Client<||> -> Hue::Server<||> > > > > - hadoop::client { "hadoop client": > > - namenode_host => $hadoop_namenode_host, > > - namenode_port => $hadoop_namenode_port, > > - jobtracker_host => $hadoop_jobtracker_host, > > - jobtracker_port => $hadoop_jobtracker_port, > > - # auth => $hadoop_security_authentication, > > - } > > + include hadoop::client > > > > if ($all or "mahout" in $components) { > > - mahout::client { "mahout client": > > - } > > + include mahout::client > > } > > if ($all or "giraph" in $components) { > > - giraph::client { "giraph client": > > - zookeeper_quorum => $giraph_zookeeper_quorum, > > - } > > + include giraph::client > > } > > if ($all or "crunch" in $components) { > > - crunch::client { "crunch client": > > - } > > + include crunch::client > > } > > if ($all or "pig" in $components) { > > - hadoop-pig::client { "pig client": > > - } > > + include hadoop-pig::client > > } > > if ($all or "hive" in $components) { > > - hadoop-hive::client { "hive client": > > - hbase_zookeeper_quorum => $hadoop_hbase_zookeeper_quorum, > > - } > > + include hadoop-hive::client > > } > > if ($all or "sqoop" in $components) { > > - hadoop-sqoop::client { "sqoop client": > > - } > > + include hadoop-sqoop::client > > } > > if ($all or "oozie" in $components) { > > - hadoop-oozie::client { "oozie client": > > - } > > + include hadoop-oozie::client > > } > > if ($all or "hbase" in $components) { > > - hadoop-hbase::client { "hbase thrift client": > > - thrift => true, > > - kerberos_realm => $kerberos_realm, > > - } > > + include hadoop-hbase::client > > } > > if ($all or "zookeeper" in $components) { > > - hadoop-zookeeper::client { "zookeeper client": > > - } > > + include hadoop-zookeeper::client > > } > > } > > diff --git a/bigtop-deploy/puppet/manifests/site.pp > b/bigtop-deploy/puppet/manifests/site.pp > > index 8997140..dd5921c 100644 > > --- a/bigtop-deploy/puppet/manifests/site.pp > > +++ b/bigtop-deploy/puppet/manifests/site.pp > > @@ -13,19 +13,15 @@ > > # See the License for the specific language governing permissions and > > # limitations under the License. > > > > -require bigtop_util > > -$puppet_confdir = get_setting("confdir") > > $default_yumrepo = " > http://bigtop01.cloudera.org:8080/view/Releases/job/Bigtop-0.8.0/label=centos6/6/artifact/output/ > " > > -$extlookup_datadir="$puppet_confdir/config" > > -$extlookup_precedence = ["site", "default"] > > -$jdk_package_name = extlookup("jdk_package_name", "jdk") > > +$jdk_package_name = hiera("bigtop::jdk_package_name", "jdk") > > > > stage {"pre": before => Stage["main"]} > > > > case $operatingsystem { > > /(OracleLinux|Amazon|CentOS|Fedora|RedHat)/: { > > yumrepo { "Bigtop": > > - baseurl => extlookup("bigtop_yumrepo_uri", $default_yumrepo), > > + baseurl => hiera("hiera::bigtop_yumrepo_uri", > $default_yumrepo), > > descr => "Bigtop packages", > > enabled => 1, > > gpgcheck => 0, > > @@ -44,10 +40,16 @@ package { $jdk_package_name: > > import "cluster.pp" > > > > node default { > > - include stdlib > > - $hadoop_head_node = extlookup("hadoop_head_node") > > - $standby_head_node = extlookup("standby_head_node", "") > > - $hadoop_gateway_node = extlookup("hadoop_gateway_node", > $hadoop_head_node) > > + $hadoop_head_node = hiera("bigtop::hadoop_head_node") > > + $standby_head_node = hiera("bigtop::standby_head_node", "") > > + $hadoop_gateway_node = hiera("bigtop::hadoop_gateway_node", > $hadoop_head_node) > > + > > + # look into alternate hiera datasources configured using this path in > > + # hiera.yaml > > + $hadoop_hiera_ha_path = $standby_head_node ? { > > + "" => "noha", > > + default => "ha", > > + } > > > > case $::fqdn { > > $hadoop_head_node: { > > @@ -69,7 +71,7 @@ node default { > > Yumrepo<||> -> Package<||> > > > > if versioncmp($::puppetversion,'3.6.1') >= 0 { > > - $allow_virtual_packages = hiera('allow_virtual_packages',false) > > + $allow_virtual_packages = > hiera('bigtop::allow_virtual_packages',false) > > Package { > > allow_virtual => $allow_virtual_packages, > > } > > diff --git > a/bigtop-deploy/puppet/modules/bigtop_util/lib/puppet/parser/functions/append_each.rb > b/bigtop-deploy/puppet/modules/bigtop_util/lib/puppet/parser/functions/append_each.rb > > deleted file mode 100644 > > index b360b1e..0000000 > > --- > a/bigtop-deploy/puppet/modules/bigtop_util/lib/puppet/parser/functions/append_each.rb > > +++ /dev/null > > @@ -1,22 +0,0 @@ > > -# Licensed to the Apache Software Foundation (ASF) under one or more > > -# contributor license agreements. See the NOTICE file distributed with > > -# this work for additional information regarding copyright ownership. > > -# The ASF licenses this file to You under the Apache License, Version > 2.0 > > -# (the "License"); you may not use this file except in compliance with > > -# the License. You may obtain a copy of the License at > > -# > > -# http://www.apache.org/licenses/LICENSE-2.0 > > -# > > -# Unless required by applicable law or agreed to in writing, software > > -# distributed under the License is distributed on an "AS IS" BASIS, > > -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or > implied. > > -# See the License for the specific language governing permissions and > > -# limitations under the License. > > - > > -# Append a string to every element of an array > > - > > -Puppet::Parser::Functions::newfunction(:append_each, :type => :rvalue) > do |args| > > - suffix = (args[0].is_a? Array) ? args[0].join("") : args[0] > > - inputs = (args[1].is_a? Array) ? args[1] : [ args[1] ] > > - inputs.map { |item| item + suffix } > > -end > > diff --git a/bigtop-deploy/puppet/modules/crunch/manifests/init.pp > b/bigtop-deploy/puppet/modules/crunch/manifests/init.pp > > index d446667..b31edf6 100644 > > --- a/bigtop-deploy/puppet/modules/crunch/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/crunch/manifests/init.pp > > @@ -14,7 +14,7 @@ > > # limitations under the License. > > > > class crunch { > > - define client { > > + class client { > > package { ["crunch", "crunch-doc"]: > > ensure => latest, > > } > > diff --git a/bigtop-deploy/puppet/modules/giraph/manifests/init.pp > b/bigtop-deploy/puppet/modules/giraph/manifests/init.pp > > index 6652e40..1dc0d9b 100644 > > --- a/bigtop-deploy/puppet/modules/giraph/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/giraph/manifests/init.pp > > @@ -14,7 +14,7 @@ > > # limitations under the License. > > > > class giraph { > > - define client($zookeeper_quorum = 'localhost') { > > + class client($zookeeper_quorum = 'localhost') { > > package { "giraph": > > ensure => latest, > > } > > diff --git a/bigtop-deploy/puppet/modules/hadoop-flume/manifests/init.pp > b/bigtop-deploy/puppet/modules/hadoop-flume/manifests/init.pp > > index 8e3bf64..daf352a 100644 > > --- a/bigtop-deploy/puppet/modules/hadoop-flume/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/hadoop-flume/manifests/init.pp > > @@ -14,7 +14,7 @@ > > # limitations under the License. > > > > class hadoop-flume { > > - define agent($sources = [], $sinks = [], $channels = []) { > > + class agent($sources = [], $sinks = [], $channels = []) { > > package { "flume-agent": > > ensure => latest, > > } > > diff --git a/bigtop-deploy/puppet/modules/hadoop-hbase/manifests/init.pp > b/bigtop-deploy/puppet/modules/hadoop-hbase/manifests/init.pp > > index 3bbaa8a..5ef45b1 100644 > > --- a/bigtop-deploy/puppet/modules/hadoop-hbase/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/hadoop-hbase/manifests/init.pp > > @@ -20,7 +20,7 @@ class hadoop-hbase { > > } > > } > > > > - class common-config { > > + class common_config ($rootdir, $zookeeper_quorum, $kerberos_realm = > "", $heap_size="1024") { > > include client-package > > if ($kerberos_realm) { > > require kerberos::client > > @@ -45,8 +45,8 @@ class hadoop-hbase { > > } > > } > > > > - define client($thrift = false, $kerberos_realm = "") { > > - include common-config > > + class client($thrift = false) { > > + inclass common_config > > > > if ($thrift) { > > package { "hbase-thrift": > > @@ -64,8 +64,8 @@ class hadoop-hbase { > > } > > } > > > > - define server($rootdir, $zookeeper_quorum, $kerberos_realm = "", > $heap_size="1024") { > > - include common-config > > + class server { > > + include common_config > > > > package { "hbase-regionserver": > > ensure => latest, > > @@ -81,8 +81,8 @@ class hadoop-hbase { > > Kerberos::Host_keytab <| title == "hbase" |> -> > Service["hbase-regionserver"] > > } > > > > - define master($rootdir, $zookeeper_quorum, $kerberos_realm = "", > $heap_size="1024") { > > - include common-config > > + class master { > > + include common_config > > > > package { "hbase-master": > > ensure => latest, > > diff --git a/bigtop-deploy/puppet/modules/hadoop-hive/manifests/init.pp > b/bigtop-deploy/puppet/modules/hadoop-hive/manifests/init.pp > > index 891d4be..f9dede4 100644 > > --- a/bigtop-deploy/puppet/modules/hadoop-hive/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/hadoop-hive/manifests/init.pp > > @@ -14,7 +14,7 @@ > > # limitations under the License. > > > > class hadoop-hive { > > - define client($hbase_master = "", $hbase_zookeeper_quorum = "") { > > + class client($hbase_master = "", $hbase_zookeeper_quorum = "") { > > package { "hive": > > ensure => latest, > > } > > diff --git a/bigtop-deploy/puppet/modules/hadoop-oozie/manifests/init.pp > b/bigtop-deploy/puppet/modules/hadoop-oozie/manifests/init.pp > > index 46b937b..f1177e9 100644 > > --- a/bigtop-deploy/puppet/modules/hadoop-oozie/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/hadoop-oozie/manifests/init.pp > > @@ -14,13 +14,13 @@ > > # limitations under the License. > > > > class hadoop-oozie { > > - define client($kerberos_realm = "") { > > + class client { > > package { "oozie-client": > > ensure => latest, > > } > > } > > > > - define server($kerberos_realm = "") { > > + class server($kerberos_realm = "") { > > if ($kerberos_realm) { > > require kerberos::client > > kerberos::host_keytab { "oozie": > > diff --git a/bigtop-deploy/puppet/modules/hadoop-pig/manifests/init.pp > b/bigtop-deploy/puppet/modules/hadoop-pig/manifests/init.pp > > index f26047b..37bfde0 100644 > > --- a/bigtop-deploy/puppet/modules/hadoop-pig/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/hadoop-pig/manifests/init.pp > > @@ -14,7 +14,7 @@ > > # limitations under the License. > > > > class hadoop-pig { > > - define client { > > + class client { > > package { "pig": > > ensure => latest, > > require => Package["hadoop"], > > diff --git a/bigtop-deploy/puppet/modules/hadoop-sqoop/manifests/init.pp > b/bigtop-deploy/puppet/modules/hadoop-sqoop/manifests/init.pp > > index d1d08db..e0223ba 100644 > > --- a/bigtop-deploy/puppet/modules/hadoop-sqoop/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/hadoop-sqoop/manifests/init.pp > > @@ -14,13 +14,13 @@ > > # limitations under the License. > > > > class hadoop-sqoop { > > - define client { > > + class client { > > package { "sqoop-client": > > ensure => latest, > > } > > } > > > > - define server { > > + class server { > > package { "sqoop-server": > > ensure => latest, > > } > > diff --git > a/bigtop-deploy/puppet/modules/hadoop-zookeeper/manifests/init.pp > b/bigtop-deploy/puppet/modules/hadoop-zookeeper/manifests/init.pp > > index 701590e..dfbb6eb 100644 > > --- a/bigtop-deploy/puppet/modules/hadoop-zookeeper/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/hadoop-zookeeper/manifests/init.pp > > @@ -14,14 +14,14 @@ > > # limitations under the License. > > > > class hadoop-zookeeper { > > - define client { > > + class client { > > package { "zookeeper": > > ensure => latest, > > require => Package["jdk"], > > } > > } > > > > - define server($myid, $ensemble = ["localhost:2888:3888"], > > + class server($myid, $ensemble = ["localhost:2888:3888"], > > $kerberos_realm = "") > > { > > package { "zookeeper-server": > > diff --git a/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp > b/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp > > index 32eebe2..2c631ba 100644 > > --- a/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp > > @@ -13,7 +13,16 @@ > > # See the License for the specific language governing permissions and > > # limitations under the License. > > > > -class hadoop { > > +class hadoop ($hadoop_security_authentication = "simple", > > + $zk = "", > > + # Set from facter if available > > + $hadoop_storage_dirs = split($::hadoop_storage_dirs, ";"), > > + $proxyusers = { > > + oozie => { groups => > 'hudson,testuser,root,hadoop,jenkins,oozie,httpfs,hue,users', hosts => "*" > }, > > + hue => { groups => > 'hudson,testuser,root,hadoop,jenkins,oozie,httpfs,hue,users', hosts => "*" > }, > > + httpfs => { groups => > 'hudson,testuser,root,hadoop,jenkins,oozie,httpfs,hue,users', hosts => "*" > } } ) { > > + > > + include stdlib > > > > /** > > * Common definitions for hadoop nodes. > > @@ -36,8 +45,28 @@ class hadoop { > > } > > } > > > > - class common { > > - if ($auth == "kerberos") { > > + class common ($hadoop_java_home = undef, > > + $hadoop_classpath = undef, > > + $hadoop_heapsize = undef, > > + $hadoop_opts = undef, > > + $hadoop_namenode_opts = undef, > > + $hadoop_secondarynamenode_opts = undef, > > + $hadoop_datanode_opts = undef, > > + $hadoop_balancer_opts = undef, > > + $hadoop_jobtracker_opts = undef, > > + $hadoop_tasktracker_opts = undef, > > + $hadoop_client_opts = undef, > > + $hadoop_ssh_opts = undef, > > + $hadoop_log_dir = undef, > > + $hadoop_slaves = undef, > > + $hadoop_master = undef, > > + $hadoop_slave_sleep = undef, > > + $hadoop_pid_dir = undef, > > + $hadoop_ident_string = undef, > > + $hadoop_niceness = undef, > > + $hadoop_security_authentication = > $hadoop::hadoop_security_authentication ) inherits hadoop { > > + > > + if ($hadoop_security_authentication == "kerberos") { > > include hadoop::kerberos > > } > > > > @@ -58,7 +87,25 @@ class hadoop { > > #} > > } > > > > - class common-yarn inherits common { > > + class common_yarn ( > > + $yarn_data_dirs = suffix($hadoop::hadoop_storage_dirs, "/yarn"), > > + $kerberos_realm = undef, > > + $hadoop_ps_host, > > + $hadoop_ps_port = "20888", > > + $hadoop_rm_host, > > + $hadoop_rm_port = "8032", > > + $hadoop_rt_port = "8025", > > + $hadoop_sc_port = "8030", > > + $yarn_nodemanager_resource_memory_mb = undef, > > + $yarn_scheduler_maximum_allocation_mb = undef, > > + $yarn_scheduler_minimum_allocation_mb = undef, > > + $yarn_resourcemanager_scheduler_class = undef, > > + $yarn_resourcemanager_ha_enabled = undef, > > + $yarn_resourcemanager_cluster_id = "ha-rm-uri", > > + $yarn_resourcemanager_zk_address = $hadoop::zk) inherits hadoop { > > + > > + include common > > + > > package { "hadoop-yarn": > > ensure => latest, > > require => [Package["jdk"], Package["hadoop"]], > > @@ -76,18 +123,55 @@ class hadoop { > > } > > } > > > > - class common-hdfs inherits common { > > + class common_hdfs ($ha = "disabled", > > + $hadoop_config_dfs_block_size = undef, > > + $hadoop_config_namenode_handler_count = undef, > > + $hadoop_dfs_datanode_plugins = "", > > + $hadoop_dfs_namenode_plugins = "", > > + $hadoop_namenode_host = $fqdn, > > + $hadoop_namenode_port = "8020", > > + $hadoop_namenode_http_port = "50070", > > + $hadoop_namenode_https_port = "50470", > > + $hdfs_data_dirs = suffix($hadoop::hadoop_storage_dirs, "/hdfs"), > > + $hdfs_shortcut_reader_user = undef, > > + $hdfs_support_append = undef, > > + $hdfs_webhdfs_enabled = "true", > > + $hdfs_replication = undef, > > + $hdfs_datanode_fsdataset_volume_choosing_policy = undef, > > + $namenode_data_dirs = suffix($hadoop::hadoop_storage_dirs, > "/namenode"), > > + $nameservice_id = "ha-nn-uri", > > + $journalnode_edits_dir = undef, > > + $shared_edits_dir = "/hdfs_shared", > > + $testonly_hdfs_sshkeys = "no", > > + $hadoop_ha_sshfence_user_home = "/var/lib/hadoop-hdfs", > > + $sshfence_user = "hdfs", > > + $zk = $hadoop::zk, > > + $hadoop_config_fs_inmemory_size_mb = undef, > > + $hadoop_security_group_mapping = undef, > > + $hadoop_core_proxyusers = $hadoop::proxyusers, > > + $hadoop_snappy_codec = undef, > > + $hadoop_security_authentication = > $hadoop::hadoop_security_authentication ) inherits hadoop { > > + > > + require bigtop_util > > + $sshfence_keydir = "$hadoop_ha_sshfence_user_home/.ssh" > > + $sshfence_keypath = "$sshfence_keydir/id_sshfence" > > + $puppet_confdir = get_setting("confdir") > > + $configdir = hiera("hadoop::configdir", > "$puppet_confdir/config") > > + $sshfence_privkey = hiera("hadoop::common_hdfs::sshfence_privkey", > "$configdir/hadoop/id_sshfence") > > + $sshfence_pubkey = hiera("hadoop::common_hdfs::sshfence_pubkey", > "$configdir/hadoop/id_sshfence.pub") > > + > > + include common > > + > > # Check if test mode is enforced, so we can install hdfs ssh-keys for > passwordless > > - $testonly = extlookup("testonly_hdfs_sshkeys", 'no') > > - if ($testonly == "yes") { > > + if ($testonly_hdfs_sshkeys == "yes") { > > notify{"WARNING: provided hdfs ssh keys are for testing purposes > only.\n > > They shouldn't be used in production cluster": } > > $ssh_user = "hdfs" > > $ssh_user_home = "/var/lib/hadoop-hdfs" > > $ssh_user_keydir = "$ssh_user_home/.ssh" > > $ssh_keypath = "$ssh_user_keydir/id_hdfsuser" > > - $ssh_privkey = "$extlookup_datadir/hdfs/id_hdfsuser" > > - $ssh_pubkey = "$extlookup_datadir/hdfs/id_hdfsuser.pub" > > + $ssh_privkey = "$configdir/hdfs/id_hdfsuser" > > + $ssh_pubkey = "$configdir/hdfs/id_hdfsuser.pub" > > > > file { $ssh_user_keydir: > > ensure => directory, > > @@ -113,14 +197,10 @@ class hadoop { > > require => File[$ssh_user_keydir], > > } > > } > > - if ($auth == "kerberos" and $ha != "disabled") { > > + if ($hadoop_security_authentication == "kerberos" and $ha != > "disabled") { > > fail("High-availability secure clusters are not currently > supported") > > } > > > > - if ($ha != 'disabled') { > > - $nameservice_id = extlookup("hadoop_ha_nameservice_id", > "ha-nn-uri") > > - } > > - > > package { "hadoop-hdfs": > > ensure => latest, > > require => [Package["jdk"], Package["hadoop"]], > > @@ -139,7 +219,32 @@ class hadoop { > > } > > } > > > > - class common-mapred-app inherits common-hdfs { > > + class common_mapred_app ( > > + $hadoop_config_io_sort_factor = undef, > > + $hadoop_config_io_sort_mb = undef, > > + $hadoop_config_mapred_child_ulimit = undef, > > + $hadoop_config_mapred_fairscheduler_assignmultiple = undef, > > + $hadoop_config_mapred_fairscheduler_sizebasedweight = undef, > > + $hadoop_config_mapred_job_tracker_handler_count = undef, > > + $hadoop_config_mapred_reduce_parallel_copies = undef, > > + $hadoop_config_mapred_reduce_slowstart_completed_maps = undef, > > + $hadoop_config_mapred_reduce_tasks_speculative_execution = undef, > > + $hadoop_config_tasktracker_http_threads = undef, > > + $hadoop_config_use_compression = undef, > > + $hadoop_hs_host = undef, > > + $hadoop_hs_port = "10020", > > + $hadoop_hs_webapp_port = "19888", > > + $hadoop_jobtracker_fairscheduler_weightadjuster = undef, > > + $hadoop_jobtracker_host, > > + $hadoop_jobtracker_port = "8021", > > + $hadoop_jobtracker_taskscheduler = undef, > > + $hadoop_mapred_jobtracker_plugins = "", > > + $hadoop_mapred_tasktracker_plugins = "", > > + $mapred_acls_enabled = undef, > > + $mapred_data_dirs = suffix($hadoop::hadoop_storage_dirs, > "/mapred")) { > > + > > + include common_hdfs > > + > > package { "hadoop-mapreduce": > > ensure => latest, > > require => [Package["jdk"], Package["hadoop"]], > > @@ -157,22 +262,8 @@ class hadoop { > > } > > } > > > > - define datanode ($namenode_host, $namenode_port, $port = "50075", > $auth = "simple", $dirs = ["/tmp/data"], $ha = 'disabled') { > > - > > - $hadoop_namenode_host = $namenode_host > > - $hadoop_namenode_port = $namenode_port > > - $hadoop_datanode_port = $port > > - $hadoop_security_authentication = $auth > > - > > - if ($ha != 'disabled') { > > - # Needed by hdfs-site.xml > > - $sshfence_keydir = "/usr/lib/hadoop/.ssh" > > - $sshfence_keypath = "$sshfence_keydir/id_sshfence" > > - $sshfence_user = extlookup("hadoop_ha_sshfence_user", > "hdfs") > > - $shared_edits_dir = extlookup("hadoop_ha_shared_edits_dir", > "/hdfs_shared") > > - } > > - > > - include common-hdfs > > + class datanode { > > + include common_hdfs > > > > package { "hadoop-hdfs-datanode": > > ensure => latest, > > @@ -189,11 +280,11 @@ class hadoop { > > ensure => running, > > hasstatus => true, > > subscribe => [Package["hadoop-hdfs-datanode"], > File["/etc/hadoop/conf/core-site.xml"], > File["/etc/hadoop/conf/hdfs-site.xml"], > File["/etc/hadoop/conf/hadoop-env.sh"]], > > - require => [ Package["hadoop-hdfs-datanode"], > File["/etc/default/hadoop-hdfs-datanode"], File[$dirs] ], > > + require => [ Package["hadoop-hdfs-datanode"], > File["/etc/default/hadoop-hdfs-datanode"], > File[$hadoop::common_hdfs::hdfs_data_dirs] ], > > } > > Kerberos::Host_keytab <| title == "hdfs" |> -> Exec <| tag == > "namenode-format" |> -> Service["hadoop-hdfs-datanode"] > > > > - file { $dirs: > > + file { $hadoop::common_hdfs::hdfs_data_dirs: > > ensure => directory, > > owner => hdfs, > > group => hdfs, > > @@ -202,14 +293,12 @@ class hadoop { > > } > > } > > > > - define httpfs ($namenode_host, $namenode_port, $port = "14000", $auth > = "simple", $secret = "hadoop httpfs secret") { > > - > > - $hadoop_namenode_host = $namenode_host > > - $hadoop_namenode_port = $namenode_port > > - $hadoop_httpfs_port = $port > > - $hadoop_security_authentication = $auth > > + class httpfs ($hadoop_httpfs_port = "14000", > > + $secret = "hadoop httpfs secret", > > + $hadoop_core_proxyusers = $hadoop::proxyusers, > > + $hadoop_security_authentcation = > $hadoop::hadoop_security_authentication ) inherits hadoop { > > > > - if ($auth == "kerberos") { > > + if ($hadoop_security_authentication == "kerberos") { > > kerberos::host_keytab { "httpfs": > > spnego => true, > > require => Package["hadoop-httpfs"], > > @@ -255,11 +344,12 @@ class hadoop { > > } > > } > > > > - define create_hdfs_dirs($hdfs_dirs_meta, $auth="simple") { > > + class create_hdfs_dirs($hdfs_dirs_meta, > > + $hadoop_security_authentcation = > $hadoop::hadoop_security_authentication ) inherits hadoop { > > $user = $hdfs_dirs_meta[$title][user] > > $perm = $hdfs_dirs_meta[$title][perm] > > > > - if ($auth == "kerberos") { > > + if ($hadoop_security_authentication == "kerberos") { > > require hadoop::kinit > > Exec["HDFS kinit"] -> Exec["HDFS init $title"] > > } > > @@ -272,10 +362,11 @@ class hadoop { > > Exec <| title == "activate nn1" |> -> Exec["HDFS init $title"] > > } > > > > - define rsync_hdfs($files, $auth="simple") { > > + class rsync_hdfs($files, > > + $hadoop_security_authentcation = > $hadoop::hadoop_security_authentication ) inherits hadoop { > > $src = $files[$title] > > > > - if ($auth == "kerberos") { > > + if ($hadoop_security_authentication == "kerberos") { > > require hadoop::kinit > > Exec["HDFS kinit"] -> Exec["HDFS init $title"] > > } > > @@ -288,25 +379,11 @@ class hadoop { > > Exec <| title == "activate nn1" |> -> Exec["HDFS rsync $title"] > > } > > > > - define namenode ($host = $fqdn , $port = "8020", $auth = "simple", > $dirs = ["/tmp/nn"], $ha = 'disabled', $zk = '') { > > - > > - $first_namenode = inline_template("<%= Array(@host)[0] %>") > > - $hadoop_namenode_host = $host > > - $hadoop_namenode_port = $port > > - $hadoop_security_authentication = $auth > > - > > - if ($ha != 'disabled') { > > - $sshfence_user = extlookup("hadoop_ha_sshfence_user", > "hdfs") > > - $sshfence_user_home = extlookup("hadoop_ha_sshfence_user_home", > "/var/lib/hadoop-hdfs") > > - $sshfence_keydir = "$sshfence_user_home/.ssh" > > - $sshfence_keypath = "$sshfence_keydir/id_sshfence" > > - $sshfence_privkey = extlookup("hadoop_ha_sshfence_privkey", > "$extlookup_datadir/hadoop/id_sshfence") > > - $sshfence_pubkey = extlookup("hadoop_ha_sshfence_pubkey", > "$extlookup_datadir/hadoop/id_sshfence.pub") > > - $shared_edits_dir = extlookup("hadoop_ha_shared_edits_dir", > "/hdfs_shared") > > - $nfs_server = extlookup("hadoop_ha_nfs_server", > "") > > - $nfs_path = extlookup("hadoop_ha_nfs_path", > "") > > - > > - file { $sshfence_keydir: > > + class namenode ( $nfs_server = "", $nfs_path = "" ) { > > + include common_hdfs > > + > > + if ($hadoop::common_hdfs::ha != 'disabled') { > > + file { $hadoop::common_hdfs::sshfence_keydir: > > ensure => directory, > > owner => 'hdfs', > > group => 'hdfs', > > @@ -314,25 +391,25 @@ class hadoop { > > require => Package["hadoop-hdfs"], > > } > > > > - file { $sshfence_keypath: > > - source => $sshfence_privkey, > > + file { $hadoop::common_hdfs::sshfence_keypath: > > + source => $hadoop::common_hdfs::sshfence_privkey, > > owner => 'hdfs', > > group => 'hdfs', > > mode => '0600', > > before => Service["hadoop-hdfs-namenode"], > > - require => File[$sshfence_keydir], > > + require => File[$hadoop::common_hdfs::sshfence_keydir], > > } > > > > - file { "$sshfence_keydir/authorized_keys": > > + file { "$hadoop::common_hdfs::sshfence_keydir/authorized_keys": > > source => $sshfence_pubkey, > > owner => 'hdfs', > > group => 'hdfs', > > mode => '0600', > > before => Service["hadoop-hdfs-namenode"], > > - require => File[$sshfence_keydir], > > + require => File[$hadoop::common_hdfs::sshfence_keydir], > > } > > > > - file { $shared_edits_dir: > > + file { $hadoop::common_hdfs::shared_edits_dir: > > ensure => directory, > > } > > > > @@ -343,20 +420,18 @@ class hadoop { > > > > require nfs::client > > > > - mount { $shared_edits_dir: > > + mount { $hadoop::common_hdfs::shared_edits_dir: > > ensure => "mounted", > > atboot => true, > > device => "${nfs_server}:${nfs_path}", > > fstype => "nfs", > > options => "tcp,soft,timeo=10,intr,rsize=32768,wsize=32768", > > - require => File[$shared_edits_dir], > > + require => File[$hadoop::common::hdfs::shared_edits_dir], > > before => Service["hadoop-hdfs-namenode"], > > } > > } > > } > > > > - include common-hdfs > > - > > package { "hadoop-hdfs-namenode": > > ensure => latest, > > require => Package["jdk"], > > @@ -370,7 +445,7 @@ class hadoop { > > } > > Kerberos::Host_keytab <| title == "hdfs" |> -> Exec <| tag == > "namenode-format" |> -> Service["hadoop-hdfs-namenode"] > > > > - if ($ha == "auto") { > > + if ($hadoop::common_hdfs::ha == "auto") { > > package { "hadoop-hdfs-zkfc": > > ensure => latest, > > require => Package["jdk"], > > @@ -385,17 +460,18 @@ class hadoop { > > Service <| title == "hadoop-hdfs-zkfc" |> -> Service <| title == > "hadoop-hdfs-namenode" |> > > } > > > > + $first_namenode = > any2array($hadoop::common_hdfs::hadoop_namenode_host)[0] > > if ($::fqdn == $first_namenode) { > > exec { "namenode format": > > user => "hdfs", > > command => "/bin/bash -c 'yes Y | hdfs namenode -format >> > /var/lib/hadoop-hdfs/nn.format.log 2>&1'", > > - creates => "${dirs[0]}/current/VERSION", > > - require => [ Package["hadoop-hdfs-namenode"], File[$dirs], > File["/etc/hadoop/conf/hdfs-site.xml"] ], > > + creates => > "${hadoop::common_hdfs::namenode_data_dirs[0]}/current/VERSION", > > + require => [ Package["hadoop-hdfs-namenode"], > File[$hadoop::common_hdfs::namenode_data_dirs], > File["/etc/hadoop/conf/hdfs-site.xml"] ], > > tag => "namenode-format", > > } > > > > - if ($ha != "disabled") { > > - if ($ha == "auto") { > > + if ($hadoop::common_hdfs::ha != "disabled") { > > + if ($hadoop::common_hdfs::ha == "auto") { > > exec { "namenode zk format": > > user => "hdfs", > > command => "/bin/bash -c 'yes N | hdfs zkfc -formatZK >> > /var/lib/hadoop-hdfs/zk.format.log 2>&1 || :'", > > @@ -413,11 +489,11 @@ class hadoop { > > } > > } > > } > > - } elsif ($ha != "disabled") { > > - hadoop::namedir_copy { $namenode_data_dirs: > > + } elsif ($hadoop::common_hdfs::ha != "disabled") { > > + hadoop::namedir_copy { $hadoop::common_hdfs::namenode_data_dirs: > > source => $first_namenode, > > - ssh_identity => $sshfence_keypath, > > - require => File[$sshfence_keypath], > > + ssh_identity => $hadoop::common_hdfs::sshfence_keypath, > > + require => File[$hadoop::common_hdfs::sshfence_keypath], > > } > > } > > > > @@ -427,7 +503,7 @@ class hadoop { > > require => [Package["hadoop-hdfs-namenode"]], > > } > > > > - file { $dirs: > > + file { $hadoop::common_hdfs::namenode_data_dirs: > > ensure => directory, > > owner => hdfs, > > group => hdfs, > > @@ -445,12 +521,8 @@ class hadoop { > > } > > } > > > > - define secondarynamenode ($namenode_host, $namenode_port, $port = > "50090", $auth = "simple") { > > - > > - $hadoop_secondarynamenode_port = $port > > - $hadoop_security_authentication = $auth > > - > > - include common-hdfs > > + class secondarynamenode { > > + include common_hdfs > > > > package { "hadoop-hdfs-secondarynamenode": > > ensure => latest, > > @@ -472,15 +544,34 @@ class hadoop { > > Kerberos::Host_keytab <| title == "hdfs" |> -> > Service["hadoop-hdfs-secondarynamenode"] > > } > > > > + class journalnode { > > + include common_hdfs > > > > - define resourcemanager ($host = $fqdn, $port = "8032", $rt_port = > "8025", $sc_port = "8030", $auth = "simple") { > > - $hadoop_rm_host = $host > > - $hadoop_rm_port = $port > > - $hadoop_rt_port = $rt_port > > - $hadoop_sc_port = $sc_port > > - $hadoop_security_authentication = $auth > > + package { "hadoop-hdfs-journalnode": > > + ensure => latest, > > + require => Package["jdk"], > > + } > > + > > + service { "hadoop-hdfs-journalnode": > > + ensure => running, > > + hasstatus => true, > > + subscribe => [Package["hadoop-hdfs-journalnode"], > File["/etc/hadoop/conf/hadoop-env.sh"], > > + File["/etc/hadoop/conf/hdfs-site.xml"], > File["/etc/hadoop/conf/core-site.xml"]], > > + require => [ Package["hadoop-hdfs-journalnode"], > File[$hadoop::common_hdfs::journalnode_edits_dir] ], > > + } > > + > > + file { > "${hadoop::common_hdfs::journalnode_edits_dir}/${hadoop::common_hdfs::nameservice_id}": > > + ensure => directory, > > + owner => 'hdfs', > > + group => 'hdfs', > > + mode => 755, > > + require => [Package["hadoop-hdfs"]], > > + } > > + } > > > > - include common-yarn > > + > > + class resourcemanager { > > + include common_yarn > > > > package { "hadoop-yarn-resourcemanager": > > ensure => latest, > > @@ -497,12 +588,8 @@ class hadoop { > > Kerberos::Host_keytab <| tag == "mapreduce" |> -> > Service["hadoop-yarn-resourcemanager"] > > } > > > > - define proxyserver ($host = $fqdn, $port = "8088", $auth = "simple") { > > - $hadoop_ps_host = $host > > - $hadoop_ps_port = $port > > - $hadoop_security_authentication = $auth > > - > > - include common-yarn > > + class proxyserver { > > + include common_yarn > > > > package { "hadoop-yarn-proxyserver": > > ensure => latest, > > @@ -519,13 +606,8 @@ class hadoop { > > Kerberos::Host_keytab <| tag == "mapreduce" |> -> > Service["hadoop-yarn-proxyserver"] > > } > > > > - define historyserver ($host = $fqdn, $port = "10020", $webapp_port = > "19888", $auth = "simple") { > > - $hadoop_hs_host = $host > > - $hadoop_hs_port = $port > > - $hadoop_hs_webapp_port = $app_port > > - $hadoop_security_authentication = $auth > > - > > - include common-mapred-app > > + class historyserver { > > + include common_mapred_app > > > > package { "hadoop-mapreduce-historyserver": > > ensure => latest, > > @@ -543,12 +625,8 @@ class hadoop { > > } > > > > > > - define nodemanager ($rm_host, $rm_port, $rt_port, $auth = "simple", > $dirs = ["/tmp/yarn"]){ > > - $hadoop_rm_host = $rm_host > > - $hadoop_rm_port = $rm_port > > - $hadoop_rt_port = $rt_port > > - > > - include common-yarn > > + class nodemanager { > > + include common_yarn > > > > package { "hadoop-yarn-nodemanager": > > ensure => latest, > > @@ -560,11 +638,11 @@ class hadoop { > > hasstatus => true, > > subscribe => [Package["hadoop-yarn-nodemanager"], > File["/etc/hadoop/conf/hadoop-env.sh"], > > File["/etc/hadoop/conf/yarn-site.xml"], > File["/etc/hadoop/conf/core-site.xml"]], > > - require => [ Package["hadoop-yarn-nodemanager"], File[$dirs] ], > > + require => [ Package["hadoop-yarn-nodemanager"], > File[$hadoop::common_yarn::yarn_data_dirs] ], > > } > > Kerberos::Host_keytab <| tag == "mapreduce" |> -> > Service["hadoop-yarn-nodemanager"] > > > > - file { $dirs: > > + file { $hadoop::common_yarn::yarn_data_dirs: > > ensure => directory, > > owner => yarn, > > group => yarn, > > @@ -573,21 +651,10 @@ class hadoop { > > } > > } > > > > - define mapred-app ($namenode_host, $namenode_port, $jobtracker_host, > $jobtracker_port, $auth = "simple", $jobhistory_host = "", > $jobhistory_port="10020", $dirs = ["/tmp/mr"]){ > > - $hadoop_namenode_host = $namenode_host > > - $hadoop_namenode_port = $namenode_port > > - $hadoop_jobtracker_host = $jobtracker_host > > - $hadoop_jobtracker_port = $jobtracker_port > > - $hadoop_security_authentication = $auth > > + class mapred-app { > > + include common_mapred_app > > > > - include common-mapred-app > > - > > - if ($jobhistory_host != "") { > > - $hadoop_hs_host = $jobhistory_host > > - $hadoop_hs_port = $jobhistory_port > > - } > > - > > - file { $dirs: > > + file { $hadoop::common_mapred_app::mapred_data_dirs: > > ensure => directory, > > owner => yarn, > > group => yarn, > > @@ -596,12 +663,9 @@ class hadoop { > > } > > } > > > > - define client ($namenode_host, $namenode_port, $jobtracker_host, > $jobtracker_port, $auth = "simple") { > > - $hadoop_namenode_host = $namenode_host > > - $hadoop_namenode_port = $namenode_port > > - $hadoop_jobtracker_host = $jobtracker_host > > - $hadoop_jobtracker_port = $jobtracker_port > > - $hadoop_security_authentication = $auth > > + class client { > > + include common_mapred_app > > + > > $hadoop_client_packages = $operatingsystem ? { > > /(OracleLinux|CentOS|RedHat|Fedora)/ => [ "hadoop-doc", > "hadoop-hdfs-fuse", "hadoop-client", "hadoop-libhdfs", "hadoop-debuginfo" ], > > /(SLES|OpenSuSE)/ => [ "hadoop-doc", > "hadoop-hdfs-fuse", "hadoop-client", "hadoop-libhdfs" ], > > @@ -609,8 +673,6 @@ class hadoop { > > default => [ "hadoop-doc", > "hadoop-hdfs-fuse", "hadoop-client" ], > > } > > > > - include common-mapred-app > > - > > package { $hadoop_client_packages: > > ensure => latest, > > require => [Package["jdk"], Package["hadoop"], > Package["hadoop-hdfs"], Package["hadoop-mapreduce"]], > > diff --git a/bigtop-deploy/puppet/modules/hadoop/templates/hadoop-env.sh > b/bigtop-deploy/puppet/modules/hadoop/templates/hadoop-env.sh > > index 6b28bdd..f2e355b 100644 > > --- a/bigtop-deploy/puppet/modules/hadoop/templates/hadoop-env.sh > > +++ b/bigtop-deploy/puppet/modules/hadoop/templates/hadoop-env.sh > > @@ -15,7 +15,7 @@ > > > > <% def shell_config(shell_var, *puppet_var) > > puppet_var = puppet_var[0] || shell_var.downcase > > - if has_variable? puppet_var > > + if @puppet_var > > return "export #{shell_var}=#{scope.lookupvar(puppet_var)}" > > else > > return "#export #{shell_var}=" > > diff --git a/bigtop-deploy/puppet/modules/hadoop/templates/hdfs-site.xml > b/bigtop-deploy/puppet/modules/hadoop/templates/hdfs-site.xml > > index 351508d..11c59be 100644 > > --- a/bigtop-deploy/puppet/modules/hadoop/templates/hdfs-site.xml > > +++ b/bigtop-deploy/puppet/modules/hadoop/templates/hdfs-site.xml > > @@ -30,7 +30,7 @@ > > <% end -%> > > > > <property> > > - <name>dfs.federation.nameservices</name> > > + <name>dfs.nameservices</name> > > <value><%= @nameservice_id %></value> > > </property> > > > > @@ -47,7 +47,12 @@ > > > > <property> > > <name>dfs.namenode.http-address.<%= @nameservice_id %>.nn<%= idx+1 > %></name> > > - <value><%= host %>:50070</value> > > + <value><%= host %>:<%= @hadoop_namenode_http_port %></value> > > + </property> > > + > > + <property> > > + <name>dfs.namenode.https-address.<%= @nameservice_id %>.nn<%= idx+1 > %></name> > > + <value><%= host %>:<%= @hadoop_namenode_https_port %></value> > > </property> > > > > <% end -%> > > @@ -249,7 +254,28 @@ > > <% end -%> > > <property> > > <name>dfs.webhdfs.enabled</name> > > - <value>true</value> > > + <value><%= @hdfs_webhdfs_enabled %></value> > > + </property> > > + > > +<% if @hdfs_datanode_fsdataset_volume_choosing_policy -%> > > + <property> > > + <name>dfs.datanode.fsdataset.volume.choosing.policy</name> > > + <value><%= @hdfs_datanode_fsdataset_volume_choosing_policy > %></value> > > + </property> > > + > > +<% end -%> > > +<% if @hdfs_replication -%> > > + <property> > > + <name>dfs.replication</name> > > + <value><%= @hdfs_replication %></value> > > + </property> > > + > > +<% end -%> > > +<% if @journalnode_edits_dir -%> > > + <property> > > + <name>dfs.journalnode.edits.dir</name> > > + <value><%= @journalnode_edits_dir %></value> > > </property> > > > > +<% end -%> > > </configuration> > > diff --git a/bigtop-deploy/puppet/modules/hadoop/templates/yarn-site.xml > b/bigtop-deploy/puppet/modules/hadoop/templates/yarn-site.xml > > index 0713d97..57ce85a 100644 > > --- a/bigtop-deploy/puppet/modules/hadoop/templates/yarn-site.xml > > +++ b/bigtop-deploy/puppet/modules/hadoop/templates/yarn-site.xml > > @@ -17,6 +17,7 @@ > > --> > > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > > > +<% resourcemanager_hosts = Array(@hadoop_rm_host) -%> > > <configuration> > > <% if @hadoop_security_authentication == "kerberos" %> > > <!-- JobTracker security configs --> > > @@ -61,6 +62,47 @@ > > <value><%= @hadoop_ps_host %>:<%= @hadoop_ps_port %></value> > > </property> > > > > +<% if @yarn_resourcemanager_ha_enabled -%> > > + > > + <property> > > + <name>yarn.resourcemanager.ha.enabled</name> > > + <value><%= @yarn_resourcemanager_ha_enabled %></value> > > + </property> > > + > > + <property> > > + <name>yarn.resourcemanager.cluster-id</name> > > + <value><%= @yarn_resourcemanager_cluster_id %></value> > > + </property> > > + > > + <property> > > + <name>yarn.resourcemanager.ha.rm-ids</name> > > + <value><%= (1..resourcemanager_hosts.length).map { |n| "rm#{n}" > }.join(",") %></value> > > + </property> > > + > > +<% resourcemanager_hosts.each_with_index do |host,idx| -%> > > + <property> > > + <name>yarn.resourcemanager.resource-tracker.address.rm<%= idx+1 > %></name> > > + <value><%= host %>:<%= @hadoop_rt_port %></value> > > + </property> > > + > > + <property> > > + <name>yarn.resourcemanager.address.rm<%= idx+1 %></name> > > + <value><%= host %>:<%= @hadoop_rm_port %></value> > > + </property> > > + > > + <property> > > + <name>yarn.resourcemanager.scheduler.address.rm<%= idx+1 %></name> > > + <value><%= host %>:<%= @hadoop_sc_port %></value> > > + </property> > > +<% end -%> > > +<% if @yarn_resourcemanager_zk_address -%> > > + > > + <property> > > + <name>yarn.resourcemanager.zk-address</name> > > + <value><%= @yarn_resourcemanager_zk_address %></value> > > + </property> > > +<% end -%> > > +<% else -%> > > <property> > > <name>yarn.resourcemanager.resource-tracker.address</name> > > <value><%= @hadoop_rm_host %>:<%= @hadoop_rt_port %></value> > > @@ -75,6 +117,7 @@ > > <name>yarn.resourcemanager.scheduler.address</name> > > <value><%= @hadoop_rm_host %>:<%= @hadoop_sc_port %></value> > > </property> > > +<% end -%> > > > > <property> > > <name>yarn.nodemanager.aux-services</name> > > @@ -125,4 +168,32 @@ > > $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/* > > </value> > > </property> > > +<% if @yarn_scheduler_minimum_allocation_mb -%> > > + > > + <property> > > + <name>yarn.scheduler.minimum-allocation-mb</name> > > + <value><%= @yarn_scheduler_minimum_allocation_mb %></value> > > + </property> > > +<% end -%> > > +<% if @yarn_scheduler_maximum_allocation_mb -%> > > + > > + <property> > > + <name>yarn.scheduler.maximum-allocation-mb</name> > > + <value><%= @yarn_scheduler_maximum_allocation_mb %></value> > > + </property> > > +<% end -%> > > +<% if @yarn_nodemanager_resource_memory_mb -%> > > + > > + <property> > > + <name>yarn.nodemanager.resource.memory-mb</name> > > + <value><%= @yarn_nodemanager_resource_memory_mb %></value> > > + </property> > > +<% end -%> > > +<% if @yarn_resourcemanager_scheduler_class -%> > > + > > + <property> > > + <name>yarn.resourcemanager.scheduler.class</name> > > + <value><%= @yarn_resourcemanager_scheduler_class %></value> > > + </property> > > +<% end -%> > > </configuration> > > diff --git a/bigtop-deploy/puppet/modules/hcatalog/manifests/init.pp > b/bigtop-deploy/puppet/modules/hcatalog/manifests/init.pp > > index f9c07aa..6585dd3 100644 > > --- a/bigtop-deploy/puppet/modules/hcatalog/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/hcatalog/manifests/init.pp > > @@ -14,7 +14,7 @@ > > # limitations under the License. > > > > class hcatalog { > > - define server($port = "9083", $kerberos_realm = "") { > > + class server($port = "9083", $kerberos_realm = "") { > > package { "hcatalog-server": > > ensure => latest, > > } > > @@ -33,7 +33,7 @@ class hcatalog { > > } > > > > class webhcat { > > - define server($port = "50111", $kerberos_realm = "") { > > + class server($port = "50111", $kerberos_realm = "") { > > package { "webhcat-server": > > ensure => latest, > > } > > diff --git a/bigtop-deploy/puppet/modules/hue/manifests/init.pp > b/bigtop-deploy/puppet/modules/hue/manifests/init.pp > > index f4c6f95..e5c7762 100644 > > --- a/bigtop-deploy/puppet/modules/hue/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/hue/manifests/init.pp > > @@ -14,7 +14,7 @@ > > # limitations under the License. > > > > class hue { > > - define server($sqoop_url, $solr_url, $hbase_thrift_url, > > + class server($sqoop_url, $solr_url, $hbase_thrift_url, > > $webhdfs_url, $rm_host, $rm_port, $oozie_url, $rm_url, > $rm_proxy_url, $history_server_url, > > $hue_host = "0.0.0.0", $hue_port = "8888", $default_fs > = "hdfs://localhost:8020", > > $kerberos_realm = "") { > > diff --git a/bigtop-deploy/puppet/modules/kerberos/manifests/init.pp > b/bigtop-deploy/puppet/modules/kerberos/manifests/init.pp > > index 5476235..dd83500 100644 > > --- a/bigtop-deploy/puppet/modules/kerberos/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/kerberos/manifests/init.pp > > @@ -14,23 +14,12 @@ > > # limitations under the License. > > > > class kerberos { > > - class site { > > - # The following is our interface to the world. This is what we allow > > - # users to tweak from the outside (see tests/init.pp for a complete > > - # example) before instantiating target classes. > > - # Once we migrate to Puppet 2.6 we can potentially start using > > - # parametrized classes instead. > > - $domain = $kerberos_domain ? { '' => inline_template('<%= > domain %>'), > > - default => $kerberos_domain } > > - $realm = $kerberos_realm ? { '' => inline_template('<%= > domain.upcase %>'), > > - default => $kerberos_realm } > > - $kdc_server = $kerberos_kdc_server ? { '' => 'localhost', > > - default => > $kerberos_kdc_server } > > - $kdc_port = $kerberos_kdc_port ? { '' => '88', > > - default => > $kerberos_kdc_port } > > - $admin_port = 749 /* BUG: linux daemon packaging doesn't let us > tweak this */ > > - > > - $keytab_export_dir = "/var/lib/bigtop_keytabs" > > + class site ($domain = inline_template('<%= domain %>'), > > + $realm = inline_template('<%= domain.upcase %>'), > > + $kdc_server = 'localhost', > > + $kdc_port = '88', > > + $admin_port = 749, > > + $keytab_export_dir = "/var/lib/bigtop_keytabs") { > > > > case $operatingsystem { > > 'ubuntu': { > > diff --git a/bigtop-deploy/puppet/modules/mahout/manifests/init.pp > b/bigtop-deploy/puppet/modules/mahout/manifests/init.pp > > index 9f10b17..0d9bd8c 100644 > > --- a/bigtop-deploy/puppet/modules/mahout/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/mahout/manifests/init.pp > > @@ -14,7 +14,7 @@ > > # limitations under the License. > > > > class mahout { > > - define client { > > + class client { > > package { "mahout": > > ensure => latest, > > require => Package["hadoop"], > > diff --git a/bigtop-deploy/puppet/modules/solr/manifests/init.pp > b/bigtop-deploy/puppet/modules/solr/manifests/init.pp > > index 22c4d9e..119fbd1 100644 > > --- a/bigtop-deploy/puppet/modules/solr/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/solr/manifests/init.pp > > @@ -14,7 +14,7 @@ > > # limitations under the License. > > > > class solr { > > - define server($port = "1978", $port_admin = "1979", $zk = > "localhost:2181", $root_url = "hdfs://localhost:8020/solr", $kerberos_realm > = "") { > > + class server($port = "1978", $port_admin = "1979", $zk = > "localhost:2181", $root_url = "hdfs://localhost:8020/solr", $kerberos_realm > = "") { > > package { "solr-server": > > ensure => latest, > > } > > diff --git a/bigtop-deploy/puppet/modules/spark/manifests/init.pp > b/bigtop-deploy/puppet/modules/spark/manifests/init.pp > > index 1281ff4..d7a9360 100644 > > --- a/bigtop-deploy/puppet/modules/spark/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/spark/manifests/init.pp > > @@ -14,7 +14,7 @@ > > # limitations under the License. > > > > class spark { > > - class common { > > + class common ($master_host = $fqdn, $master_port = "7077", > $master_ui_port = "18080") { > > package { "spark-core": > > ensure => latest, > > } > > @@ -25,7 +25,7 @@ class spark { > > } > > } > > > > - define master($master_host = $fqdn, $master_port = "7077", > $master_ui_port = "18080") { > > + class master { > > include common > > > > package { "spark-master": > > @@ -43,7 +43,7 @@ class spark { > > } > > } > > > > - define worker($master_host = $fqdn, $master_port = "7077", > $master_ui_port = "18080") { > > + class worker { > > include common > > > > package { "spark-worker": > > diff --git a/bigtop-deploy/puppet/modules/tachyon/manifests/init.pp > b/bigtop-deploy/puppet/modules/tachyon/manifests/init.pp > > index 55fb34a..fe1a7b6 100644 > > --- a/bigtop-deploy/puppet/modules/tachyon/manifests/init.pp > > +++ b/bigtop-deploy/puppet/modules/tachyon/manifests/init.pp > > @@ -10,7 +10,7 @@ > > # See the License for the specific language governing permissions and > > # limitations under the License. > > class tachyon { > > - class common { > > + class common ($master_host){ > > package { "tachyon": > > ensure => latest, > > } > > @@ -29,7 +29,7 @@ class tachyon { > > } > > } > > > > - define master($master_host) { > > + class master { > > include common > > > > exec { > > @@ -49,10 +49,10 @@ class tachyon { > > > > } > > > > - define worker($master_host) { > > + class worker { > > include common > > > > - if ( $fqdn == $master_host ) { > > + if ( $fqdn == $tachyon::common::master_host ) { > > notice("tachyon ---> master host") > > # We want master to run first in all cases > > Service["tachyon-master"] ~> Service["tachyon-worker"] > > -- > > 2.1.4 > > > >
