BIGTOP-2435 Add Juju charms for hadoop component Signed-off-by: Konstantin Boudnik <[email protected]>
Project: http://git-wip-us.apache.org/repos/asf/bigtop/repo Commit: http://git-wip-us.apache.org/repos/asf/bigtop/commit/d639645e Tree: http://git-wip-us.apache.org/repos/asf/bigtop/tree/d639645e Diff: http://git-wip-us.apache.org/repos/asf/bigtop/diff/d639645e Branch: refs/heads/BIGTOP-2253 Commit: d639645e7726577e5898f5da3722c6be65b2c9bd Parents: f2d91af Author: Kevin W Monroe <[email protected]> Authored: Mon May 9 16:05:57 2016 -0700 Committer: Konstantin Boudnik <[email protected]> Committed: Fri Jun 3 15:06:13 2016 -0700 ---------------------------------------------------------------------- NOTICE | 3 +- bigtop-packages/src/charm/README.md | 75 +++++++ .../hadoop/layer-hadoop-namenode/.gitignore | 4 + .../hadoop/layer-hadoop-namenode/README.md | 106 +++++++++ .../hadoop/layer-hadoop-namenode/actions.yaml | 2 + .../layer-hadoop-namenode/actions/smoke-test | 62 +++++ .../hadoop/layer-hadoop-namenode/copyright | 16 ++ .../hadoop/layer-hadoop-namenode/layer.yaml | 25 +++ .../hadoop/layer-hadoop-namenode/metadata.yaml | 17 ++ .../layer-hadoop-namenode/reactive/namenode.py | 206 +++++++++++++++++ .../tests/01-basic-deployment.py | 39 ++++ .../layer-hadoop-namenode/tests/tests.yaml | 3 + .../hadoop/layer-hadoop-namenode/wheelhouse.txt | 1 + .../charm/hadoop/layer-hadoop-plugin/README.md | 92 ++++++++ .../charm/hadoop/layer-hadoop-plugin/copyright | 16 ++ .../charm/hadoop/layer-hadoop-plugin/layer.yaml | 8 + .../hadoop/layer-hadoop-plugin/metadata.yaml | 20 ++ .../reactive/apache_bigtop_plugin.py | 149 ++++++++++++ .../tests/01-basic-deployment.py | 46 ++++ .../hadoop/layer-hadoop-plugin/tests/tests.yaml | 3 + .../layer-hadoop-resourcemanager/.gitignore | 4 + .../layer-hadoop-resourcemanager/README.md | 155 +++++++++++++ .../layer-hadoop-resourcemanager/actions.yaml | 134 +++++++++++ .../actions/mrbench | 75 +++++++ .../actions/nnbench | 76 +++++++ .../actions/parseNNBench.py | 45 ++++ .../actions/parseTerasort.py | 45 ++++ .../actions/smoke-test | 80 +++++++ .../actions/teragen | 57 +++++ .../actions/terasort | 80 +++++++ .../actions/testdfsio | 76 +++++++ .../layer-hadoop-resourcemanager/copyright | 16 ++ .../layer-hadoop-resourcemanager/layer.yaml | 27 +++ .../layer-hadoop-resourcemanager/metadata.yaml | 19 ++ .../reactive/resourcemanager.py | 225 +++++++++++++++++++ .../tests/01-basic-deployment.py | 39 ++++ .../tests/tests.yaml | 3 + .../layer-hadoop-resourcemanager/wheelhouse.txt | 1 + .../charm/hadoop/layer-hadoop-slave/README.md | 116 ++++++++++ .../charm/hadoop/layer-hadoop-slave/copyright | 16 ++ .../charm/hadoop/layer-hadoop-slave/layer.yaml | 2 + .../hadoop/layer-hadoop-slave/metadata.yaml | 8 + .../reactive/hadoop_status.py | 55 +++++ .../tests/01-basic-deployment.py | 39 ++++ .../hadoop/layer-hadoop-slave/tests/tests.yaml | 3 + build.gradle | 3 + pom.xml | 3 + 47 files changed, 2294 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/NOTICE ---------------------------------------------------------------------- diff --git a/NOTICE b/NOTICE index 76730a9..dc7c948 100644 --- a/NOTICE +++ b/NOTICE @@ -1,5 +1,6 @@ Apache Bigtop Copyright 2014, The Apache Software Foundation +Portions Copyright 2015-2016 Canonical Ltd. This product includes software developed at The Apache Software Foundation (http://www.apache.org/). @@ -10,4 +11,4 @@ In addition, this product includes files licensed under: https://www.freebsd.org/copyright/freebsd-doc-license.html * The MIT License - https://opensource.org/licenses/MIT \ No newline at end of file + https://opensource.org/licenses/MIT http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/README.md ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/README.md b/bigtop-packages/src/charm/README.md new file mode 100644 index 0000000..1290d4b --- /dev/null +++ b/bigtop-packages/src/charm/README.md @@ -0,0 +1,75 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> +# Juju Charms for Deploying Bigtop + +## Overview + +These are the charm layers used to build Juju charms for deploying Bigtop +components. The charms are also published to the [Juju charm store][] and +can be deployed directly from there using [bundles][], or they can be +built from these layers and deployed locally. + +Charms allow you to deploy, configure, and connect a Apache Bigtop cluster +on any supported cloud, which can be easily scaled to meet workload demands. +You can also easily connect other, non-Bigtop components from the +[Juju charm store][] that support common interfaces. + + +[Juju charm store]: https://jujucharms.com/ +[bundles]: https://jujucharms.com/u/bigdata-dev/hadoop-processing + + +## Building the Bigtop Charms + +To build these charms, you will need [charm-tools][]. You should also read +over the developer [Getting Started][] page for an overview of charms and +building them. Then, in any of the charm layer directories, use `charm build`. +For example: + + export JUJU_REPOSITORY=$HOME/charms + mkdir $HOME/charms + + cd bigtop-packages/src/charms/hadoop/layer-hadoop-namenode + charm build + +This will build the NameNode charm, pulling in the appropriate base and +interface layers from [interfaces.juju.solutions][]. You can get local copies +of those layers as well using `charm pull-source`: + + export LAYER_PATH=$HOME/layers + export INTERFACE_PATH=$HOME/interfaces + mkdir $HOME/{layers,interfaces} + + charm pull-source layer:apache-bigtop-base + charm pull-source interface:dfs + +You can then deploy the locally built charms individually: + + juju deploy local:trusty/hadoop-namenode + +You can also use the local version of a bundle: + + juju deploy bigtop-deploy/juju/hadoop-processing/bundle-local.yaml + +> Note: With Juju versions < 2.0, you will need to use [juju-deployer][] to +deploy the local bundle. + + +[charm-tools]: https://jujucharms.com/docs/stable/tools-charm-tools +[Getting Started]: https://jujucharms.com/docs/devel/developer-getting-started +[interfaces.juju.solutions]: http://interfaces.juju.solutions/ +[juju-deployer]: https://pypi.python.org/pypi/juju-deployer/ http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/.gitignore ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/.gitignore b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/.gitignore new file mode 100644 index 0000000..749ccda --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/.gitignore @@ -0,0 +1,4 @@ +# Byte-compiled / optimized / DLL files +__pycache__/ +*.py[cod] +*$py.class http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/README.md ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/README.md b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/README.md new file mode 100644 index 0000000..bf46bf7 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/README.md @@ -0,0 +1,106 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> +## Overview + +The Apache Hadoop software library is a framework that allows for the +distributed processing of large data sets across clusters of computers +using a simple programming model. + +This charm deploys the NameNode component of the Apache Bigtop platform +to provide HDFS master resources. + + +## Usage + +This charm is intended to be deployed via one of the +[apache bigtop bundles](https://jujucharms.com/u/bigdata-dev/#bundles). +For example: + + juju deploy hadoop-processing + +> Note: With Juju versions < 2.0, you will need to use [juju-deployer][] to +deploy the bundle. + +This will deploy the Apache Bigtop platform with a workload node +preconfigured to work with the cluster. + +You can also manually load and run map-reduce jobs via the plugin charm +included in the bundles linked above: + + juju scp my-job.jar plugin/0: + juju ssh plugin/0 + hadoop jar my-job.jar + + +[juju-deployer]: https://pypi.python.org/pypi/juju-deployer/ + + +## Status and Smoke Test + +Apache Bigtop charms provide extended status reporting to indicate when they +are ready: + + juju status --format=tabular + +This is particularly useful when combined with `watch` to track the on-going +progress of the deployment: + + watch -n 0.5 juju status --format=tabular + +The message for each unit will provide information about that unit's state. +Once they all indicate that they are ready, you can perform a "smoke test" +to verify HDFS or YARN services are working as expected. Trigger the +`smoke-test` action by: + + juju action do namenode/0 smoke-test + juju action do resourcemanager/0 smoke-test + +After a few seconds or so, you can check the results of the smoke test: + + juju action status + +You will see `status: completed` if the smoke test was successful, or +`status: failed` if it was not. You can get more information on why it failed +via: + + juju action fetch <action-id> + + +## Deploying in Network-Restricted Environments + +Charms can be deployed in environments with limited network access. To deploy +in this environment, you will need a local mirror to serve required packages. + + +### Mirroring Packages + +You can setup a local mirror for apt packages using squid-deb-proxy. +For instructions on configuring juju to use this, see the +[Juju Proxy Documentation](https://juju.ubuntu.com/docs/howto-proxies.html). + + +## Contact Information + +- <[email protected]> + + +## Hadoop + +- [Apache Bigtop](http://bigtop.apache.org/) home page +- [Apache Bigtop issue tracking](http://bigtop.apache.org/issue-tracking.html) +- [Apache Bigtop mailing lists](http://bigtop.apache.org/mail-lists.html) +- [Apache Bigtop charms](https://jujucharms.com/q/apache/bigtop) http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/actions.yaml ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/actions.yaml b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/actions.yaml new file mode 100644 index 0000000..ee93b4c --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/actions.yaml @@ -0,0 +1,2 @@ +smoke-test: + description: Verify that HDFS is working by creating and removing a test directory. http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/actions/smoke-test ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/actions/smoke-test b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/actions/smoke-test new file mode 100755 index 0000000..58ffce2 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/actions/smoke-test @@ -0,0 +1,62 @@ +#!/usr/bin/env python3 + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import sys + +from charmhelpers.core import hookenv +from jujubigdata.utils import run_as +from charms.reactive import is_state + +if not is_state('apache-bigtop-namenode.ready'): + hookenv.action_fail('NameNode service not yet ready') + + +# verify the hdfs-test directory does not already exist +output = run_as('ubuntu', 'hdfs', 'dfs', '-ls', '/tmp', capture_output=True) +if '/tmp/hdfs-test' in output: + run_as('ubuntu', 'hdfs', 'dfs', '-rm', '-R', '/tmp/hdfs-test') + output = run_as('ubuntu', 'hdfs', 'dfs', '-ls', '/tmp', capture_output=True) + if 'hdfs-test' in output: + hookenv.action_fail('Unable to remove existing hdfs-test directory') + sys.exit() + +# create the directory +run_as('ubuntu', 'hdfs', 'dfs', '-mkdir', '-p', '/tmp/hdfs-test') +run_as('ubuntu', 'hdfs', 'dfs', '-chmod', '-R', '777', '/tmp/hdfs-test') + +# verify the newly created hdfs-test subdirectory exists +output = run_as('ubuntu', 'hdfs', 'dfs', '-ls', '/tmp', capture_output=True) +for line in output.split('\n'): + if '/tmp/hdfs-test' in line: + if 'ubuntu' not in line or 'drwxrwxrwx' not in line: + hookenv.action_fail('Permissions incorrect for hdfs-test directory') + sys.exit() + break +else: + hookenv.action_fail('Unable to create hdfs-test directory') + sys.exit() + +# remove the directory +run_as('ubuntu', 'hdfs', 'dfs', '-rm', '-R', '/tmp/hdfs-test') + +# verify the hdfs-test subdirectory has been removed +output = run_as('ubuntu', 'hdfs', 'dfs', '-ls', '/tmp', capture_output=True) +if '/tmp/hdfs-test' in output: + hookenv.action_fail('Unable to remove hdfs-test directory') + sys.exit() + +hookenv.action_set({'outcome': 'success'}) http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/copyright ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/copyright b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/copyright new file mode 100644 index 0000000..52de50a --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/copyright @@ -0,0 +1,16 @@ +Format: http://dep.debian.net/deps/dep5/ + +Files: * +Copyright: Copyright 2015, Canonical Ltd., All Rights Reserved, The Apache Software Foundation +License: Apache License 2.0 + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + . + http://www.apache.org/licenses/LICENSE-2.0 + . + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/layer.yaml ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/layer.yaml b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/layer.yaml new file mode 100644 index 0000000..332a6e3 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/layer.yaml @@ -0,0 +1,25 @@ +repo: [email protected]:juju-solutions/layer-hadoop-namenode.git +includes: + - 'layer:apache-bigtop-base' + - 'interface:dfs' + - 'interface:dfs-slave' + - 'interface:benchmark' +options: + apache-bigtop-base: + groups: + - 'mapred' + - 'yarn' + users: + mapred: + groups: ['hadoop', 'mapred'] + ubuntu: + groups: ['hadoop', 'mapred'] + yarn: + groups: ['hadoop', 'yarn'] + ports: + namenode: + port: 8020 + exposed_on: 'namenode' + nn_webapp_http: + port: 50070 + exposed_on: 'namenode' http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/metadata.yaml ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/metadata.yaml b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/metadata.yaml new file mode 100644 index 0000000..ab51ce4 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/metadata.yaml @@ -0,0 +1,17 @@ +name: hadoop-namenode +summary: HDFS master (NameNode) for Apache Bigtop platform +maintainer: Juju Big Data <[email protected]> +description: > + Hadoop is a software platform that lets one easily write and + run applications that process vast amounts of data. + + This charm manages the HDFS master node (NameNode). +tags: ["applications", "bigdata", "bigtop", "hadoop", "apache"] +provides: + namenode: + interface: dfs + benchmark: + interface: benchmark +requires: + datanode: + interface: dfs-slave http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/reactive/namenode.py ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/reactive/namenode.py b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/reactive/namenode.py new file mode 100644 index 0000000..c39a609 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/reactive/namenode.py @@ -0,0 +1,206 @@ + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from charms.reactive import is_state, remove_state, set_state, when, when_not +from charms.layer.apache_bigtop_base import Bigtop, get_layer_opts, get_fqdn +from charmhelpers.core import hookenv, host +from jujubigdata import utils +from path import Path + + +############################################################################### +# Utility methods +############################################################################### +def send_early_install_info(remote): + """Send clients/slaves enough relation data to start their install. + + If slaves or clients join before the namenode is installed, we can still provide enough + info to start their installation. This will help parallelize installation among our + cluster. + + Note that slaves can safely install early, but should not start until the + 'namenode.ready' state is set by the dfs-slave interface. + """ + fqdn = get_fqdn() + hdfs_port = get_layer_opts().port('namenode') + webhdfs_port = get_layer_opts().port('nn_webapp_http') + + remote.send_namenodes([fqdn]) + remote.send_ports(hdfs_port, webhdfs_port) + + +############################################################################### +# Core methods +############################################################################### +@when('bigtop.available') +@when_not('apache-bigtop-namenode.installed') +def install_namenode(): + hookenv.status_set('maintenance', 'installing namenode') + bigtop = Bigtop() + bigtop.render_site_yaml( + hosts={ + 'namenode': get_fqdn(), + }, + roles=[ + 'namenode', + 'mapred-app', + ], + ) + bigtop.trigger_puppet() + + # /etc/hosts entries from the KV are not currently used for bigtop, + # but a hosts_map attribute is required by some interfaces (eg: dfs-slave) + # to signify NN's readiness. Set our NN info in the KV to fulfill this + # requirement. + utils.initialize_kv_host() + + # make our namenode listen on all interfaces + hdfs_site = Path('/etc/hadoop/conf/hdfs-site.xml') + with utils.xmlpropmap_edit_in_place(hdfs_site) as props: + props['dfs.namenode.rpc-bind-host'] = '0.0.0.0' + props['dfs.namenode.servicerpc-bind-host'] = '0.0.0.0' + props['dfs.namenode.http-bind-host'] = '0.0.0.0' + props['dfs.namenode.https-bind-host'] = '0.0.0.0' + + # We need to create the 'mapred' user/group since we are not installing + # hadoop-mapreduce. This is needed so the namenode can access yarn + # job history files in hdfs. Also add our ubuntu user to the hadoop + # and mapred groups. + get_layer_opts().add_users() + + set_state('apache-bigtop-namenode.installed') + hookenv.status_set('maintenance', 'namenode installed') + + +@when('apache-bigtop-namenode.installed') +@when_not('apache-bigtop-namenode.started') +def start_namenode(): + hookenv.status_set('maintenance', 'starting namenode') + # NB: service should be started by install, but this may be handy in case + # we have something that removes the .started state in the future. Also + # note we restart here in case we modify conf between install and now. + host.service_restart('hadoop-hdfs-namenode') + for port in get_layer_opts().exposed_ports('namenode'): + hookenv.open_port(port) + set_state('apache-bigtop-namenode.started') + hookenv.status_set('maintenance', 'namenode started') + + +############################################################################### +# Slave methods +############################################################################### +@when('datanode.joined') +@when_not('apache-bigtop-namenode.installed') +def send_dn_install_info(datanode): + """Send datanodes enough relation data to start their install.""" + send_early_install_info(datanode) + + +@when('apache-bigtop-namenode.started', 'datanode.joined') +def send_dn_all_info(datanode): + """Send datanodes all dfs-slave relation data. + + At this point, the namenode is ready to serve datanodes. Send all + dfs-slave relation data so that our 'namenode.ready' state becomes set. + """ + bigtop = Bigtop() + fqdn = get_fqdn() + hdfs_port = get_layer_opts().port('namenode') + webhdfs_port = get_layer_opts().port('nn_webapp_http') + + datanode.send_spec(bigtop.spec()) + datanode.send_namenodes([fqdn]) + datanode.send_ports(hdfs_port, webhdfs_port) + + # hosts_map, ssh_key, and clustername are required by the dfs-slave + # interface to signify NN's readiness. Send them, even though they are not + # utilized by bigtop. + # NB: update KV hosts with all datanodes prior to sending the hosts_map + # because dfs-slave gates readiness on a DN's presence in the hosts_map. + utils.update_kv_hosts(datanode.hosts_map()) + datanode.send_hosts_map(utils.get_kv_hosts()) + datanode.send_ssh_key('invalid') + datanode.send_clustername(hookenv.service_name()) + + # update status with slave count and report ready for hdfs + num_slaves = len(datanode.nodes()) + hookenv.status_set('active', 'ready ({count} datanode{s})'.format( + count=num_slaves, + s='s' if num_slaves > 1 else '', + )) + set_state('apache-bigtop-namenode.ready') + + +@when('apache-bigtop-namenode.started', 'datanode.departing') +def remove_dn(datanode): + """Handle a departing datanode. + + This simply logs a message about a departing datanode and removes + the entry from our KV hosts_map. The hosts_map is not used by bigtop, but + it is required for the 'namenode.ready' state, so we may as well keep it + accurate. + """ + slaves_leaving = datanode.nodes() # only returns nodes in "departing" state + hookenv.log('Datanodes leaving: {}'.format(slaves_leaving)) + utils.remove_kv_hosts(slaves_leaving) + datanode.dismiss() + + +@when('apache-bigtop-namenode.started') +@when_not('datanode.joined') +def wait_for_dn(): + remove_state('apache-bigtop-namenode.ready') + # NB: we're still active since a user may be interested in our web UI + # without any DNs, but let them know hdfs is caput without a DN relation. + hookenv.status_set('active', 'hdfs requires a datanode relation') + + +############################################################################### +# Client methods +############################################################################### +@when('namenode.clients') +@when_not('apache-bigtop-namenode.installed') +def send_client_install_info(client): + """Send clients enough relation data to start their install.""" + send_early_install_info(client) + + +@when('apache-bigtop-namenode.started', 'namenode.clients') +def send_client_all_info(client): + """Send clients (plugin, RM, non-DNs) all dfs relation data. + + At this point, the namenode is ready to serve clients. Send all + dfs relation data so that our 'namenode.ready' state becomes set. + """ + bigtop = Bigtop() + fqdn = get_fqdn() + hdfs_port = get_layer_opts().port('namenode') + webhdfs_port = get_layer_opts().port('nn_webapp_http') + + client.send_spec(bigtop.spec()) + client.send_namenodes([fqdn]) + client.send_ports(hdfs_port, webhdfs_port) + # namenode.ready implies we have at least 1 datanode, which means hdfs + # is ready for use. Inform clients of that with send_ready(). + if is_state('apache-bigtop-namenode.ready'): + client.send_ready(True) + else: + client.send_ready(False) + + # hosts_map and clustername are required by the dfs interface to signify + # NN's readiness. Send it, even though they are not utilized by bigtop. + client.send_hosts_map(utils.get_kv_hosts()) + client.send_clustername(hookenv.service_name()) http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/tests/01-basic-deployment.py ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/tests/01-basic-deployment.py b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/tests/01-basic-deployment.py new file mode 100755 index 0000000..15c00c9 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/tests/01-basic-deployment.py @@ -0,0 +1,39 @@ +#!/usr/bin/env python3 + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import unittest +import amulet + + +class TestDeploy(unittest.TestCase): + """ + Trivial deployment test for Apache Hadoop NameNode. + + This charm cannot do anything useful by itself, so integration testing + is done in the bundle. + """ + + def test_deploy(self): + self.d = amulet.Deployment(series='trusty') + self.d.add('namenode', 'hadoop-namenode') + self.d.setup(timeout=900) + self.d.sentry.wait(timeout=1800) + self.unit = self.d.sentry['namenode'][0] + + +if __name__ == '__main__': + unittest.main() http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/tests/tests.yaml ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/tests/tests.yaml b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/tests/tests.yaml new file mode 100644 index 0000000..3b6ce3e --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/tests/tests.yaml @@ -0,0 +1,3 @@ +reset: false +packages: + - amulet http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/wheelhouse.txt ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/wheelhouse.txt b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/wheelhouse.txt new file mode 100644 index 0000000..183242f --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-namenode/wheelhouse.txt @@ -0,0 +1 @@ +charms.benchmark>=1.0.0,<2.0.0 http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/README.md ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/README.md b/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/README.md new file mode 100644 index 0000000..cbea7f0 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/README.md @@ -0,0 +1,92 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> +## Overview + +The Apache Hadoop software library is a framework that allows for the +distributed processing of large data sets across clusters of computers +using a simple programming model. + +This charm facilitates communication between core Apache Bigtop cluster +components and workload charms. + + +## Usage + +This charm is intended to be deployed via one of the +[apache bigtop bundles](https://jujucharms.com/u/bigdata-dev/#bundles). +For example: + + juju deploy hadoop-processing + +> Note: With Juju versions < 2.0, you will need to use [juju-deployer][] to +deploy the bundle. + +This will deploy the Apache Bigtop platform with a workload node +preconfigured to work with the cluster. + +You could extend this deployment, for example, to analyze data using Apache Pig. +Simply deploy Pig and attach it to the same plugin: + + juju deploy apache-pig pig + juju add-relation plugin pig + + +[juju-deployer]: https://pypi.python.org/pypi/juju-deployer/ + + +## Status and Smoke Test + +Apache Bigtop charms provide extended status reporting to indicate when they +are ready: + + juju status --format=tabular + +This is particularly useful when combined with `watch` to track the on-going +progress of the deployment: + + watch -n 0.5 juju status --format=tabular + +The message for each unit will provide information about that unit's state. +Once they all indicate that they are ready, you can perform a "smoke test" +to verify HDFS or YARN services are working as expected. Trigger the +`smoke-test` action by: + + juju action do namenode/0 smoke-test + juju action do resourcemanager/0 smoke-test + +After a few seconds or so, you can check the results of the smoke test: + + juju action status + +You will see `status: completed` if the smoke test was successful, or +`status: failed` if it was not. You can get more information on why it failed +via: + + juju action fetch <action-id> + + +## Contact Information + +- <[email protected]> + + +## Resources + +- [Apache Bigtop](http://bigtop.apache.org/) home page +- [Apache Bigtop issue tracking](http://bigtop.apache.org/issue-tracking.html) +- [Apache Bigtop mailing lists](http://bigtop.apache.org/mail-lists.html) +- [Apache Bigtop charms](https://jujucharms.com/q/apache/bigtop) http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/copyright ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/copyright b/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/copyright new file mode 100644 index 0000000..52de50a --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/copyright @@ -0,0 +1,16 @@ +Format: http://dep.debian.net/deps/dep5/ + +Files: * +Copyright: Copyright 2015, Canonical Ltd., All Rights Reserved, The Apache Software Foundation +License: Apache License 2.0 + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + . + http://www.apache.org/licenses/LICENSE-2.0 + . + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/layer.yaml ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/layer.yaml b/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/layer.yaml new file mode 100644 index 0000000..5ddc2c9 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/layer.yaml @@ -0,0 +1,8 @@ +repo: [email protected]:juju-solutions/layer-hadoop-plugin.git +includes: ['layer:apache-bigtop-base', 'interface:hadoop-plugin', 'interface:dfs', 'interface:mapred'] +options: + basic: + use_venv: true +metadata: + deletes: + - requires.java http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/metadata.yaml ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/metadata.yaml b/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/metadata.yaml new file mode 100644 index 0000000..a5fd453 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/metadata.yaml @@ -0,0 +1,20 @@ +name: hadoop-plugin +summary: Simplified connection point for Apache Bigtop platform +maintainer: Juju Big Data <[email protected]> +description: > + Hadoop is a software platform that lets one easily write and + run applications that process vast amounts of data. + + This charm provides a simplified connection point for client / workload + services which require access to Apache Hadoop. This connection is established + via the Apache Bigtop gateway. +tags: ["applications", "bigdata", "hadoop", "apache"] +subordinate: true +requires: + namenode: + interface: dfs + resourcemanager: + interface: mapred + hadoop-plugin: + interface: hadoop-plugin + scope: container http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/reactive/apache_bigtop_plugin.py ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/reactive/apache_bigtop_plugin.py b/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/reactive/apache_bigtop_plugin.py new file mode 100644 index 0000000..e5b1275 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/reactive/apache_bigtop_plugin.py @@ -0,0 +1,149 @@ + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from charms.reactive import is_state, remove_state, set_state, when, when_any, when_none, when_not +from charmhelpers.core import hookenv +from charms.layer.apache_bigtop_base import Bigtop, get_hadoop_version + + +@when('hadoop-plugin.joined') +@when_not('namenode.joined') +def blocked(principal): + hookenv.status_set('blocked', 'missing required namenode relation') + + +@when('bigtop.available', 'hadoop-plugin.joined', 'namenode.joined') +@when_not('apache-bigtop-plugin.hdfs.installed') +def install_hadoop_client_hdfs(principal, namenode): + """Install if the namenode has sent its FQDN. + + We only need the namenode FQDN to perform the plugin install, so poll for + namenodes() data whenever we have a namenode relation. This allows us to + install asap, even if 'namenode.ready' is not set yet. + """ + if namenode.namenodes(): + hookenv.status_set('maintenance', 'installing plugin (hdfs)') + nn_host = namenode.namenodes()[0] + bigtop = Bigtop() + hosts = {'namenode': nn_host} + bigtop.render_site_yaml(hosts=hosts, roles='hadoop-client') + bigtop.trigger_puppet() + set_state('apache-bigtop-plugin.hdfs.installed') + hookenv.status_set('maintenance', 'plugin (hdfs) installed') + else: + hookenv.status_set('waiting', 'waiting for namenode fqdn') + + +@when('apache-bigtop-plugin.hdfs.installed') +@when('hadoop-plugin.joined', 'namenode.joined') +@when_not('namenode.ready') +def send_nn_spec(principal, namenode): + """Send our plugin spec so the namenode can become ready.""" + bigtop = Bigtop() + # Send plugin spec (must match NN spec for 'namenode.ready' to be set) + namenode.set_local_spec(bigtop.spec()) + + +@when('apache-bigtop-plugin.hdfs.installed') +@when('hadoop-plugin.joined', 'namenode.ready') +@when_not('apache-bigtop-plugin.hdfs.ready') +def send_principal_hdfs_info(principal, namenode): + """Send HDFS data when the namenode becomes ready.""" + principal.set_installed(get_hadoop_version()) + principal.set_hdfs_ready(namenode.namenodes(), namenode.port()) + set_state('apache-bigtop-plugin.hdfs.ready') + + +@when('apache-bigtop-plugin.hdfs.ready') +@when('hadoop-plugin.joined') +@when_not('namenode.ready') +def clear_hdfs_ready(principal): + principal.clear_hdfs_ready() + remove_state('apache-bigtop-plugin.hdfs.ready') + remove_state('apache-bigtop-plugin.hdfs.installed') + + +@when('bigtop.available', 'hadoop-plugin.joined', 'namenode.joined', 'resourcemanager.joined') +@when_not('apache-bigtop-plugin.yarn.installed') +def install_hadoop_client_yarn(principal, namenode, resourcemanager): + if namenode.namenodes() and resourcemanager.resourcemanagers(): + hookenv.status_set('maintenance', 'installing plugin (yarn)') + nn_host = namenode.namenodes()[0] + rm_host = resourcemanager.resourcemanagers()[0] + bigtop = Bigtop() + hosts = {'namenode': nn_host, 'resourcemanager': rm_host} + bigtop.render_site_yaml(hosts=hosts, roles='hadoop-client') + bigtop.trigger_puppet() + set_state('apache-bigtop-plugin.yarn.installed') + hookenv.status_set('active', 'ready (HDFS & YARN)') + else: + hookenv.status_set('waiting', 'waiting for master fqdns') + + +@when('apache-bigtop-plugin.yarn.installed') +@when('hadoop-plugin.joined', 'resourcemanager.joined') +@when_not('resourcemanager.ready') +def send_rm_spec(principal, resourcemanager): + """Send our plugin spec so the resourcemanager can become ready.""" + bigtop = Bigtop() + resourcemanager.set_local_spec(bigtop.spec()) + + +@when('apache-bigtop-plugin.yarn.installed') +@when('hadoop-plugin.joined', 'resourcemanager.ready') +@when_not('apache-bigtop-plugin.yarn.ready') +def send_principal_yarn_info(principal, resourcemanager): + """Send YARN data when the resourcemanager becomes ready.""" + version = get_hadoop_version() + principal.set_installed(version) + principal.set_yarn_ready( + resourcemanager.resourcemanagers(), resourcemanager.port(), + resourcemanager.hs_http(), resourcemanager.hs_ipc()) + set_state('apache-bigtop-plugin.yarn.ready') + + +@when('apache-bigtop-plugin.yarn.ready') +@when('hadoop-plugin.joined') +@when_not('resourcemanager.ready') +def clear_yarn_ready(principal): + principal.clear_yarn_ready() + remove_state('apache-bigtop-plugin.yarn.ready') + remove_state('apache-bigtop-plugin.yarn.installed') + + +@when_any('apache-bigtop-plugin.hdfs.installed', 'apache-bigtop-plugin.yarn.installed') +@when('hadoop-plugin.joined') +@when_none('namenode.spec.mismatch', 'resourcemanager.spec.mismatch') +def update_status(principal): + hdfs_rel = is_state('namenode.joined') + yarn_rel = is_state('resourcemanager.joined') + hdfs_ready = is_state('namenode.ready') + yarn_ready = is_state('resourcemanager.ready') + + if not (hdfs_rel or yarn_rel): + hookenv.status_set('blocked', + 'missing namenode and/or resourcemanager relation') + elif hdfs_rel and not hdfs_ready: + hookenv.status_set('waiting', 'waiting for hdfs') + elif yarn_rel and not yarn_ready: + hookenv.status_set('waiting', 'waiting for yarn') + else: + ready = [] + if hdfs_ready: + ready.append('hdfs') + if yarn_ready: + ready.append('yarn') + hookenv.status_set('active', 'ready ({})'.format(' & '.join(ready))) http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/tests/01-basic-deployment.py ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/tests/01-basic-deployment.py b/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/tests/01-basic-deployment.py new file mode 100755 index 0000000..512630d --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/tests/01-basic-deployment.py @@ -0,0 +1,46 @@ +#!/usr/bin/env python3 + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import unittest +import amulet + + +class TestDeploy(unittest.TestCase): + """ + Trivial deployment test for Apache Bigtop Plugin. + + This charm cannot do anything useful by itself, so integration testing + is done in the bundle. However, becaues it's a subordinate, it requires + a principle to confirm that it deploys. + """ + + def test_deploy(self): + self.d = amulet.Deployment(series='trusty') + self.d.load({ + 'services': { + 'client': {'charm': 'hadoop-client'}, + 'plugin': {'charm': 'hadoop-plugin'}, + }, + 'relations': [('client', 'plugin')], + }) + self.d.setup(timeout=900) + self.d.sentry.wait(timeout=1800) + self.unit = self.d.sentry['plugin'][0] + + +if __name__ == '__main__': + unittest.main() http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/tests/tests.yaml ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/tests/tests.yaml b/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/tests/tests.yaml new file mode 100644 index 0000000..3b6ce3e --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-plugin/tests/tests.yaml @@ -0,0 +1,3 @@ +reset: false +packages: + - amulet http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/.gitignore ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/.gitignore b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/.gitignore new file mode 100644 index 0000000..749ccda --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/.gitignore @@ -0,0 +1,4 @@ +# Byte-compiled / optimized / DLL files +__pycache__/ +*.py[cod] +*$py.class http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/README.md ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/README.md b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/README.md new file mode 100644 index 0000000..0250881 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/README.md @@ -0,0 +1,155 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> +## Overview + +The Apache Hadoop software library is a framework that allows for the +distributed processing of large data sets across clusters of computers +using a simple programming model. + +This charm deploys the ResourceManager component of the Apache Bigtop platform +to provide YARN master resources. + + +## Usage + +This charm is intended to be deployed via one of the +[apache bigtop bundles](https://jujucharms.com/u/bigdata-dev/#bundles). +For example: + + juju deploy hadoop-processing + +> Note: With Juju versions < 2.0, you will need to use [juju-deployer][] to +deploy the bundle. + +This will deploy the Apache Bigtop platform with a workload node +preconfigured to work with the cluster. + +You can also manually load and run map-reduce jobs via the plugin charm +included in the bundles linked above: + + juju scp my-job.jar plugin/0: + juju ssh plugin/0 + hadoop jar my-job.jar + + +[juju-deployer]: https://pypi.python.org/pypi/juju-deployer/ + + +## Status and Smoke Test + +Apache Bigtop charms provide extended status reporting to indicate when they +are ready: + + juju status --format=tabular + +This is particularly useful when combined with `watch` to track the on-going +progress of the deployment: + + watch -n 0.5 juju status --format=tabular + +The message for each unit will provide information about that unit's state. +Once they all indicate that they are ready, you can perform a "smoke test" +to verify HDFS or YARN services are working as expected. Trigger the +`smoke-test` action by: + + juju action do namenode/0 smoke-test + juju action do resourcemanager/0 smoke-test + +After a few seconds or so, you can check the results of the smoke test: + + juju action status + +You will see `status: completed` if the smoke test was successful, or +`status: failed` if it was not. You can get more information on why it failed +via: + + juju action fetch <action-id> + + +## Benchmarking + +This charm provides several benchmarks to gauge the performance of your +environment. + +The easiest way to run the benchmarks on this service is to relate it to the +[Benchmark GUI][]. You will likely also want to relate it to the +[Benchmark Collector][] to have machine-level information collected during the +benchmark, for a more complete picture of how the machine performed. + +[Benchmark GUI]: https://jujucharms.com/benchmark-gui/ +[Benchmark Collector]: https://jujucharms.com/benchmark-collector/ + +However, each benchmark is also an action that can be called manually: + + $ juju action do resourcemanager/0 nnbench + Action queued with id: 55887b40-116c-4020-8b35-1e28a54cc622 + $ juju action fetch --wait 0 55887b40-116c-4020-8b35-1e28a54cc622 + + results: + meta: + composite: + direction: asc + units: secs + value: "128" + start: 2016-02-04T14:55:39Z + stop: 2016-02-04T14:57:47Z + results: + raw: '{"BAD_ID": "0", "FILE: Number of read operations": "0", "Reduce input groups": + "8", "Reduce input records": "95", "Map output bytes": "1823", "Map input records": + "12", "Combine input records": "0", "HDFS: Number of bytes read": "18635", "FILE: + Number of bytes written": "32999982", "HDFS: Number of write operations": "330", + "Combine output records": "0", "Total committed heap usage (bytes)": "3144749056", + "Bytes Written": "164", "WRONG_LENGTH": "0", "Failed Shuffles": "0", "FILE: + Number of bytes read": "27879457", "WRONG_MAP": "0", "Spilled Records": "190", + "Merged Map outputs": "72", "HDFS: Number of large read operations": "0", "Reduce + shuffle bytes": "2445", "FILE: Number of large read operations": "0", "Map output + materialized bytes": "2445", "IO_ERROR": "0", "CONNECTION": "0", "HDFS: Number + of read operations": "567", "Map output records": "95", "Reduce output records": + "8", "WRONG_REDUCE": "0", "HDFS: Number of bytes written": "27412", "GC time + elapsed (ms)": "603", "Input split bytes": "1610", "Shuffled Maps ": "72", "FILE: + Number of write operations": "0", "Bytes Read": "1490"}' + status: completed + timing: + completed: 2016-02-04 14:57:48 +0000 UTC + enqueued: 2016-02-04 14:55:14 +0000 UTC + started: 2016-02-04 14:55:27 +0000 UTC + + +## Deploying in Network-Restricted Environments + +Charms can be deployed in environments with limited network access. To deploy +in this environment, you will need a local mirror to serve required packages. + + +### Mirroring Packages + +You can setup a local mirror for apt packages using squid-deb-proxy. +For instructions on configuring juju to use this, see the +[Juju Proxy Documentation](https://juju.ubuntu.com/docs/howto-proxies.html). + + +## Contact Information + +- <[email protected]> + + +## Hadoop + +- [Apache Bigtop](http://bigtop.apache.org/) home page +- [Apache Bigtop issue tracking](http://bigtop.apache.org/issue-tracking.html) +- [Apache Bigtop mailing lists](http://bigtop.apache.org/mail-lists.html) +- [Apache Bigtop charms](https://jujucharms.com/q/apache/bigtop) http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions.yaml ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions.yaml b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions.yaml new file mode 100644 index 0000000..da4fc08 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions.yaml @@ -0,0 +1,134 @@ +smoke-test: + description: > + Verify that YARN is working as expected by running a small (1MB) terasort. +mrbench: + description: Mapreduce benchmark for small jobs + params: + basedir: + description: DFS working directory + type: string + default: "/benchmarks/MRBench" + numruns: + description: Number of times to run the job + type: integer + default: 1 + maps: + description: number of maps for each run + type: integer + default: 2 + reduces: + description: number of reduces for each run + type: integer + default: 1 + inputlines: + description: number of input lines to generate + type: integer + default: 1 + inputtype: + description: 'Type of input to generate, one of [ascending, descending, random]' + type: string + default: "ascending" + enum: [ascending,descending,random] +nnbench: + description: Load test the NameNode hardware and configuration + params: + maps: + description: number of map jobs + type: integer + default: 12 + reduces: + description: number of reduces + type: integer + default: 6 + blocksize: + description: block size + type: integer + default: 1 + bytes: + description: bytes to write + type: integer + default: 0 + numfiles: + description: number of files + type: integer + default: 0 + repfactor: + description: replication factor per file + type: integer + default: 3 + basedir: + description: DFS working directory + type: string + default: "/benchmarks/NNBench" +testdfsio: + description: DFS IO Testing + params: + mode: + description: read or write IO test + type: string + default: "write" + enum: [read,write] + numfiles: + description: number of files + type: integer + default: 10 + filesize: + description: filesize in MB + type: integer + default: 1000 + resfile: + description: Results file name + type: string + default: "/tmp/TestDFSIO_results.log" + buffersize: + description: Buffer size in bytes + type: integer + default: 1000000 +teragen: + description: Generate data with teragen + params: + size: + description: The number of 100 byte rows, default to 1GB of data to generate + type: integer + default: 10000000 + indir: + description: HDFS directory where generated data is stored + type: string + default: 'tera_demo_in' +terasort: + description: Runs teragen to generate sample data, and then runs terasort to sort that data + params: + indir: + description: HDFS directory where generated data is stored + type: string + default: 'tera_demo_in' + outdir: + description: HDFS directory where sorted data is stored + type: string + default: 'tera_demo_out' + size: + description: The number of 100 byte rows, default to 1GB of data to generate and sort + type: integer + default: 10000000 + maps: + description: The default number of map tasks per job. 1-20 + type: integer + default: 1 + reduces: + description: The default number of reduce tasks per job. Typically set to 99% of the cluster's reduce capacity, so that if a node fails the reduces can still be executed in a single wave. Try 1-20 + type: integer + default: 1 + numtasks: + description: How many tasks to run per jvm. If set to -1, there is no limit. + type: integer + default: 1 + compression: + description: > + Enable or Disable mapred output (intermediate) compression. + LocalDefault will run with your current local hadoop configuration. + Default means default hadoop deflate codec. + One of: Gzip, BZip2, Snappy, Lzo, Default, Disable, LocalDefault + These are all case sensitive. + type: string + default: "LocalDefault" + enum: [Gzip, BZip2, Snappy, Lzo, Default, Disable, LocalDefault] http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/mrbench ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/mrbench b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/mrbench new file mode 100755 index 0000000..1ef70cc --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/mrbench @@ -0,0 +1,75 @@ +#!/bin/bash + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +set -ex + +if ! charms.reactive is_state 'apache-bigtop-resourcemanager.ready'; then + action-fail 'ResourceManager not yet ready' + exit +fi + +BASE_DIR=`action-get basedir`/`hostname -s` +OPTIONS='' + +MAPS=`action-get maps` +REDUCES=`action-get reduces` +NUMRUNS=`action-get numruns` +INPUTLINES=`action-get inputlines` +INPUTTYPE=`action-get inputtype` + +OPTIONS="${OPTIONS} -maps ${MAPS}" +OPTIONS="${OPTIONS} -reduces ${REDUCES}" +OPTIONS="${OPTIONS} -numRuns ${NUMRUNS}" +OPTIONS="${OPTIONS} -inputLines ${INPUTLINES}" +OPTIONS="${OPTIONS} -inputType ${INPUTTYPE}" +OPTIONS="${OPTIONS} -baseDir ${BASE_DIR}" + +# create dir to store results +RUN=`date +%s` +RESULT_DIR=/opt/mrbench-results +RESULT_LOG=${RESULT_DIR}/${RUN}.log +mkdir -p ${RESULT_DIR} +chown -R ubuntu:ubuntu ${RESULT_DIR} + +# clean out any previous data (must be run as ubuntu) +su ubuntu << EOF +if hadoop fs -stat ${BASE_DIR} &> /dev/null; then + hadoop fs -rm -r -skipTrash ${BASE_DIR} || true +fi +EOF + +benchmark-start +START=`date +%s` +# NB: Escaped vars in the block below (e.g., \${HADOOP_HOME}) come from +# the environment while non-escaped vars (e.g., ${IN_DIR}) are parameterized +# from this outer scope +su ubuntu << EOF +. /etc/default/hadoop +echo 'running benchmark' +hadoop jar \${HADOOP_MAPRED_HOME}/hadoop-mapreduce-*test*.jar mrbench $OPTIONS &> ${RESULT_LOG} +EOF +STOP=`date +%s` +benchmark-finish + +su ubuntu << EOF +`cat ${RESULT_LOG} | $CHARM_DIR/actions/parseTerasort.py` +EOF + +# More mapreduce benchmark results logged to: ${RESULT_LOG} + +DURATION=`expr $STOP - $START` +benchmark-composite "${DURATION}" 'secs' 'asc' http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/nnbench ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/nnbench b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/nnbench new file mode 100755 index 0000000..f965936 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/nnbench @@ -0,0 +1,76 @@ +#!/bin/bash + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +set -ex + +if ! charms.reactive is_state 'apache-bigtop-resourcemanager.ready'; then + action-fail 'ResourceManager not yet ready' + exit +fi + +BASE_DIR=`action-get basedir`/`hostname -s` +OPTIONS='' + +MAPS=`action-get maps` +REDUCES=`action-get reduces` +BLOCKSIZE=`action-get blocksize` +BYTES=`action-get bytes` +NUMFILES=`action-get numfiles` +REPFACTOR=`action-get repfactor` + +OPTIONS="${OPTIONS} -maps ${MAPS}" +OPTIONS="${OPTIONS} -reduces ${REDUCES}" +OPTIONS="${OPTIONS} -blockSize ${BLOCKSIZE}" +OPTIONS="${OPTIONS} -bytesToWrite ${BYTES}" +OPTIONS="${OPTIONS} -numberOfFiles ${NUMFILES}" +OPTIONS="${OPTIONS} -replicationFactorPerFile ${REPFACTOR}" +OPTIONS="${OPTIONS} -baseDir ${BASE_DIR}" + +# create dir to store results +RUN=`date +%s` +RESULT_DIR=/opt/nnbench-results +RESULT_LOG=${RESULT_DIR}/${RUN}.log +mkdir -p ${RESULT_DIR} +chown -R ubuntu:ubuntu ${RESULT_DIR} + +# clean out any previous data (must be run as ubuntu) +su ubuntu << EOF +if hadoop fs -stat ${BASE_DIR} &> /dev/null; then + hadoop fs -rm -r -skipTrash ${BASE_DIR} || true +fi +EOF + +benchmark-start +START=`date +%s` +# NB: Escaped vars in the block below (e.g., \${HADOOP_HOME}) come from +# the environment while non-escaped vars (e.g., ${IN_DIR}) are parameterized +# from this outer scope +su ubuntu << EOF +. /etc/default/hadoop +echo 'running benchmark' +cd /tmp/ +hadoop jar \${HADOOP_MAPRED_HOME}/hadoop-mapreduce-*test*.jar nnbench -operation create_write -readFileAfterOpen true $OPTIONS &> ${RESULT_LOG} +EOF +STOP=`date +%s` +benchmark-finish + +su ubuntu << EOF +`cat ${RESULT_LOG} | $CHARM_DIR/actions/parseTerasort.py` +EOF + +DURATION=`expr $STOP - $START` +benchmark-composite "${DURATION}" 'secs' 'asc' http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/parseNNBench.py ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/parseNNBench.py b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/parseNNBench.py new file mode 100755 index 0000000..0374dbe --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/parseNNBench.py @@ -0,0 +1,45 @@ +#!/usr/bin/env python3 + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Simple script to parse nnbench transaction results +and reformat them as JSON for sending back to juju +""" +import sys +import json +from charmhelpers.core import hookenv +import re + + +def parse_nnbench_output(): + """ + Parse the output from nnbench and set the action results: + + """ + + results = {} + + # Find all of the interesting things + regex = re.compile('\t+(.*)=(.*)') + for line in sys.stdin.readlines(): + m = regex.match(line) + if m: + results[m.group(1)] = m.group(2) + hookenv.action_set({"results.raw": json.dumps(results)}) + +if __name__ == "__main__": + parse_nnbench_output() http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/parseTerasort.py ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/parseTerasort.py b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/parseTerasort.py new file mode 100755 index 0000000..dbf59ba --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/parseTerasort.py @@ -0,0 +1,45 @@ +#!/usr/bin/env python3 + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Simple script to parse terasort transaction results +and reformat them as JSON for sending back to juju +""" +import sys +import json +from charmhelpers.core import hookenv +import re + + +def parse_terasort_output(): + """ + Parse the output from terasort and set the action results: + + """ + + results = {} + + # Find all of the interesting things + regex = re.compile('\t+(.*)=(.*)') + for line in sys.stdin.readlines(): + m = regex.match(line) + if m: + results[m.group(1)] = m.group(2) + hookenv.action_set({"results.raw": json.dumps(results)}) + +if __name__ == "__main__": + parse_terasort_output() http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/smoke-test ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/smoke-test b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/smoke-test new file mode 100755 index 0000000..9ef33a9 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/smoke-test @@ -0,0 +1,80 @@ +#!/bin/bash + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +set -ex + +if ! charms.reactive is_state 'apache-bigtop-resourcemanager.ready'; then + action-fail 'ResourceManager not yet ready' + exit +fi + +IN_DIR='/tmp/smoke_test_in' +OUT_DIR='/tmp/smoke_test_out' +SIZE=10000 +OPTIONS='' + +MAPS=1 +REDUCES=1 +NUMTASKS=1 +COMPRESSION='LocalDefault' + +OPTIONS="${OPTIONS} -D mapreduce.job.maps=${MAPS}" +OPTIONS="${OPTIONS} -D mapreduce.job.reduces=${REDUCES}" +OPTIONS="${OPTIONS} -D mapreduce.job.jvm.numtasks=${NUMTASKS}" +if [ $COMPRESSION == 'Disable' ] ; then + OPTIONS="${OPTIONS} -D mapreduce.map.output.compress=false" +elif [ $COMPRESSION == 'LocalDefault' ] ; then + OPTIONS="${OPTIONS}" +else + OPTIONS="${OPTIONS} -D mapreduce.map.output.compress=true -D mapred.map.output.compress.codec=org.apache.hadoop.io.compress.${COMPRESSION}Codec" +fi + +# create dir to store results +RUN=`date +%s` +RESULT_DIR=/opt/terasort-results +RESULT_LOG=${RESULT_DIR}/${RUN}.$$.log +mkdir -p ${RESULT_DIR} +chown -R hdfs ${RESULT_DIR} + +# clean out any previous data (must be run as the hdfs user) +su hdfs << EOF +if hadoop fs -stat ${IN_DIR} &> /dev/null; then + hadoop fs -rm -r -skipTrash ${IN_DIR} || true +fi +if hadoop fs -stat ${OUT_DIR} &> /dev/null; then + hadoop fs -rm -r -skipTrash ${OUT_DIR} || true +fi +EOF + +START=`date +%s` +# NB: Escaped vars in the block below (e.g., \${HADOOP_MAPRED_HOME}) come from +# the environment while non-escaped vars (e.g., ${IN_DIR}) are parameterized +# from this outer scope +su hdfs << EOF +. /etc/default/hadoop +echo 'generating data' +hadoop jar \${HADOOP_MAPRED_HOME}/hadoop-mapreduce-examples-*.jar teragen ${SIZE} ${IN_DIR} &>/dev/null +echo 'sorting data' +hadoop jar \${HADOOP_MAPRED_HOME}/hadoop-mapreduce-examples-*.jar terasort ${OPTIONS} ${IN_DIR} ${OUT_DIR} &> ${RESULT_LOG} +EOF +STOP=`date +%s` + +if ! grep -q 'Bytes Written=1000000' ${RESULT_LOG}; then + action-fail 'smoke-test failed' + action-set log="$(cat ${RESULT_LOG})" +fi +DURATION=`expr $STOP - $START` http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/teragen ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/teragen b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/teragen new file mode 100755 index 0000000..fb7c79e --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/teragen @@ -0,0 +1,57 @@ +#!/bin/bash + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +set -ex + +if ! charms.reactive is_state 'apache-bigtop-resourcemanager.ready'; then + action-fail 'ResourceManager not yet ready' + exit +fi + +SIZE=`action-get size` +IN_DIR=`action-get indir` + +# create dir to store results +RUN=`date +%s` +RESULT_DIR=/opt/teragen-results +RESULT_LOG=${RESULT_DIR}/${RUN}.log +mkdir -p ${RESULT_DIR} +chown -R hdfs ${RESULT_DIR} + +# clean out any previously generated data (must be run as hdfs) +su hdfs << EOF +if hadoop fs -stat ${IN_DIR} &> /dev/null; then + hadoop fs -rm -r -skipTrash ${IN_DIR} || true +fi +EOF + +benchmark-start +START=`date +%s` +# NB: Escaped vars in the block below (e.g., \${HADOOP_HOME}) come from +# /etc/environment while non-escaped vars (e.g., ${IN_DIR}) are parameterized +# from this outer scope +su hdfs << EOF +. /etc/default/hadoop +echo 'generating data' +hadoop jar \${HADOOP_MAPRED_HOME}/hadoop-mapreduce-examples-*.jar teragen ${SIZE} ${IN_DIR} &> ${RESULT_LOG} +EOF +STOP=`date +%s` +benchmark-finish + +`cat ${RESULT_LOG} | $CHARM_DIR/actions/parseTerasort.py` +DURATION=`expr $STOP - $START` +benchmark-composite "${DURATION}" 'secs' 'asc' http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/terasort ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/terasort b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/terasort new file mode 100755 index 0000000..aaf7da5 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/terasort @@ -0,0 +1,80 @@ +#!/bin/bash + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +set -ex + +if ! charms.reactive is_state 'apache-bigtop-resourcemanager.ready'; then + action-fail 'ResourceManager not yet ready' + exit +fi + +IN_DIR=`action-get indir` +OUT_DIR=`action-get outdir` +SIZE=`action-get size` +OPTIONS='' + +MAPS=`action-get maps` +REDUCES=`action-get reduces` +NUMTASKS=`action-get numtasks` +COMPRESSION=`action-get compression` + +OPTIONS="${OPTIONS} -D mapreduce.job.maps=${MAPS}" +OPTIONS="${OPTIONS} -D mapreduce.job.reduces=${REDUCES}" +OPTIONS="${OPTIONS} -D mapreduce.job.jvm.numtasks=${NUMTASKS}" +if [ $COMPRESSION == 'Disable' ] ; then + OPTIONS="${OPTIONS} -D mapreduce.map.output.compress=false" +elif [ $COMPRESSION == 'LocalDefault' ] ; then + OPTIONS="${OPTIONS}" +else + OPTIONS="${OPTIONS} -D mapreduce.map.output.compress=true -D mapred.map.output.compress.codec=org.apache.hadoop.io.compress.${COMPRESSION}Codec" +fi + +# create dir to store results +RUN=`date +%s` +RESULT_DIR=/opt/terasort-results +RESULT_LOG=${RESULT_DIR}/${RUN}.log +mkdir -p ${RESULT_DIR} +chown -R hdfs ${RESULT_DIR} + +# clean out any previous data (must be run as hdfs) +su hdfs << EOF +if hadoop fs -stat ${IN_DIR} &> /dev/null; then + hadoop fs -rm -r -skipTrash ${IN_DIR} || true +fi +if hadoop fs -stat ${OUT_DIR} &> /dev/null; then + hadoop fs -rm -r -skipTrash ${OUT_DIR} || true +fi +EOF + +benchmark-start +START=`date +%s` +# NB: Escaped vars in the block below (e.g., \${HADOOP_HOME}) come from +# /etc/environment while non-escaped vars (e.g., ${IN_DIR}) are parameterized +# from this outer scope +su hdfs << EOF +. /etc/default/hadoop +echo 'generating data' +hadoop jar \${HADOOP_MAPRED_HOME}/hadoop-mapreduce-examples-*.jar teragen ${SIZE} ${IN_DIR} &>/dev/null +echo 'sorting data' +hadoop jar \${HADOOP_MAPRED_HOME}/hadoop-mapreduce-examples-*.jar terasort ${OPTIONS} ${IN_DIR} ${OUT_DIR} &> ${RESULT_LOG} +EOF +STOP=`date +%s` +benchmark-finish + +`cat ${RESULT_LOG} | $CHARM_DIR/actions/parseTerasort.py` +DURATION=`expr $STOP - $START` +benchmark-composite "${DURATION}" 'secs' 'asc' http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/testdfsio ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/testdfsio b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/testdfsio new file mode 100755 index 0000000..beeb38c --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/testdfsio @@ -0,0 +1,76 @@ +#!/bin/bash + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +set -ex + +if ! charms.reactive is_state 'apache-bigtop-resourcemanager.ready'; then + action-fail 'ResourceManager not yet ready' + exit +fi + +RESULTS_FILE=`action-get resfile` +OPTIONS='' + +MODE=`action-get mode` +NUMFILES=`action-get numfiles` +FILESIZE=`action-get filesize` +BUFFERSIZE=`action-get buffersize` + +if [ $MODE == 'read' ] ; then + OPTIONS="${OPTIONS} -read" +else + OPTIONS="${OPTIONS} -write" +fi + +OPTIONS="${OPTIONS} -nrFiles ${NUMFILES}" +OPTIONS="${OPTIONS} -fileSize ${FILESIZE}" +OPTIONS="${OPTIONS} -bufferSize ${BUFFERSIZE}" +OPTIONS="${OPTIONS} -resFile ${RESULTS_FILE}" + +# create dir to store results +RUN=`date +%s` +RESULT_DIR=/opt/TestDFSIO-results +RESULT_LOG=${RESULT_DIR}/${RUN}.log +mkdir -p ${RESULT_DIR} +chown -R ubuntu:ubuntu ${RESULT_DIR} + + +# clean out any previous data (must be run as ubuntu) +su ubuntu << EOF +. /etc/default/hadoop +echo 'cleaning data' +hadoop jar \${HADOOP_MAPRED_HOME}/hadoop-mapreduce-*test*.jar TestDFSIO -clean +EOF + +benchmark-start +START=`date +%s` +# NB: Escaped vars in the block below (e.g., \${HADOOP_HOME}) come from +# /etc/environment while non-escaped vars (e.g., ${IN_DIR}) are parameterized +# from this outer scope +su ubuntu << EOF +. /etc/default/hadoop +echo 'running benchmark' +cd /tmp/ +hadoop jar \${HADOOP_MAPRED_HOME}/hadoop-mapreduce-*test*.jar TestDFSIO $OPTIONS &> ${RESULT_LOG} +EOF +STOP=`date +%s` +benchmark-finish + +`cat ${RESULTS_FILE} ${RESULT_LOG} | $CHARM_DIR/actions/parseNNBench.py` + +DURATION=`expr $STOP - $START` +benchmark-composite "${DURATION}" 'secs' 'asc' http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/copyright ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/copyright b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/copyright new file mode 100644 index 0000000..52de50a --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/copyright @@ -0,0 +1,16 @@ +Format: http://dep.debian.net/deps/dep5/ + +Files: * +Copyright: Copyright 2015, Canonical Ltd., All Rights Reserved, The Apache Software Foundation +License: Apache License 2.0 + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + . + http://www.apache.org/licenses/LICENSE-2.0 + . + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/layer.yaml ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/layer.yaml b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/layer.yaml new file mode 100644 index 0000000..ad0b569 --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/layer.yaml @@ -0,0 +1,27 @@ +repo: [email protected]:juju-solutions/layer-hadoop-resourcemanager.git +includes: + - 'layer:apache-bigtop-base' + - 'interface:dfs' + - 'interface:mapred' + - 'interface:mapred-slave' + - 'interface:benchmark' +options: + apache-bigtop-base: + users: + ubuntu: + groups: ['hadoop', 'mapred'] + ports: + resourcemanager: + port: 8032 + rm_webapp_http: + port: 8088 + exposed_on: 'resourcemanager' + # TODO: support SSL + #rm_webapp_https: + # port: 8090 + # exposed_on: 'yarn-master' + jobhistory: + port: 10020 + jh_webapp_http: + port: 19888 + exposed_on: 'resourcemanager' http://git-wip-us.apache.org/repos/asf/bigtop/blob/d639645e/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/metadata.yaml ---------------------------------------------------------------------- diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/metadata.yaml b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/metadata.yaml new file mode 100644 index 0000000..82b82cd --- /dev/null +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/metadata.yaml @@ -0,0 +1,19 @@ +name: hadoop-resourcemanager +summary: YARN master (ResourceManager) for Apache Bigtop platform +maintainer: Juju Big Data <[email protected]> +description: > + Hadoop is a software platform that lets one easily write and + run applications that process vast amounts of data. + + This charm manages the YARN master node (ResourceManager). +tags: ["applications", "bigdata", "bigtop", "hadoop", "apache"] +provides: + resourcemanager: + interface: mapred + benchmark: + interface: benchmark +requires: + namenode: + interface: dfs + nodemanager: + interface: mapred-slave
