Repository: bigtop Updated Branches: refs/heads/master 56deaa7c0 -> f4d023b4c
BIGTOP-2561: add juju bundle for hadoop-spark (closes #166) Signed-off-by: Kevin W Monroe <[email protected]> Project: http://git-wip-us.apache.org/repos/asf/bigtop/repo Commit: http://git-wip-us.apache.org/repos/asf/bigtop/commit/f4d023b4 Tree: http://git-wip-us.apache.org/repos/asf/bigtop/tree/f4d023b4 Diff: http://git-wip-us.apache.org/repos/asf/bigtop/diff/f4d023b4 Branch: refs/heads/master Commit: f4d023b4c505efbb3c5b52cb0aa7ceb9dc20cc60 Parents: 56deaa7 Author: Kevin W Monroe <[email protected]> Authored: Wed Sep 21 20:46:24 2016 -0500 Committer: Kevin W Monroe <[email protected]> Committed: Fri Dec 2 16:46:24 2016 -0600 ---------------------------------------------------------------------- bigtop-deploy/juju/hadoop-spark/.gitignore | 2 + bigtop-deploy/juju/hadoop-spark/README.md | 356 +++++++++++++++++++ bigtop-deploy/juju/hadoop-spark/bundle-dev.yaml | 138 +++++++ .../juju/hadoop-spark/bundle-local.yaml | 138 +++++++ bigtop-deploy/juju/hadoop-spark/bundle.yaml | 138 +++++++ bigtop-deploy/juju/hadoop-spark/copyright | 16 + .../juju/hadoop-spark/tests/01-bundle.py | 137 +++++++ .../juju/hadoop-spark/tests/tests.yaml | 7 + 8 files changed, 932 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/.gitignore ---------------------------------------------------------------------- diff --git a/bigtop-deploy/juju/hadoop-spark/.gitignore b/bigtop-deploy/juju/hadoop-spark/.gitignore new file mode 100644 index 0000000..a295864 --- /dev/null +++ b/bigtop-deploy/juju/hadoop-spark/.gitignore @@ -0,0 +1,2 @@ +*.pyc +__pycache__ http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/README.md ---------------------------------------------------------------------- diff --git a/bigtop-deploy/juju/hadoop-spark/README.md b/bigtop-deploy/juju/hadoop-spark/README.md new file mode 100644 index 0000000..b2b936b --- /dev/null +++ b/bigtop-deploy/juju/hadoop-spark/README.md @@ -0,0 +1,356 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> +# Overview + +The Apache Hadoop software library is a framework that allows for the +distributed processing of large data sets across clusters of computers +using a simple programming model. + +Hadoop is designed to scale from a few servers to thousands of machines, +each offering local computation and storage. Rather than rely on hardware +to deliver high-availability, Hadoop can detect and handle failures at the +application layer. This provides a highly-available service on top of a cluster +of machines, each of which may be prone to failure. + +Spark is a fast and general engine for large-scale data processing. + +This bundle provides a complete deployment of Hadoop and Spark components from +[Apache Bigtop][] that performs distributed data processing at scale. Ganglia +and rsyslog applications are also provided to monitor cluster health and syslog +activity. + +[Apache Bigtop]: http://bigtop.apache.org/ + +## Bundle Composition + +The applications that comprise this bundle are spread across 9 units as +follows: + + * NameNode (HDFS) + * ResourceManager (YARN) + * Colocated on the NameNode unit + * Slave (DataNode and NodeManager) + * 3 separate units + * Spark + * Plugin (Facilitates communication with the Hadoop cluster) + * Colocated on the Spark unit + * Client (Hadoop endpoint) + * Colocated on the Spark unit + * Zookeeper + * 3 separate units + * Ganglia (Web interface for monitoring cluster metrics) + * Rsyslog (Aggregate cluster syslog events in a single location) + * Colocated on the Ganglia unit + +Deploying this bundle results in a fully configured Apache Bigtop +cluster on any supported cloud, which can be scaled to meet workload +demands. + + +# Deploying + +A working Juju installation is assumed to be present. If Juju is not yet set +up, please follow the [getting-started][] instructions prior to deploying this +bundle. + +> **Note**: This bundle requires hardware resources that may exceed limits +of Free-tier or Trial accounts on some clouds. To deploy to these +environments, modify a local copy of [bundle.yaml][] to set +`services: 'X': num_units: 1` and `machines: 'X': constraints: mem=3G` as +needed to satisfy account limits. + +Deploy this bundle from the Juju charm store with the `juju deploy` command: + + juju deploy hadoop-spark + +> **Note**: The above assumes Juju 2.0 or greater. If using an earlier version +of Juju, use [juju-quickstart][] with the following syntax: `juju quickstart +hadoop-spark`. + +Alternatively, deploy a locally modified `bundle.yaml` with: + + juju deploy /path/to/bundle.yaml + +> **Note**: The above assumes Juju 2.0 or greater. If using an earlier version +of Juju, use [juju-quickstart][] with the following syntax: `juju quickstart +/path/to/bundle.yaml`. + +The charms in this bundle can also be built from their source layers in the +[Bigtop charm repository][]. See the [Bigtop charm README][] for instructions +on building and deploying these charms locally. + +## Network-Restricted Environments +Charms can be deployed in environments with limited network access. To deploy +in this environment, configure a Juju model with appropriate proxy and/or +mirror options. See [Configuring Models][] for more information. + +[getting-started]: https://jujucharms.com/docs/stable/getting-started +[bundle.yaml]: https://github.com/apache/bigtop/blob/master/bigtop-deploy/juju/hadoop-spark/bundle.yaml +[juju-quickstart]: https://launchpad.net/juju-quickstart +[Bigtop charm repository]: https://github.com/apache/bigtop/tree/master/bigtop-packages/src/charm +[Bigtop charm README]: https://github.com/apache/bigtop/blob/master/bigtop-packages/src/charm/README.md +[Configuring Models]: https://jujucharms.com/docs/stable/models-config + + +# Verifying + +## Status +The applications that make up this bundle provide status messages to indicate +when they are ready: + + juju status + +This is particularly useful when combined with `watch` to track the on-going +progress of the deployment: + + watch -n 2 juju status + +The message for each unit will provide information about that unit's state. +Once they all indicate that they are ready, perform application smoke tests +to verify that the bundle is working as expected. + +## Smoke Test +The charms for each core component (namenode, resourcemanager, slave, spark, +and zookeeper) provide a `smoke-test` action that can be used to verify the +application is functioning as expected. Note that the 'slave' component runs +extensive tests provided by Apache Bigtop and may take up to 30 minutes to +complete. Run the smoke-test actions as follows: + + juju run-action namenode/0 smoke-test + juju run-action resourcemanager/0 smoke-test + juju run-action slave/0 smoke-test + juju run-action spark/0 smoke-test + juju run-action zookeeper/0 smoke-test + +> **Note**: The above assumes Juju 2.0 or greater. If using an earlier version +of Juju, the syntax is `juju action do <application>/0 smoke-test`. + +Watch the progress of the smoke test actions with: + + watch -n 2 juju show-action-status + +> **Note**: The above assumes Juju 2.0 or greater. If using an earlier version +of Juju, the syntax is `juju action status`. + +Eventually, all of the actions should settle to `status: completed`. If +any report `status: failed`, that application is not working as expected. Get +more information about a specific smoke test with: + + juju show-action-output <action-id> + +> **Note**: The above assumes Juju 2.0 or greater. If using an earlier version +of Juju, the syntax is `juju action fetch <action-id>`. + +## Utilities +Applications in this bundle include command line and web utilities that +can be used to verify information about the cluster. + +From the command line, show the HDFS dfsadmin report and view the current list +of YARN NodeManager units with the following: + + juju run --application namenode "su hdfs -c 'hdfs dfsadmin -report'" + juju run --application resourcemanager "su yarn -c 'yarn node -list'" + +Show the list of Zookeeper nodes with the following: + + juju run --unit zookeeper/0 'echo "ls /" | /usr/lib/zookeeper/bin/zkCli.sh' + +To access the HDFS web console, find the `PUBLIC-ADDRESS` of the namenode +application and expose it: + + juju status namenode + juju expose namenode + +The web interface will be available at the following URL: + + http://NAMENODE_PUBLIC_IP:50070 + +Similarly, to access the Resource Manager web consoles, find the +`PUBLIC-ADDRESS` of the resourcemanager application and expose it: + + juju status resourcemanager + juju expose resourcemanager + +The YARN and Job History web interfaces will be available at the following URLs: + + http://RESOURCEMANAGER_PUBLIC_IP:8088 + http://RESOURCEMANAGER_PUBLIC_IP:19888 + +Finally, to access the Spark web console, find the `PUBLIC-ADDRESS` of the +spark application and expose it: + + juju status spark + juju expose spark + +The web interface will be available at the following URL: + + http://SPARK_PUBLIC_IP:8080 + + +# Monitoring + +This bundle includes Ganglia for system-level monitoring of the namenode, +resourcemanager, slave, spark, and zookeeper units. Metrics are sent to a +centralized ganglia unit for easy viewing in a browser. To view the ganglia web +interface, find the `PUBLIC-ADDRESS` of the Ganglia application and expose it: + + juju status ganglia + juju expose ganglia + +The web interface will be available at: + + http://GANGLIA_PUBLIC_IP/ganglia + + +# Logging + +This bundle includes rsyslog to collect syslog data from the namenode, +resourcemanager, slave, spark, and zookeeper units. These logs are sent to a +centralized rsyslog unit for easy syslog analysis. One method of viewing this +log data is to simply cat syslog from the rsyslog unit: + + juju run --unit rsyslog/0 'sudo cat /var/log/syslog' + +Logs may also be forwarded to an external rsyslog processing service. See +the *Forwarding logs to a system outside of the Juju environment* section of +the [rsyslog README](https://jujucharms.com/rsyslog/) for more information. + + +# Benchmarking + +The `resourcemanager` charm in this bundle provide several benchmarks to gauge +the performance of the Hadoop cluster. Each benchmark is an action that can be +run with `juju run-action`: + + $ juju actions resourcemanager + ACTION DESCRIPTION + mrbench Mapreduce benchmark for small jobs + nnbench Load test the NameNode hardware and configuration + smoke-test Run an Apache Bigtop smoke test. + teragen Generate data with teragen + terasort Runs teragen to generate sample data, and then runs terasort to sort that data + testdfsio DFS IO Testing + + $ juju run-action resourcemanager/0 nnbench + Action queued with id: 55887b40-116c-4020-8b35-1e28a54cc622 + + $ juju show-action-output 55887b40-116c-4020-8b35-1e28a54cc622 + results: + meta: + composite: + direction: asc + units: secs + value: "128" + start: 2016-02-04T14:55:39Z + stop: 2016-02-04T14:57:47Z + results: + raw: '{"BAD_ID": "0", "FILE: Number of read operations": "0", "Reduce input groups": + "8", "Reduce input records": "95", "Map output bytes": "1823", "Map input records": + "12", "Combine input records": "0", "HDFS: Number of bytes read": "18635", "FILE: + Number of bytes written": "32999982", "HDFS: Number of write operations": "330", + "Combine output records": "0", "Total committed heap usage (bytes)": "3144749056", + "Bytes Written": "164", "WRONG_LENGTH": "0", "Failed Shuffles": "0", "FILE: + Number of bytes read": "27879457", "WRONG_MAP": "0", "Spilled Records": "190", + "Merged Map outputs": "72", "HDFS: Number of large read operations": "0", "Reduce + shuffle bytes": "2445", "FILE: Number of large read operations": "0", "Map output + materialized bytes": "2445", "IO_ERROR": "0", "CONNECTION": "0", "HDFS: Number + of read operations": "567", "Map output records": "95", "Reduce output records": + "8", "WRONG_REDUCE": "0", "HDFS: Number of bytes written": "27412", "GC time + elapsed (ms)": "603", "Input split bytes": "1610", "Shuffled Maps ": "72", "FILE: + Number of write operations": "0", "Bytes Read": "1490"}' + status: completed + timing: + completed: 2016-02-04 14:57:48 +0000 UTC + enqueued: 2016-02-04 14:55:14 +0000 UTC + started: 2016-02-04 14:55:27 +0000 UTC + +The `spark` charm in this bundle also provides several benchmarks to gauge +the performance of the Spark cluster. Each benchmark is an action that can be +run with `juju run-action`: + + $ juju actions spark | grep Bench + connectedcomponent Run the Spark Bench ConnectedComponent benchmark. + decisiontree Run the Spark Bench DecisionTree benchmark. + kmeans Run the Spark Bench KMeans benchmark. + linearregression Run the Spark Bench LinearRegression benchmark. + logisticregression Run the Spark Bench LogisticRegression benchmark. + matrixfactorization Run the Spark Bench MatrixFactorization benchmark. + pagerank Run the Spark Bench PageRank benchmark. + pca Run the Spark Bench PCA benchmark. + pregeloperation Run the Spark Bench PregelOperation benchmark. + shortestpaths Run the Spark Bench ShortestPaths benchmark. + sql Run the Spark Bench SQL benchmark. + stronglyconnectedcomponent Run the Spark Bench StronglyConnectedComponent benchmark. + svdplusplus Run the Spark Bench SVDPlusPlus benchmark. + svm Run the Spark Bench SVM benchmark. + + $ juju run-action spark/0 svdplusplus + Action queued with id: 339cec1f-e903-4ee7-85ca-876fb0c3d28e + + $ juju show-action-output 339cec1f-e903-4ee7-85ca-876fb0c3d28e + results: + meta: + composite: + direction: asc + units: secs + value: "200.754000" + raw: | + SVDPlusPlus,2016-11-02-03:08:26,200.754000,85.974071,.428255,0,SVDPlusPlus-MLlibConfig,,,,,10,,,50000,4.0,1.3, + start: 2016-11-02T03:08:26Z + stop: 2016-11-02T03:11:47Z + results: + duration: + direction: asc + units: secs + value: "200.754000" + throughput: + direction: desc + units: MB/sec + value: ".428255" + status: completed + timing: + completed: 2016-11-02 03:11:48 +0000 UTC + enqueued: 2016-11-02 03:08:21 +0000 UTC + started: 2016-11-02 03:08:26 +0000 UTC + + +# Scaling + +By default, three Hadoop slave and three zookeeper units are deployed. Scaling +these applications is as simple as adding more units. To add one unit: + + juju add-unit slave + juju add-unit zookeeper + +Multiple units may be added at once. For example, add four more slave units: + + juju add-unit -n4 slave + + +# Contact Information + +- <[email protected]> + + +# Resources + +- [Apache Bigtop](http://bigtop.apache.org/) home page +- [Apache Bigtop issue tracking](http://bigtop.apache.org/issue-tracking.html) +- [Apache Bigtop mailing lists](http://bigtop.apache.org/mail-lists.html) +- [Juju Bigtop charms](https://jujucharms.com/q/apache/bigtop) +- [Juju mailing list](https://lists.ubuntu.com/mailman/listinfo/juju) +- [Juju community](https://jujucharms.com/community) http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/bundle-dev.yaml ---------------------------------------------------------------------- diff --git a/bigtop-deploy/juju/hadoop-spark/bundle-dev.yaml b/bigtop-deploy/juju/hadoop-spark/bundle-dev.yaml new file mode 100644 index 0000000..35623fd --- /dev/null +++ b/bigtop-deploy/juju/hadoop-spark/bundle-dev.yaml @@ -0,0 +1,138 @@ +services: + namenode: + charm: "cs:~bigdata-dev/xenial/hadoop-namenode" + num_units: 1 + annotations: + gui-x: "500" + gui-y: "800" + to: + - "0" + resourcemanager: + charm: "cs:~bigdata-dev/xenial/hadoop-resourcemanager" + num_units: 1 + annotations: + gui-x: "500" + gui-y: "0" + to: + - "0" + slave: + charm: "cs:~bigdata-dev/xenial/hadoop-slave" + num_units: 3 + annotations: + gui-x: "0" + gui-y: "400" + to: + - "1" + - "2" + - "3" + plugin: + charm: "cs:~bigdata-dev/xenial/hadoop-plugin" + annotations: + gui-x: "1000" + gui-y: "400" + client: + charm: "cs:xenial/hadoop-client-2" + num_units: 1 + annotations: + gui-x: "1250" + gui-y: "400" + to: + - "4" + spark: + charm: "cs:~bigdata-dev/xenial/spark" + num_units: 1 + options: + spark_execution_mode: "yarn-client" + annotations: + gui-x: "1000" + gui-y: "0" + to: + - "4" + zookeeper: + charm: "cs:xenial/zookeeper-10" + num_units: 3 + annotations: + gui-x: "500" + gui-y: "400" + to: + - "5" + - "6" + - "7" + ganglia: + charm: "cs:~bigdata-dev/xenial/ganglia-5" + num_units: 1 + annotations: + gui-x: "0" + gui-y: "800" + to: + - "8" + ganglia-node: + charm: "cs:~bigdata-dev/xenial/ganglia-node-6" + annotations: + gui-x: "250" + gui-y: "400" + rsyslog: + charm: "cs:~bigdata-dev/xenial/rsyslog-6" + num_units: 1 + annotations: + gui-x: "1000" + gui-y: "800" + to: + - "8" + rsyslog-forwarder-ha: + charm: "cs:~bigdata-dev/xenial/rsyslog-forwarder-ha-7" + annotations: + gui-x: "750" + gui-y: "400" +series: xenial +relations: + - [resourcemanager, namenode] + - [namenode, slave] + - [resourcemanager, slave] + - [plugin, namenode] + - [plugin, resourcemanager] + - [client, plugin] + - [spark, plugin] + - [spark, zookeeper] + - ["ganglia-node:juju-info", "client:juju-info"] + - ["ganglia-node:juju-info", "namenode:juju-info"] + - ["ganglia-node:juju-info", "resourcemanager:juju-info"] + - ["ganglia-node:juju-info", "slave:juju-info"] + - ["ganglia-node:juju-info", "spark:juju-info"] + - ["ganglia-node:juju-info", "zookeeper:juju-info"] + - ["ganglia:node", "ganglia-node:node"] + - ["rsyslog-forwarder-ha:juju-info", "client:juju-info"] + - ["rsyslog-forwarder-ha:juju-info", "namenode:juju-info"] + - ["rsyslog-forwarder-ha:juju-info", "resourcemanager:juju-info"] + - ["rsyslog-forwarder-ha:juju-info", "slave:juju-info"] + - ["rsyslog-forwarder-ha:juju-info", "spark:juju-info"] + - ["rsyslog-forwarder-ha:juju-info", "zookeeper:juju-info"] + - ["rsyslog:aggregator", "rsyslog-forwarder-ha:syslog"] +machines: + "0": + constraints: "mem=7G root-disk=32G" + series: "xenial" + "1": + constraints: "mem=7G root-disk=32G" + series: "xenial" + "2": + constraints: "mem=7G root-disk=32G" + series: "xenial" + "3": + constraints: "mem=7G root-disk=32G" + series: "xenial" + "4": + constraints: "mem=7G root-disk=32G" + series: "xenial" + "5": + constraints: "mem=3G root-disk=32G" + series: "xenial" + "6": + constraints: "mem=3G root-disk=32G" + series: "xenial" + "7": + constraints: "mem=3G root-disk=32G" + series: "xenial" + "8": + constraints: "mem=3G" + series: "xenial" http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/bundle-local.yaml ---------------------------------------------------------------------- diff --git a/bigtop-deploy/juju/hadoop-spark/bundle-local.yaml b/bigtop-deploy/juju/hadoop-spark/bundle-local.yaml new file mode 100644 index 0000000..160683a --- /dev/null +++ b/bigtop-deploy/juju/hadoop-spark/bundle-local.yaml @@ -0,0 +1,138 @@ +services: + namenode: + charm: "/home/ubuntu/charms/xenial/hadoop-namenode" + num_units: 1 + annotations: + gui-x: "500" + gui-y: "800" + to: + - "0" + resourcemanager: + charm: "/home/ubuntu/charms/xenial/hadoop-resourcemanager" + num_units: 1 + annotations: + gui-x: "500" + gui-y: "0" + to: + - "0" + slave: + charm: "/home/ubuntu/charms/xenial/hadoop-slave" + num_units: 3 + annotations: + gui-x: "0" + gui-y: "400" + to: + - "1" + - "2" + - "3" + plugin: + charm: "/home/ubuntu/charms/xenial/hadoop-plugin" + annotations: + gui-x: "1000" + gui-y: "400" + client: + charm: "cs:xenial/hadoop-client-2" + num_units: 1 + annotations: + gui-x: "1250" + gui-y: "400" + to: + - "4" + spark: + charm: "/home/ubuntu/charms/xenial/spark" + num_units: 1 + options: + spark_execution_mode: "yarn-client" + annotations: + gui-x: "1000" + gui-y: "0" + to: + - "4" + zookeeper: + charm: "cs:xenial/zookeeper-10" + num_units: 3 + annotations: + gui-x: "500" + gui-y: "400" + to: + - "5" + - "6" + - "7" + ganglia: + charm: "cs:~bigdata-dev/xenial/ganglia-5" + num_units: 1 + annotations: + gui-x: "0" + gui-y: "800" + to: + - "8" + ganglia-node: + charm: "cs:~bigdata-dev/xenial/ganglia-node-6" + annotations: + gui-x: "250" + gui-y: "400" + rsyslog: + charm: "cs:~bigdata-dev/xenial/rsyslog-6" + num_units: 1 + annotations: + gui-x: "1000" + gui-y: "800" + to: + - "8" + rsyslog-forwarder-ha: + charm: "cs:~bigdata-dev/xenial/rsyslog-forwarder-ha-7" + annotations: + gui-x: "750" + gui-y: "400" +series: xenial +relations: + - [resourcemanager, namenode] + - [namenode, slave] + - [resourcemanager, slave] + - [plugin, namenode] + - [plugin, resourcemanager] + - [client, plugin] + - [spark, plugin] + - [spark, zookeeper] + - ["ganglia-node:juju-info", "client:juju-info"] + - ["ganglia-node:juju-info", "namenode:juju-info"] + - ["ganglia-node:juju-info", "resourcemanager:juju-info"] + - ["ganglia-node:juju-info", "slave:juju-info"] + - ["ganglia-node:juju-info", "spark:juju-info"] + - ["ganglia-node:juju-info", "zookeeper:juju-info"] + - ["ganglia:node", "ganglia-node:node"] + - ["rsyslog-forwarder-ha:juju-info", "client:juju-info"] + - ["rsyslog-forwarder-ha:juju-info", "namenode:juju-info"] + - ["rsyslog-forwarder-ha:juju-info", "resourcemanager:juju-info"] + - ["rsyslog-forwarder-ha:juju-info", "slave:juju-info"] + - ["rsyslog-forwarder-ha:juju-info", "spark:juju-info"] + - ["rsyslog-forwarder-ha:juju-info", "zookeeper:juju-info"] + - ["rsyslog:aggregator", "rsyslog-forwarder-ha:syslog"] +machines: + "0": + constraints: "mem=7G root-disk=32G" + series: "xenial" + "1": + constraints: "mem=7G root-disk=32G" + series: "xenial" + "2": + constraints: "mem=7G root-disk=32G" + series: "xenial" + "3": + constraints: "mem=7G root-disk=32G" + series: "xenial" + "4": + constraints: "mem=7G root-disk=32G" + series: "xenial" + "5": + constraints: "mem=3G root-disk=32G" + series: "xenial" + "6": + constraints: "mem=3G root-disk=32G" + series: "xenial" + "7": + constraints: "mem=3G root-disk=32G" + series: "xenial" + "8": + constraints: "mem=3G" + series: "xenial" http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/bundle.yaml ---------------------------------------------------------------------- diff --git a/bigtop-deploy/juju/hadoop-spark/bundle.yaml b/bigtop-deploy/juju/hadoop-spark/bundle.yaml new file mode 100644 index 0000000..67b9bb7 --- /dev/null +++ b/bigtop-deploy/juju/hadoop-spark/bundle.yaml @@ -0,0 +1,138 @@ +services: + namenode: + charm: "cs:xenial/hadoop-namenode-6" + num_units: 1 + annotations: + gui-x: "500" + gui-y: "800" + to: + - "0" + resourcemanager: + charm: "cs:xenial/hadoop-resourcemanager-6" + num_units: 1 + annotations: + gui-x: "500" + gui-y: "0" + to: + - "0" + slave: + charm: "cs:xenial/hadoop-slave-6" + num_units: 3 + annotations: + gui-x: "0" + gui-y: "400" + to: + - "1" + - "2" + - "3" + plugin: + charm: "cs:xenial/hadoop-plugin-6" + annotations: + gui-x: "1000" + gui-y: "400" + client: + charm: "cs:xenial/hadoop-client-2" + num_units: 1 + annotations: + gui-x: "1250" + gui-y: "400" + to: + - "4" + spark: + charm: "cs:xenial/spark-15" + num_units: 1 + options: + spark_execution_mode: "yarn-client" + annotations: + gui-x: "1000" + gui-y: "0" + to: + - "4" + zookeeper: + charm: "cs:xenial/zookeeper-10" + num_units: 3 + annotations: + gui-x: "500" + gui-y: "400" + to: + - "5" + - "6" + - "7" + ganglia: + charm: "cs:~bigdata-dev/xenial/ganglia-5" + num_units: 1 + annotations: + gui-x: "0" + gui-y: "800" + to: + - "8" + ganglia-node: + charm: "cs:~bigdata-dev/xenial/ganglia-node-6" + annotations: + gui-x: "250" + gui-y: "400" + rsyslog: + charm: "cs:~bigdata-dev/xenial/rsyslog-6" + num_units: 1 + annotations: + gui-x: "1000" + gui-y: "800" + to: + - "8" + rsyslog-forwarder-ha: + charm: "cs:~bigdata-dev/xenial/rsyslog-forwarder-ha-7" + annotations: + gui-x: "750" + gui-y: "400" +series: xenial +relations: + - [resourcemanager, namenode] + - [namenode, slave] + - [resourcemanager, slave] + - [plugin, namenode] + - [plugin, resourcemanager] + - [client, plugin] + - [spark, plugin] + - [spark, zookeeper] + - ["ganglia-node:juju-info", "client:juju-info"] + - ["ganglia-node:juju-info", "namenode:juju-info"] + - ["ganglia-node:juju-info", "resourcemanager:juju-info"] + - ["ganglia-node:juju-info", "slave:juju-info"] + - ["ganglia-node:juju-info", "spark:juju-info"] + - ["ganglia-node:juju-info", "zookeeper:juju-info"] + - ["ganglia:node", "ganglia-node:node"] + - ["rsyslog-forwarder-ha:juju-info", "client:juju-info"] + - ["rsyslog-forwarder-ha:juju-info", "namenode:juju-info"] + - ["rsyslog-forwarder-ha:juju-info", "resourcemanager:juju-info"] + - ["rsyslog-forwarder-ha:juju-info", "slave:juju-info"] + - ["rsyslog-forwarder-ha:juju-info", "spark:juju-info"] + - ["rsyslog-forwarder-ha:juju-info", "zookeeper:juju-info"] + - ["rsyslog:aggregator", "rsyslog-forwarder-ha:syslog"] +machines: + "0": + constraints: "mem=7G root-disk=32G" + series: "xenial" + "1": + constraints: "mem=7G root-disk=32G" + series: "xenial" + "2": + constraints: "mem=7G root-disk=32G" + series: "xenial" + "3": + constraints: "mem=7G root-disk=32G" + series: "xenial" + "4": + constraints: "mem=7G root-disk=32G" + series: "xenial" + "5": + constraints: "mem=3G root-disk=32G" + series: "xenial" + "6": + constraints: "mem=3G root-disk=32G" + series: "xenial" + "7": + constraints: "mem=3G root-disk=32G" + series: "xenial" + "8": + constraints: "mem=3G" + series: "xenial" http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/copyright ---------------------------------------------------------------------- diff --git a/bigtop-deploy/juju/hadoop-spark/copyright b/bigtop-deploy/juju/hadoop-spark/copyright new file mode 100644 index 0000000..e900b97 --- /dev/null +++ b/bigtop-deploy/juju/hadoop-spark/copyright @@ -0,0 +1,16 @@ +Format: http://dep.debian.net/deps/dep5/ + +Files: * +Copyright: Copyright 2015, Canonical Ltd., All Rights Reserved. +License: Apache License 2.0 + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + . + http://www.apache.org/licenses/LICENSE-2.0 + . + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/tests/01-bundle.py ---------------------------------------------------------------------- diff --git a/bigtop-deploy/juju/hadoop-spark/tests/01-bundle.py b/bigtop-deploy/juju/hadoop-spark/tests/01-bundle.py new file mode 100755 index 0000000..ba292bc --- /dev/null +++ b/bigtop-deploy/juju/hadoop-spark/tests/01-bundle.py @@ -0,0 +1,137 @@ +#!/usr/bin/env python3 + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import amulet +import os +import re +import unittest +import yaml + + +class TestBundle(unittest.TestCase): + bundle_file = os.path.join(os.path.dirname(__file__), '..', 'bundle.yaml') + + @classmethod + def setUpClass(cls): + # classmethod inheritance doesn't work quite right with + # setUpClass / tearDownClass, so subclasses have to manually call this + cls.d = amulet.Deployment(series='xenial') + with open(cls.bundle_file) as f: + bun = f.read() + bundle = yaml.safe_load(bun) + + # NB: strip machine ('to') placement out. amulet loses our machine spec + # somewhere between yaml and json; without that spec, charms specifying + # machine placement will not deploy. This is ok for now because all + # charms in this bundle are using 'reset: false' so we'll already + # have our deployment just the way we want it by the time this test + # runs. However, it's bad. Remove once this is fixed: + # https://github.com/juju/amulet/issues/148 + for service, service_config in bundle['services'].items(): + if 'to' in service_config: + del service_config['to'] + + cls.d.load(bundle) + cls.d.setup(timeout=3600) + + # we need units reporting ready before we attempt our smoke tests + cls.d.sentry.wait_for_messages({'client': re.compile('ready'), + 'namenode': re.compile('ready'), + 'resourcemanager': re.compile('ready'), + 'slave': re.compile('ready'), + 'spark': re.compile('ready'), + }, timeout=3600) + cls.hdfs = cls.d.sentry['namenode'][0] + cls.yarn = cls.d.sentry['resourcemanager'][0] + cls.slave = cls.d.sentry['slave'][0] + cls.spark = cls.d.sentry['spark'][0] + + def test_components(self): + """ + Confirm that all of the required components are up and running. + """ + hdfs, retcode = self.hdfs.run("pgrep -a java") + yarn, retcode = self.yarn.run("pgrep -a java") + slave, retcode = self.slave.run("pgrep -a java") + spark, retcode = self.spark.run("pgrep -a java") + + assert 'NameNode' in hdfs, "NameNode not started" + assert 'NameNode' not in slave, "NameNode should not be running on slave" + + assert 'ResourceManager' in yarn, "ResourceManager not started" + assert 'ResourceManager' not in slave, "ResourceManager should not be running on slave" + + assert 'JobHistoryServer' in yarn, "JobHistoryServer not started" + assert 'JobHistoryServer' not in slave, "JobHistoryServer should not be running on slave" + + assert 'NodeManager' in slave, "NodeManager not started" + assert 'NodeManager' not in yarn, "NodeManager should not be running on resourcemanager" + assert 'NodeManager' not in hdfs, "NodeManager should not be running on namenode" + + assert 'DataNode' in slave, "DataNode not started" + assert 'DataNode' not in yarn, "DataNode should not be running on resourcemanager" + assert 'DataNode' not in hdfs, "DataNode should not be running on namenode" + + assert 'Master' in spark, "Spark Master not started" + + def test_hdfs(self): + """ + Validates mkdir, ls, chmod, and rm HDFS operations. + """ + uuid = self.hdfs.run_action('smoke-test') + result = self.d.action_fetch(uuid, timeout=600, full_output=True) + # action status=completed on success + if (result['status'] != "completed"): + self.fail('HDFS smoke-test did not complete: %s' % result) + + def test_yarn(self): + """ + Validates YARN using the Bigtop 'yarn' smoke test. + """ + uuid = self.yarn.run_action('smoke-test') + # 'yarn' smoke takes a while (bigtop tests download lots of stuff) + result = self.d.action_fetch(uuid, timeout=1800, full_output=True) + # action status=completed on success + if (result['status'] != "completed"): + self.fail('YARN smoke-test did not complete: %s' % result) + + def test_spark(self): + """ + Validates Spark with a simple sparkpi test. + """ + uuid = self.spark.run_action('smoke-test') + result = self.d.action_fetch(uuid, timeout=600, full_output=True) + # action status=completed on success + if (result['status'] != "completed"): + self.fail('Spark smoke-test did not complete: %s' % result) + + @unittest.skip( + 'Skipping slave smoke tests; they are too inconsistent and long running for CWR.') + def test_slave(self): + """ + Validates slave using the Bigtop 'hdfs' and 'mapred' smoke test. + """ + uuid = self.slave.run_action('smoke-test') + # 'hdfs+mapred' smoke takes a long while (bigtop tests are slow) + result = self.d.action_fetch(uuid, timeout=3600, full_output=True) + # action status=completed on success + if (result['status'] != "completed"): + self.fail('Slave smoke-test did not complete: %s' % result) + + +if __name__ == '__main__': + unittest.main() http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/tests/tests.yaml ---------------------------------------------------------------------- diff --git a/bigtop-deploy/juju/hadoop-spark/tests/tests.yaml b/bigtop-deploy/juju/hadoop-spark/tests/tests.yaml new file mode 100644 index 0000000..c9325b0 --- /dev/null +++ b/bigtop-deploy/juju/hadoop-spark/tests/tests.yaml @@ -0,0 +1,7 @@ +reset: false +deployment_timeout: 7200 +sources: + - 'ppa:juju/stable' +packages: + - amulet + - python3-yaml
