Repository: incubator-toree-website Updated Branches: refs/heads/OverhaulSite 0b721c876 -> 9b329ef1f
Added Installation, Quick Start, and FAQ pages Project: http://git-wip-us.apache.org/repos/asf/incubator-toree-website/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-toree-website/commit/9b329ef1 Tree: http://git-wip-us.apache.org/repos/asf/incubator-toree-website/tree/9b329ef1 Diff: http://git-wip-us.apache.org/repos/asf/incubator-toree-website/diff/9b329ef1 Branch: refs/heads/OverhaulSite Commit: 9b329ef1fa052a4624b79ccb5d84f09a985a0a88 Parents: 0b721c8 Author: Corey A. Stubbs <cstu...@us.ibm.com> Authored: Mon Jun 13 13:39:45 2016 -0500 Committer: Corey A. Stubbs <cstu...@us.ibm.com> Committed: Mon Jun 13 13:39:45 2016 -0500 ---------------------------------------------------------------------- assets/image/toree-quick-start-notebook.gif | Bin 0 -> 280479 bytes assets/image/toree-quick-start-spark.gif | Bin 0 -> 129767 bytes documentation/user/faq.md | 38 +++++- documentation/user/installation.md | 127 ++++++++++++++++++- documentation/user/quick-start.md | 53 +++++++- .../user/using-with-jupyter-notebooks.md | 5 +- 6 files changed, 206 insertions(+), 17 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-toree-website/blob/9b329ef1/assets/image/toree-quick-start-notebook.gif ---------------------------------------------------------------------- diff --git a/assets/image/toree-quick-start-notebook.gif b/assets/image/toree-quick-start-notebook.gif new file mode 100644 index 0000000..c842614 Binary files /dev/null and b/assets/image/toree-quick-start-notebook.gif differ http://git-wip-us.apache.org/repos/asf/incubator-toree-website/blob/9b329ef1/assets/image/toree-quick-start-spark.gif ---------------------------------------------------------------------- diff --git a/assets/image/toree-quick-start-spark.gif b/assets/image/toree-quick-start-spark.gif new file mode 100644 index 0000000..ed44c40 Binary files /dev/null and b/assets/image/toree-quick-start-spark.gif differ http://git-wip-us.apache.org/repos/asf/incubator-toree-website/blob/9b329ef1/documentation/user/faq.md ---------------------------------------------------------------------- diff --git a/documentation/user/faq.md b/documentation/user/faq.md index 230fc1e..bf464dd 100644 --- a/documentation/user/faq.md +++ b/documentation/user/faq.md @@ -9,6 +9,40 @@ tagline: Apache Project ! {% include JB/setup %} -- How to visualize data -- How to create dashboards with interactive widgets +# FAQ +## How do I access Apache Spark? + +You can access Spark through a `SparkContext` which is created by Apache Toree when the kernel starts. You can access +the context through the `sc` variable. + +## How do I add a jar? +Jars are added through the `AddJar` magic. You simply need to supply an URL for the jar to be added. + +``` +%AddJar http://myproject.com/myproject/my.jar +``` + +For more information about the `AddJar` magic see the [Magic Tutorial Notebook][1]. + +## How do I add a library/dependency? + +Dependencies stored in repositories can be added through the `AddDeps` magic. An example usage would be: + +``` +%AddDeps my.company artifact-id version +``` + +If the dependency you are trying to add has transitive dependencies, you can add the `--transitive` flag to add those dependencies as well. +For more information about the `AddDeps` magic see the [Magic Tutorial Notebook][1]. + +## How do I visualize data? +The most straightforward way to add data visualization with Apache Toree is through the [Jupyter Declarative Widgets][2] project. + +## How do I create dashboards with interactive widgets? +Notebooks can be changed into dashboards through the [Jupyter Dashboards][3] project. This project allows you to use +[Jupyter Declarative Widgets][2] in your dashboards. + +[1]: https://github.com/apache/incubator-toree/blob/master/etc/examples/notebooks/magic-tutorial.ipynb +[2]: https://github.com/jupyter-incubator/declarativewidgets +[3]: https://github.com/jupyter-incubator/dashboards \ No newline at end of file http://git-wip-us.apache.org/repos/asf/incubator-toree-website/blob/9b329ef1/documentation/user/installation.md ---------------------------------------------------------------------- diff --git a/documentation/user/installation.md b/documentation/user/installation.md index eebdce3..5a9096c 100644 --- a/documentation/user/installation.md +++ b/documentation/user/installation.md @@ -9,10 +9,125 @@ tagline: Apache Project ! {% include JB/setup %} -- Explain the different options -- `SPARK_OPTS` -- CMD line options for Toree -- Docker options (all-spark-notebook) -- Multiple kernels with default languages -- Requirement (based on language) +# Installation + +## Setup + +An Apache Spark distribution is required to be installed before installing Apache Toree. You can download a copy of Apache Spark [here](http://spark.apache.org/downloads.html). Throughout the rest of this guide we will assume you have downloaded and extracted the Apache Spark distribution to `/usr/local/bin/apache-spark/`. + +## Installing Toree via Pip + +The quickest way to install Apache Toree is through the toree pip package. + +``` +pip install toree +``` + +This will install a jupyter application called `toree`, which can be used to install and configure different Apache Toree kernels. + +``` +jupyter toree install --spark_home=/usr/local/bin/apache-spark/ +``` + +You can confirm the installation by verifying the `apache_toree_scala` kernel is listed in the following command: + +``` +jupyter kernelspec list +``` + +## Options +Arguments that take values are actually convenience aliases to full +Configurables, whose aliases are listed on the help line. For more information +on full configurables, see '--help-all'. + +``` +--user + Install to the per-user kernel registry +--debug + set log level to logging.DEBUG (maximize logging output) +--replace + Replace any existing kernel spec with this name. +--sys-prefix + Install to Python's sys.prefix. Useful in conda/virtual environments. +--interpreters=<Unicode> (ToreeInstall.interpreters) + Default: 'Scala' + A comma separated list of the interpreters to install. The names of the + interpreters are case sensitive. +--toree_opts=<Unicode> (ToreeInstall.toree_opts) + Default: '' + Specify command line arguments for Apache Toree. +--python_exec=<Unicode> (ToreeInstall.python_exec) + Default: 'python' + Specify the python executable. Defaults to "python" +--kernel_name=<Unicode> (ToreeInstall.kernel_name) + Default: 'Apache Toree' + Install the kernel spec with this name. This is also used as the base of the + display name in jupyter. +--log-level=<Enum> (Application.log_level) + Default: 30 + Choices: (0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL') + Set the log level by value or name. +--config=<Unicode> (JupyterApp.config_file) + Default: '' + Full path of a config file. +--spark_home=<Unicode> (ToreeInstall.spark_home) + Default: '/usr/local/spark' + Specify where the spark files can be found. +--spark_opts=<Unicode> (ToreeInstall.spark_opts) + Default: '' + Specify command line arguments to proxy for spark config. +``` + +# Configuring Spark + +There are two options for setting configuration options for Spark. + +The first is at install time with the `--spark_opts` command line option. + +``` +jupyter toree instal --spark_opts='--master=local[4]' +``` + +The second option is configured at run time through the `SPARK_OPTS` environment variable. + +``` +SPARK_OPTS='--master=local[4]' jupyter notebook +``` + +__Note:__ There is an order of precedence to the configuration options. `SPARK_OPTS` will overwrite any values configured in `--spark_opts`. + + +## Installing Multiple Kernels + +Apache Toree provides support for multiple languages. To enable this you need to install the configurations for these +interpreters as a comma seperated list to the `--interpreters` flag: + +``` +jupyter toree install --interpreters=Scala,PySpark,SparkR,SQL +``` + +The available interpreters and their supported languages are: + +| Language | Spark Implementation | Value to provide to Apache Toree | +|----------|----------------------|----------------------------------| +| Scala | Scala with Spark | Scala | +| Python | Python with PySpark | PySpark | +| R | R with SparkR | SparkR | +| SQL | Spark SQL | SQL | + +### Interpreter Requirements +* R version 3.2+ +* Make sure that the packages directory used by R when installing packages is writable, necessary to installed modified SparkR library. This is done automatically before any R code is run. + +If the package directory is not writable by the Apache Toree, then you should see an error similar to the following: + +``` +Installing package into â/usr/local/lib/R/site-libraryâ +(as âlibâ is unspecified) +Warning in install.packages("sparkr_bundle.tar.gz", repos = NULL, type = "source") : +'lib = "/usr/local/lib/R/site-library"' is not writable +Error in install.packages("sparkr_bundle.tar.gz", repos = NULL, type = "source") : +unable to install packages +Execution halted +``` http://git-wip-us.apache.org/repos/asf/incubator-toree-website/blob/9b329ef1/documentation/user/quick-start.md ---------------------------------------------------------------------- diff --git a/documentation/user/quick-start.md b/documentation/user/quick-start.md index 7e69796..2560e58 100644 --- a/documentation/user/quick-start.md +++ b/documentation/user/quick-start.md @@ -9,10 +9,53 @@ tagline: Apache Project ! {% include JB/setup %} -- What is Toree -- Installing as kernel in Jupyter -- Pick your language -- Your Hello World example -- Where to try Toree? +# Quick Start +## What is Apache Toree +Apache Toree has one main goal: provide the foundation for interactive applications to connect and use [Apache Spark][1]. +The project intends to provide applications with the ability to send both packaged jars and code snippets. As it +implements the latest Jupyter message protocol, Apache Toree can easily plug into the Jupyter ecosystem for quick, interactive data exploration. + +## Installing as kernel in Jupyter + +This requires you to have a distribution of [Apache Spark][1] downloaded to the system where Apache Toree will run. The +following commands will install Apache Toree. + +``` +pip install toree +jupyter toree install --spark_home=/usr/local/bin/apache-spark/ +``` + +## Your Hello World example + +One of the most common ways to use Apache Toree is for interactive data exploration in a Jupyter Notebook. You will +first need to install the notebook and get the notebook server running: + +``` +pip install notebook +jupyter notebook +``` + +The following clip shows a simple notebook running Scala code to print `Hello, World!`. Each of the code cells can be +run by pressing `Shift-Enter` on your keyboard. + +<img src="/assets/image/toree-quick-start-notebook.gif" alt="Drawing" style="width: 100%;"/> + +A key component to Apache Toree is that is will automatically create a `SparkContext` binding for you. This can be accessed +through the variable `sc`. The following clip shows code accessing the `SparkContext` and returning a value. + +<img src="/assets/image/toree-quick-start-spark.gif" alt="Drawing" style="width: 100%;"/> + + +## Where to try Apache Toree? +* [![Binder](http://mybinder.org/badge.svg)][2] +* [Try Jupyter][3] (_Spark With Scala Notbeook_) +* [IBM Bluemix][4] + + + +[1]: https://spark.apache.org/ +[2]: http://mybinder.org/badge.svg)](http://mybinder.org/repo/apache/incubator-toree +[3]: http://try.jupyter.org +[4]: https://console.ng.bluemix.net/catalog/services/apache-spark \ No newline at end of file http://git-wip-us.apache.org/repos/asf/incubator-toree-website/blob/9b329ef1/documentation/user/using-with-jupyter-notebooks.md ---------------------------------------------------------------------- diff --git a/documentation/user/using-with-jupyter-notebooks.md b/documentation/user/using-with-jupyter-notebooks.md index 2957e5f..307e517 100644 --- a/documentation/user/using-with-jupyter-notebooks.md +++ b/documentation/user/using-with-jupyter-notebooks.md @@ -12,8 +12,5 @@ tagline: Apache Project ! - Create a notebook with Toree - Intro to magics - Intro to kernel API -- How Tos - - How to add a jar - - How to add a library - - How to access Apache Spark +