[systemds] 01/02: [MINOR] Remove notebooks

baunsgaard Mon, 07 Jun 2021 04:28:31 -0700

This is an automated email from the ASF dual-hosted git repository.

baunsgaard pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemds.git


commit ffffc59bc5febb96bd523134dbc75817e15876f4
Author: baunsgaard <[email protected]>
AuthorDate: Mon Jun 7 09:50:03 2021 +0200

    [MINOR] Remove notebooks
---
 notebooks/databricks/MLContext.scala | 205 ------------
 notebooks/databricks/README.md       |   9 -
 notebooks/systemds_dev.ipynb         | 582 -----------------------------------
 3 files changed, 796 deletions(-)

diff --git a/notebooks/databricks/MLContext.scala 
b/notebooks/databricks/MLContext.scala
deleted file mode 100644
index 55b6536..0000000
--- a/notebooks/databricks/MLContext.scala
+++ /dev/null
@@ -1,205 +0,0 @@
-// Databricks notebook source
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *   http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing,
- * software distributed under the License is distributed on an
- * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- * KIND, either express or implied.  See the License for the
- * specific language governing permissions and limitations
- * under the License.
- */
-
-
-// COMMAND ----------
-
-// MAGIC %md # Apache SystemDS on Databricks
-
-// COMMAND ----------
-
-// MAGIC %md ## Create a quickstart cluster
-// MAGIC 
-// MAGIC 1. In the sidebar, right-click the **Clusters** button and open the 
link in a new window.
-// MAGIC 1. On the Clusters page, click **Create Cluster**.
-// MAGIC 1. Name the cluster **Quickstart**.
-// MAGIC 1. In the Databricks Runtime Version drop-down, select **6.4 (Scala 
2.11, Spark 2.4.5)**.
-// MAGIC 1. Click **Create Cluster**.
-// MAGIC 1. Attach `SystemDS.jar` file to the libraries
-
-// COMMAND ----------
-
-// MAGIC %md ## Attach the notebook to the cluster and run all commands in the 
notebook
-// MAGIC 
-// MAGIC 1. Return to this notebook. 
-// MAGIC 1. In the notebook menu bar, select **<img 
src="http://docs.databricks.com/_static/images/notebooks/detached.png"/></a> > 
Quickstart**.
-// MAGIC 1. When the cluster changes from <img 
src="http://docs.databricks.com/_static/images/clusters/cluster-starting.png"/></a>
 to <img 
src="http://docs.databricks.com/_static/images/clusters/cluster-running.png"/></a>,
 click **<img 
src="http://docs.databricks.com/_static/images/notebooks/run-all.png"/></a> Run 
All**.
-
-// COMMAND ----------
-
-// MAGIC %md ## Load SystemDS MLContext API
-
-// COMMAND ----------
-
-import org.apache.sysds.api.mlcontext._
-import org.apache.sysds.api.mlcontext.ScriptFactory._
-val ml = new MLContext(spark)
-
-// COMMAND ----------
-
-val habermanUrl = 
"http://archive.ics.uci.edu/ml/machine-learning-databases/haberman/haberman.data";
-val habermanList = scala.io.Source.fromURL(habermanUrl).mkString.split("\n")
-val habermanRDD = sc.parallelize(habermanList)
-val habermanMetadata = new MatrixMetadata(306, 4)
-val typesRDD = sc.parallelize(Array("1.0,1.0,1.0,2.0"))
-val typesMetadata = new MatrixMetadata(1, 4)
-val scriptUrl = 
"https://raw.githubusercontent.com/apache/systemds/master/scripts/algorithms/Univar-Stats.dml";
-val uni = dmlFromUrl(scriptUrl).in("A", habermanRDD, habermanMetadata).in("K", 
typesRDD, typesMetadata).in("$CONSOLE_OUTPUT", true)
-ml.execute(uni)
-
-// COMMAND ----------
-
-// MAGIC %md ### Create a neural network layer with (R-like) DML language
-
-// COMMAND ----------
-
-val s = """
-  source("scripts/nn/layers/relu.dml") as relu;
-  X = rand(rows=100, cols=10, min=-1, max=1);
-  R1 = relu::forward(X);
-  R2 = max(X, 0);
-  R = sum(R1==R2);
-  """
-
-val ret = ml.execute(dml(s).out("R")).getScalarObject("R").getDoubleValue();
-
-// COMMAND ----------
-
-// MAGIC %md ### Recommendation with Amazon review dataset
-
-// COMMAND ----------
-
-import java.net.URL
-import java.io.File
-import org.apache.commons.io.FileUtils
-
-FileUtils.copyURLToFile(new 
URL("http://snap.stanford.edu/data/amazon0601.txt.gz";), new 
File("/tmp/amazon0601.txt.gz"))
-
-// COMMAND ----------
-
-// MAGIC %sh
-// MAGIC gunzip -d /tmp/amazon0601.txt.gz
-
-// COMMAND ----------
-
-// To list the file system files. For more 
https://docs.databricks.com/data/filestore.html
-// File system: display(dbutils.fs.ls("file:/tmp"))
-// DBFS: display(dbutils.fs.ls("."))
-
-dbutils.fs.mv("file:/tmp/amazon0601.txt", "dbfs:/tmp/amazon0601.txt")
-
-// COMMAND ----------
-
-display(dbutils.fs.ls("/tmp"))
-// display(dbutils.fs.ls("file:/tmp"))
-
-// COMMAND ----------
-
-// move temporary files to databricks file system (DBFS)
-// dbutils.fs.mv("file:/databricks/driver/amazon0601.txt", 
"dbfs:/tmp/amazon0601.txt") 
-val df = spark.read.format("text").option("inferSchema", 
"true").option("header","true").load("dbfs:/tmp/amazon0601.txt")
-display(df)
-
-// COMMAND ----------
-
-// MAGIC %py
-// MAGIC 
-// MAGIC # The scala data processing pipeline can also be
-// MAGIC # implemented in python as shown in this block
-// MAGIC 
-// MAGIC # 
-// MAGIC # import pyspark.sql.functions as F
-// MAGIC # # https://spark.apache.org/docs/latest/sql-ref.html
-// MAGIC 
-// MAGIC # dataPath = "dbfs:/tmp/amazon0601.txt"
-// MAGIC 
-// MAGIC # X_train = (sc.textFile(dataPath)
-// MAGIC #     .filter(lambda l: not l.startswith("#"))
-// MAGIC #     .map(lambda l: l.split("\t"))
-// MAGIC #     .map(lambda prods: (int(prods[0]), int(prods[1]), 1.0))
-// MAGIC #     .toDF(("prod_i", "prod_j", "x_ij"))
-// MAGIC #     .filter("prod_i < 500 AND prod_j < 500") # Filter for memory 
constraints
-// MAGIC #     .cache())
-// MAGIC 
-// MAGIC # max_prod_i = X_train.select(F.max("prod_i")).first()[0]
-// MAGIC # max_prod_j = X_train.select(F.max("prod_j")).first()[0]
-// MAGIC # numProducts = max(max_prod_i, max_prod_j) + 1 # 0-based indexing
-// MAGIC # print("Total number of products: {}".format(numProducts))
-
-// COMMAND ----------
-
-// Reference: https://spark.apache.org/docs/latest/rdd-programming-guide.html
-val X_train = (sc.textFile("dbfs:/tmp/amazon0601.txt").filter(l => 
!(l.startsWith("#"))).map(l => l.split("\t"))
-                  .map(prods => (prods(0).toLong, prods(1).toLong, 1.0))
-                  .toDF("prod_i", "prod_j", "x_ij")
-                  .filter("prod_i < 500 AND prod_j < 500") // filter for 
memory constraints
-                  .cache())
-
-display(X_train)
-
-// COMMAND ----------
-
-// MAGIC %md #### Poisson Nonnegative Matrix Factorization
-
-// COMMAND ----------
-
-# Poisson Nonnegative Matrix Factorization
-
-val pnmf = """
-# data & args
-X = X+1 # change product IDs to be 1-based, rather than 0-based
-V = table(X[,1], X[,2])
-size = ifdef($size, -1)
-if(size > -1) {
-    V = V[1:size,1:size]
-}
-
-n = nrow(V)
-m = ncol(V)
-range = 0.01
-W = Rand(rows=n, cols=rank, min=0, max=range, pdf="uniform")
-H = Rand(rows=rank, cols=m, min=0, max=range, pdf="uniform")
-losses = matrix(0, rows=max_iter, cols=1)
-
-# run PNMF
-i=1
-while(i <= max_iter) {
-  # update params
-  H = (H * (t(W) %*% (V/(W%*%H))))/t(colSums(W)) 
-  W = (W * ((V/(W%*%H)) %*% t(H)))/t(rowSums(H))
-  
-  # compute loss
-  losses[i,] = -1 * (sum(V*log(W%*%H)) - as.scalar(colSums(W)%*%rowSums(H)))
-  i = i + 1;
-}
-  """
-
-val ret = ml.execute(dml(pnmf).in("X", X_train).in("max_iter", 100).in("rank", 
10).out("W").out("H").out("losses"));
-
-// COMMAND ----------
-
-val W = ret.getMatrix("W")
-val H = ret.getMatrix("H")
-val losses = ret.getMatrix("losses")
-
-// COMMAND ----------
-
-val lossesDF = losses.toDF().sort("__INDEX")
-display(lossesDF)
diff --git a/notebooks/databricks/README.md b/notebooks/databricks/README.md
deleted file mode 100644
index f4ce275..0000000
--- a/notebooks/databricks/README.md
+++ /dev/null
@@ -1,9 +0,0 @@
-#### Setup Apache SystemDS on Databricks platform
-
-1. Create a new account at [databricks 
cloud](https://community.cloud.databricks.com/)
-2. In left-side navbar select **Clusters** > **`+ Create Cluster`** > Name the 
cluster! > **`Create Cluster`**
-3. Navigate to the created cluster configuration.
-    1. Select **Libraries**
-    2. Select **Install New** > **Library Source [`Upload`]** and **Library 
Type [`Jar`]**
-    3. Upload the `SystemDS.jar` file! > **`Install`**
-4. Attach a notebook to the cluster above.
diff --git a/notebooks/systemds_dev.ipynb b/notebooks/systemds_dev.ipynb
deleted file mode 100644
index 9ba218f..0000000
--- a/notebooks/systemds_dev.ipynb
+++ /dev/null
@@ -1,582 +0,0 @@
-{
-  "nbformat": 4,
-  "nbformat_minor": 0,
-  "metadata": {
-    "colab": {
-      "name": "SystemDS on Colaboratory.ipynb",
-      "provenance": [],
-      "collapsed_sections": [],
-      "toc_visible": true,
-      "include_colab_link": true
-    },
-    "kernelspec": {
-      "name": "python3",
-      "display_name": "Python 3"
-    }
-  },
-  "cells": [
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "XX60cA7YuZsw"
-      },
-      "source": [
-        "##### Copyright &copy; 2020 The Apache Software Foundation."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "8GEGDZ9GuZGp",
-        "cellView": "form"
-      },
-      "source": [
-        "# @title Apache Version 2.0 (The \"License\");\n",
-        "#-------------------------------------------------------------\n",
-        "#\n",
-        "# Licensed to the Apache Software Foundation (ASF) under one\n",
-        "# or more contributor license agreements.  See the NOTICE file\n",
-        "# distributed with this work for additional information\n",
-        "# regarding copyright ownership.  The ASF licenses this file\n",
-        "# to you under the Apache License, Version 2.0 (the\n",
-        "# \"License\"); you may not use this file except in compliance\n",
-        "# with the License.  You may obtain a copy of the License at\n",
-        "#\n",
-        "#   http://www.apache.org/licenses/LICENSE-2.0\n";,
-        "#\n",
-        "# Unless required by applicable law or agreed to in writing,\n",
-        "# software distributed under the License is distributed on an\n",
-        "# \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n",
-        "# KIND, either express or implied.  See the License for the\n",
-        "# specific language governing permissions and limitations\n",
-        "# under the License.\n",
-        "#\n",
-        "#-------------------------------------------------------------"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "_BbCdLjRoy2A"
-      },
-      "source": [
-        "### Developer notebook for Apache SystemDS"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "zhdfvxkEq1BX"
-      },
-      "source": [
-        "Run this notebook online at [Google Colab 
↗](https://colab.research.google.com/github/apache/systemds/blob/master/notebooks/systemds_dev.ipynb).\n",
-        "\n",
-        "\n"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "efFVuggts1hr"
-      },
-      "source": [
-        "This Jupyter/Colab-based tutorial will interactively walk through 
development setup and running SystemDS in both the\n",
-        "\n",
-        "A. standalone mode \\\n",
-        "B. with Apache Spark.\n",
-        "\n",
-        "Flow of the notebook:\n",
-        "1. Download and Install the dependencies\n",
-        "2. Go to section **A** or **B**"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "vBC5JPhkGbIV"
-      },
-      "source": [
-        "#### Download and Install the dependencies\n",
-        "\n",
-        "1. **Runtime:** Java (OpenJDK 8 is preferred)\n",
-        "2. **Build:** Apache Maven\n",
-        "3. **Backend:** Apache Spark (optional)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "VkLasseNylPO"
-      },
-      "source": [
-        "##### Setup\n",
-        "\n",
-        "A custom function to run OS commands."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "4Wmf-7jfydVH"
-      },
-      "source": [
-        "# Run and print a shell command.\n",
-        "def run(command):\n",
-        "  print('>> {}'.format(command))\n",
-        "  !{command}\n",
-        "  print('')"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "kvD4HBMi0ohY"
-      },
-      "source": [
-        "##### Install Java\n",
-        "Let us install OpenJDK 8. More about [OpenJDK 
↗](https://openjdk.java.net/install/)."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "8Xnb_ePUyQIL"
-      },
-      "source": [
-        "!apt-get update\n",
-        "!apt-get install openjdk-8-jdk-headless -qq > /dev/null\n",
-        "\n",
-        "# run the below command to replace the existing installation\n",
-        "!update-alternatives --set java 
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java\n",
-        "\n",
-        "import os\n",
-        "os.environ[\"JAVA_HOME\"] = \"/usr/lib/jvm/java-8-openjdk-amd64\"\n",
-        "\n",
-        "!java -version"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "BhmBWf3u3Q0o"
-      },
-      "source": [
-        "##### Install Apache Maven\n",
-        "\n",
-        "SystemDS uses Apache Maven to build and manage the project. More 
about [Apache Maven ↗](http://maven.apache.org/).\n",
-        "\n",
-        "Maven builds SystemDS using its project object model (POM) and a set 
of plugins. One would find `pom.xml` find the codebase!"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "I81zPDcblchL"
-      },
-      "source": [
-        "# Download the maven source.\n",
-        "maven_version = 'apache-maven-3.6.3'\n",
-        "maven_path = f\"/opt/{maven_version}\"\n",
-        "\n",
-        "if not os.path.exists(maven_path):\n",
-        "  run(f\"wget -q -nc -O apache-maven.zip 
https://downloads.apache.org/maven/maven-3/3.6.3/binaries/{maven_version}-bin.zip\";)\n",
-        "  run('unzip -q -d /opt apache-maven.zip')\n",
-        "  run('rm -f apache-maven.zip')\n",
-        "\n",
-        "# Let's choose the absolute path instead of $PATH environment 
variable.\n",
-        "def maven(args):\n",
-        "  run(f\"{maven_path}/bin/mvn {args}\")\n",
-        "\n",
-        "maven('-v')"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "Xphbe3R43XLw"
-      },
-      "source": [
-        "##### Install Apache Spark (Optional, if you want to work with spark 
backend)\n"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "_WgEa00pTs3w"
-      },
-      "source": [
-        "NOTE: If spark is not downloaded. Let us make sure the version we are 
trying to download is officially supported at\n",
-        "https://spark.apache.org/downloads.html";
-      ]
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "3zdtkFkLnskx"
-      },
-      "source": [
-        "# Spark and Hadoop version\n",
-        "spark_version = 'spark-2.4.7'\n",
-        "hadoop_version = 'hadoop2.7'\n",
-        "spark_path = f\"/opt/{spark_version}-bin-{hadoop_version}\"\n",
-        "if not os.path.exists(spark_path):\n",
-        "  run(f\"wget -q -nc -O apache-spark.tgz 
https://downloads.apache.org/spark/{spark_version}/{spark_version}-bin-{hadoop_version}.tgz\";)\n",
-        "  run('tar zxfv apache-spark.tgz -C /opt')\n",
-        "  run('rm -f apache-spark.tgz')\n",
-        "\n",
-        "os.environ[\"SPARK_HOME\"] = spark_path\n",
-        "os.environ[\"PATH\"] += \":$SPARK_HOME/bin\"\n"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "91pJ5U8k3cjk"
-      },
-      "source": [
-        "#### Get Apache SystemDS\n",
-        "\n",
-        "Apache SystemDS development happens on GitHub at [apache/systemds 
↗](https://github.com/apache/systemds)"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "SaPIprmg3lKE"
-      },
-      "source": [
-        "!git clone https://github.com/apache/systemds systemds --depth=1\n",
-        "%cd systemds"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "40Fo9tPUzbWK"
-      },
-      "source": [
-        "##### Build the project"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "s0Iorb0ICgHa"
-      },
-      "source": [
-        "# Logging flags: -q only for ERROR; -X for DEBUG; -e for ERROR\n",
-        "# Option 1: Build only the java codebase\n",
-        "maven('clean package -q')\n",
-        "\n",
-        "# Option 2: For building along with python distribution\n",
-        "# maven('clean package -P distribution')"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "SUGac5w9ZRBQ"
-      },
-      "source": [
-        "### A. Working with SystemDS in **standalone** mode\n",
-        "\n",
-        "NOTE: Let's pay attention to *directories* and *relative paths*. 
:)\n",
-        "\n"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "g5Nk2Bb4UU2O"
-      },
-      "source": [
-        "##### 1. Set SystemDS environment variables\n",
-        "\n",
-        "These are useful for the `./bin/systemds` script."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "2ZnSzkq8UT32"
-      },
-      "source": [
-        "!export SYSTEMDS_ROOT=$(pwd)\n",
-        "!export PATH=$SYSTEMDS_ROOT/bin:$PATH"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "zyLmFCv6ZYk5"
-      },
-      "source": [
-        "##### 2. Download Haberman data\n",
-        "\n",
-        "Data source: 
https://archive.ics.uci.edu/ml/datasets/Haberman's+Survival\n",
-        "\n",
-        "About: The survival of patients who had undergone surgery for breast 
cancer.\n",
-        "\n",
-        "Data Attributes:\n",
-        "1. Age of patient at time of operation (numerical)\n",
-        "2. Patient's year of operation (year - 1900, numerical)\n",
-        "3. Number of positive axillary nodes detected (numerical)\n",
-        "4. Survival status (class attribute)\n",
-        "    - 1 = the patient survived 5 years or longer\n",
-        "    - 2 = the patient died within 5 year"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "ZrQFBQehV8SF"
-      },
-      "source": [
-        "!mkdir ../data"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "E1ZFCTFmXFY_"
-      },
-      "source": [
-        "!wget -P ../data/ 
https://web.archive.org/web/20200725014530/https://archive.ics.uci.edu/ml/machine-learning-databases/haberman/haberman.data";
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "FTo8Py_vOGpX"
-      },
-      "source": [
-        "# Display first 10 lines of the dataset\n",
-        "# Notice that the test is plain csv with no headers!\n",
-        "!sed -n 1,10p ../data/haberman.data"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "Oy2kgVdkaeWK"
-      },
-      "source": [
-        "##### 2.1 Set `metadata` for the data\n",
-        "\n",
-        "The data does not have any info on the value types. So, `metadata` 
for the data\n",
-        "helps know the size and format for the matrix data as `.mtd` file 
with the same\n",
-        "name and location as `.data` file."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "vfypIgJWXT6K"
-      },
-      "source": [
-        "# generate metadata file for the dataset\n",
-        "!echo '{\"rows\": 306, \"cols\": 4, \"format\": \"csv\"}' > 
../data/haberman.data.mtd\n",
-        "\n",
-        "# generate type description for the data\n",
-        "!echo '1,1,1,2' > ../data/types.csv\n",
-        "!echo '{\"rows\": 1, \"cols\": 4, \"format\": \"csv\"}' > 
../data/types.csv.mtd"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "7Vis3V31bA53"
-      },
-      "source": [
-        "##### 3. Find the algorithm to run with `systemds`"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "L_0KosFhbhun"
-      },
-      "source": [
-        "# Inspect the directory structure of systemds code base\n",
-        "!ls"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "R7C5DVM7YfTb"
-      },
-      "source": [
-        "# List all the scripts (also called top level algorithms!)\n",
-        "!ls scripts/algorithms"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "5PrxwviWJhNd"
-      },
-      "source": [
-        "# Lets choose univariate statistics script.\n",
-        "# Output the algorithm documentation\n",
-        "# start from line no. 22 onwards. Till 35th line the command looks 
like\n",
-        "!sed -n 22,35p ./scripts/algorithms/Univar-Stats.dml"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "zv_7wRPFSeuJ"
-      },
-      "source": [
-        "!./bin/systemds ./scripts/algorithms/Univar-Stats.dml -nvargs 
X=../data/haberman.data TYPES=../data/types.csv STATS=../data/univarOut.mtx 
CONSOLE_OUTPUT=TRUE"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "IqY_ARNnavrC"
-      },
-      "source": [
-        "##### 3.1 Let us inspect the output data"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "k-_eQg9TauPi"
-      },
-      "source": [
-        "# output first 10 lines only.\n",
-        "!sed -n 1,10p ../data/univarOut.mtx"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "o5VCCweiDMjf"
-      },
-      "source": [
-        "#### B. Run SystemDS with Apache Spark"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "6gJhL7lc1vf7"
-      },
-      "source": [
-        "#### Playground for DML scripts\n",
-        "\n",
-        "DML - A custom language designed for SystemDS with R-like syntax."
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "zzqeSor__U6M"
-      },
-      "source": [
-        "##### A test `dml` script to prototype algorithms\n",
-        "\n",
-        "Modify the code in the below cell and run to work develop data 
science tasks\n",
-        "in a high level language."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "t59rTyNbOF5b"
-      },
-      "source": [
-        "%%writefile ../test.dml\n",
-        "\n",
-        "# This code code acts as a playground for dml code\n",
-        "X = rand (rows = 20, cols = 10)\n",
-        "y = X %*% rand(rows = ncol(X), cols = 1)\n",
-        "lm(X = X, y = y)"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "VDfeuJYE1JfK"
-      },
-      "source": [
-        "Submit the `dml` script to Spark with `spark-submit`.\n",
-        "More about [Spark Submit 
↗](https://spark.apache.org/docs/latest/submitting-applications.html)"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "YokktyNE1Cig"
-      },
-      "source": [
-        "!$SPARK_HOME/bin/spark-submit \\\n",
-        "    ./target/SystemDS.jar -f ../test.dml"
-      ],
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "gCMkudo_-8_8"
-      },
-      "source": [
-        "##### Run a binary classification example with sample data\n",
-        "\n",
-        "One would notice that no other script than simple dml is used in this 
example completely."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "metadata": {
-        "id": "OSLq2cZb_SUl"
-      },
-      "source": [
-        "# Example binary classification task with sample data.\n",
-        "# !$SPARK_HOME/bin/spark-submit ./target/SystemDS.jar -f 
./scripts/nn/examples/fm-binclass-dummy-data.dml"
-      ],
-      "execution_count": null,
-      "outputs": []
-    }
-  ]
-}
\ No newline at end of file

[systemds] 01/02: [MINOR] Remove notebooks

Reply via email to