bzablocki commented on code in PR #27284: URL: https://github.com/apache/beam/pull/27284#discussion_r1529348432
########## examples/notebooks/get-started/try-apache-beam-yaml.ipynb: ########## @@ -0,0 +1,671 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "view-in-github" + }, + "source": [ + "<a href=\"https://colab.research.google.com/github/apache/beam/blob/master/examples/notebooks/get-started/try-apache-beam-yaml.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form" + }, + "outputs": [], + "source": [ + "#@title ###### Licensed to the Apache Software Foundation (ASF), Version 2.0 (the \"License\")\n", + "\n", + "# Licensed to the Apache Software Foundation (ASF) under one\n", + "# or more contributor license agreements. See the NOTICE file\n", + "# distributed with this work for additional information\n", + "# regarding copyright ownership. The ASF licenses this file\n", + "# to you under the Apache License, Version 2.0 (the\n", + "# \"License\"); you may not use this file except in compliance\n", + "# with the License. You may obtain a copy of the License at\n", + "#\n", + "# http://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing,\n", + "# software distributed under the License is distributed on an\n", + "# \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n", + "# KIND, either express or implied. See the License for the\n", + "# specific language governing permissions and limitations\n", + "# under the License." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "lNKIMlEDZ_Vw" + }, + "source": [ + "# Try Apache Beam - YAML\n", + "\n", + "While Beam provides powerful APIs for authoring sophisticated data processing pipelines, it still has a high barrier for getting started and authoring simple pipelines. Even setting up the environment, installing the dependencies, and setting up the project can be a challenge.\n", + "\n", + "Here we provide a simple declarative syntax for describing pipelines that does not require coding experience or learning how to use an SDK—any text editor will do. Some installation may be required to actually *execute* a pipeline, but we envision various services (such as Dataflow) to accept yaml pipelines directly obviating the need for even that in the future. We also anticipate the ability to generate code directly from these higher-level yaml descriptions, should one want to graduate to a full Beam SDK (and possibly the other direction as well as far as possible).\n", + "\n", + "It should be noted that everything here is still under development, but any features already included are considered stable. Feedback is welcome at d...@apache.beam.org.\n", + "\n", + "In this notebook, you set up your development environment and write a simple pipeline using YAML. Then you run it locally, using the [DirectRunner](https://beam.apache.org/documentation/runners/direct/). You can explore other runners with the [Beam Capability Matrix](https://beam.apache.org/documentation/runners/capability-matrix/).\n", + "\n", + "To navigate through different sections, use the table of contents. From **View** drop-down list, select **Table of contents**.\n", + "\n", + "To run a code cell, click the **Run cell** button at the top left of the cell, or select it and press **`Shift+Enter`**. Try modifying a code cell and re-running it to see what happens.\n", + "\n", + "To learn more about Colab, see [Welcome to Colaboratory!](https://colab.sandbox.google.com/notebooks/welcome.ipynb)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "Fz6KSQ13_3Rr" + }, + "source": [ + "# Setup\n", + "\n", + "First, you need to set up your environment. The following code installs `apache-beam` and creates directories for your data, pipelines and results." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 170 + }, + "colab_type": "code", + "id": "GOOk81Jj_yUy", + "outputId": "d283dfb2-4f51-4fec-816b-f57b0cb9b71c" + }, + "outputs": [], + "source": [ + "def save_to_file(content, file_name):\n", + " with open(file_name, 'w') as f:\n", + " f.write(content)\n", + "\n", Review Comment: Decided to go with the '%%writefile' -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org