petermarshallio commented on code in PR #14726:
URL: https://github.com/apache/druid/pull/14726#discussion_r1281524000
##########
examples/quickstart/jupyter-notebooks/notebooks/03-query/04-UnionOperations.ipynb:
##########
@@ -0,0 +1,644 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "557e06e8-9b35-4b34-8322-8a8ede6de709",
+ "metadata": {},
+ "source": [
+ "# Performing set operations\n",
+ "\n",
+ "Users often call for a way to concatenate results into a single list. In
this tutorial, work through some examples of different techniques that are
available."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "cf4554ae-6516-4e76-b202-d6e2fdf31603",
+ "metadata": {},
+ "source": [
+ "## Prerequisites\n",
+ "\n",
+ "This tutorial works with Druid 26.0.0 or later.\n",
+ "\n",
+ "Launch this tutorial and all prerequisites using the `druid-jupyter`
profile of the Docker Compose file for Jupyter-based Druid tutorials. For more
information, see [Docker for Jupyter Notebook
tutorials](https://druid.apache.org/docs/latest/tutorials/tutorial-jupyter-docker.html).\n",
+ "\n",
+ "<details><summary> \n",
+ "<b>Run without Docker Compose</b> \n",
+ "</summary>\n",
+ "\n",
+ "If you do not use the Docker Compose environment, you need the
following:\n",
+ "\n",
+ "* A running Druid instance.\n",
+ "*
[druidapi](https://github.com/apache/druid/blob/master/examples/quickstart/jupyter-notebooks/druidapi/README.md),
a Python client for Apache Druid. Follow the instructions in the Install
section of the README file.\n",
+ "* [matplotlib](https://matplotlib.org/), a library for creating
visualizations in Python,\n",
+ "* [pandas](https://pandas.pydata.org/), a data analysis and manipulation
tool.\n",
+ "* Jupyter notebook or Jupyter Lab. See
[jupyter.org](https://jupyter.org/) for installation instructions.\n",
+ "\n",
+ "</details>"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ee0c3171-def8-4ad9-9c56-d3a67f309631",
+ "metadata": {},
+ "source": [
+ "### Initialization\n",
+ "\n",
+ "Run the next cell to attempt a connection to Druid services. If
successful, the Druid version number will be shown in the output."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "9fa4abfe-f878-4031-88f2-94c13e922279",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import druidapi\n",
+ "import os\n",
+ "\n",
+ "if 'DRUID_HOST' not in os.environ.keys():\n",
+ " druid_host=f\"http://localhost:8888\"\n",
+ "else:\n",
+ " druid_host=f\"http://{os.environ['DRUID_HOST']}:8888\"\n",
+ " \n",
+ "print(f\"Opening a connection to {druid_host}.\")\n",
+ "druid = druidapi.jupyter_client(druid_host)\n",
+ "\n",
+ "display = druid.display\n",
+ "sql_client = druid.sql\n",
+ "status_client = druid.status\n",
+ "\n",
+ "status_client.version"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d948743d-10cf-4df9-bff2-92a79535ec89",
+ "metadata": {},
+ "source": [
+ "### Load example flight data\n",
+ "\n",
+ "Once your Druid environment is up and running, ingest the sample data for
this tutorial.\n",
+ "\n",
+ "Open the Druid console:\n",
+ "\n",
+ "1. Load data\n",
+ "2. Batch - SQL\n",
+ "3. Example data\n",
+ "4. Select \"FlightCarrierOnTime (1 month)\"\n",
+ "\n",
+ "For the purposes of this notebook, use all the defaults suggested by the
console, including the default datasource name: \n",
+ "\n",
+ "`On_Time_Reporting_Carrier_On_Time_Performance_(1987_present)_2005_11`"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3b3c7fc0-cb5c-43a0-aa53-2c62053181b0",
+ "metadata": {},
+ "source": [
+ "When this is completed, run the following cell for the final part of the
initialization. This will provide us some methods to call as we explore what
TopN does."
Review Comment:
Thanks! I have added a `display.table()` to the notebook. This is possibly a
good practice for all notebooks where we rely on some table or other?
@sergioferragut @techdocsmith @writer-jill
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]