ngachung commented on code in PR #185:
URL: 
https://github.com/apache/incubator-sdap-nexus/pull/185#discussion_r988383401


##########
docs/quickstart.rst:
##########
@@ -64,181 +80,238 @@ The network we will be using for this quickstart will be 
called ``sdap-net``. Cr
 
 .. _quickstart-step3:
 
-Download Sample Data
----------------------
+Start Ingester Components and Ingest Some Science Data
+========================================================
 
-The data we will be downloading is part of the `AVHRR OI dataset 
<https://podaac.jpl.nasa.gov/dataset/AVHRR_OI-NCEI-L4-GLOB-v2.0>`_ which 
measures sea surface temperature. We will download 1 month of data and ingest 
it into a local Solr and Cassandra instance.
+Create Data Directory
+------------------------
+
+Let's start by creating the directory to hold the science data to ingest.
 
 Choose a location that is mountable by Docker (typically needs to be under the 
User's home directory) to download the data files to.
 
 .. code-block:: bash
 
-  export DATA_DIRECTORY=~/nexus-quickstart/data/avhrr-granules
-  mkdir -p ${DATA_DIRECTORY}
+    export DATA_DIRECTORY=~/nexus-quickstart/data/avhrr-granules
+    mkdir -p ${DATA_DIRECTORY}
 
-Then go ahead and download 1 month worth of AVHRR netCDF files.
+Now we can start up the data storage components. We will be using Solr and 
Cassandra to store the tile metadata and data respectively.
 
-.. code-block:: bash
+.. _quickstart-step4:
 
-  cd $DATA_DIRECTORY
+Start Zookeeper
+---------------
 
-  export 
URL_LIST="https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/305/20151101120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/306/20151102120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/307/20151103120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/308/20151104120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/309/20151105120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/
 2015/310/20151106120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/311/20151107120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/312/20151108120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/313/20151109120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/314/20151110120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/315/20151111120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 https://podaac-opendap.jpl.nasa.gov:443
 
/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/316/20151112120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/317/20151113120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/318/20151114120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/319/20151115120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/320/20151116120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/321/20151117120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-G
 LOB-v02.0-fv02.0.nc 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/322/20151118120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/323/20151119120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/324/20151120120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/325/20151121120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/326/20151122120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2
 /2015/327/20151123120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/328/20151124120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/329/20151125120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/330/20151126120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/331/20151127120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/332/20151128120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 https://podaac-opendap.jpl.nasa.gov:44
 
3/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/333/20151129120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/334/20151130120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc";
+In order to run Solr in cloud mode, we must first run Zookeeper.
 
-  for url in ${URL_LIST}; do
-    curl -O "${url}"
-  done
+.. code-block:: bash
 
-You should now have 30 files downloaded to your data directory, one for each 
day in November 2015.
+    docker run --name zookeeper -dp 2181:2181 zookeeper:${ZK_VERSION}
 
-Start Data Storage Containers
-==============================
+We then need to ensure the ``/solr`` znode is present.
 
-We will use Solr and Cassandra to store the tile metadata and data 
respectively.
+.. code-block:: bash
 
-.. _quickstart-step4:
+  docker exec zookeeper bash -c "bin/zkCli.sh create /solr"
+
+.. _quickstart-step5:
 
 Start Solr
 -----------
 
-SDAP is tested with Solr version 7.x with the JTS topology suite add-on 
installed. The SDAP docker image is based off of the official Solr image and 
simply adds the JTS topology suite and the nexustiles core.
+SDAP is tested with Solr version 8.11.1.
 
-.. note:: Mounting a volume is optional but if you choose to do it, you can 
start and stop the Solr container without having to reingest your data every 
time. If you do not mount a volume, every time you stop your Solr container the 
data will be lost.
+.. note:: Mounting a volume is optional but if you choose to do it, you can 
start and stop the Solr container without having to reingest your data every 
time. If you do not mount a volume, every time you stop your Solr container the 
data will be lost. If you don't want a volume, leave off the ``-v`` option in 
the following ``docker run`` command.
 
 To start Solr using a volume mount and expose the admin webapp on port 8983:
 
 .. code-block:: bash
 
   export SOLR_DATA=~/nexus-quickstart/solr
-  docker run --name solr --network sdap-net -v 
${SOLR_DATA}:/opt/solr/server/solr/nexustiles/data -p 8983:8983 -d 
sdap/solr-singlenode:${VERSION}
+  mkdir -p ${SOLR_DATA}
+  docker run --name solr --network sdap-net -v 
${SOLR_DATA}/:/opt/solr/server/solr/nexustiles/data -p 8983:8983 -e 
ZK_HOST="host.docker.internal:2181/solr" -d nexusjpl/solr:${SOLR_VERSION}
 
-If you don't want to use a volume, leave off the ``-v`` option.
+This will start an instance of Solr. To initialize it, we need to run the 
``solr-cloud-init`` image.
 
+.. code-block:: bash
 
-.. _quickstart-step5:
+  docker run -it --rm --name solr-init --network sdap-net -e 
SDAP_ZK_SOLR="host.docker.internal:2181/solr" -e 
SDAP_SOLR_URL="http://host.docker.internal:8983/solr/"; -e 
CREATE_COLLECTION_PARAMS="name=nexustiles&numShards=1&waitForFinalState=true" 
nexusjpl/solr-cloud-init:${SOLR_CLOUD_INIT_VERSION}
 
-Start Cassandra
-----------------
+When the init script finishes, kill the container by typing ``Ctrl + C``
 
-SDAP is tested with Cassandra version 2.2.x. The SDAP docker image is based 
off of the official Cassandra image and simply mounts the schema DDL script 
into the container for easy initialization.
+.. _quickstart-step6:
 
-.. note:: Similar to the Solr container, using a volume is recommended but not 
required.
+Starting Cassandra
+-------------------
+
+SDAP is tested with Cassandra version 3.11.6.
 
-To start cassandra using a volume mount and expose the connection port 9042:
+.. note:: Similar to the Solr container, using a volume is recommended but not 
required. Be aware that the second ``-v`` option is required.
+
+Before starting Cassandra, we need to prepare a script to initialize the 
database.
+
+.. code-block:: bash
+
+  export CASSANDRA_INIT=~/nexus-quickstart/init
+  mkdir -p ${CASSANDRA_INIT}
+  cat << EOF >> ${CASSANDRA_INIT}/initdb.cql
+  CREATE KEYSPACE IF NOT EXISTS nexustiles WITH REPLICATION = { 'class': 
'SimpleStrategy', 'replication_factor': 1 };
+
+  CREATE TABLE IF NOT EXISTS nexustiles.sea_surface_temp  (
+  tile_id      uuid PRIMARY KEY,
+  tile_blob    blob
+  );
+  EOF
+
+Now we can start the image and run the initialization script.
 
 .. code-block:: bash
 
   export CASSANDRA_DATA=~/nexus-quickstart/cassandra
-  docker run --name cassandra --network sdap-net -p 9042:9042 -v 
${CASSANDRA_DATA}:/var/lib/cassandra -d sdap/cassandra:${VERSION}
+  mkdir -p ${CASSANDRA_DATA}
+  docker run --name cassandra --network sdap-net -p 9042:9042 -v 
${CASSANDRA_DATA}/cassandra/:/var/lib/cassandra -v 
"${CASSANDRA_INIT}/initdb.cql:/scripts/initdb.cql" -d 
bitnami/cassandra:${CASSANDRA_VERSION}
 
-.. _quickstart-step6:
+Wait a few moments for the database to start.
+
+.. code-block:: bash
+
+  docker exec  cassandra bash -c "cqlsh -u cassandra -p cassandra -f 
/scripts/initdb.cql"
+
+With Solr and Cassandra started and initialized, we can now start the 
collection manager and granule ingester(s).
+
+.. _quickstart-step7:
+
+Start RabbitMQ
+----------------
+
+The collection manager and granule ingester(s) use RabbitMQ to communicate, so 
we need to start that up first.
+
+.. code-block:: bash
+
+  docker run -dp 5672:5672 -p 15672:15672 --name rmq --network sdap-net 
bitnami/rabbitmq:${RMQ_VERSION}
 
-Ingest Data
-============
+.. _quickstart-step8:
+
+Start the Granule Ingester(s)
+-----------------------------
+
+The granule ingester(s) read new granules from the message queue and process 
them into tiles. For the set of granules we will be using in this guide, we 
recommend using two ingester containers to speed up the process.
+
+.. code-block:: bash
+
+  docker run --name granule-ingester-1 --network sdap-net -e 
RABBITMQ_HOST="host.docker.internal:5672" -e RABBITMQ_USERNAME="user" -e 
RABBITMQ_PASSWORD="bitnami" -d -e CASSANDRA_CONTACT_POINTS=host.docker.internal 
-e CASSANDRA_USERNAME=cassandra -e CASSANDRA_PASSWORD=cassandra -e 
SOLR_HOST_AND_PORT="http://host.docker.internal:8983"; -v 
${DATA_DIRECTORY}:/data/granules/ 
nexusjpl/granule-ingester:${GRANULE_INGESTER_VERSION}
+  docker run --name granule-ingester-2 --network sdap-net -e 
RABBITMQ_HOST="host.docker.internal:5672" -e RABBITMQ_USERNAME="user" -e 
RABBITMQ_PASSWORD="bitnami" -d -e CASSANDRA_CONTACT_POINTS=host.docker.internal 
-e CASSANDRA_USERNAME=cassandra -e CASSANDRA_PASSWORD=cassandra -e 
SOLR_HOST_AND_PORT="http://host.docker.internal:8983"; -v 
${DATA_DIRECTORY}:/data/granules/ 
nexusjpl/granule-ingester:${GRANULE_INGESTER_VERSION}
+
+.. _quickstart-optional-step:
+
+[OPTIONAL] Run Message Queue Monitor
+-------------------------------------
 
-Now that Solr and Cassandra have both been started and configured, we can 
ingest some data. NEXUS ingests data using the ningester docker image. This 
image is designed to read configuration and data from volume mounts and then 
tile the data and save it to the datastores. More information can be found in 
the :ref:`ningester` section.
+The granule ingestion process can take some time. To monitor its progress, we 
wrote a simple python script to monitor the message queue. It will wait until 
some granules show up and then will exit once they have all been ingested.
 
-Ningester needs 3 things to run:
+The script only needs the requests module, which can be installed by running 
``pip install requests`` if you do not have it.
+
+To download the script:
+
+.. code-block:: bash
+
+  curl -O 
https://raw.githubusercontent.com/RKuttruff/rmq-monitor/pub/monitor.py

Review Comment:
   We probably want to include this monitor.py with the quickstart rather than 
in @RKuttruff GitHub



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@sdap.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to