This is an automated email from the ASF dual-hosted git repository.
eamonford pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-sdap-nexus.git
The following commit(s) were added to refs/heads/master by this push:
new 698930b SDAP-298: Helm Chart 1.0.0 (#113)
698930b is described below
commit 698930b1c670898a67014f5bf2088976e53f7adc
Author: Eamon Ford <[email protected]>
AuthorDate: Mon Dec 21 10:06:43 2020 -0800
SDAP-298: Helm Chart 1.0.0 (#113)
* updated helm chart for zookeeper
use solr and zk helm charts
change .Release.Namespace to .Release.Name
add rabbitmq storageclass
fix rbac
add max_concurrency
add solr_host arg
add solr port
always deploy solr
add solr-host option
read cli args for cass and solr hosts
pass cassandra host
add support for cassandra username and password
cassandra helm chart included
fix arguments sent to spark driver, add logging in cassandraproxy
pass factory method to nexuscalchandlers to create tile service in spark
nodes
fix namespace
fix bad argument order
fix cass url for granule ingester
change solr-create-collection to a deployment
make solr history default
pr
enable external solr/zk/cass hosts
rabbitmq.enabled
revert doms
revert
update images
only deploy config operator if it is enabled
remove http:// from solr hardcoded endpoint
turn off configmap by default
* revert doms
* upgrade images
* use new nexusjpl/solr image, update solr-create-collection image tag
* Add flag to disable crd creation
* Fixed typo
* Only create gitcfg if if configMap not set
* Only create rbac if configmap disabled
* update nexus version
* Add S3 support to helm chart
* Update Rabbitmq version
* Update versions
* version bumps
* Cleanup
* Configurable cassandra parameters
* wip: helm readme
* Helm readme changes
* Typo
* Typo
* Readme updates
* Readme
* Readme
* Readme updates
* Readme
* Readme
* Fix typos
* Readme
* Readme
* Readme
* Readme
* Readme
* Readme
* More docs
* Cleanup
Co-authored-by: Eamon Ford <[email protected]>
---
.gitignore | 1 +
helm/Chart.yaml | 4 +-
helm/README.md | 444 ++++++++++++++++++++----
helm/requirements.yaml | 15 +-
helm/templates/_helpers.tpl | 21 +-
helm/templates/cassandra.yml | 107 ------
helm/templates/collection-manager.yml | 23 +-
helm/templates/collections-config-gitcfg.yml | 4 +
helm/templates/config-operator-rbac.yml | 7 +-
helm/templates/config-operator.yml | 3 +-
helm/{crds => templates}/gitbasedconfig-crd.yml | 3 +-
helm/templates/granule-ingester.yml | 24 +-
helm/templates/history-pvc.yml | 7 +-
helm/templates/init-cassandra-configmap.yml | 13 +
helm/templates/solr-create-collection.yml | 34 ++
helm/templates/solr.yml | 129 -------
helm/templates/webapp.yml | 7 +-
helm/templates/zookeeper.yml | 144 --------
helm/values.yaml | 149 +++++---
19 files changed, 610 insertions(+), 529 deletions(-)
diff --git a/.gitignore b/.gitignore
index 3e29626..12ab2d6 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,3 +1,4 @@
+*.pytest_cache
*.vscode
*.code-workspace
*.idea
diff --git a/helm/Chart.yaml b/helm/Chart.yaml
index 2a2e9d1..27cbdd6 100644
--- a/helm/Chart.yaml
+++ b/helm/Chart.yaml
@@ -1,5 +1,5 @@
apiVersion: v1
-appVersion: "0.1.5"
+appVersion: "0.2.2"
description: Science Data Analytics Platform
name: nexus
-version: 0.2.0
+version: 1.0.0
diff --git a/helm/README.md b/helm/README.md
index f099765..aeb15d8 100644
--- a/helm/README.md
+++ b/helm/README.md
@@ -8,16 +8,34 @@ NEXUS is an earth science data analytics application, and a
component of the [Ap
The helm chart deploys all the required components of the NEXUS application
(Spark webapp, Solr, Cassandra, Zookeeper, and optionally ingress components).
## Table of Contents
-- [Prerequisites](#prerequisites)
- - [Spark Operator](#spark-operator)
- - [Persistent Volume Provisioner](#persistent-volume-provisioner)
-- [Installing the Chart](#installing-the-chart)
-- [Verifying Successful Installation](#verifying-successful-installation)
- - [Local Deployment with Ingress
Enabled](#option-1-local-deployment-with-ingress-enabled)
- - [No Ingress Enabled](#option-2-no-ingress-enabled)
-- [Uninstalling the Chart](#uninstalling-the-chart)
-- [Configuration](#configuration)
-- [Restricting Pods to Specific Nodes](#restricting-pods-to-specific-nodes)
+- [NEXUS](#nexus)
+ - [Introduction](#introduction)
+ - [Table of Contents](#table-of-contents)
+ - [Prerequisites](#prerequisites)
+ - [Spark Operator](#spark-operator)
+ - [Persistent Volume Provisioner](#persistent-volume-provisioner)
+ - [Installing the Chart](#installing-the-chart)
+ - [Verifying Successful Installation](#verifying-successful-installation)
+ - [Option 1: Local deployment with ingress
enabled](#option-1-local-deployment-with-ingress-enabled)
+ - [Option 2: No ingress enabled](#option-2-no-ingress-enabled)
+ - [Uninstalling the Chart](#uninstalling-the-chart)
+ - [Parameters](#parameters)
+ - [SDAP Webapp (Analysis) Parameters](#sdap-webapp-analysis-parameters)
+ - [SDAP Ingestion Parameters](#sdap-ingestion-parameters)
+ - [Cassandra Parameters](#cassandra-parameters)
+ - [Solr/Zookeeper Parameters](#solrzookeeper-parameters)
+ - [RabbitMQ Parameters](#rabbitmq-parameters)
+ - [Ingress Parameters](#ingress-parameters)
+ - [The Collections Config](#the-collections-config)
+ - [Option 1: Manually Create a
ConfigMap](#option-1-manually-create-a-configmap)
+ - [Option 2: Store a File in Git](#option-2-store-a-file-in-git)
+ - [Ingestion Sources](#ingestion-sources)
+ - [Ingesting from a Local Directory](#ingesting-from-a-local-directory)
+ - [Ingesting from S3](#ingesting-from-s3)
+ - [Ingesting from an NFS Host](#ingesting-from-an-nfs-host)
+ - [Other Configuration Examples](#other-configuration-examples)
+ - [Restricting Pods to Specific Nodes](#restricting-pods-to-specific-nodes)
+ - [Persistence](#persistence)
## Prerequisites
@@ -34,7 +52,7 @@ Follow their instructions to install the Helm chart, or
simply run:
$ helm install incubator/sparkoperator --generate-name
--namespace=spark-operator
#### Persistent Volume Provisioner
-NEXUS stores data in Cassandra and Solr. In order to have persistent storage,
you need to have a Storage Class defined and have Persistent Volumes
provisioned either manually or dynamically. See [Persistent
Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/).
+The RabbitMQ, Solr, Zookeeper, Cassandra, and Collection Manager (ingestion)
components of SDAP need to be able to store data. In order to have persistent
storage, you need to have a Storage Class defined and have Persistent Volumes
provisioned either manually or dynamically. See [Persistent
Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/).
> **Tip**: If you are using an NFS server as storage, you can use
> [nfs-client-provisioner](https://github.com/helm/charts/tree/master/stable/nfs-client-provisioner)
> to dynamically provision persistent volumes on your NFS server.
@@ -90,9 +108,9 @@ To uninstall/delete the `nexus` deployment:
The command removes all the Kubernetes components associated with the chart
and deletes the release.
-## Configuration
+## Parameters
-There are two ways to override configuration values for the chart. The first
is to use the `--set` flag when installing the chart, for example:
+There are two ways to override configuration parameters for the chart. The
first is to use the `--set` flag when installing the chart, for example:
$ helm install nexus incubator-sdap-nexus/helm --namespace=sdap
--dependency-update --set cassandra.replicas=3 --set solr.replicas=3
@@ -102,21 +120,25 @@ The second way is to create a yaml file with overridden
configuration values and
# overridden-values.yml
cassandra:
- replicas: 2
+ cluster:
+ replicaCount: 2
solr:
- replicas: 2
+ replicaCount: 2
```
```
$ helm install nexus incubator-sdap-nexus/helm --namespace=sdap
--dependency-update -f ~/overridden-values.yml
```
-The following table lists the configurable parameters of the NEXUS chart and
their default values. You can also look at `helm/values.yaml` to see the
available options.
+The following tables list the configurable parameters of the NEXUS chart and
their default values. You can also look at `helm/values.yaml` to see the
available options.
> **Note**: The default configuration values are tuned to run NEXUS in a local
> environment. Setting `ingressEnabled=true` in addition will create a load
> balancer and expose NEXUS at `localhost`.
+### SDAP Webapp (Analysis) Parameters
| Parameter | Description |
Default |
|---------------------------------------|------------------------------------|---------------------------------------------|
-| `storageClass` | Storage class to use for Cassandra,
Solr, and Zookeeper. (Note that `hostpath` should only be used in local
deployments.) |`hostpath`|
-| `webapp.distributed.image` | Docker image and tag for the webapp|
`nexusjpl/nexus-webapp:distributed.0.1.5` |
+| `onEarthProxyIP` | IP or hostname to proxy `/onearth`
to (leave blank to disable the proxy)| `""` |
+| `rootWebpage.enabled` | Whether to deploy the root webpage
(just returns HTTP 200) | `true` |
+| `webapp.enabled` | Whether to deploy the webapp |
`true` |
+| `webapp.distributed.image` | Docker image and tag for the webapp|
`nexusjpl/nexus-webapp:distributed.0.2.2` |
| `webapp.distributed.driver.cores` | Number of cores on Spark driver |
`1` |
| `webapp.distributed.driver.coreLimit` | Maximum cores on Spark driver, in
millicpus| `1200m` |
| `webapp.distributed.driver.memory` | Memory on Spark driver |
`512m` |
@@ -127,60 +149,287 @@ The following table lists the configurable parameters of
the NEXUS chart and the
| `webapp.distributed.executor.memory` | Memory on Spark workers |
`512m` |
| `webapp.distributed.executor.tolerations`| Tolerations for Spark workers |
`nil` |
| `webapp.distributed.executor.affinity`| Affinity (node or pod) for Spark
workers| `nil` |
-| `cassandra.replicas` | Number of Cassandra replicas |
`2` |
-| `cassandra.storage` | Storage per Cassandra replica |
`13Gi` |
-| `cassandra.requests.cpu` | CPUs to request per Cassandra
replica| `1` |
-| `cassandra.requests.memory` | Memory to request per Cassandra
replica| `3Gi` |
-| `cassandra.limits.cpu` | CPU limit per Cassandra replica |
`1` |
-| `cassandra.limits.memory` | Memory limit per Cassandra replica |
`3Gi` |
-| `cassandra.tolerations` | Tolerations for Cassandra instances|
`[]` |
-| `cassandra.nodeSelector` | Node selector for Cassandra
instances| `nil` |
-| `solr.replicas` | Number of Solr replicas (this should
not be less than 2, or else solr-cloud will not be happy)| `2`|
-| `solr.storage` | Storage per Solr replica |
`10Gi` |
-| `solr.heap` | Heap per Solr replica |
`4g` |
-| `solr.requests.memory` | Memory to request per Solr replica |
`5Gi` |
-| `solr.requests.cpu` | CPUs to request per Solr replica |
`1` |
-| `solr.limits.memory` | Memory limit per Solr replica |
`5Gi` |
-| `solr.limits.cpu` | CPU limit per Solr replica |
`1` |
-| `solr.tolerations` | Tolerations for Solr instances |
`nil` |
-| `solr.nodeSelector` | Node selector for Solr instances |
`nil` |
-| `zookeeper.replicas` | Number of zookeeper replicas. This
should be an odd number greater than or equal to 3 in order to form a valid
quorum.|`3`|
-| `zookeeper.memory` | Memory per zookeeper replica |
`1Gi` |
-| `zookeeper.cpu` | CPUs per zookeeper replica |
`0.5` |
-| `zookeeper.storage` | Storage per zookeeper replica |
`8Gi` |
-| `zookeeper.tolerations` | Tolerations for Zookeeper instances|
`nil` |
-| `zookeeper.nodeSelector` | Node selector for Zookeeper
instances| `nil` |
-| `onEarthProxyIP` | IP or hostname to proxy `/onearth`
to (leave blank to disable the proxy)| `""` |
-| `ingressEnabled` | Enable nginx-ingress |
`false` |
-| `nginx-ingress.controller.scope.enabled`|Limit the scope of the ingress
controller to this namespace | `true` |
-| `nginx-ingress.controller.kind` | Install ingress controller as
Deployment, DaemonSet or Both | `DaemonSet` |
-| `nginx-ingress.controller.service.enabled`| Create a front-facing controller
service (this might be used for local or on-prem deployments) | `true` |
-| `nginx-ingress.controller.service.type`|Type of controller service to
create| `LoadBalancer` |
-| `nginx-ingress.defaultBackend.enabled`| Use default backend component
| `false` |
-| `rabbitmq.replicaCount` | Number of RabbitMQ replicas |
`2` |
-| `rabbitmq.auth.username` | RabbitMQ username |
`guest` |
-| `rabbitmq.auth.password` | RabbitMQ password |
`guest` |
-| `rabbitmq.ingress.enabled` | Enable ingress resource for RabbitMQ
Management console | `true` |
-| `ingestion.enabled` | Enable ingestion by deploying the
Config Operator, Collection Manager, Granule Ingestion, and RabbitMQ | `true` |
+
+
+### SDAP Ingestion Parameters
+| Parameter | Description |
Default |
+|---------------------------------------|------------------------------------|---------------------------------------------|
+| `ingestion.enabled` | Enable ingestion by deploying the
Config Operator, Collection Manager, Granule Ingestion| `true` |
| `ingestion.granuleIngester.replicas` | Number of Granule Ingester replicas
| `2` |
-| `ingestion.granuleIngester.image` | Docker image and tag for Granule
Ingester| `nexusjpl/granule-ingester:0.0.1` |
+| `ingestion.granuleIngester.image` | Docker image and tag for Granule
Ingester| `nexusjpl/granule-ingester:0.1.2` |
| `ingestion.granuleIngester.cpu` | CPUs (request and limit) for each
Granule Ingester replica| `1` |
| `ingestion.granuleIngester.memory` | Memory (request and limit) for each
Granule Ingester replica| `1Gi` |
-| `ingestion.collectionManager.image` | Docker image and tag for Collection
Manager| `nexusjpl/collection-manager:0.0.2` |
+| `ingestion.collectionManager.image` | Docker image and tag for Collection
Manager| `nexusjpl/collection-manager:0.1.2` |
| `ingestion.collectionManager.cpu` | CPUs (request and limit) for the
Collection Manager | `0.5` |
-| `ingestion.collectionManager.memory` | Memory (request and limit) for the
Collection Manager | `0.5Gi` |
+| `ingestion.collectionManager.memory` | Memory (request and limit) for the
Collection Manager | `1Gi` |
| `ingestion.configOperator.image` | Docker image and tag for Config
Operator | `nexusjpl/config-operator:0.0.1` |
| `ingestion.granules.nfsServer` | An optional URL to an NFS server
containing a directory where granule files are stored. If set, this NFS server
will be mounted in the Collection Manager and Granule Ingester pods.| `nil` |
-| `ingestion.granules.mountPath` | The path in the Collection Manager
and Granule Ingester pods where granule files will be mounted. *Important:* the
`path` property on all collections in the Collections Config file should match
this value. | `/data` |
+| `ingestion.granules.mountPath` | The path in the Collection Manager
and Granule Ingester pods where granule files will be mounted. *Important:* the
`path` property on all collections in the Collections Config file should match
this value.| `/data` |
| `ingestion.granules.path` | Directory on either the local
filesystem or an NFS mount where granule files are located. This directory will
be mounted onto the Collection Manager and Granule Ingester at
`ingestion.granules.mountPath`. | `/var/lib/sdap/granules` |
+| `ingestion.granules.s3.bucket` | An optional S3 bucket from which to
download granules for ingestion. If this is set, `ingestion.granules.nfsServer`
and `ingestion.granules.path` will be ignored.|`nil`|
+| `ingestion.granules.awsCredsEnvs` | Environment variables containing AWS
credentials. This should be populated if `ingestion.granules.s3.bucket` is set.
See https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html
for possible options.|`nil`|
+| `ingestion.collections.createCrd` | Whether to automatically create the
`GitBasedConfig` CRD (custom resource definition). This CRD is only needed if
loading the Collections Config from a Git repository is enabled (i.e., only if
`ingestion.collections.git.url` is set).
| `ingestion.collections.git.url` | URL to a Git repository containing a
[Collections
Config](https://github.com/apache/incubator-sdap-ingester/tree/dev/collection_manager#the-collections-configuration-file)
file. The file should be at the root of the repository. The repository URL
should be of the form `https://github.com/username/repo.git`. This property
must be configured if ingestion is enabled! | `nil`|
| `ingestion.collections.git.branch` | Branch to use when loading a
Collections Config file from a Git repository.| `master`|
-| `ingestion.history.url` | An optional URL to a Solr database
in which to store ingestion history. If this is not set, ingestion history will
be stored in a directory instead, with the storage class configured by
`storageClass` above.| `nil`|
+| `ingestion.history.solrEnabled` | Whether to store ingestion history
in Solr, instead of in a filesystem directory. If this is set to `true`,
`ingestion.history.storageClass` will be ignored. | `true`|
+| `ingestion.history.storageClass` | The storage class to use for
storing ingestion history files. This will only be used if
`ingestion.history.solrEnabled` is set to `false`. | `hostpath`|
+
+
+### Cassandra Parameters
+
+See the [Cassandra Helm chart
docs](https://github.com/bitnami/charts/tree/master/bitnami/cassandra) for full
list of options.
+| Parameter | Description |
Default |
+|---------------------------------------|------------------------------------|---------------------------------------------|
+| `cassandra.enabled` | Whether to deploy Cassandra |
`true` |
+| `cassandra.initDBConfigMap` | Configmap for initialization CQL
commands (done in the first node) | `init-cassandra`|
+| `cassandra.dbUser.user` | Cassandra admin user |
`cassandra` |
+| `cassandra.dbUser.password` | Password for `dbUser.user`. Randomly
generated if empty| `cassandra` |
+| `cassandra.cluster.replicaCount` | Number of Cassandra replicas |
`1` |
+| `cassandra.persistence.storageClass` | PVC Storage Class for Cassandra data
volume| `hostpath` |
+| `cassandra.persistence.size` | PVC Storage Request for Cassandra
data volume| `8Gi` |
+| `cassandra.resources.requests.cpu` | CPUs to request per Cassandra
replica| `1` |
+| `cassandra.resources.requests.memory` | Memory to request per Cassandra
replica| `8Gi` |
+| `cassandra.resources.limits.cpu` | CPU limit per Cassandra replica |
`1` |
+| `cassandra.resources.limits.memory` | Memory limit per Cassandra replica |
`8Gi` |
+| `external.cassandraHost` | External Cassandra host for if
`cassandra.enabled` is set to `false`. This should be set if connecting SDAP to
a Cassandra database that is not deployed by the SDAP Helm chart. | `nil`|
+| `external.cassandraUsername` | Optional Cassandra username, only
applies if `external.cassandraHost` is set.| `nil`|
+| `external.cassandraPassword` | Optional Cassandra password, only
applies if `external.cassandraHost` is set.| `nil`|
+
+
+### Solr/Zookeeper Parameters
+
+See the [Solr Helm chart
docs](https://github.com/helm/charts/tree/master/incubator/solr) and [Zookeeper
Helm chart
docs](https://github.com/helm/charts/tree/master/incubator/zookeeper) for full
set of options.
+| Parameter | Description |
Default |
+|---------------------------------------|------------------------------------|---------------------------------------------|
+| `solr.enabled` | Whether to deploy Solr and
Zookeeper| `true` |
+| `solr.initPodEnabled` | Whether to deploy a pod which
initializes the Solr database for SDAP (does nothing if the database is alreday
initialized)| `true`|
+| `solr.image.repository` | The repository to pull the Solr
docker image from| `nexusjpl/solr` |
+| `solr.image.tag` | The tag on the Solr repository to
pull| `8.4.0` |
+| `solr.replicaCount` | The number of replicas in the Solr
statefulset| `3` |
+| `solr.volumeClaimTemplates.storageClassName`| The name of the storage class
for the Solr PVC| `hostpath` |
+| `solr.volumeClaimTemplates.storageSize`| The size of the PVC |
`10Gi` |
+| `solr.resources.requests.memory` | Memory to request per Solr replica |
`2Gi` |
+| `solr.resources.requests.cpu` | CPUs to request per Solr replica |
`1` |
+| `solr.resources.limits.memory` | Memory limit per Solr replica |
`2Gi` |
+| `solr.resources.limits.cpu` | CPU limit per Solr replica |
`1` |
+| `solr.zookeeper.replicaCount` | The number of replicas in the
Zookeeper statefulset (this should be an odd number)| `3`|
+| `solr.zookeeper.persistence.storageClass`| The name of the storage class for
the Zookeeper PVC| `hostpath` |
+| `solr.zookeeper.resources.requests.memory`| Memory to request per Zookeeper
replica| `1Gi` |
+| `solr.zookeeper.resources.requests.cpu`| CPUs to request per Zookeeper
replica| `0.5` |
+| `solr.zookeeper.resources.limits.memory`| Memory limit per Zookeeper
replica| `1Gi` |
+| `solr.zookeeper.resources.limits.cpu` | CPU limit per Zookeeper replica |
`0.5` |
+| `external.solrHostAndPort` | External Solr host for if
`solr.enabled` is set to `false`. This should be set if connecting SDAP to a
Solr database that is not deployed by the SDAP Helm chart. | `nil`|
+| `external.zookeeperHostAndPort` | External Zookeeper host for if
`solr.enabled` is set to `false`. This should be set if connecting SDAP to a
Solr database and Zookeeper that is not deployed by the SDAP Helm chart. |
`nil`|
+
+
+### RabbitMQ Parameters
+
+See the [RabbitMQ Helm chart
docs](https://github.com/bitnami/charts/tree/master/bitnami/rabbitmq) for full
set of options.
+| Parameter | Description |
Default |
+|---------------------------------------|------------------------------------|---------------------------------------------|
+| `rabbitmq.enabled` | Whether to deploy RabbitMQ |
`true` |
+| `rabbitmq.persistence.storageClass` | Storage class to use for RabbitMQ |
`hostpath` |
+| `rabbitmq.replicaCount` | Number of RabbitMQ replicas |
`1` |
+| `rabbitmq.auth.username` | RabbitMQ username |
`guest` |
+| `rabbitmq.auth.password` | RabbitMQ password |
`guest` |
+| `rabbitmq.ingress.enabled` | Enable ingress resource for RabbitMQ
Management console | `true` |
+
+
+### Ingress Parameters
+
+See the [nginx-ingress Helm chart
docs](https://github.com/helm/charts/tree/master/stable/nginx-ingress) for full
set of options.
+| Parameter | Description |
Default |
+|---------------------------------------|------------------------------------|---------------------------------------------|
+| `nginx-ingress.enabled` | Whether to deploy nginx ingress
controllers| `false` |
+| `nginx-ingress.controller.scope.enabled`|Limit the scope of the ingress
controller to this namespace | `true` |
+| `nginx-ingress.controller.kind` | Install ingress controller as
Deployment, DaemonSet or Both | `DaemonSet` |
+| `nginx-ingress.controller.service.enabled`| Create a front-facing controller
service (this might be used for local or on-prem deployments) | `true` |
+| `nginx-ingress.controller.service.type`|Type of controller service to
create| `LoadBalancer` |
+| `nginx-ingress.defaultBackend.enabled`| Use default backend component
| `false` |
+
+## The Collections Config
+
+In order to ingest data into SDAP, you must write a Collections Config. This
is a YAML-formatted configuration which defines
+what granules to ingest into which collections (or "datasets"), and how. See
the [Collections Manager
docs](https://github.com/apache/incubator-sdap-ingester/tree/dev/collection_manager#the-collections-configuration-file)
for information on the proper content and format of the Collections Config.
+
+There are two ways to manage the Collections Config:
+
+### Option 1: Manually Create a ConfigMap
+Create a
[ConfigMap](https://kubernetes.io/docs/concepts/configuration/configmap/) by
hand, containing the collections config YAML under a key called
`collections.yml`. Then set the Chart configuration option
`ingestion.collections.configMap` to the name of the ConfigMap.
+
+### Option 2: Store a File in Git
+Write a Collections Config YAML file, save it as `collections.yml`, check it
into a Git repository under the root directory, and let the [Config
Operator](https://github.com/apache/incubator-sdap-ingester/tree/dev/config_operator)
create the ConfigMap for you.
+The Config Operator will periodically read the YAML file from Git, and create
or update a ConfigMap with the contents of the file.
+
+To enable this, set `ingestion.collections.git.url` to the Git URL of the
repository containing the Collections Config file.
+
+## Ingestion Sources
+
+SDAP supports ingesting granules from either a local directory, an AWS S3
bucket, or an NFS server. (It is not yet possible to configure SDAP to ingest
from multiple of these sources simultanously.)
+
+### Ingesting from a Local Directory
+
+To ingest granules that are stored on the local filesystem, you must provide
the path to the directory where the granules are stored. This directory will be
mounted as a volume in the ingestion pods.
+> **Note**: if you are ingesting granules that live on the local filesystem,
the granule files must be accessible at the same location on every Kubernetes
node
+> that the collections-manager and granule-ingester pods are running on.
Because of this, it usually only makes sense to use local directory ingestion
if a) your Kubernetes cluster consists of a single node (as in the case of
running Kubernetes on a local computer), or b) you have configured nodeAffinity
to force
+> the collections-manager and granule-ingester pods to run on only one node
(see [Restricting Pods to Specific Nodes](#restricting-pods-to-specific-nodes)).
-## Restricting Pods to Specific Nodes
+The following is an example configuration for ingesting granules from a local
directory:
+
+```yaml
+ingestion:
+ granules:
+ path: /share/granules
+ mountPath: /data
+```
+
+The `ingestion.granules.mountPath` property sets the mount path in the
ingestion pods where the granule directory will be mounted. **The root
directory of the `path` property of all collection entries in the collections
config must match this value.** This is because the `path` property of
collections in the collections config describes to the
+ingestion pods where to find the mounted granules.
+
+The following is an example of a collections config to be used with the NFS
ingestion configuration above:
+
+```yaml
+# collections.yml
+
+collections:
+ - id: "CSR-RL06-Mascons_LAND"
+ path: "/data/CSR-RL06-Mascons-land/CSR_GRACE_RL06_Mascons_v01-land.nc"
+ priority: 1
+ projection: Grid
+ dimensionNames:
+ latitude: lat
+ longitude: lon
+ time: time
+ variable: lwe_thickness
+ slices:
+ time: 1
+ lat: 60
+ lon: 60
+ - id: "TELLUS_GRAC-GRFO_MASCON_CRI_GRID_RL06_V2_LAND"
+ path: "/data/grace-fo-land/"
+ priority: 1
+ projection: Grid
+ dimensionNames:
+ latitude: lat
+ longitude: lon
+ time: time
+ variable: lwe_thickness
+ slices:
+ time: 1
+ lat: 60
+ lon: 60
+```
+
+### Ingesting from S3
+
+To ingest granules that are stored in an S3 bucket, you must provide the name
of the S3 bucket to read from, as well as the S3 credentials as environment
variables.
+(See the [AWS
docs](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html)
for the list of possible AWS credentials environment variables.)
+
+The following is an example configuration that enables ingestion from S3:
+
+```yaml
+ingestion:
+ granules:
+ s3:
+ bucket: my-nexus-bucket
+ awsCredsEnvs:
+ AWS_ACCESS_KEY_ID: my-secret
+ AWS_SECRET_ACCESS_KEY: my-secret
+ AWS_DEFAULT_REGION: us-west-2
+```
+
+When S3 ingestion is enabled, the `path` property of all collection entries in
the collections config must be an S3 path or prefix. (Due to S3 limitations,
wildcards are not supported.) The following
+is an example of a collections config to be used with the S3 ingestion
configuration above:
+
+```yaml
+# collections.yml
+
+collections:
+ - id: "CSR-RL06-Mascons_LAND"
+ path:
"s3://my-nexus-bucket/CSR-RL06-Mascons-land/CSR_GRACE_RL06_Mascons_v01-land.nc"
# full S3 path
+ priority: 1
+ projection: Grid
+ dimensionNames:
+ latitude: lat
+ longitude: lon
+ time: time
+ variable: lwe_thickness
+ slices:
+ time: 1
+ lat: 60
+ lon: 60
+ - id: "TELLUS_GRAC-GRFO_MASCON_CRI_GRID_RL06_V2_LAND"
+ path: "s3://my-nexus-bucket/grace-fo-land/" # S3 prefix
+ priority: 1
+ projection: Grid
+ dimensionNames:
+ latitude: lat
+ longitude: lon
+ time: time
+ variable: lwe_thickness
+ slices:
+ time: 1
+ lat: 60
+ lon: 60
+```
+
+### Ingesting from an NFS Host
+
+To ingest granules that are stored on an NFS host, you must provide the NFS
host url, and the path to the directory on the NFS server the granules are
located.
+
+The following is an example configuration that enables ingestion from an NFS
host:
+
+```yaml
+ingestion:
+ granules:
+ nfsServer: nfsserver.example.com
+ path: /share/granules
+ mountPath: /data
+```
+
+The `ingestion.granules.mountPath` property sets the mount path in the
ingestion pods where the granule directory will be mounted. **The root
directory of the `path` property of all collection entries in the collections
config must match this value.** This is because the `path` property of
collections in the collections config describes to the
+ingestion pods where to find the mounted granules.
+
+The following is an example of a collections config to be used with the NFS
ingestion configuration above:
+
+```yaml
+# collections.yml
+
+collections:
+ - id: "CSR-RL06-Mascons_LAND"
+ path: "/data/CSR-RL06-Mascons-land/CSR_GRACE_RL06_Mascons_v01-land.nc"
+ priority: 1
+ projection: Grid
+ dimensionNames:
+ latitude: lat
+ longitude: lon
+ time: time
+ variable: lwe_thickness
+ slices:
+ time: 1
+ lat: 60
+ lon: 60
+ - id: "TELLUS_GRAC-GRFO_MASCON_CRI_GRID_RL06_V2_LAND"
+ path: "/data/grace-fo-land/"
+ priority: 1
+ projection: Grid
+ dimensionNames:
+ latitude: lat
+ longitude: lon
+ time: time
+ variable: lwe_thickness
+ slices:
+ time: 1
+ lat: 60
+ lon: 60
+```
+
+## Other Configuration Examples
+
+### Restricting Pods to Specific Nodes
Sometimes you may wish to restrict pods to run on specific nodes, for example
if you have "UAT" and "SIT" nodes within the same cluster. You can configure
-node selectors and tolerations for all the components, as in the following
example:
+affinity and tolerations for all the components, as in the following example:
```yaml
webapp:
@@ -222,8 +471,15 @@ cassandra:
operator: Equal
value: uat
effect: NoExecute
- nodeSelector:
- environment: uat
+ affinity:
+ nodeAffinity:
+ requiredDuringSchedulingIgnoredDuringExecution:
+ nodeSelectorTerms:
+ - matchExpressions:
+ - key: environment
+ operator: In
+ values:
+ - uat
solr:
tolerations:
@@ -231,16 +487,58 @@ solr:
operator: Equal
value: uat
effect: NoExecute
- nodeSelector:
- environment: uat
+ affinity:
+ nodeAffinity:
+ requiredDuringSchedulingIgnoredDuringExecution:
+ nodeSelectorTerms:
+ - matchExpressions:
+ - key: environment
+ operator: In
+ values:
+ - uat
+ zookeeper:
+ tolerations:
+ - key: environment
+ operator: Equal
+ value: uat
+ effect: NoExecute
+ affinity:
+ nodeAffinity:
+ requiredDuringSchedulingIgnoredDuringExecution:
+ nodeSelectorTerms:
+ - matchExpressions:
+ - key: environment
+ operator: In
+ values:
+ - uat
+```
-zookeeper:
- tolerations:
- - key: environment
- operator: Equal
- value: uat
- effect: NoExecute
- nodeSelector:
- environment: uat
+### Persistence
+
+The SDAP Helm chart uses [persistent
volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) for
RabbitMQ, Solr, Zookeeper, Cassandra, and optionally the Collection Manager
ingestion component (if Solr ingestion history is disabled).
+In most use cases you will want to use the same storage class for all of these
components.
+
+For example, if you are deploying SDAP on AWS and you want to use [EBS
gp2](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volume-types.html#EBSVolumeTypes_gp2)
volumes for persistence storage, you would need to set the following
+configuration values for the SDAP Helm chart:
+
+```yaml
+rabbitmq:
+ persistence:
+ storageClass: gp2
+
+cassandra:
+ persistence:
+ storageClass: gp2
+
+solr:
+ volumeClaimTemplates:
+ storageClassName: gp2
+ zookeeper:
+ persistence:
+ storageClass: gp2
+
+ingestion:
+ history:
+ storageClass: gp2 # This is only needed if Solr ingestion history is
disabled, as follows:
+ solrEnabled: false
```
->**Note**: The webapp supports `affinity` instead of `nodeSelector` because
the Spark Operator has deprecated `nodeSelector` in favor of `affinity`.
\ No newline at end of file
diff --git a/helm/requirements.yaml b/helm/requirements.yaml
index 7970f29..ffd1db1 100644
--- a/helm/requirements.yaml
+++ b/helm/requirements.yaml
@@ -2,10 +2,17 @@ dependencies:
- name: nginx-ingress
version: 1.28.2
repository: https://kubernetes-charts.storage.googleapis.com
- condition: ingressEnabled
+ condition: nginx-ingress.enabled
- name: rabbitmq
- version: 7.1.0
+ version: 8.0.1
repository: https://charts.bitnami.com/bitnami
- condition: ingestion.enabled
-
+ condition: rabbitmq.enabled
+ - name: solr
+ version: 1.5.2
+ repository: http://storage.googleapis.com/kubernetes-charts-incubator
+ condition: solr.enabled
+ - name: cassandra
+ version: 5.5.3
+ repository: https://charts.bitnami.com/bitnami
+ condition: cassandra.enabled
diff --git a/helm/templates/_helpers.tpl b/helm/templates/_helpers.tpl
index b697c17..56dcb4a 100644
--- a/helm/templates/_helpers.tpl
+++ b/helm/templates/_helpers.tpl
@@ -4,7 +4,7 @@
Name of the generated configmap containing the contents of the collections
config file.
*/}}
{{- define "nexus.collectionsConfig.configmapName" -}}
-collections-config
+{{ .Values.ingestion.collections.configMap | default "collections-config" }}
{{- end -}}
{{/*
@@ -45,3 +45,22 @@ The data volume mount which is used in both the Collection
Manager and the Granu
mountPath: {{ .Values.ingestion.granules.mountPath }}
{{- end -}}
+{{- define "nexus.urls.solr" -}}
+{{ .Values.external.solrHostAndPort | default (print "http://" .Release.Name
"-solr-svc:8983") }}
+{{- end -}}
+
+{{- define "nexus.urls.zookeeper" -}}
+{{ .Values.external.zookeeperHostAndPort | default (print .Release.Name
"-zookeeper:2181") }}
+{{- end -}}
+
+{{- define "nexus.urls.cassandra" -}}
+{{ .Values.external.cassandraHost | default (print .Release.Name "-cassandra")
}}
+{{- end -}}
+
+{{- define "nexus.credentials.cassandra.username" -}}
+{{ .Values.external.cassandraUsername | default "cassandra" }}
+{{- end -}}
+
+{{- define "nexus.credentials.cassandra.password" -}}
+{{ .Values.external.cassandraPassword | default "cassandra" }}
+{{- end -}}
diff --git a/helm/templates/cassandra.yml b/helm/templates/cassandra.yml
deleted file mode 100644
index 6023e55..0000000
--- a/helm/templates/cassandra.yml
+++ /dev/null
@@ -1,107 +0,0 @@
-apiVersion: v1
-kind: Service
-metadata:
- name: sdap-cassandra
-spec:
- clusterIP: None
- ports:
- - name: cql
- port: 9042
- targetPort: cql
- selector:
- app: sdap-cassandra
-
----
-
-apiVersion: apps/v1
-kind: StatefulSet
-metadata:
- name: cassandra-set
-spec:
- serviceName: sdap-cassandra
- replicas: {{ .Values.cassandra.replicas }}
- selector:
- matchLabels:
- app: sdap-cassandra
- template:
- metadata:
- labels:
- app: sdap-cassandra
- spec:
- terminationGracePeriodSeconds: 120
- {{ if .Values.cassandra.tolerations }}
- tolerations:
-{{ .Values.cassandra.tolerations | toYaml | indent 6 }}
- {{ end }}
- {{ if .Values.cassandra.nodeSelector }}
- nodeSelector:
-{{ .Values.cassandra.nodeSelector | toYaml | indent 8 }}
- {{ end }}
- affinity:
- podAntiAffinity:
- # Prefer spreading over all hosts
- preferredDuringSchedulingIgnoredDuringExecution:
- - weight: 100
- podAffinityTerm:
- labelSelector:
- matchExpressions:
- - key: "app"
- operator: In
- values:
- - sdap-cassandra
- topologyKey: "kubernetes.io/hostname"
- containers:
- - name: cassandra
- image: nexusjpl/cassandra:1.0.0-rc1
- imagePullPolicy: Always
- ports:
- - containerPort: 7000
- name: intra-node
- - containerPort: 7001
- name: tls-intra-node
- - containerPort: 7199
- name: jmx
- - containerPort: 9042
- name: cql
- resources:
- requests:
- cpu: {{ .Values.cassandra.requests.cpu }}
- memory: {{ .Values.cassandra.requests.memory }}
- limits:
- cpu: {{ .Values.cassandra.limits.cpu }}
- memory: {{ .Values.cassandra.limits.memory }}
- securityContext:
- capabilities:
- add:
- - IPC_LOCK
- lifecycle:
- preStop:
- exec:
- command:
- - /bin/sh
- - -c
- - nodetool drain
- env:
- - name: MAX_HEAP_SIZE
- value: 2G
- - name: HEAP_NEWSIZE
- value: 200M
- - name: CASSANDRA_SEEDS
- value: "cassandra-set-0.sdap-cassandra"
- - name: POD_IP
- valueFrom:
- fieldRef:
- fieldPath: status.podIP
- volumeMounts:
- - name: cassandra-data
- mountPath: /var/lib/cassandra
-
- volumeClaimTemplates:
- - metadata:
- name: cassandra-data
- spec:
- accessModes: [ "ReadWriteOnce" ]
- storageClassName: {{ .Values.storageClass }}
- resources:
- requests:
- storage: {{ .Values.cassandra.storage }}
diff --git a/helm/templates/collection-manager.yml
b/helm/templates/collection-manager.yml
index 6708b13..993e71e 100644
--- a/helm/templates/collection-manager.yml
+++ b/helm/templates/collection-manager.yml
@@ -1,5 +1,4 @@
{{- if .Values.ingestion.enabled }}
-{{- $history := .Values.ingestion.history | default dict }}
apiVersion: apps/v1
kind: Deployment
@@ -30,13 +29,21 @@ spec:
value: {{ .Values.rabbitmq.fullnameOverride }}
- name: COLLECTIONS_PATH
value: {{ include "nexus.collectionsConfig.mountPath" .
}}/collections.yml
- {{- if $history.url }}
+ {{- if .Values.ingestion.history.solrEnabled }}
- name: HISTORY_URL
- value: {{ .Values.ingestion.history.url}}
+ value: {{ include "nexus.urls.solr" . }}
{{- else }}
- name: HISTORY_PATH
value: {{ include "nexus.history.mountPath" . }}
{{- end }}
+ {{- if .Values.ingestion.granules.s3.bucket }}
+ - name: S3_BUCKET
+ value: {{ .Values.ingestion.granules.s3.bucket }}
+ {{- end }}
+ {{- range $name, $value :=
.Values.ingestion.granules.s3.awsCredsEnvs }}
+ - name: {{ $name }}
+ value: {{ $value }}
+ {{- end }}
resources:
requests:
cpu: {{ .Values.ingestion.collectionManager.cpu }}
@@ -45,19 +52,23 @@ spec:
cpu: {{ .Values.ingestion.collectionManager.cpu }}
memory: {{ .Values.ingestion.collectionManager.memory }}
volumeMounts:
-{{ include "nexus.ingestion.dataVolumeMount" . | indent 12 }}
- {{- if not $history.url }}
+ {{- if not .Values.ingestion.history.solrEnabled }}
- name: history-volume
mountPath: {{ include "nexus.history.mountPath" . }}
{{- end }}
- name: collections-config-volume
mountPath: {{ include "nexus.collectionsConfig.mountPath" . }}
+{{- if not .Values.ingestion.granules.s3.bucket }}
+{{ include "nexus.ingestion.dataVolumeMount" . | indent 12 }}
+{{- end }}
volumes:
+{{- if not .Values.ingestion.granules.s3.bucket }}
{{ include "nexus.ingestion.dataVolume" . | indent 8 }}
+{{- end }}
- name: collections-config-volume
configMap:
name: {{ include "nexus.collectionsConfig.configmapName" . }}
- {{- if not $history.url }}
+ {{- if not .Values.ingestion.history.solrEnabled }}
- name: history-volume
persistentVolumeClaim:
claimName: history-volume-claim
diff --git a/helm/templates/collections-config-gitcfg.yml
b/helm/templates/collections-config-gitcfg.yml
index e4b7294..ea78f9a 100644
--- a/helm/templates/collections-config-gitcfg.yml
+++ b/helm/templates/collections-config-gitcfg.yml
@@ -1,3 +1,5 @@
+{{ if .Values.ingestion.enabled }}
+{{ if not .Values.ingestion.collections.configMap }}
apiVersion: sdap.apache.org/v1
kind: GitBasedConfig
metadata:
@@ -11,3 +13,5 @@ spec:
local-dir: {{ .Values.ingestion.collections.localDir }}
{{ end }}
config-map: {{ include "nexus.collectionsConfig.configmapName" . }}
+{{ end }}
+{{ end }}
\ No newline at end of file
diff --git a/helm/templates/config-operator-rbac.yml
b/helm/templates/config-operator-rbac.yml
index 54064d5..b295430 100644
--- a/helm/templates/config-operator-rbac.yml
+++ b/helm/templates/config-operator-rbac.yml
@@ -1,3 +1,5 @@
+{{ if .Values.ingestion.enabled }}
+{{ if not .Values.ingestion.collections.configMap }}
apiVersion: v1
kind: ServiceAccount
metadata:
@@ -6,7 +8,7 @@ metadata:
---
apiVersion: rbac.authorization.k8s.io/v1
-kind: RoleBinding
+kind: ClusterRoleBinding
metadata:
name: config-operator-role-binding
roleRef:
@@ -16,4 +18,7 @@ roleRef:
subjects:
- kind: ServiceAccount
name: config-operator
+ namespace: {{ .Release.Namespace }}
+{{ end }}
+{{ end }}
diff --git a/helm/templates/config-operator.yml
b/helm/templates/config-operator.yml
index 3f56f44..298095e 100644
--- a/helm/templates/config-operator.yml
+++ b/helm/templates/config-operator.yml
@@ -1,4 +1,5 @@
{{ if .Values.ingestion.enabled }}
+{{ if not .Values.ingestion.collections.configMap }}
apiVersion: apps/v1
kind: Deployment
metadata:
@@ -21,4 +22,4 @@ spec:
image: {{ .Values.ingestion.configOperator.image }}
imagePullPolicy: Always
{{ end }}
-
+{{ end }}
diff --git a/helm/crds/gitbasedconfig-crd.yml
b/helm/templates/gitbasedconfig-crd.yml
similarity index 92%
rename from helm/crds/gitbasedconfig-crd.yml
rename to helm/templates/gitbasedconfig-crd.yml
index 6143752..8c1dd4c 100644
--- a/helm/crds/gitbasedconfig-crd.yml
+++ b/helm/templates/gitbasedconfig-crd.yml
@@ -1,3 +1,4 @@
+{{ if .Values.ingestion.collections.createCrd }}
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
@@ -31,4 +32,4 @@ spec:
type: string
config-map:
type: string
-
+{{ end }}
diff --git a/helm/templates/granule-ingester.yml
b/helm/templates/granule-ingester.yml
index 2ce03b6..405edb8 100644
--- a/helm/templates/granule-ingester.yml
+++ b/helm/templates/granule-ingester.yml
@@ -1,4 +1,5 @@
{{- if .Values.ingestion.enabled }}
+
apiVersion: apps/v1
kind: Deployment
metadata:
@@ -17,6 +18,7 @@ spec:
spec:
containers:
- image: {{ .Values.ingestion.granuleIngester.image }}
+ imagePullPolicy: Always
name: granule-ingester
env:
- name: RABBITMQ_USERNAME
@@ -26,9 +28,21 @@ spec:
- name: RABBITMQ_HOST
value: {{ .Values.rabbitmq.fullnameOverride }}
- name: CASSANDRA_CONTACT_POINTS
- value: sdap-cassandra
- - name: SOLR_HOST_AND_PORT
- value: http://sdap-solr:8983
+ value: {{ include "nexus.urls.cassandra" . }}
+ - name: CASSANDRA_USERNAME
+ value: {{ include "nexus.credentials.cassandra.username" . }}
+ - name: CASSANDRA_PASSWORD
+ value: {{ include "nexus.credentials.cassandra.password" . }}
+ - name: ZK_HOST_AND_PORT
+ value: {{ include "nexus.urls.zookeeper" . }}
+ {{ if .Values.ingestion.granuleIngester.maxConcurrency }}
+ - name: MAX_CONCURRENCY
+ value: "{{ .Values.ingestion.granuleIngester.maxConcurrency }}"
+ {{ end }}
+ {{- range $name, $value :=
.Values.ingestion.granules.s3.awsCredsEnvs }}
+ - name: {{ $name }}
+ value: {{ $value }}
+ {{- end }}
resources:
requests:
cpu: {{ .Values.ingestion.granuleIngester.cpu }}
@@ -37,9 +51,13 @@ spec:
cpu: {{ .Values.ingestion.granuleIngester.cpu }}
memory: {{ .Values.ingestion.granuleIngester.memory }}
volumeMounts:
+{{- if not .Values.ingestion.granules.s3.bucket }}
{{ include "nexus.ingestion.dataVolumeMount" . | indent 12 }}
+{{- end }}
volumes:
+{{- if not .Values.ingestion.granules.s3.bucket }}
{{ include "nexus.ingestion.dataVolume" . | indent 8 }}
+{{- end }}
restartPolicy: Always
{{- end }}
diff --git a/helm/templates/history-pvc.yml b/helm/templates/history-pvc.yml
index 3ecabe9..50e9249 100644
--- a/helm/templates/history-pvc.yml
+++ b/helm/templates/history-pvc.yml
@@ -1,12 +1,15 @@
+{{- if not .Values.ingestion.history.solrEnabled }}
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: history-volume-claim
+ annotations:
+ helm.sh/resource-policy: "keep"
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
- storageClassName: {{ .Values.storageClass }}
-
+ storageClassName: {{ .Values.ingestion.history.storageClass }}
+{{- end }}
diff --git a/helm/templates/init-cassandra-configmap.yml
b/helm/templates/init-cassandra-configmap.yml
new file mode 100644
index 0000000..3e7ed3c
--- /dev/null
+++ b/helm/templates/init-cassandra-configmap.yml
@@ -0,0 +1,13 @@
+apiVersion: v1
+data:
+ init.cql: |
+ CREATE KEYSPACE IF NOT EXISTS nexustiles WITH REPLICATION = { 'class':
'SimpleStrategy', 'replication_factor': 1 };
+
+ CREATE TABLE IF NOT EXISTS nexustiles.sea_surface_temp (
+ tile_id uuid PRIMARY KEY,
+ tile_blob blob
+ );
+kind: ConfigMap
+metadata:
+ name: init-cassandra
+ namespace: {{ .Release.Namespace }}
diff --git a/helm/templates/solr-create-collection.yml
b/helm/templates/solr-create-collection.yml
new file mode 100644
index 0000000..717cb42
--- /dev/null
+++ b/helm/templates/solr-create-collection.yml
@@ -0,0 +1,34 @@
+{{ if .Values.solr.initPodEnabled }}
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: solr-create-collection
+spec:
+ selector:
+ matchLabels:
+ app: solr-create-collection # has to match .spec.template.metadata.labels
+ replicas: 1
+ template:
+ metadata:
+ labels:
+ app: solr-create-collection
+ spec:
+ containers:
+ - name: solr-create-collection
+ imagePullPolicy: Always
+ image: nexusjpl/solr-cloud-init:1.0.2
+ resources:
+ requests:
+ memory: "0.5Gi"
+ cpu: "0.25"
+ env:
+ - name: MINIMUM_NODES
+ value: "{{ .Values.solr.replicaCount }}"
+ - name: SDAP_SOLR_URL
+ value: {{ include "nexus.urls.solr" . }}/solr/
+ - name: SDAP_ZK_SOLR
+ value: {{ include "nexus.urls.zookeeper" . }}/solr
+ - name: CREATE_COLLECTION_PARAMS
+ value:
"name=nexustiles&numShards=$(MINIMUM_NODES)&waitForFinalState=true"
+ restartPolicy: Always
+{{ end }}
\ No newline at end of file
diff --git a/helm/templates/solr.yml b/helm/templates/solr.yml
deleted file mode 100644
index c8d0f9b..0000000
--- a/helm/templates/solr.yml
+++ /dev/null
@@ -1,129 +0,0 @@
-apiVersion: v1
-kind: Service
-metadata:
- name: sdap-solr
-spec:
- ports:
- - port: 8983
- clusterIP: None
- selector:
- app: sdap-solr
-
----
-
-apiVersion: apps/v1
-kind: StatefulSet
-metadata:
- name: solr-set
-spec:
- selector:
- matchLabels:
- app: sdap-solr # has to match .spec.template.metadata.labels
- serviceName: "sdap-solr"
- replicas: {{.Values.solr.replicas }} # by default is 1
- podManagementPolicy: Parallel
- template:
- metadata:
- labels:
- app: sdap-solr # has to match .spec.selector.matchLabels
- spec:
- terminationGracePeriodSeconds: 10
- {{ if .Values.solr.tolerations }}
- tolerations:
-{{ .Values.solr.tolerations | toYaml | indent 6 }}
- {{ end }}
- {{ if .Values.solr.nodeSelector }}
- nodeSelector:
-{{ .Values.solr.nodeSelector | toYaml | indent 8 }}
- {{ end }}
- affinity:
- podAntiAffinity:
- # Prefer spreading over all hosts
- preferredDuringSchedulingIgnoredDuringExecution:
- - weight: 100
- podAffinityTerm:
- labelSelector:
- matchExpressions:
- - key: "app"
- operator: In
- values:
- - sdap-solr
- topologyKey: "kubernetes.io/hostname"
- securityContext:
- runAsUser: 8983
- fsGroup: 8983
- containers:
- - name: solr-create-collection
- imagePullPolicy: Always
- image: nexusjpl/solr-cloud-init:1.0.0-rc1
- resources:
- requests:
- memory: "1Gi"
- cpu: "0.25"
- env:
- - name: MINIMUM_NODES
- value: "2" # MINIMUM_NODES should be the same as spec.replicas
- - name: SOLR_HOST
- valueFrom:
- fieldRef:
- fieldPath: status.podIP
- - name: SDAP_SOLR_URL
- value: http://$(SOLR_HOST):8983/solr/
- - name: SDAP_ZK_SOLR
- value: "zk-hs:2181/solr"
- - name: CREATE_COLLECTION_PARAMS
- value:
"name=nexustiles&collection.configName=nexustiles&numShards=$(MINIMUM_NODES)&waitForFinalState=true"
- - name: solr-cloud
- imagePullPolicy: Always
- image: nexusjpl/solr-cloud:1.0.0-rc1
- resources:
- requests:
- memory: {{ .Values.solr.requests.memory }}
- cpu: {{ .Values.solr.requests.cpu }}
- limits:
- memory: {{ .Values.solr.limits.memory }}
- cpu: {{ .Values.solr.limits.cpu }}
- env:
- - name: SOLR_HEAP
- value: {{ .Values.solr.heap }}
- - name: SOLR_HOST
- valueFrom:
- fieldRef:
- fieldPath: status.podIP
- - name: SDAP_ZK_SERVICE_HOST
- value: "zk-hs"
- ports:
- - containerPort: 8983
- name: http
- volumeMounts:
- - name: solr-data
- mountPath: /opt/solr/server/solr/
- readinessProbe:
- exec:
- command:
- - solr
- - healthcheck
- - -c
- - nexustiles
- - -z
- - zk-hs:2181/solr
- initialDelaySeconds: 10
- timeoutSeconds: 5
- livenessProbe:
- exec:
- command:
- - solr
- - assert
- - -s
- - http://localhost:8983/solr/
- initialDelaySeconds: 10
- timeoutSeconds: 5
- volumeClaimTemplates:
- - metadata:
- name: solr-data
- spec:
- accessModes: [ "ReadWriteOnce" ]
- storageClassName: {{ .Values.storageClass }}
- resources:
- requests:
- storage: {{ .Values.solr.storage }}
diff --git a/helm/templates/webapp.yml b/helm/templates/webapp.yml
index d77496f..d14f877 100644
--- a/helm/templates/webapp.yml
+++ b/helm/templates/webapp.yml
@@ -9,8 +9,13 @@ spec:
pythonVersion: "2"
mode: cluster
image: {{ .Values.webapp.distributed.image }}
- imagePullPolicy: Always
+ imagePullPolicy: Always
mainApplicationFile:
local:///incubator-sdap-nexus/analysis/webservice/webapp.py
+ arguments:
+ - --cassandra-host={{ include "nexus.urls.cassandra" . }}
+ - --cassandra-username={{ include "nexus.credentials.cassandra.username" .
}}
+ - --cassandra-password={{ include "nexus.credentials.cassandra.password" .
}}
+ - --solr-host={{ include "nexus.urls.solr" . }}
sparkVersion: "2.4.4"
restartPolicy:
type: OnFailure
diff --git a/helm/templates/zookeeper.yml b/helm/templates/zookeeper.yml
deleted file mode 100644
index bdc3925..0000000
--- a/helm/templates/zookeeper.yml
+++ /dev/null
@@ -1,144 +0,0 @@
-apiVersion: v1
-kind: Service
-metadata:
- name: zk-hs
- labels:
- app: zk
-spec:
- ports:
- - port: 2888
- name: server
- - port: 3888
- name: leader-election
- clusterIP: None
- selector:
- app: zk
----
-apiVersion: v1
-kind: Service
-metadata:
- name: zk-cs
- labels:
- app: zk
-spec:
- ports:
- - port: 2181
- name: client
- selector:
- app: zk
----
-apiVersion: policy/v1beta1
-kind: PodDisruptionBudget
-metadata:
- name: zk-pdb
-spec:
- selector:
- matchLabels:
- app: zk
- maxUnavailable: 1
----
-apiVersion: apps/v1
-kind: StatefulSet
-metadata:
- name: zk
-spec:
- selector:
- matchLabels:
- app: zk
- serviceName: zk-hs
- replicas: {{ .Values.zookeeper.replicas }}
- updateStrategy:
- type: RollingUpdate
- podManagementPolicy: Parallel
- template:
- metadata:
- labels:
- app: zk
- spec:
- {{ if .Values.zookeeper.tolerations }}
- tolerations:
-{{ .Values.zookeeper.tolerations | toYaml | indent 6 }}
- {{ end }}
- {{ if .Values.zookeeper.nodeSelector }}
- nodeSelector:
-{{ .Values.zookeeper.nodeSelector | toYaml | indent 8 }}
- {{ end }}
- affinity:
- podAntiAffinity:
- preferredDuringSchedulingIgnoredDuringExecution:
- - weight: 100
- podAffinityTerm:
- labelSelector:
- matchExpressions:
- - key: "app"
- operator: In
- values:
- - zk
- topologyKey: "kubernetes.io/hostname"
- containers:
- - name: kubernetes-zookeeper
- imagePullPolicy: Always
- image: "k8s.gcr.io/kubernetes-zookeeper:1.0-3.4.10"
- resources:
- requests:
- memory: {{ .Values.zookeeper.memory }}
- cpu: {{ .Values.zookeeper.cpu }}
- ports:
- - containerPort: 2181
- name: client
- - containerPort: 2888
- name: server
- - containerPort: 3888
- name: leader-election
- command:
- - sh
- - -c
- - "start-zookeeper \
- --servers={{ .Values.zookeeper.replicas }} \
- --data_dir=/var/lib/zookeeper/data \
- --data_log_dir=/var/lib/zookeeper/data/log \
- --conf_dir=/opt/zookeeper/conf \
- --client_port=2181 \
- --election_port=3888 \
- --server_port=2888 \
- --tick_time=2000 \
- --init_limit=10 \
- --sync_limit=5 \
- --heap=512M \
- --max_client_cnxns=60 \
- --snap_retain_count=3 \
- --purge_interval=12 \
- --max_session_timeout=40000 \
- --min_session_timeout=4000 \
- --log_level=INFO"
- readinessProbe:
- exec:
- command:
- - sh
- - -c
- - "zookeeper-ready 2181"
- initialDelaySeconds: 10
- timeoutSeconds: 5
- livenessProbe:
- exec:
- command:
- - sh
- - -c
- - "zookeeper-ready 2181"
- initialDelaySeconds: 10
- timeoutSeconds: 5
- volumeMounts:
- - name: zkdatadir
- mountPath: /var/lib/zookeeper
- securityContext:
- runAsUser: 1000
- fsGroup: 1000
- volumeClaimTemplates:
- - metadata:
- name: zkdatadir
- spec:
- accessModes: [ "ReadWriteOnce" ]
- storageClassName: {{ .Values.storageClass }}
- resources:
- requests:
- storage: {{ .Values.zookeeper.storage }}
diff --git a/helm/values.yaml b/helm/values.yaml
index c012e6e..657fb6a 100644
--- a/helm/values.yaml
+++ b/helm/values.yaml
@@ -1,6 +1,4 @@
-## This is the StorageClass that will be used for Cassandra, Solr, Zookeeper,
-## and ingestion history (if ingestion history is not configured to use Solr)
-storageClass: hostpath
+onEarthProxyIP: ""
rootWebpage:
enabled: true
@@ -8,7 +6,7 @@ rootWebpage:
webapp:
enabled: true
distributed:
- image: nexusjpl/nexus-webapp:distributed.0.1.5
+ image: nexusjpl/nexus-webapp:distributed.0.2.2
## Use any of the driver configuration options available at
##
https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/user-guide.md
@@ -27,11 +25,12 @@ webapp:
## This section deals with the ingestion components of SDAP
ingestion:
+ # If ingestion.enabled=true, collections-ingester and granule-ingester will
be deployed
enabled: true
granuleIngester:
replicas: 2
- image: nexusjpl/granule-ingester:0.0.1
+ image: nexusjpl/granule-ingester:0.1.2
## cpu refers to both request and limit
cpu: 1
@@ -40,7 +39,7 @@ ingestion:
memory: 1Gi
collectionManager:
- image: nexusjpl/collection-manager:0.0.2
+ image: nexusjpl/collection-manager:0.1.2
## cpu refers to both request and limit
cpu: 0.5
@@ -53,87 +52,105 @@ ingestion:
## How to mount the granule files to ingest
granules:
- ## Enable nfsServer if you want to mount the granules from an NFS server,
- ## otherwise they will be loaded from the local filesystem.
- ## mountPath and path should be set whether or not nfsServer is enabled.
- # nfsServer: nfs-server.com
## mountPath is the path in the Collection Manager and Granule Ingester
pods
## where the granule files will be mounted.
## IMPORTANT: the `path` property on all collections in the Collections
Config file
- ## should match this value.
- ## Example: if mountPath is set to /data, then every collection in the
Collections
- ## Config file should have something like
+ ## should have mountPath as the root.
+ ## Example: if mountPath = /data, then every collection in the Collections
+ ## Config file should have something like:
## path: /data/<some-directory>/<some-file-pattern>
mountPath: /data
+ ## Set nfsServer to an NFS host URL if you want to mount the granules from
an NFS server.
+ ## For S3 or local filesystem ingestion, leave nfsServer blank.
+ nfsServer:
+
## path is the path on either local filesystem or NFS mount at which
- ## the granule files are stored.
- path: /var/lib/sdap/granules
+ ## the granule files are stored. This will be ignored if S3 ingestion is
enabled.
+ path:
+
+ s3:
+ ## If bucket has a value, S3 ingestion will be enabled (and nfsServer
will be ignored even if it has a value).
+ bucket:
+
+ ## awsCredsEnvs can include any environment variables that contain AWS
credentials
+ awsCredsEnvs: {}
## Where to find the Collections Config file
## ref:
https://github.com/apache/incubator-sdap-ingester/tree/dev/collection_manager#the-collections-configuration-file
## Either localDir should be set, or the git options, but not both.
collections:
+ createCrd: true
- ## Load the Collections Config file from a local path
- ## This is a future option that is not yet supported!
- # localDir: /Users/edford/Desktop/collections.yml
+ ## Name of a ConfigMap containing the Collections Config YAML.
+ ## Leave this blank if Git is enabled below.
+ configMap:
- ## Load the Collections Config file from a git repository
- ## Until localDir is supported, this configuration is mandatory
+ ## Load the Collections Config file from a git repository.
git:
-
## This should be an https repository url of the form
https://github.com/username/repo.git
url:
-
branch: master
-
- ## token is not yet supported!
# token: someToken
## Where to store ingestion history
- ## Defaults to a using a history directory, stored on a PVC using the
storageClass defined in this file above
+ ## Defaults to Solr for ingestion history storage
history:
- ## Store ingestion history in a solr database instead of a filesystem
directory
- # url: http://history-solr
+ ## Whether to store ingestion history in a solr database instead of a
filesystem directory
+ solrEnabled: true
-cassandra:
- replicas: 2
- storage: 13Gi
- requests:
- cpu: 1
- memory: 3Gi
- limits:
- cpu: 1
- memory: 3Gi
+ ## storage class to use for ingestion history file only if solrEnabled =
false
+ storageClass: hostpath
-solr:
- replicas: 2
- storage: 10Gi
- heap: 4g
- requests:
- memory: 5Gi
- cpu: 1
- limits:
- memory: 5Gi
- cpu: 1
-zookeeper:
- replicas: 3
- memory: 1Gi
- cpu: 0.5
- storage: 8Gi
+## The values in this section are relevant if using Solr, Zookeeper, or
Cassandra that were not deployed from this Helm chart
+external:
+ solrHostAndPort:
+ zookeeperHostAndPort:
+ cassandraHost:
+ cassandraUsername:
+ cassandraPassword:
-ingressEnabled: false
+## Configuration values for the Solr and Zookeeper dependencies
+## ref: https://github.com/helm/charts/tree/master/incubator/solr
+## ref: https://github.com/helm/charts/tree/master/incubator/zookeeper
+solr:
+ enabled: true
+ initPodEnabled: true
+ image:
+ repository: nexusjpl/solr
+ tag: 8.4.0
+ replicaCount: 3
+ volumeClaimTemplates:
+ storageClassName: hostpath
+ storageSize: 10Gi
+ resources:
+ requests:
+ memory: 2Gi
+ cpu: 1
+ limits:
+ memory: 2Gi
+ cpu: 1
+ zookeeper:
+ replicaCount: 3
+ persistence:
+ storageClass: hostpath
+ resources:
+ limits:
+ memory: 1Gi
+ cpu: 0.5
+ requests:
+ memory: 1Gi
+ cpu: 0.5
-onEarthProxyIP: ""
## Configuration values for the nginx-ingress dependency
## ref: https://github.com/helm/charts/tree/master/stable/nginx-ingress
nginx-ingress:
+ enabled: false
controller:
scope:
enabled: true
@@ -150,10 +167,34 @@ nginx-ingress:
rabbitmq:
## fullnameOverride sets the name of the RabbitMQ service
## with which the ingestion components will communicate.
+ enabled: true
+ persistence:
+ storageClass: hostpath
fullnameOverride: rabbitmq
replicaCount: 1
auth:
username: guest
password: guest
ingress:
- enabled: true
\ No newline at end of file
+ enabled: true
+
+## Configuration values for the rabbitmq dependency
+## ref: https://github.com/bitnami/charts/tree/master/bitnami/cassandra
+cassandra:
+ enabled: true
+ initDBConfigMap: init-cassandra
+ dbUser:
+ user: cassandra
+ password: cassandra
+ cluster:
+ replicaCount: 1
+ persistence:
+ storageClass: hostpath
+ size: 8Gi
+ resources:
+ requests:
+ cpu: 1
+ memory: 8Gi
+ limits:
+ cpu: 1
+ memory: 8Gi