This is an automated email from the ASF dual-hosted git repository.

suvasude pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-gobblin.git


The following commit(s) were added to refs/heads/master by this push:
     new 570f3d7  [GOBBLIN-932] Create deployment for Azure, clean up existing 
deployments
570f3d7 is described below

commit 570f3d7129e708fd9583ef47676333cc486a97a5
Author: William Lo <[email protected]>
AuthorDate: Fri Dec 6 15:47:57 2019 -0800

    [GOBBLIN-932] Create deployment for Azure, clean up existing deployments
    
    Closes #2799 from Will-Lo/azure-deploy
---
 .../user-guide/Azure-Kubernetes-Deployment.md      | 83 ++++++++++++++++++++++
 .../user-guide/Building-Gobblin-as-a-Service.md    | 22 +++++-
 .../gobblin-service/azure-cluster/ingress.yaml     | 13 ++++
 .../azure-cluster/kustomization.yaml               |  4 ++
 .../{application.yaml => deployment.yaml}          | 24 ++-----
 .../flowconfig-templates/distcp.template           | 51 +++++++++++++
 .../gobblin-service/base-cluster/ingress.yaml      |  4 +-
 .../base-cluster/kustomization.yaml                | 11 +++
 .../gobblin-service/base-cluster/service.yaml      | 14 ++++
 .../gobblin-service/base-cluster/storage.yaml      | 28 --------
 .../{application.yaml => deployment.yaml}          | 22 ++----
 .../mysql-cluster/kustomization.yaml               |  5 +-
 .../mysql-cluster/mysql-deployment.yaml            |  2 +-
 .../gobblin-service/mysql-cluster/mysql-pv.yaml    |  2 -
 14 files changed, 214 insertions(+), 71 deletions(-)

diff --git a/gobblin-docs/user-guide/Azure-Kubernetes-Deployment.md 
b/gobblin-docs/user-guide/Azure-Kubernetes-Deployment.md
new file mode 100644
index 0000000..61b789c
--- /dev/null
+++ b/gobblin-docs/user-guide/Azure-Kubernetes-Deployment.md
@@ -0,0 +1,83 @@
+# GaaS on Azure Deployment Steps
+
+## Create Azure Container Registry [Optional]
+
+1\) Log into Azure Container Registry
+
+```bash
+$ az acr login --name gobblintest
+```
+
+2\) Tag docker images to container registry
+
+```bash
+$ docker tag <gaas_image_id> gobblintest.azurecr.io/gobblin-service
+$ docker tag <standalone_image_id> gobblintest.azurecr.io/gobblin-standalone
+```
+
+3\) Push the images
+
+```bash
+$ docker push gobblintest.azurecr.io/gobblin-service
+$ docker push gobblintest.azurecr.io/gobblin-standalone
+```
+
+The images should now be hosted on azure with the tag:latest
+
+## Deploy the base K8s cluster
+
+1\) Create a resource group on Azure
+
+2\) Create a cluster and deploy it onto the resource group
+
+```bash
+az aks create --resource-group <resource_group_name> --name GaaS-cluster-test 
--node-count 1 --enable-addons monitoring --generate-ssh-keys
+```
+
+3\) Switch kubectl to use azure
+
+4\) Check status of cluster
+
+```bash
+$ kubectl get pods
+```
+
+## Install the nginx ingress to connect to the Azure Cluster
+
+1\) Install helm if you don't currently have it
+
+```bash
+brew install helm
+helm init
+```
+
+2\) Deploy the nginx helm chart to create the ingress
+
+```bash
+helm install stable/nginx-ingress
+```
+
+If this is the first time deploying helm (v2.0), you will need to set up the 
tiller, which is a helm serviceaccount with sudo permissions that lives inside 
of the cluster. Otherwise you'll run into this 
[issue](https://github.com/helm/helm/issues/2224).
+
+> Error: configmaps is forbidden: User 
"system:serviceaccount:kube-system:default" cannot list configmaps in the 
namespace "kube-system"
+
+To set up the tiller \(steps are also found in the issue link\)
+
+```bash
+kubectl create serviceaccount --namespace kube-system tiller
+kubectl create clusterrolebinding tiller-cluster-rule 
--clusterrole=cluster-admin --serviceaccount=kube-system:tiller
+kubectl edit deploy --namespace kube-system tiller-deploy #and add the line 
serviceAccount: tiller to spec/template/spec
+```
+
+3\) Deploy the ingress controller in 
`gobblin-kubernetes/gobblin-service/azure-cluster`
+
+4\) Run `kubectl get services`, and the output should look something like this:
+
+```text
+gaas-svc                                        ClusterIP      10.0.176.58    
<none>           6956/TCP                     16h
+honorary-possum-nginx-ingress-controller        LoadBalancer   10.0.182.255   
<EXTERNAL_IP>    80:30488/TCP,443:31835/TCP   6m13s
+honorary-possum-nginx-ingress-default-backend   ClusterIP      10.0.236.153   
<none>           80/TCP                       6m13s
+kubernetes                                      ClusterIP      10.0.0.1       
<none>           443/TCP                      10d
+```
+
+5\) Send a request to the IP for the `honorary-possum-nginx-ingress-controller`
diff --git a/gobblin-docs/user-guide/Building-Gobblin-as-a-Service.md 
b/gobblin-docs/user-guide/Building-Gobblin-as-a-Service.md
index 8cb6fcf..5661b18 100644
--- a/gobblin-docs/user-guide/Building-Gobblin-as-a-Service.md
+++ b/gobblin-docs/user-guide/Building-Gobblin-as-a-Service.md
@@ -31,4 +31,24 @@ To run the full docker compose:
 4. `docker compose -f 
gobblin-docker/gobblin-service/alpine-gaas-latest/docker-compose.yml build`
 5. `docker compose -f 
gobblin-docker/gobblin-service/alpine-gaas-latest/docker-compose.yml up`
  
-The docker container exposes the endpoints from Gobblin as a Service which can 
be accessed on `localhost:6956`
\ No newline at end of file
+The docker container exposes the endpoints from Gobblin as a Service which can 
be accessed on `localhost:6956`
+
+# Running Gobblin as a Service with Kubernetes
+Gobblin as a service also has a kubernetes cluster, which can be deployed to 
any K8s environment.
+
+Currently, the yamls use 
[Kustomize](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/kustomization/)
 for configuration management. In the future, we may utilise Helm instead.
+
+To cluster is split into 3 environments
+1) base-cluster (deploys one pod of GaaS and Gobblin standalone, where GaaS 
writes jobSpecs to a folder tracked by the standalone instance)
+2) mysql-cluster (utilises MySQL for storing specStores instead of FS, future 
work may involve writing to a job queue to be picked by gobblin standalone)
+3) azure-cluster (deploys Dev on Microsoft Azure), more docs 
[here](./Azure-Kubernetes-Deployment.md)
+
+To add any flow config template for GaaS to use, add the `.template` file to 
`gobblin-kubernetes/gobblin-service/base-cluster/` and add the file to the 
configmap.
+For production purposes, flow config templates should be stored in a proper 
file system or a database instead of being added to the configmap.
+
+To deploy any of these clusters, run the following command from the repository 
root.
+```
+kubectl apply -k gobblin-kubernetes/gobblin-service/<ENV>/
+```
+
+There, find the external IP of the cluster and start sending requests.
diff --git a/gobblin-kubernetes/gobblin-service/azure-cluster/ingress.yaml 
b/gobblin-kubernetes/gobblin-service/azure-cluster/ingress.yaml
new file mode 100644
index 0000000..a419a5a
--- /dev/null
+++ b/gobblin-kubernetes/gobblin-service/azure-cluster/ingress.yaml
@@ -0,0 +1,13 @@
+apiVersion: extensions/v1beta1
+kind: Ingress
+metadata:
+  name: gaas-ingress
+  annotations:
+    # utilize an nginx ingress as default, to set up read file at 
incubator-gobblin/gobblin-docs/user-guide/Azure-Kubernetes-Deployment.md
+    kubernetes.io/ingress.class: nginx
+    nginx.ingress.kubernetes.io/ssl-redirect: "false"
+    nginx.ingress.kubernetes.io/rewrite-target: /$1
+spec:
+  backend:
+    serviceName: gaas-svc
+    servicePort: 6956
diff --git 
a/gobblin-kubernetes/gobblin-service/azure-cluster/kustomization.yaml 
b/gobblin-kubernetes/gobblin-service/azure-cluster/kustomization.yaml
new file mode 100644
index 0000000..dd4abf1
--- /dev/null
+++ b/gobblin-kubernetes/gobblin-service/azure-cluster/kustomization.yaml
@@ -0,0 +1,4 @@
+bases:
+  - ../mysql-cluster
+patchesStrategicMerge:
+  - ingress.yaml
diff --git a/gobblin-kubernetes/gobblin-service/base-cluster/application.yaml 
b/gobblin-kubernetes/gobblin-service/base-cluster/deployment.yaml
similarity index 79%
rename from gobblin-kubernetes/gobblin-service/base-cluster/application.yaml
rename to gobblin-kubernetes/gobblin-service/base-cluster/deployment.yaml
index c50a4b7..57bec9d 100644
--- a/gobblin-kubernetes/gobblin-service/base-cluster/application.yaml
+++ b/gobblin-kubernetes/gobblin-service/base-cluster/deployment.yaml
@@ -22,18 +22,17 @@ spec:
         - name: 'shared-jobs'
           persistentVolumeClaim:
             claimName: shared-jobs-claim
-        - name: 'shared-template-catalogs'
-          persistentVolumeClaim:
-            claimName: shared-template-catalogs-claim
+        - name: flowconfig-templates
+          configMap:
+            name: flowconfig-templates
       containers:
         - name: gobblin-service
           image: will97/gobblin-as-a-service:latest
           volumeMounts:
             - name: shared-jobs
               mountPath: /tmp/gobblin-as-service/jobs
-            - name: shared-template-catalogs
+            - name: flowconfig-templates
               mountPath: /tmp/templateCatalog
-
 ---
 apiVersion: apps/v1
 kind: Deployment
@@ -62,18 +61,3 @@ spec:
           volumeMounts:
             - name: shared-jobs
               mountPath: /tmp/gobblin-standalone/jobs
----
-apiVersion: v1
-kind: Service
-metadata:
-  name: gaas-svc
-  labels:
-    app: gobblin-service
-spec:
-  type: ClusterIP
-  ports:
-    - protocol: TCP
-      port: 6956
-      targetPort: 6956
-  selector:
-    app: gaas
diff --git 
a/gobblin-kubernetes/gobblin-service/base-cluster/flowconfig-templates/distcp.template
 
b/gobblin-kubernetes/gobblin-service/base-cluster/flowconfig-templates/distcp.template
new file mode 100644
index 0000000..1626abb
--- /dev/null
+++ 
b/gobblin-kubernetes/gobblin-service/base-cluster/flowconfig-templates/distcp.template
@@ -0,0 +1,51 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# ====================================================================
+# Job configurations
+# ====================================================================
+
+gobblin.template.required_attributes="from,to"
+
+job.name=Distcp
+job.description="Distributed copy"
+
+# target location for copy
+data.publisher.final.dir=${gobblin.flow.output.dataset.descriptor.path}
+gobblin.dataset.pattern=${gobblin.flow.input.dataset.descriptor.path}
+
+gobblin.dataset.profile.class=org.apache.gobblin.data.management.copy.CopyableGlobDatasetFinder
+
+# ====================================================================
+# Distcp configurations
+# ====================================================================
+
+extract.namespace=org.apache.gobblin.copy
+data.publisher.type=org.apache.gobblin.data.management.copy.publisher.CopyDataPublisher
+source.class=org.apache.gobblin.data.management.copy.CopySource
+writer.builder.class=org.apache.gobblin.data.management.copy.writer.FileAwareInputStreamDataWriterBuilder
+converter.classes=org.apache.gobblin.converter.IdentityConverter
+
+task.maxretries=0
+workunit.retry.enabled=false
+
+distcp.persist.dir=/tmp/distcp-persist-dir
+
+cleanup.staging.data.per.task=false
+gobblin.trash.skip.trash=true
+state.store.enabled=false
+job.commit.parallelize=true
diff --git a/gobblin-kubernetes/gobblin-service/base-cluster/ingress.yaml 
b/gobblin-kubernetes/gobblin-service/base-cluster/ingress.yaml
index 7c8f99c..c50c50b 100644
--- a/gobblin-kubernetes/gobblin-service/base-cluster/ingress.yaml
+++ b/gobblin-kubernetes/gobblin-service/base-cluster/ingress.yaml
@@ -1,8 +1,8 @@
-apiVersion: networking.k8s.io/v1beta1
+apiVersion: extensions/v1beta1
 kind: Ingress
 metadata:
   name: gaas-ingress
 spec:
   backend:
     serviceName: gaas-svc
-    servicePort: 6956
\ No newline at end of file
+    servicePort: 6956
diff --git a/gobblin-kubernetes/gobblin-service/base-cluster/kustomization.yaml 
b/gobblin-kubernetes/gobblin-service/base-cluster/kustomization.yaml
new file mode 100644
index 0000000..eeb1f1e
--- /dev/null
+++ b/gobblin-kubernetes/gobblin-service/base-cluster/kustomization.yaml
@@ -0,0 +1,11 @@
+resources:
+  - deployment.yaml
+  - storage.yaml
+  - service.yaml
+  - ingress.yaml
+configMapGenerator:
+  # only used for development purposes to allow an easy way to expose template 
files to GaaS
+  # add flow templates here
+  - name: flowconfig-templates
+    files:
+      - flowconfig-templates/distcp.template
diff --git a/gobblin-kubernetes/gobblin-service/base-cluster/service.yaml 
b/gobblin-kubernetes/gobblin-service/base-cluster/service.yaml
new file mode 100644
index 0000000..9d163f1
--- /dev/null
+++ b/gobblin-kubernetes/gobblin-service/base-cluster/service.yaml
@@ -0,0 +1,14 @@
+apiVersion: v1
+kind: Service
+metadata:
+  name: gaas-svc
+  labels:
+    app: gobblin-service
+spec:
+  type: ClusterIP
+  ports:
+    - protocol: TCP
+      port: 6956
+      targetPort: 6956
+  selector:
+    app: gaas
diff --git a/gobblin-kubernetes/gobblin-service/base-cluster/storage.yaml 
b/gobblin-kubernetes/gobblin-service/base-cluster/storage.yaml
index 0765f98..3e98769 100644
--- a/gobblin-kubernetes/gobblin-service/base-cluster/storage.yaml
+++ b/gobblin-kubernetes/gobblin-service/base-cluster/storage.yaml
@@ -24,31 +24,3 @@ spec:
   resources:
     requests:
       storage: 100Mi
----
-apiVersion: v1
-kind: PersistentVolume
-metadata:
-  name: shared-template-catalogs-volume
-spec:
-  capacity:
-    storage: 50Mi
-  volumeMode: Filesystem
-  accessModes:
-    - ReadWriteOnce
-  persistentVolumeReclaimPolicy: Delete
-  storageClassName: manual
-  hostPath:
-    path: "/tmp/templateCatalog"
----
-kind: PersistentVolumeClaim
-apiVersion: v1
-metadata:
-  name: shared-template-catalogs-claim
-spec:
-  accessModes:
-    - ReadWriteOnce
-  storageClassName: manual
-  resources:
-    requests:
-      storage: 50Mi
-
diff --git a/gobblin-kubernetes/gobblin-service/mysql-cluster/application.yaml 
b/gobblin-kubernetes/gobblin-service/mysql-cluster/deployment.yaml
similarity index 89%
rename from gobblin-kubernetes/gobblin-service/mysql-cluster/application.yaml
rename to gobblin-kubernetes/gobblin-service/mysql-cluster/deployment.yaml
index 20a3226..c71a4ad 100644
--- a/gobblin-kubernetes/gobblin-service/mysql-cluster/application.yaml
+++ b/gobblin-kubernetes/gobblin-service/mysql-cluster/deployment.yaml
@@ -22,6 +22,9 @@ spec:
         - name: shared-jobs
           persistentVolumeClaim:
             claimName: shared-jobs-claim
+        - name: flowconfig-templates
+          configMap:
+            name: flowconfig-templates
         - name: gaas-config
           configMap:
             name: gaas-config
@@ -44,6 +47,8 @@ spec:
           volumeMounts:
             - name: shared-jobs
               mountPath: /tmp/gobblin-as-service/jobs
+            - name: flowconfig-templates
+              mountPath: /tmp/templateCatalog
             - name: gaas-config
               mountPath: /home/gobblin/conf/gobblin-as-service/application.conf
               subPath: gaas-application.conf
@@ -51,7 +56,7 @@ spec:
       initContainers:
         - name: init-mysql
           image: busybox:1.28
-          command: ["sh", "-c", "until nslookup mysql; do echo waiting for 
mysql; sleep 2; done;"]
+          command: ['sh', '-c', 'until nslookup mysql; do echo waiting for 
mysql; sleep 2; done;']
 
 
 ---
@@ -88,18 +93,3 @@ spec:
             - name: standalone-config
               mountPath: /home/gobblin/conf/standalone/application.conf
               subPath: standalone-application.conf
----
-apiVersion: v1
-kind: Service
-metadata:
-  name: gaas-svc
-  labels:
-    app: gobblin-service
-spec:
-  type: NodePort
-  ports:
-    - port: 6956
-      protocol: TCP
-      targetPort: 6956
-  selector:
-    app: gaas
diff --git 
a/gobblin-kubernetes/gobblin-service/mysql-cluster/kustomization.yaml 
b/gobblin-kubernetes/gobblin-service/mysql-cluster/kustomization.yaml
index 9899123..cd3f446 100644
--- a/gobblin-kubernetes/gobblin-service/mysql-cluster/kustomization.yaml
+++ b/gobblin-kubernetes/gobblin-service/mysql-cluster/kustomization.yaml
@@ -1,7 +1,10 @@
+bases:
+  - ../base-cluster
 resources:
-  - application.yaml
   - mysql-deployment.yaml
   - mysql-pv.yaml
+patchesStrategicMerge:
+  - deployment.yaml
 configMapGenerator:
   - name: gaas-config
     files:
diff --git 
a/gobblin-kubernetes/gobblin-service/mysql-cluster/mysql-deployment.yaml 
b/gobblin-kubernetes/gobblin-service/mysql-cluster/mysql-deployment.yaml
index a949979..ff11411 100644
--- a/gobblin-kubernetes/gobblin-service/mysql-cluster/mysql-deployment.yaml
+++ b/gobblin-kubernetes/gobblin-service/mysql-cluster/mysql-deployment.yaml
@@ -30,7 +30,7 @@ spec:
           persistentVolumeClaim:
             claimName: mysql-pv-claim
       containers:
-        - image: mysql:5.6
+        - image: mysql:5.6.45
           name: mysql
           env:
           - name: MYSQL_RANDOM_ROOT_PASSWORD
diff --git a/gobblin-kubernetes/gobblin-service/mysql-cluster/mysql-pv.yaml 
b/gobblin-kubernetes/gobblin-service/mysql-cluster/mysql-pv.yaml
index 77d58d9..7f498d2 100644
--- a/gobblin-kubernetes/gobblin-service/mysql-cluster/mysql-pv.yaml
+++ b/gobblin-kubernetes/gobblin-service/mysql-cluster/mysql-pv.yaml
@@ -5,7 +5,6 @@ metadata:
   labels:
     type: local
 spec:
-  storageClassName: manual
   capacity:
     storage: 1Gi
   accessModes:
@@ -18,7 +17,6 @@ kind: PersistentVolumeClaim
 metadata:
   name: mysql-pv-claim
 spec:
-  storageClassName: manual
   accessModes:
     - ReadWriteOnce
   resources:

Reply via email to