hkoosha opened a new pull request #3283:
URL: https://github.com/apache/gobblin/pull/3283


   This is how workers on k8s could look like. I followed mapreduce's approach: 
   - Have a PersistentVolume shared among workers (same as HDFS).
   - Copy jars and job.state there.
   - Have a docker container available to k8s. The container startup scripts 
takes an argument: classpath. Then launches the JVM with a simple java 
application which in turn launches the main(String[]) method of the class 
passed as argument. 
   - The KuberJobLauncher, uses k8s api to create a pod with the container 
described above (and copies the jars/files to PersistentVolume's location 
beforehand)
   - Wait for the pod to exit.
   
   Instead of pod I should have used k8s's job. But for some reason pv and pvc 
do not work with jobs on my machine.
   
   k3d is used (https://github.com/rancher/k3d) to launch a cluster.
   
   And finally the MRJobLaucnher is ab(used) to launch the a k8s-based job.
   
   The steps to reproduce the build and jobs:
   ```bash
   # This acts as the HDFS path, everything is put here! hardcoded a lot of 
places
   sudo mkdir -p /a
   # Spawn a cluster and have the docker image ready
   cd gobblin-kubernetes-support/docker-java-runner/docker-local/
   make k3d-reg k3d-cluster
   make clean build push
   cd -
   # Create PV and PVC
   cd gobblin-kubernetes-support/k8s/
   make apply
   # Launch gobblin
   cd -
   # build gobblin, last step (cp) fails, you can ignore that
   make build
   make run
   make build
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to