Hi, Thanks for the pointers. I got Submarine up and running.
*Submarine Setup Question:* The only thing that I find a little suspect is that the execution fails if container-executor.cfg and container-executor binary are not placed in a file path where all the directories in that file path are not owned by root. By default, we place container-executor.cfg and container-executor binary in a user home directory $HADOOP_CONF_DIR (=~/user/hadoop-binary-extraction-location/conf) and to avoid the permission errors we end up changing the ownership of the user home directory (~/user) to be owned by root.* Is there a way to enable execution without the container-executor.cfg and container-executor binary being owned by root?* Also, for some security reasons, in our production cluster, we are not allowed to enable Docker. *Is there a way to get submarine running without Docker-based container execution?* *Implementing Pause/Resume Feature Question:* I've been going through the source code and been trying to understand how to go ahead with implementation of a simple pause and resume feature. From what I understand, the submarine-client invokes a YARN Service AM by requesting RM and submits a JSON-spec to this AM to begin execution of a job. I want to implement a simple pause-resume feature. For this, I'm considering using making changes in the YARN Service AM - and maybe use the Flex or Upgrade functionality? *Does anyone have pointers as to how to go about implementing a pause/resume feature for ML training jobs executed with Submarine?* Thanks, Kshiteej On Sun, Mar 24, 2019 at 1:21 AM Sunil G <[email protected]> wrote: > Hi Kshiteej > > Thanks a lot for trying Submarine and your interest in this project. > Unfortunately, we missed your mail earlier and thanks for figuring it out. > > Quick updates: > 1. a simple hadoop setup with docker runtime enabled is needed. you can > refer below blog for this. > > https://hortonworks.com/blog/trying-containerized-applications-apache-hadoop-yarn-3-1/ > 2. once these are enabled, you could use calico etc if needed. otherwise > try running a simple job as given in submarine documentation. > > https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/Index.html > > Pls send us feedback based on your tests. > > - Sunil > > On Sun, Mar 24, 2019 at 7:59 AM Kshiteej Mahajan <[email protected]> > wrote: > > > Hi all, > > > > I realized the details in my previous mail were sparing. From my > > understanding of > > > > > https://github.com/hadoopsubmarine/submarine-installer/blob/master/InstallationGuide.md > > - > > it seems to get Submarine working - two things are essential - > > > > 1. Docker Runtime in YARN > > 2. Some non-trivial Docker Networking setup / configuration so that > Docker > > containers launched by submarine can seamlessly communicate > > > > Step 2 involves some non-trivial configuration so that etcd, calico, > > yarn-registry-dns-service can seamlessly interact. > > > > Is my understanding correct? I've setup vanilla hadoop before and it is > > easy. However, after trying submarine setup for two days I'm still > > struggling. As someone who is new to docker networking and > > yarn-dns-registry-service - please can someone point me to what are the > > minimal essential steps that I should follow to get Submarine working. > > > > I wish to contribute to Submarine. I'm a graduate student and my aim is > to > > try different scheduling schemes (implemented in YARN's RM) and see how > > different ML training workloads (submitted using Submarine) perform. In > > doing this, I hope to make some contributions to Submarine - like perhaps > > developing the pause / resume feature in submarine. > > > > Any help will be much appreciated. > > > > Thanks, > > Kshiteej > > > > On Sat, Mar 23, 2019 at 8:35 PM Kshiteej Mahajan <[email protected]> > > wrote: > > > > > Hi all, > > > > > > I've been trying to follow the guide at > > > > > > https://github.com/hadoopsubmarine/submarine-installer/blob/master/InstallationGuide.md > > - > > > but there are a lot of pre-requisites like Calico, HBase, Zookeeper > that > > > are needed with that guide. Are all these pre-reqs essential to a > > submarine > > > setup. > > > > > > Please can someone point me to a bare minimal installation guide for > > > setting up submarine. > > > > > > Thanks, > > > Kshiteej > > > > > >
