Hi Robert, I also encourage you to check out https://github.com/linkedin/TonY (TensorFlow on YARN) which is a platform built for this purpose.
Jonathan ________________________________ From: Sunil G <sun...@apache.org> Sent: Tuesday, November 6, 2018 10:05:14 PM To: Robert Grandl Cc: yarn-...@hadoop.apache.org; yarn-dev-h...@hadoop.apache.org; General Subject: Re: Run Distributed TensorFlow on YARN Hi Robert {Submarine} project helps to run Distributed Tensorflow on top of YARN with ease. YARN-8220 <https://issues.apache.org/jira/browse/YARN-8220> was an early attempt to do the same with some scripts etc, but Submarine will help to avoid all such custom scripts etc, and rather can simply run tensorflow like a distributed shell command line by using Submarine jar. Pls refer below doc for deep dive. https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7 Submarine will be released as part of Hadoop 3.2.0 release which will be out very soon officially (in coming weeks). you are free to use hadoop trunk to run same if you need very soon. For now you can refer submarine docs under hadoop repo (trunk) under hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/site/markdown/ or( https://github.com/apache/hadoop/tree/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/site/markdown ) Thanks Sunil On Wed, Nov 7, 2018 at 10:34 AM Robert Grandl <rgra...@yahoo.com.invalid> wrote: > Hi all, > I am wondering if there is any stable support to run distributed > TensorFlow atop YARN at the moment. > I found this blog post from Hortonworks. It seems this it is possible > starting YARN 3.1.0. > https://hortonworks.com/blog/distributed-tensorflow-assembly-hadoop-yarn/ > > > Also I found some more recent JIRAs: > https://issues.apache.org/jira/browse/YARN-8220 > https://issues.apache.org/jira/browse/YARN-8135 > which suggests to use something called submarine. > > However, I could not find any proper documentation or instructions to use > any of these. > > Can someone help me with this? > Otherwise, it is any better support to run any other machine learning > framework with YARN? > Thank you in advance,- Robert >