Re: yarn usercache dir not resolved properly when running an example application

2019-02-21 Thread Vinay Kashyap
Yes Jeff Thanks again. I could successfully run standalone TF training application with Tensorboard on docker container. Will definitely take care of silent ssh once I start with Distributed TF.. On Tue, Feb 19, 2019 at 9:44 PM Jeff Hubbs wrote: > Great, Vinay - I'm glad that made a

Re: yarn usercache dir not resolved properly when running an example application

2019-02-19 Thread Jeff Hubbs
Great, Vinay - I'm glad that made a difference. When you get to the point where you are running a cluster, the same sort of thing will have to carry over to all nodes, with the added issue that ssh and keys must be configured such that each of those users can shell to other nodes without

Re: yarn usercache dir not resolved properly when running an example application

2019-02-18 Thread Vinay Kashyap
Perfect Jeff, I clearly understand. After changing the setup to the appropriate users and folder permissions, I can see some progress.. Cheers.. On Fri, Feb 15, 2019 at 10:05 AM Jeff Hubbs wrote: > On 2/14/19 11:09 PM, Vinay Kashyap wrote: > > I am running hadoop on my mac and all the folders

Re: yarn usercache dir not resolved properly when running an example application

2019-02-14 Thread Jeff Hubbs
On 2/14/19 11:09 PM, Vinay Kashyap wrote: I am running hadoop on my mac and all the folders have *myuser:staff* as the owner. I have verified the permissions for the local dirs to be 755. This doesn't sound right. By-the-book, there are supposed to be separate "users" for hdfs, yarn, and

Re: yarn usercache dir not resolved properly when running an example application

2019-02-14 Thread Vinay Kashyap
I am running hadoop on my mac and all the folders have *myuser:staff* as the owner. I have verified the permissions for the local dirs to be 755. I run all hadoop services with myuser and I have configured *yarn.nodemanager.linux-container-executor.group**=staff *accordingly both in

Re: yarn usercache dir not resolved properly when running an example application

2019-02-14 Thread Prabhu Josephraj
In case of Distributed Shell Job - ApplicationMaster runs in normal linux container and the subsequent shell command runs inside Docker container. The job fails even before launching AM, that is before starting Docker Container. I think the Distributed Shell job will fail even without Docker

Re: yarn usercache dir not resolved properly when running an example application

2019-02-14 Thread Vinay Kashyap
Hi Prabhu, Thanks for your reply. I tried the configurations as per your suggestion. But I get the same error. Is this related to container localization by any chance?. Also, is there any log or out information which says that the docker container runtime has been picked up.? On Thu, Feb 14,

Re: yarn usercache dir not resolved properly when running an example application

2019-02-14 Thread Prabhu Josephraj
Hi Vinay, Can you try specifying below configs under Docker section in container-executor.cfg which will allow Docker Containers to use the NM Local Dirs. docker.allowed.ro-mounts=/data/yarn/local,,/usr/jdk64/jdk1.8.0_112/bin

yarn usercache dir not resolved properly when running an example application

2019-02-14 Thread Vinay Kashyap
I am using Hadoop 3.2.0 and trying to run a simple application in a docker container and I have made the required configuration changes both in *yarn-site.xml* and *container-executor.cfg* to choose LinuxContainerExecutor and docker runtime. I use the example of distributed shell in one of the