Thanks Brian: Here's how I have it setup. The "install directory" is NFS mounted and mounted Readonly. I did this on purpose as I was hoping to share this. using the ZEPPELIN_HOME (RO) ZEPPELIN_CONF_DIR (NFS mounted, per user read/write) ZEPPELIN_LOG_DIR (Set to a local directory in the container, /tmp) and ZEPPELING_PID_DIR (Set to local dir in container, /tmp) and ZEPPELIN_NOTEBOOK_DIR (NFS mounted Read write).
Basically those locations are connected to the container at run time and there are proper permissions there. in the container, I add a user with the UID as the user on the NFS system and run the zeppelin process in the container as that user, so that should be good too. Other than ./conf and the log/pid stuff, are there any other directories in that require read/write access? I can try to run them with read/write rather than read-only and see what happens (I'll do that here next) On Tue, Jun 16, 2015 at 8:40 AM, Brian McDevitt <[email protected]> wrote: > I'd first check that the user that's running zeppelin has ownership of the > zeppelin installation and has appropriate rights to read any additional > files you might need. > > Hope that helps, > Brian > > Thanks, > Brian McDevitt > Software Engineer > The Nerdery > > On Tue, Jun 16, 2015 at 8:25 AM, John Omernik <[email protected]> wrote: > >> Hey all, I am running into an interesting problem and I think I am >> getting to the end of my ability to troubleshoot so I thought I'd list >> things out here and see if anyone has more ideas for next steps in >> troubleshooting. >> >> I am running a Docker Container I built in Mesos. I can get things up >> and running, things seem happy and healthy until I try to run a command >> with an interpreter. At that point I am getting strange errors about >> connections refused. I put the errors below (from the Notebook, log files >> and from Std Err) for clarity. But the basic thing I saw was "connection >> refused". So I put tcpdump on the container and went to trouble shoot what >> was happening. (TCPdump below too) it looks like it's trying to connect to >> localhost 36365 which is the port the interpreter was started on, but after >> the initial syn, it's getting a rst-ack. I've validated in netstat, and >> that port IS listening on all interfaces, so I am not sure why it's >> providing the rst ack. >> >> One hunch is around the hostname that interpreter is listening. The >> Hostname I connect to in the webui is zeppelin.marathon.mesos (I am using >> mesos dns and haproxy-bridge) however perhaps that is causing the thrift >> server to deny something that said, it's connecting to local host, and it's >> not even getting to the app level (just SYN -> RST/ACK) so I am not sure >> how or why that would be occurring. >> >> I guess based on what I have seen, this SHOULD work. i.e. even though >> I've only exposed the UI and the web sockets port to the client, the docker >> container should be able to connect locally to any newly opened ports. The >> interpreter is starting fine.. so I guess are there any other steps I >> should take to try and trouble shoot? >> >> Thanks >> >> John >> >> >> >> >> >> Only thing in hive interpreter log: >> >> INFO [2015-06-16 13:07:07,150] ({Thread-0} >> RemoteInterpreterServer.java[run]:95) - Starting remote interpreter server >> on port 36365 >> >> >> tcpdump from container: >> >> 13:06:50.967696 IP 127.0.0.1.38133 > 127.0.0.1.36365: Flags [S], seq >> 340951329, win 65535, options [mss 65495,sackOK,TS val 300975824 ecr >> 0,nop,wscale 7], length 0 >> >> .R.!.........0......... >> >> ............ >> >> 13:06:50.967716 IP 127.0.0.1.36365 > 127.0.0.1.38133: Flags [R.], seq 0, >> ack 340951330, win 0, length 0 >> >> .......R."P....V..... >> >> 13:06:51.468191 IP 127.0.0.1.38137 > 127.0.0.1.36365: Flags [S], seq >> 3821372812, win 65535, options [mss 65495,sackOK,TS val 300975949 ecr >> 0,nop,wscale 7], length 0 >> >> .............0......... >> >> ...M........ >> >> 13:06:51.468216 IP 127.0.0.1.36365 > 127.0.0.1.38137: Flags [R.], seq 0, >> ack 3821372813, win 0, length 0 >> >> ..........P...%t..... >> >> 13:06:51.968677 IP 127.0.0.1.38142 > 127.0.0.1.36365: Flags [S], seq >> 2630719687, win 65535, options [mss 65495,sackOK,TS val 300976074 ecr >> 0,nop,wscale 7], length 0 >> >> .............0......... >> >> ............ >> >> 13:06:51.968693 IP 127.0.0.1.36365 > 127.0.0.1.38142: Flags [R.], seq 0, >> ack 2630719688, win 0, length 0 >> >> ..........P...Y,..... >> >> 13:06:52.469035 IP 127.0.0.1.38146 > 127.0.0.1.36365: Flags [S], seq >> 976891692, win 65535, options [mss 65495,sackOK,TS val 300976199 ecr >> 0,nop,wscale 7], length 0 >> >> ::/,.........0......... >> >> ...G........ >> >> 13:06:52.469052 IP 127.0.0.1.36365 > 127.0.0.1.38146: Flags [R.], seq 0, >> ack 976891693, win 0, length 0 >> >> ......::/-P...%W..... >> >> Error in Logs: >> >> INFO [2015-06-16 13:06:50,953] ({pool-1-thread-2} >> SchedulerFactory.java[jobStarted]:132) - Job >> paragraph_1434047295030_-1730740540 started by scheduler >> remoteinterpreter_236878590 >> >> INFO [2015-06-16 13:06:50,954] ({pool-1-thread-2} >> Paragraph.java[jobRun]:194) - run paragraph 20150611-132815_546208121 using >> hive org.apache.zeppelin.interpreter.LazyOpenInterpreter@12aa010c >> >> INFO [2015-06-16 13:06:50,966] ({pool-1-thread-2} >> RemoteInterpreterProcess.java[reference]:107) - Run interpreter process >> /zeppelin/bin/interpreter.sh -d /zeppelin/interpreter/hive -p 36365 >> >> ERROR [2015-06-16 13:06:56,023] ({Thread-35} >> RemoteScheduler.java[getStatus]:226) - Can't get status information >> >> org.apache.zeppelin.interpreter.InterpreterException: >> org.apache.thrift.transport.TTransportException: java.net.ConnectException: >> Connection refused >> >> at >> org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:53) >> >> at >> org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37) >> >> at >> org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60) >> >> at >> org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861) >> >> at >> org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435) >> >> at >> org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363) >> >> at >> org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:138) >> >> at >> org.apache.zeppelin.scheduler.RemoteScheduler$JobStatusPoller.getStatus(RemoteScheduler.java:224) >> >> at >> org.apache.zeppelin.scheduler.RemoteScheduler$JobStatusPoller.run(RemoteScheduler.java:183) >> >> Caused by: org.apache.thrift.transport.TTransportException: >> java.net.ConnectException: Connection refused >> >> at org.apache.thrift.transport.TSocket.open(TSocket.java:185) >> >> at >> org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51) >> >> ... 8 more >> >> >> >> >> Error in Std Err: >> >> org.apache.zeppelin.interpreter.InterpreterException: >> org.apache.zeppelin.interpreter.InterpreterException: >> org.apache.thrift.transport.TTransportException: java.net.ConnectException: >> Connection refused >> at >> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:135) >> at >> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:249) >> at >> org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:104) >> at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:202) >> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) >> at >> org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:296) >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> Caused by: org.apache.zeppelin.interpreter.InterpreterException: >> org.apache.thrift.transport.TTransportException: java.net.ConnectException: >> Connection refused >> at >> org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:53) >> at >> org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37) >> at >> org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60) >> at >> org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861) >> at >> org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435) >> at >> org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363) >> at >> org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:138) >> at >> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:133) >> ... 12 more >> Caused by: org.apache.thrift.transport.TTransportException: >> java.net.ConnectException: Connection refused >> at org.apache.thrift.transport.TSocket.open(TSocket.java:185) >> at >> org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51) >> ... 19 more >> Caused by: java.net.ConnectException: Connection refused >> at java.net.PlainSocketImpl.socketConnect(Native Method) >> at >> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) >> at >> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) >> at >> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) >> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) >> at java.net.Socket.connect(Socket.java:579) >> at org.apache.thrift.transport.TSocket.open(TSocket.java:180) >> ... 20 more >> >> Error in the Notebook: >> >> >> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:135) >> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:249) >> org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:104) >> org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:202) >> org.apache.zeppelin.scheduler.Job.run(Job.java:170) >> org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:296) >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >> java.util.concurrent.FutureTask.run(FutureTask.java:262) >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> java.lang.Thread.run(Thread.java:745) >> > >
