So I think I ran across an issue with Zeppelin/Docker but I wanted to describe here and see what people thought.
Basically I have a YARN cluster (Yarn... well Myriad on Mesos, but for pyspark, it works like Yarn). I have setup a docker container with all I need including Spark etc. When I try to run pyspark, my Application Master in Yarn tries to connect back to the driver which is inconveniently located in a docker container, and the only ports exposed are those for Zeppelin. (My yarn spark applicaiton is accepted, but I see in the logs, and it never "runs" 15/06/17 11:04:29 ERROR yarn.ApplicationMaster: Failed to connect to driver at 172.17.0.16:59601, retrying ... 15/06/17 11:04:32 ERROR yarn.ApplicationMaster: Failed to connect to driver at 172.17.0.16:59601, retrying . Ok, so yarn-client mode won't work (easily), however, yarn-cluster mode doesn't work either because you can't run a shell in yarn-cluster mode. So this is all just trying to run $SPARK_HOME/bin/pyspark --master yarn-client from inside of my Zeppelin Docker container (not even running Zeppelin yet) I am guessing that the issue is part of my issues I am having getting the spark interpreter running in Zeppelin because Zeppelin is doing just that, running pyspark in yarn-client mode. Ok, so how have people running Zeppelin in Docker dealt with this? What other ideas should I look into here?