So I think I ran across an issue with Zeppelin/Docker but I wanted to
describe here and see what people thought.

Basically I have a YARN cluster (Yarn... well Myriad on Mesos, but for
pyspark, it works like Yarn).

I have setup a docker container with all I need including Spark etc.

When I try to run pyspark, my Application Master in Yarn tries to connect
back to the driver which is inconveniently located in a docker container,
and the only ports exposed are those for Zeppelin. (My yarn spark
applicaiton is accepted, but I see in the logs, and it never "runs"

15/06/17 11:04:29 ERROR yarn.ApplicationMaster: Failed to connect to
driver at 172.17.0.16:59601, retrying ...
15/06/17 11:04:32 ERROR yarn.ApplicationMaster: Failed to connect to
driver at 172.17.0.16:59601, retrying .


Ok, so yarn-client mode won't work (easily), however, yarn-cluster mode
doesn't work either because you can't run a shell in yarn-cluster mode.

So this is all just trying to run $SPARK_HOME/bin/pyspark --master
yarn-client from inside of my Zeppelin Docker container (not even running
Zeppelin yet)

I am guessing that the issue is part of my issues I am having getting the
spark interpreter running in Zeppelin because Zeppelin is doing just that,
running pyspark in yarn-client mode.

Ok, so how have people running Zeppelin in Docker dealt with this?  What
other ideas should I look into here?

Reply via email to