Hello,

I have Spark 1.3.1 running well on EC2 with ephemeral hdfs using the
spark-ec2 script, quite happy with it.

I want to switch to persistent-hdfs in order to be able to maintain data
between cluster stop/starts. Unfortunately spark-ec stop/start causes spark
to revert back from persistent to ephemeral hdfs - it changes the HDFS_URL
environment variable and several others back to ephemeral.

I managed to get Spark with persistent-hdfs running once by grep'ing for the
ephemeral HDFS address and ports in all files and changing them to
persistent port (9000 to 9010). All was working. Then I stopped the cluster
and started it again and now I can't get persistent to work anymore...

Here are some of the configurations I've setup:

env: HDFS_HOME=/root/persistent-hdfs
env: HDFS_URL=hdfs://xxx.ec2.internal:9010
mapreduce/conf/core-site.xml:   
<value>hdfs://ec2-xx.compute-1.amazonaws.com:9010</value>
persistent-hdfs/conf/core-site.xml:   
<value>hdfs://ec2-xxx.compute-1.amazonaws.com:9010</value>
spark/conf/core-site.xml:   
<value>hdfs://ec2-xxx.compute-1.amazonaws.com:9010</value>

I've restarted the daemons using persistent-hdfs/bin/stop-all.sh ;
start-all.sh .

I can use the "hadoop" command to interact with persistent-hdfs - "hadoop fs
-ls" works, as do other hadoop fs commands.

However, when I start the python or scala shell and try to access HDFS I run
into the following issue:

Py4JJavaError: An error occurred while calling o25.load.
: java.lang.RuntimeException: java.net.ConnectException: Call to
xxx.compute-1.amazonaws.com/12.12.12.133:9000 failed on connection
exception: java.net.ConnectException: Connection refused

Note the port: it's 9000, as in ephemeral HDFS, instead of 9010 for
persistent.

Any ideas? What configuration am I missing to get pyshell / scala shell to
use persistent instead of ephemeral?

Best,

Tony



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Required-settings-for-permanent-HDFS-Spark-on-EC2-tp22860.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to