Thanks. This worked :). I am thinking I should add this in spark-env.sh so
that spark-shell always connects to master be default.
On Aug 6, 2014 12:04 AM, "Akhil Das" <ak...@sigmoidanalytics.com> wrote:

> ​You can always start your spark-shell by specifying the master as
>
> MASTER=spark://*whatever*:7077 $SPARK_HOME/bin/spark-shell​
>
> Then it will connect to that *whatever* master.
>
>
> Thanks
> Best Regards
>
>
> On Tue, Aug 5, 2014 at 8:51 PM, Aniket Bhatnagar <
> aniket.bhatna...@gmail.com> wrote:
>
>> Hi
>>
>> Apologies if this is a noob question. I have setup Spark 1.0.1 on EMR
>> using a slightly modified version of script
>> @ s3://elasticmapreduce/samples/spark/1.0.0/install-spark-shark-yarn.rb. It
>> seems to be running fine with master logs stating:
>>
>> 14/08/05 14:36:56 INFO Master: I have been elected leader! New state:
>> ALIVE
>> 14/08/05 14:37:21 INFO Master: Registering worker
>> ip-10-0-2-80.ec2.internal:52029 with 2 cores, 6.3 GB RAM
>>
>> The script has also created spark-env.sh under conf which has the
>> following content:
>>
>> export SPARK_MASTER_IP=x.x.x.x
>> export SCALA_HOME=/home/hadoop/.versions/scala-2.10.3
>> export SPARK_LOCAL_DIRS=/mnt/spark/
>> export
>> SPARK_CLASSPATH="/usr/share/aws/emr/emr-fs/lib/*:/usr/share/aws/emr/lib/*:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/.versions/2.4.0/share/hadoop/common/lib/hadoop-lzo.jar"
>> export SPARK_DAEMON_JAVA_OPTS="-verbose:gc -XX:+PrintGCDetails
>> -XX:+PrintGCTimeStamps"
>> export
>> SPARK_ASSEMBLY_JAR=/home/hadoop/spark/lib/spark-assembly-1.0.1-hadoop2.4.0.jar
>>
>> However, when I run the spark-shell, sc.isLocal returns true. Also, no
>> matter how many RDDs I cache, the used memory in the master UI
>> (x.x.x.x:7077) shows 0B used. This leads me to believe that the spark-shell
>> isn't connecting to Spark master and has started a local instance of spark.
>> Is there something I am missing in my setup that allows for spark-shell to
>> connect to master?
>>
>> Thanks,
>> Aniket
>>
>
>

Reply via email to