Hi, I configured spark 0.8.1 cluster on AWS with one master node and 3 worker nodes. The cluster was configured as a standalone cluster using http://spark.incubator.apache.org/docs/latest/spark-standalone.html
The distribution was generated the master node was started on master host with ./bin/start-master.sh Then on each of the worker nodes, I did a cd spark-distro directory and did ./spark-class org.apache.spark.deploy.worker.Worker spark://IPxxxx:7077 In the browser, on master 8080 port, I can see the 3 worker nodes ALIVE Next I start a spark shell on master node with MASTER=spark://IPxxx:7077 ./spark-shell. In it I create a simple RDD on a local text file with few lines and do countByKey(). The shell hangs. Doing ctrl-C gives scala> credit.countByKey() java.lang.InterruptedException at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.spark.scheduler.JobWaiter.awaitResult(JobWaiter.scala:73) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:318) at org.apache.spark.SparkContext.runJob(SparkContext.scala:840) at org.apache.spark.SparkContext.runJob(SparkContext.scala:909) at org.apache.spark.rdd.RDD.reduce(RDD.scala:654) at org.apache.spark.rdd.RDD.countByValue(RDD.scala:752) at org.apache.spark.rdd.PairRDDFunctions.countByKey(PairRDDFunctions.scala:198) Note - the same works in a local shell (without master). Any pointers? Do I have to set any other network/logins? Note I am *** NOT *** starting slaves from the master node (using bin/start-slaves.sh) and thus have not set passwordless ssh login etc.