Re: Deploying a python code on a spark EC2 cluster

2014-04-25 Thread Shubhabrata
This is the error from stderr:


Spark Executor Command: java -cp
:/root/ephemeral-hdfs/conf:/root/ephemeral-hdfs/conf:/root/ephemeral-hdfs/conf:/root/spark/conf:/root/spark/assembly/target/scala-2.10/spark-assembly_2.10-0.9.1-hadoop1.0.4.jar
-Djava.library.path=/root/ephemeral-hdfs/lib/native/
-Dspark.local.dir=/mnt/spark -Dspark.local.dir=/mnt/spark
-Dspark.local.dir=/mnt/spark -Dspark.local.dir=/mnt/spark -Xms2048M
-Xmx2048M org.apache.spark.executor.CoarseGrainedExecutorBackend
akka.tcp://spark@192.168.122.1:44577/user/CoarseGrainedScheduler 1
ip-10-84-7-178.eu-west-1.compute.internal 1
akka.tcp://sparkwor...@ip-10-84-7-178.eu-west-1.compute.internal:57839/user/Worker
app-20140425133749-


14/04/25 13:39:37 INFO slf4j.Slf4jLogger: Slf4jLogger started
14/04/25 13:39:38 INFO Remoting: Starting remoting
14/04/25 13:39:38 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://sparkexecu...@ip-10-84-7-178.eu-west-1.compute.internal:36800]
14/04/25 13:39:38 INFO Remoting: Remoting now listens on addresses:
[akka.tcp://sparkexecu...@ip-10-84-7-178.eu-west-1.compute.internal:36800]
14/04/25 13:39:38 INFO worker.WorkerWatcher: Connecting to worker
akka.tcp://sparkwor...@ip-10-84-7-178.eu-west-1.compute.internal:57839/user/Worker
14/04/25 13:39:38 INFO executor.CoarseGrainedExecutorBackend: Connecting to
driver: akka.tcp://spark@192.168.122.1:44577/user/CoarseGrainedScheduler
14/04/25 13:39:39 INFO worker.WorkerWatcher: Successfully connected to
akka.tcp://sparkwor...@ip-10-84-7-178.eu-west-1.compute.internal:57839/user/Worker
14/04/25 13:41:19 ERROR executor.CoarseGrainedExecutorBackend: Driver
Disassociated
[akka.tcp://sparkexecu...@ip-10-84-7-178.eu-west-1.compute.internal:36800]
- [akka.tcp://spark@192.168.122.1:44577] disassociated! Shutting down.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Deploying-a-python-code-on-a-spark-EC2-cluster-tp4758p4828.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: Deploying a python code on a spark EC2 cluster

2014-04-25 Thread Shubhabrata
:
app-20140425160713-0002/7 is now RUNNING
14/04/25 17:07:13 INFO AppClient$ClientActor: Executor updated:
app-20140425160713-0002/7 is now FAILED (class java.io.IOException: Cannot
run program /mnt/work/spark/bin/compute-classpath.sh (in directory .):
error=2, No such file or directory)
14/04/25 17:07:13 INFO SparkDeploySchedulerBackend: Executor
app-20140425160713-0002/7 removed: class java.io.IOException: Cannot run
program /mnt/work/spark/bin/compute-classpath.sh (in directory .):
error=2, No such file or directory
14/04/25 17:07:13 INFO AppClient$ClientActor: Executor added:
app-20140425160713-0002/8 on
worker-20140425133348-ip-10-84-7-178.eu-west-1.compute.internal-57839
(ip-10-84-7-178.eu-west-1.compute.internal:57839) with 1 cores
14/04/25 17:07:13 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20140425160713-0002/8 on hostPort
ip-10-84-7-178.eu-west-1.compute.internal:57839 with 1 cores, 512.0 MB RAM
14/04/25 17:07:13 INFO AppClient$ClientActor: Executor updated:
app-20140425160713-0002/8 is now RUNNING
14/04/25 17:07:13 INFO AppClient$ClientActor: Executor updated:
app-20140425160713-0002/8 is now FAILED (class java.io.IOException: Cannot
run program /mnt/work/spark/bin/compute-classpath.sh (in directory .):
error=2, No such file or directory)
14/04/25 17:07:13 INFO SparkDeploySchedulerBackend: Executor
app-20140425160713-0002/8 removed: class java.io.IOException: Cannot run
program /mnt/work/spark/bin/compute-classpath.sh (in directory .):
error=2, No such file or directory
14/04/25 17:07:13 INFO AppClient$ClientActor: Executor added:
app-20140425160713-0002/9 on
worker-20140425133348-ip-10-84-7-178.eu-west-1.compute.internal-57839
(ip-10-84-7-178.eu-west-1.compute.internal:57839) with 1 cores
14/04/25 17:07:13 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20140425160713-0002/9 on hostPort
ip-10-84-7-178.eu-west-1.compute.internal:57839 with 1 cores, 512.0 MB RAM
14/04/25 17:07:13 INFO AppClient$ClientActor: Executor updated:
app-20140425160713-0002/9 is now RUNNING
14/04/25 17:07:13 INFO AppClient$ClientActor: Executor updated:
app-20140425160713-0002/9 is now FAILED (class java.io.IOException: Cannot
run program /mnt/work/spark/bin/compute-classpath.sh (in directory .):
error=2, No such file or directory)
14/04/25 17:07:13 INFO SparkDeploySchedulerBackend: Executor
app-20140425160713-0002/9 removed: class java.io.IOException: Cannot run
program /mnt/work/spark/bin/compute-classpath.sh (in directory .):
error=2, No such file or directory
14/04/25 17:07:13 ERROR AppClient$ClientActor: Master removed our
application: FAILED; stopping client
14/04/25 17:07:13 WARN SparkDeploySchedulerBackend: Disconnected from Spark
cluster! Waiting for reconnection...
14/04/25 17:07:28 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory
14/04/25 17:07:43 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory
14/04/25 17:07:58 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory
14/04/25 17:08:13 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory
14/04/25 17:08:28 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Deploying-a-python-code-on-a-spark-EC2-cluster-tp4758p4833.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: Deploying a python code on a spark EC2 cluster

2014-04-24 Thread Shubhabrata
Moreover it seems all the workers are registered and have sufficient memory
(2.7GB where as I have asked for 512 MB). The UI also shows the jobs are
running on the slaves. But on the termial it is still the same error
Initial job has not accepted any resources; check your cluster UI to ensure
that workers are registered and have sufficient memory

Please see the screenshot. Thanks

http://apache-spark-user-list.1001560.n3.nabble.com/file/n4761/33.png 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Deploying-a-python-code-on-a-spark-EC2-cluster-tp4758p4761.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: Deploying a python code on a spark EC2 cluster

2014-04-24 Thread Matei Zaharia
Did you launch this using our EC2 scripts 
(http://spark.apache.org/docs/latest/ec2-scripts.html) or did you manually set 
up the daemons? My guess is that their hostnames are not being resolved 
properly on all nodes, so executor processes can’t connect back to your driver 
app. This error message indicates that:

14/04/24 09:00:49 WARN util.Utils: Your hostname, spark-node resolves to a
loopback address: 127.0.0.1; using 10.74.149.251 instead (on interface eth0)
14/04/24 09:00:49 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to
another address

If you launch with your EC2 scripts, or don’t manually change the hostnames, 
this should not happen.

Matei

On Apr 24, 2014, at 11:36 AM, John King usedforprinting...@gmail.com wrote:

 Same problem.
 
 
 On Thu, Apr 24, 2014 at 10:54 AM, Shubhabrata mail2shu...@gmail.com wrote:
 Moreover it seems all the workers are registered and have sufficient memory
 (2.7GB where as I have asked for 512 MB). The UI also shows the jobs are
 running on the slaves. But on the termial it is still the same error
 Initial job has not accepted any resources; check your cluster UI to ensure
 that workers are registered and have sufficient memory
 
 Please see the screenshot. Thanks
 
 http://apache-spark-user-list.1001560.n3.nabble.com/file/n4761/33.png
 
 
 
 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/Deploying-a-python-code-on-a-spark-EC2-cluster-tp4758p4761.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.
 



Re: Deploying a python code on a spark EC2 cluster

2014-04-24 Thread John King
This happens to me when using the EC2 scripts for v1.0.0rc2 recent release.
The Master connects and then disconnects immediately, eventually saying
Master disconnected from cluster.


On Thu, Apr 24, 2014 at 4:01 PM, Matei Zaharia matei.zaha...@gmail.comwrote:

 Did you launch this using our EC2 scripts (
 http://spark.apache.org/docs/latest/ec2-scripts.html) or did you manually
 set up the daemons? My guess is that their hostnames are not being resolved
 properly on all nodes, so executor processes can’t connect back to your
 driver app. This error message indicates that:

 14/04/24 09:00:49 WARN util.Utils: Your hostname, spark-node resolves to a
 loopback address: 127.0.0.1; using 10.74.149.251 instead (on interface
 eth0)
 14/04/24 09:00:49 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind
 to
 another address

 If you launch with your EC2 scripts, or don’t manually change the
 hostnames, this should not happen.

 Matei

 On Apr 24, 2014, at 11:36 AM, John King usedforprinting...@gmail.com
 wrote:

 Same problem.


 On Thu, Apr 24, 2014 at 10:54 AM, Shubhabrata mail2shu...@gmail.comwrote:

 Moreover it seems all the workers are registered and have sufficient
 memory
 (2.7GB where as I have asked for 512 MB). The UI also shows the jobs are
 running on the slaves. But on the termial it is still the same error
 Initial job has not accepted any resources; check your cluster UI to
 ensure
 that workers are registered and have sufficient memory

 Please see the screenshot. Thanks

 http://apache-spark-user-list.1001560.n3.nabble.com/file/n4761/33.png



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Deploying-a-python-code-on-a-spark-EC2-cluster-tp4758p4761.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.