Re: Trouble while running spark at ec2 cluster
Hi Hassan Typically I log on to my master to submit my app. [ec2-user@ip-172-31-11-222 bin]$ echo $SPARK_ROOT /root/spark [ec2-user@ip-172-31-11-222 bin]$echo $MASTER_URL spark://ec2-54-215-11-222.us-west-1.compute.amazonaws.com:7077 [ec2-user@ip-172-31-11-222 bin]$ $SPARK_ROOT/bin/spark-submit \ --class "com.pws.sparkStreaming.collector.StreamingKafkaCollector" \ --master $MASTER_URL I think you might be trying to launch your application from a machine outside of your ec2 cluster. I do not think that is going to work when you submit to port 7077 because the driver is going to be on your local machine. Also you probably have a file wall issue Andy From: Hassaan Chaudhry <mhassaanchaud...@gmail.com> Date: Friday, July 15, 2016 at 9:32 PM To: "user @spark" <user@spark.apache.org> Subject: Trouble while running spark at ec2 cluster > > Hi > > I have launched my cluster and I am trying to submit my application to run on > cluster but its not allowing me to connect . It prompts the following error > "Master endpoint > spark://ec2-54-187-59-117.us-west-2.compute.amazonaws.com:7077 > <http://ec2-54-187-59-117.us-west-2.compute.amazonaws.com:7077/> was not a > REST server." The command I use to run my application on cluster is > > " /spark-1.6.1/bin/spark-submit --master spark://ec2-54-200-193-107.us-west- > 2.compute.amazonaws.com:7077 <http://2.compute.amazonaws.com:7077/> > --deploy-mode cluster --class BFS target/scala- > 2.10/scalaexample_2.10-1.0.jar " > > Am i missing something ? Your help will be highly appreciated . > > P.S I have even tried adding inbound rule to my master node but still no > success. > > Thanks
Trouble while running spark at ec2 cluster
Hi I have launched my cluster and I am trying to submit my application to run on cluster but its not allowing me to connect . It prompts the following error "*Master endpoint spark://**ec2-54-187-59-117.us-west-2.compute.amazonaws.com:7077 <http://ec2-54-187-59-117.us-west-2.compute.amazonaws.com:7077/>** was not a REST server.*" The command I use to run my application on cluster is *" /spark-1.6.1/bin/spark-submit --master spark://ec2-54-200-193-107.us-west- **2.compute.amazonaws.com:7077 <http://2.compute.amazonaws.com:7077/>** --deploy-mode cluster --class BFS target/scala- 2.10/scalaexample_2.10-1.0.jar "* Am i missing something ? Your help will be highly appreciated . *P.S * *I have even tried adding inbound rule to my master node but still no success.* Thanks
Re: Upgrading Spark in EC2 clusters
Thanks for the info and the tip! I'll look into writing our own script based on the spark-ec2 scripts. Best, Augustus On Thu, Nov 12, 2015 at 10:01 AM, Jason Rubenstein < jasondrubenst...@gmail.com> wrote: > Hi, > > With some minor changes to spark-ec2/spark/init.sh and writing your own > "upgrade-spark.sh" script, you can upgrade spark in place. > > (Make sure to call not only spark/init.sh but also spark/setup.sh, because > the latter uses copy-dir to get your ner version of spark to the slaves) > > I wrote one so we could upgrade to a specific version of Spark (via > commit-hash) and used it to upgrade from 1.4.1. to 1.5.0 > > best, > Jason > > > On Thu, Nov 12, 2015 at 9:49 AM, Nicholas Chammas < > nicholas.cham...@gmail.com> wrote: > >> spark-ec2 does not offer a way to upgrade an existing cluster, and from >> what I gather, it wasn't intended to be used to manage long-lasting >> infrastructure. The recommended approach really is to just destroy your >> existing cluster and launch a new one with the desired configuration. >> >> If you want to upgrade the cluster in place, you'll probably have to do >> that manually. Otherwise, perhaps spark-ec2 is not the right tool, and >> instead you want one of those "grown-up" management tools like Ansible >> which can be setup to allow in-place upgrades. That'll take a bit of work, >> though. >> >> Nick >> >> On Wed, Nov 11, 2015 at 6:01 PM Augustus Hong <augus...@branchmetrics.io> >> wrote: >> >>> Hey All, >>> >>> I have a Spark cluster(running version 1.5.0) on EC2 launched with the >>> provided spark-ec2 scripts. If I want to upgrade Spark to 1.5.2 in the same >>> cluster, what's the safest / recommended way to do that? >>> >>> >>> I know I can spin up a new cluster running 1.5.2, but it doesn't seem >>> efficient to spin up a new cluster every time we need to upgrade. >>> >>> >>> Thanks, >>> Augustus >>> >>> >>> >>> >>> >>> -- >>> [image: Branch Metrics mobile deep linking] <http://branch.io/>* Augustus >>> Hong* >>> Data Analytics | Branch Metrics >>> m 650-391-3369 | e augus...@branch.io >>> >> > -- [image: Branch Metrics mobile deep linking] <http://branch.io/>* Augustus Hong* Data Analytics | Branch Metrics m 650-391-3369 | e augus...@branch.io
Re: Upgrading Spark in EC2 clusters
spark-ec2 does not offer a way to upgrade an existing cluster, and from what I gather, it wasn't intended to be used to manage long-lasting infrastructure. The recommended approach really is to just destroy your existing cluster and launch a new one with the desired configuration. If you want to upgrade the cluster in place, you'll probably have to do that manually. Otherwise, perhaps spark-ec2 is not the right tool, and instead you want one of those "grown-up" management tools like Ansible which can be setup to allow in-place upgrades. That'll take a bit of work, though. Nick On Wed, Nov 11, 2015 at 6:01 PM Augustus Hong <augus...@branchmetrics.io> wrote: > Hey All, > > I have a Spark cluster(running version 1.5.0) on EC2 launched with the > provided spark-ec2 scripts. If I want to upgrade Spark to 1.5.2 in the same > cluster, what's the safest / recommended way to do that? > > > I know I can spin up a new cluster running 1.5.2, but it doesn't seem > efficient to spin up a new cluster every time we need to upgrade. > > > Thanks, > Augustus > > > > > > -- > [image: Branch Metrics mobile deep linking] <http://branch.io/>* Augustus > Hong* > Data Analytics | Branch Metrics > m 650-391-3369 | e augus...@branch.io >
Re: Upgrading Spark in EC2 clusters
Hi, With some minor changes to spark-ec2/spark/init.sh and writing your own "upgrade-spark.sh" script, you can upgrade spark in place. (Make sure to call not only spark/init.sh but also spark/setup.sh, because the latter uses copy-dir to get your ner version of spark to the slaves) I wrote one so we could upgrade to a specific version of Spark (via commit-hash) and used it to upgrade from 1.4.1. to 1.5.0 best, Jason On Thu, Nov 12, 2015 at 9:49 AM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > spark-ec2 does not offer a way to upgrade an existing cluster, and from > what I gather, it wasn't intended to be used to manage long-lasting > infrastructure. The recommended approach really is to just destroy your > existing cluster and launch a new one with the desired configuration. > > If you want to upgrade the cluster in place, you'll probably have to do > that manually. Otherwise, perhaps spark-ec2 is not the right tool, and > instead you want one of those "grown-up" management tools like Ansible > which can be setup to allow in-place upgrades. That'll take a bit of work, > though. > > Nick > > On Wed, Nov 11, 2015 at 6:01 PM Augustus Hong <augus...@branchmetrics.io> > wrote: > >> Hey All, >> >> I have a Spark cluster(running version 1.5.0) on EC2 launched with the >> provided spark-ec2 scripts. If I want to upgrade Spark to 1.5.2 in the same >> cluster, what's the safest / recommended way to do that? >> >> >> I know I can spin up a new cluster running 1.5.2, but it doesn't seem >> efficient to spin up a new cluster every time we need to upgrade. >> >> >> Thanks, >> Augustus >> >> >> >> >> >> -- >> [image: Branch Metrics mobile deep linking] <http://branch.io/>* Augustus >> Hong* >> Data Analytics | Branch Metrics >> m 650-391-3369 | e augus...@branch.io >> >
Upgrading Spark in EC2 clusters
Hey All, I have a Spark cluster(running version 1.5.0) on EC2 launched with the provided spark-ec2 scripts. If I want to upgrade Spark to 1.5.2 in the same cluster, what's the safest / recommended way to do that? I know I can spin up a new cluster running 1.5.2, but it doesn't seem efficient to spin up a new cluster every time we need to upgrade. Thanks, Augustus -- [image: Branch Metrics mobile deep linking] <http://branch.io/>* Augustus Hong* Data Analytics | Branch Metrics m 650-391-3369 | e augus...@branch.io
Re: Networking issues with Spark on EC2
Hi, Nopes. I was trying to use EC2(due to a few constraints) where I faced the problem. With EMR, it works flawlessly. But, I would like to go back and use EC2 if I can fix this issue. Has anybody set up a spark cluster using plain EC2 machines. What steps did you follow? Thanks and Regards, Suraj Sheth On Sat, Sep 26, 2015 at 10:36 AM, Natu Lauchande <nlaucha...@gmail.com> wrote: > Hi, > > Are you using EMR ? > > Natu > > On Sat, Sep 26, 2015 at 6:55 AM, SURAJ SHETH <shet...@gmail.com> wrote: > >> Hi Ankur, >> Thanks for the reply. >> This is already done. >> If I wait for a long amount of time(10 minutes), a few tasks get >> successful even on slave nodes. Sometime, a fraction of the tasks(20%) are >> completed on all the machines in the initial 5 seconds and then, it slows >> down drastically. >> >> Thanks and Regards, >> Suraj Sheth >> >> On Fri, Sep 25, 2015 at 2:10 AM, Ankur Srivastava < >> ankur.srivast...@gmail.com> wrote: >> >>> Hi Suraj, >>> >>> Spark uses a lot of ports to communicate between nodes. Probably your >>> security group is restrictive and does not allow instances to communicate >>> on all networks. The easiest way to resolve it is to add a Rule to allow >>> all Inbound traffic on all ports (0-65535) to instances in same >>> security group like this. >>> >>> All TCP >>> TCP >>> 0 - 65535 >>> your security group >>> >>> Hope this helps!! >>> >>> Thanks >>> Ankur >>> >>> On Thu, Sep 24, 2015 at 7:09 AM SURAJ SHETH <shet...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> I am using Spark 1.2 and facing network related issues while performing >>>> simple computations. >>>> >>>> This is a custom cluster set up using ec2 machines and spark prebuilt >>>> binary from apache site. The problem is only when we have workers on other >>>> machines(networking involved). Having a single node for the master and the >>>> slave works correctly. >>>> >>>> The error log from slave node is attached below. It is reading textFile >>>> from local FS(copied each node) and counting it. The first 30 tasks get >>>> completed within 5 seconds. Then, it takes several minutes to complete >>>> another 10 tasks and eventually dies. >>>> >>>> Sometimes, one of the workers completes all the tasks assigned to it. >>>> Different workers have different behavior at different >>>> times(non-deterministic). >>>> >>>> Is it related to something specific to EC2? >>>> >>>> >>>> >>>> 15/09/24 13:04:40 INFO Executor: Running task 117.0 in stage 0.0 (TID >>>> 117) >>>> >>>> 15/09/24 13:04:41 INFO TorrentBroadcast: Started reading broadcast >>>> variable 1 >>>> >>>> 15/09/24 13:04:41 INFO SendingConnection: Initiating connection to >>>> [master_ip:56305] >>>> >>>> 15/09/24 13:04:41 INFO SendingConnection: Connected to >>>> [master_ip/master_ip_address:56305], 1 messages pending >>>> >>>> 15/09/24 13:05:41 INFO TorrentBroadcast: Started reading broadcast >>>> variable 1 >>>> >>>> 15/09/24 13:05:41 ERROR Executor: Exception in task 77.0 in stage 0.0 >>>> (TID 77) >>>> >>>> java.io.IOException: sendMessageReliably failed because ack was not >>>> received within 60 sec >>>> >>>> at >>>> org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:918) >>>> >>>> at >>>> org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:917) >>>> >>>> at scala.Option.foreach(Option.scala:236) >>>> >>>> at >>>> org.apache.spark.network.nio.ConnectionManager$$anon$13.run(ConnectionManager.scala:917) >>>> >>>> at >>>> io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581) >>>> >>>> at >>>> io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:656) >>>> >>>> at >>>> io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367) >>>> >>>>
Re: Networking issues with Spark on EC2
Hi, Are you using EMR ? Natu On Sat, Sep 26, 2015 at 6:55 AM, SURAJ SHETH <shet...@gmail.com> wrote: > Hi Ankur, > Thanks for the reply. > This is already done. > If I wait for a long amount of time(10 minutes), a few tasks get > successful even on slave nodes. Sometime, a fraction of the tasks(20%) are > completed on all the machines in the initial 5 seconds and then, it slows > down drastically. > > Thanks and Regards, > Suraj Sheth > > On Fri, Sep 25, 2015 at 2:10 AM, Ankur Srivastava < > ankur.srivast...@gmail.com> wrote: > >> Hi Suraj, >> >> Spark uses a lot of ports to communicate between nodes. Probably your >> security group is restrictive and does not allow instances to communicate >> on all networks. The easiest way to resolve it is to add a Rule to allow >> all Inbound traffic on all ports (0-65535) to instances in same security >> group like this. >> >> All TCP >> TCP >> 0 - 65535 >> your security group >> >> Hope this helps!! >> >> Thanks >> Ankur >> >> On Thu, Sep 24, 2015 at 7:09 AM SURAJ SHETH <shet...@gmail.com> wrote: >> >>> Hi, >>> >>> I am using Spark 1.2 and facing network related issues while performing >>> simple computations. >>> >>> This is a custom cluster set up using ec2 machines and spark prebuilt >>> binary from apache site. The problem is only when we have workers on other >>> machines(networking involved). Having a single node for the master and the >>> slave works correctly. >>> >>> The error log from slave node is attached below. It is reading textFile >>> from local FS(copied each node) and counting it. The first 30 tasks get >>> completed within 5 seconds. Then, it takes several minutes to complete >>> another 10 tasks and eventually dies. >>> >>> Sometimes, one of the workers completes all the tasks assigned to it. >>> Different workers have different behavior at different >>> times(non-deterministic). >>> >>> Is it related to something specific to EC2? >>> >>> >>> >>> 15/09/24 13:04:40 INFO Executor: Running task 117.0 in stage 0.0 (TID >>> 117) >>> >>> 15/09/24 13:04:41 INFO TorrentBroadcast: Started reading broadcast >>> variable 1 >>> >>> 15/09/24 13:04:41 INFO SendingConnection: Initiating connection to >>> [master_ip:56305] >>> >>> 15/09/24 13:04:41 INFO SendingConnection: Connected to >>> [master_ip/master_ip_address:56305], 1 messages pending >>> >>> 15/09/24 13:05:41 INFO TorrentBroadcast: Started reading broadcast >>> variable 1 >>> >>> 15/09/24 13:05:41 ERROR Executor: Exception in task 77.0 in stage 0.0 >>> (TID 77) >>> >>> java.io.IOException: sendMessageReliably failed because ack was not >>> received within 60 sec >>> >>> at >>> org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:918) >>> >>> at >>> org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:917) >>> >>> at scala.Option.foreach(Option.scala:236) >>> >>> at >>> org.apache.spark.network.nio.ConnectionManager$$anon$13.run(ConnectionManager.scala:917) >>> >>> at >>> io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581) >>> >>> at >>> io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:656) >>> >>> at >>> io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367) >>> >>> at java.lang.Thread.run(Thread.java:745) >>> >>> 15/09/24 13:05:41 INFO CoarseGrainedExecutorBackend: Got assigned task >>> 122 >>> >>> 15/09/24 13:05:41 INFO Executor: Running task 3.1 in stage 0.0 (TID 122) >>> >>> 15/09/24 13:06:41 ERROR Executor: Exception in task 113.0 in stage 0.0 >>> (TID 113) >>> >>> java.io.IOException: sendMessageReliably failed because ack was not >>> received within 60 sec >>> >>> at >>> org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:918) >>> >>> at >>> org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:917) >>> >>> at sc
Re: Networking issues with Spark on EC2
Hi Ankur, Thanks for the reply. This is already done. If I wait for a long amount of time(10 minutes), a few tasks get successful even on slave nodes. Sometime, a fraction of the tasks(20%) are completed on all the machines in the initial 5 seconds and then, it slows down drastically. Thanks and Regards, Suraj Sheth On Fri, Sep 25, 2015 at 2:10 AM, Ankur Srivastava < ankur.srivast...@gmail.com> wrote: > Hi Suraj, > > Spark uses a lot of ports to communicate between nodes. Probably your > security group is restrictive and does not allow instances to communicate > on all networks. The easiest way to resolve it is to add a Rule to allow > all Inbound traffic on all ports (0-65535) to instances in same security > group like this. > > All TCP > TCP > 0 - 65535 > your security group > > Hope this helps!! > > Thanks > Ankur > > On Thu, Sep 24, 2015 at 7:09 AM SURAJ SHETH <shet...@gmail.com> wrote: > >> Hi, >> >> I am using Spark 1.2 and facing network related issues while performing >> simple computations. >> >> This is a custom cluster set up using ec2 machines and spark prebuilt >> binary from apache site. The problem is only when we have workers on other >> machines(networking involved). Having a single node for the master and the >> slave works correctly. >> >> The error log from slave node is attached below. It is reading textFile >> from local FS(copied each node) and counting it. The first 30 tasks get >> completed within 5 seconds. Then, it takes several minutes to complete >> another 10 tasks and eventually dies. >> >> Sometimes, one of the workers completes all the tasks assigned to it. >> Different workers have different behavior at different >> times(non-deterministic). >> >> Is it related to something specific to EC2? >> >> >> >> 15/09/24 13:04:40 INFO Executor: Running task 117.0 in stage 0.0 (TID 117) >> >> 15/09/24 13:04:41 INFO TorrentBroadcast: Started reading broadcast >> variable 1 >> >> 15/09/24 13:04:41 INFO SendingConnection: Initiating connection to >> [master_ip:56305] >> >> 15/09/24 13:04:41 INFO SendingConnection: Connected to >> [master_ip/master_ip_address:56305], 1 messages pending >> >> 15/09/24 13:05:41 INFO TorrentBroadcast: Started reading broadcast >> variable 1 >> >> 15/09/24 13:05:41 ERROR Executor: Exception in task 77.0 in stage 0.0 >> (TID 77) >> >> java.io.IOException: sendMessageReliably failed because ack was not >> received within 60 sec >> >> at >> org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:918) >> >> at >> org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:917) >> >> at scala.Option.foreach(Option.scala:236) >> >> at >> org.apache.spark.network.nio.ConnectionManager$$anon$13.run(ConnectionManager.scala:917) >> >> at >> io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581) >> >> at >> io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:656) >> >> at >> io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367) >> >> at java.lang.Thread.run(Thread.java:745) >> >> 15/09/24 13:05:41 INFO CoarseGrainedExecutorBackend: Got assigned task 122 >> >> 15/09/24 13:05:41 INFO Executor: Running task 3.1 in stage 0.0 (TID 122) >> >> 15/09/24 13:06:41 ERROR Executor: Exception in task 113.0 in stage 0.0 >> (TID 113) >> >> java.io.IOException: sendMessageReliably failed because ack was not >> received within 60 sec >> >> at >> org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:918) >> >> at >> org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:917) >> >> at scala.Option.foreach(Option.scala:236) >> >> at >> org.apache.spark.network.nio.ConnectionManager$$anon$13.run(ConnectionManager.scala:917) >> >> at >> io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581) >> >> at >> io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:656) >> >> at >> io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367) >> >> at java.lang.Thread.run(T
Networking issues with Spark on EC2
Hi, I am using Spark 1.2 and facing network related issues while performing simple computations. This is a custom cluster set up using ec2 machines and spark prebuilt binary from apache site. The problem is only when we have workers on other machines(networking involved). Having a single node for the master and the slave works correctly. The error log from slave node is attached below. It is reading textFile from local FS(copied each node) and counting it. The first 30 tasks get completed within 5 seconds. Then, it takes several minutes to complete another 10 tasks and eventually dies. Sometimes, one of the workers completes all the tasks assigned to it. Different workers have different behavior at different times(non-deterministic). Is it related to something specific to EC2? 15/09/24 13:04:40 INFO Executor: Running task 117.0 in stage 0.0 (TID 117) 15/09/24 13:04:41 INFO TorrentBroadcast: Started reading broadcast variable 1 15/09/24 13:04:41 INFO SendingConnection: Initiating connection to [master_ip:56305] 15/09/24 13:04:41 INFO SendingConnection: Connected to [master_ip/master_ip_address:56305], 1 messages pending 15/09/24 13:05:41 INFO TorrentBroadcast: Started reading broadcast variable 1 15/09/24 13:05:41 ERROR Executor: Exception in task 77.0 in stage 0.0 (TID 77) java.io.IOException: sendMessageReliably failed because ack was not received within 60 sec at org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:918) at org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:917) at scala.Option.foreach(Option.scala:236) at org.apache.spark.network.nio.ConnectionManager$$anon$13.run(ConnectionManager.scala:917) at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581) at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:656) at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367) at java.lang.Thread.run(Thread.java:745) 15/09/24 13:05:41 INFO CoarseGrainedExecutorBackend: Got assigned task 122 15/09/24 13:05:41 INFO Executor: Running task 3.1 in stage 0.0 (TID 122) 15/09/24 13:06:41 ERROR Executor: Exception in task 113.0 in stage 0.0 (TID 113) java.io.IOException: sendMessageReliably failed because ack was not received within 60 sec at org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:918) at org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:917) at scala.Option.foreach(Option.scala:236) at org.apache.spark.network.nio.ConnectionManager$$anon$13.run(ConnectionManager.scala:917) at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581) at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:656) at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367) at java.lang.Thread.run(Thread.java:745) 15/09/24 13:06:41 INFO TorrentBroadcast: Started reading broadcast variable 1 15/09/24 13:06:41 INFO SendingConnection: Initiating connection to [master_ip/master_ip_address:44427] 15/09/24 13:06:41 INFO SendingConnection: Connected to [master_ip/master_ip_address:44427], 1 messages pending 15/09/24 13:07:41 ERROR Executor: Exception in task 37.0 in stage 0.0 (TID 37) java.io.IOException: sendMessageReliably failed because ack was not received within 60 sec at org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:918) at org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:917) at scala.Option.foreach(Option.scala:236) at org.apache.spark.network.nio.ConnectionManager$$anon$13.run(ConnectionManager.scala:917) at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581) at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:656) at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367) at java.lang.Thread.run(Thread.java:745) I checked the network speed between the master and the slave and it is able to scp large files at a speed of 60 MB/s. Any leads on how this can be fixed? Thanks and Regards, Suraj Sheth
Re: Networking issues with Spark on EC2
Hi Suraj, Spark uses a lot of ports to communicate between nodes. Probably your security group is restrictive and does not allow instances to communicate on all networks. The easiest way to resolve it is to add a Rule to allow all Inbound traffic on all ports (0-65535) to instances in same security group like this. All TCP TCP 0 - 65535 your security group Hope this helps!! Thanks Ankur On Thu, Sep 24, 2015 at 7:09 AM SURAJ SHETH <shet...@gmail.com> wrote: > Hi, > > I am using Spark 1.2 and facing network related issues while performing > simple computations. > > This is a custom cluster set up using ec2 machines and spark prebuilt > binary from apache site. The problem is only when we have workers on other > machines(networking involved). Having a single node for the master and the > slave works correctly. > > The error log from slave node is attached below. It is reading textFile > from local FS(copied each node) and counting it. The first 30 tasks get > completed within 5 seconds. Then, it takes several minutes to complete > another 10 tasks and eventually dies. > > Sometimes, one of the workers completes all the tasks assigned to it. > Different workers have different behavior at different > times(non-deterministic). > > Is it related to something specific to EC2? > > > > 15/09/24 13:04:40 INFO Executor: Running task 117.0 in stage 0.0 (TID 117) > > 15/09/24 13:04:41 INFO TorrentBroadcast: Started reading broadcast > variable 1 > > 15/09/24 13:04:41 INFO SendingConnection: Initiating connection to > [master_ip:56305] > > 15/09/24 13:04:41 INFO SendingConnection: Connected to > [master_ip/master_ip_address:56305], 1 messages pending > > 15/09/24 13:05:41 INFO TorrentBroadcast: Started reading broadcast > variable 1 > > 15/09/24 13:05:41 ERROR Executor: Exception in task 77.0 in stage 0.0 (TID > 77) > > java.io.IOException: sendMessageReliably failed because ack was not > received within 60 sec > > at > org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:918) > > at > org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:917) > > at scala.Option.foreach(Option.scala:236) > > at > org.apache.spark.network.nio.ConnectionManager$$anon$13.run(ConnectionManager.scala:917) > > at > io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581) > > at > io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:656) > > at > io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367) > > at java.lang.Thread.run(Thread.java:745) > > 15/09/24 13:05:41 INFO CoarseGrainedExecutorBackend: Got assigned task 122 > > 15/09/24 13:05:41 INFO Executor: Running task 3.1 in stage 0.0 (TID 122) > > 15/09/24 13:06:41 ERROR Executor: Exception in task 113.0 in stage 0.0 > (TID 113) > > java.io.IOException: sendMessageReliably failed because ack was not > received within 60 sec > > at > org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:918) > > at > org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:917) > > at scala.Option.foreach(Option.scala:236) > > at > org.apache.spark.network.nio.ConnectionManager$$anon$13.run(ConnectionManager.scala:917) > > at > io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581) > > at > io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:656) > > at > io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367) > > at java.lang.Thread.run(Thread.java:745) > > 15/09/24 13:06:41 INFO TorrentBroadcast: Started reading broadcast > variable 1 > > 15/09/24 13:06:41 INFO SendingConnection: Initiating connection to > [master_ip/master_ip_address:44427] > > 15/09/24 13:06:41 INFO SendingConnection: Connected to > [master_ip/master_ip_address:44427], 1 messages pending > > 15/09/24 13:07:41 ERROR Executor: Exception in task 37.0 in stage 0.0 (TID > 37) > > java.io.IOException: sendMessageReliably failed because ack was not > received within 60 sec > > at > org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:918) > > at > org.apache.spark.network.nio.ConnectionManager$$anon$13$$anonfun$run$19.apply(ConnectionManager.scala:917) > > at scala.Option.foreach(Option.scala:236) > > at > org.apache
Re: spark spark-ec2 credentials using aws_security_token
You refer to `aws_security_token`, but I'm not sure where you're specifying it. Can you elaborate? Is it an environment variable? On Mon, Jul 27, 2015 at 4:21 AM Jan Zikeš jan.zi...@centrum.cz wrote: Hi, I would like to ask if it is currently possible to use spark-ec2 script together with credentials that are consisting not only from: aws_access_key_id and aws_secret_access_key, but it also contains aws_security_token. When I try to run the script I am getting following error message: ERROR:boto:Caught exception reading instance data Traceback (most recent call last): File /Users/zikes/opensource/spark/ec2/lib/boto-2.34.0/boto/utils.py, line 210, in retry_url r = opener.open(req, timeout=timeout) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py, line 404, in open response = self._open(req, data) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py, line 422, in _open '_open', req) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py, line 382, in _call_chain result = func(*args) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py, line 1214, in http_open return self.do_open(httplib.HTTPConnection, req) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py, line 1184, in do_open raise URLError(err) URLError: urlopen error [Errno 64] Host is down ERROR:boto:Unable to read instance data, giving up No handler was ready to authenticate. 1 handlers were checked. ['QuerySignatureV2AuthHandler'] Check your credentials Does anyone has some idea what can be possibly wrong? Is aws_security_token the problem? I know that it seems more like a boto problem, but still I would like to ask if anybody has some experience with this? My launch command is: ./spark-ec2 -k my_key -i my_key.pem --additional-tags mytag:tag1,mytag2:tag2 --instance-profile-name profile1 -s 1 launch test Thank you in advance for any help. Best regards, Jan Note: I have also asked at http://stackoverflow.com/questions/31583513/spark-spark-ec2-credentials-using-aws-security-token?noredirect=1#comment51151822_31583513 without any success. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-spark-ec2-credentials-using-aws-security-token-tp24007.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
spark spark-ec2 credentials using aws_security_token
Hi, I would like to ask if it is currently possible to use spark-ec2 script together with credentials that are consisting not only from: aws_access_key_id and aws_secret_access_key, but it also contains aws_security_token. When I try to run the script I am getting following error message: ERROR:boto:Caught exception reading instance data Traceback (most recent call last): File /Users/zikes/opensource/spark/ec2/lib/boto-2.34.0/boto/utils.py, line 210, in retry_url r = opener.open(req, timeout=timeout) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py, line 404, in open response = self._open(req, data) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py, line 422, in _open '_open', req) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py, line 382, in _call_chain result = func(*args) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py, line 1214, in http_open return self.do_open(httplib.HTTPConnection, req) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py, line 1184, in do_open raise URLError(err) URLError: urlopen error [Errno 64] Host is down ERROR:boto:Unable to read instance data, giving up No handler was ready to authenticate. 1 handlers were checked. ['QuerySignatureV2AuthHandler'] Check your credentials Does anyone has some idea what can be possibly wrong? Is aws_security_token the problem? I know that it seems more like a boto problem, but still I would like to ask if anybody has some experience with this? My launch command is: ./spark-ec2 -k my_key -i my_key.pem --additional-tags mytag:tag1,mytag2:tag2 --instance-profile-name profile1 -s 1 launch test Thank you in advance for any help. Best regards, Jan Note: I have also asked at http://stackoverflow.com/questions/31583513/spark-spark-ec2-credentials-using-aws-security-token?noredirect=1#comment51151822_31583513 without any success. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-spark-ec2-credentials-using-aws-security-token-tp24007.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
spark spark-ec2 credentials using aws_security_token
Hi, I would like to ask if it is currently possible to use spark-ec2 script together with credentials that are consisting not only from: aws_access_key_id and aws_secret_access_key, but it also contains aws_security_token. When I try to run the script I am getting following error message: ERROR:boto:Caught exception reading instance data Traceback (most recent call last): File /Users/zikes/opensource/spark/ec2/lib/boto-2.34.0/boto/utils.py, line 210, in retry_url r = opener.open(req, timeout=timeout) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py, line 404, in open response = self._open(req, data) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py, line 422, in _open '_open', req) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py, line 382, in _call_chain result = func(*args) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py, line 1214, in http_open return self.do_open(httplib.HTTPConnection, req) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py, line 1184, in do_open raise URLError(err) URLError: urlopen error [Errno 64] Host is down ERROR:boto:Unable to read instance data, giving up No handler was ready to authenticate. 1 handlers were checked. ['QuerySignatureV2AuthHandler'] Check your credentials Does anyone has some idea what can be possibly wrong? Is aws_security_token the problem? I know that it seems more like a boto problem, but still I would like to ask if anybody has some experience with this? My launch command is: ./spark-ec2 -k my_key -i my_key.pem --additional-tags mytag:tag1,mytag2:tag2 --instance-profile-name profile1 -s 1 launch test Thank you in advance for any help. Best regards, Jan Note: I have also asked at http://stackoverflow.com/questions/31583513/spark-spark-ec2-credentials-using-aws-security-token?noredirect=1#comment51151822_31583513 without any success. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Required settings for permanent HDFS Spark on EC2
If your problem is that stopping/starting the cluster resets configs, then you may be running into this issue: https://issues.apache.org/jira/browse/SPARK-4977 Nick On Thu, Jun 4, 2015 at 2:46 PM barmaley o...@solver.com wrote: Hi - I'm having similar problem with switching from ephemeral to persistent HDFS - it always looks for 9000 port regardless of options I set for 9010 persistent HDFS. Have you figured out a solution? Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Required-settings-for-permanent-HDFS-Spark-on-EC2-tp22860p23157.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Required settings for permanent HDFS Spark on EC2
Hi - I'm having similar problem with switching from ephemeral to persistent HDFS - it always looks for 9000 port regardless of options I set for 9010 persistent HDFS. Have you figured out a solution? Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Required-settings-for-permanent-HDFS-Spark-on-EC2-tp22860p23157.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: How to run customized Spark on EC2?
This is how i used to do it: - Login to the ec2 cluster (master) - Make changes to the spark, and build it. - Stop the old installation of spark (sbin/stop-all.sh) - Copy old installation conf/* to modified version's conf/ - Rsync modified version to all slaves - do sbin/start-all.sh from the modified version. You can also simply replace the assembly jar (on master and worker) with the newly build jar if the versions are all same. Thanks Best Regards On Tue, Apr 28, 2015 at 10:59 PM, Bo Fu b...@uchicago.edu wrote: Hi experts, I have an issue. I added some timestamps in Spark source code and built it using: mvn package -DskipTests I checked the new version in my own computer and it works. However, when I ran spark on EC2, the spark code EC2 machines ran is the original version. Anyone knows how to deploy the changed spark source code into EC2? Thx a lot Bo Fu - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: How to run self-build spark on EC2?
You can replace your clusters(on master and workers) assembly jar with your custom build assembly jar. Thanks Best Regards On Tue, Apr 28, 2015 at 9:45 PM, Bo Fu b...@uchicago.edu wrote: Hi all, I have an issue. I added some timestamps in Spark source code and built it using: mvn package -DskipTests I checked the new version in my own computer and it works. However, when I ran spark on EC2, the spark code EC2 machines ran is the original version. Anyone knows how to deploy the changed spark source code into EC2? Thx a lot Bo Fu - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
How to run self-build spark on EC2?
Hi all, I have an issue. I added some timestamps in Spark source code and built it using: mvn package -DskipTests I checked the new version in my own computer and it works. However, when I ran spark on EC2, the spark code EC2 machines ran is the original version. Anyone knows how to deploy the changed spark source code into EC2? Thx a lot Bo Fu - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
How to run customized Spark on EC2?
Hi experts, I have an issue. I added some timestamps in Spark source code and built it using: mvn package -DskipTests I checked the new version in my own computer and it works. However, when I ran spark on EC2, the spark code EC2 machines ran is the original version. Anyone knows how to deploy the changed spark source code into EC2? Thx a lot Bo Fu - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Spark on EC2
Hi all, I just tried launching a Spark cluster on EC2 as described in http://spark.apache.org/docs/1.3.0/ec2-scripts.html I got the following response: *ResponseErrorsErrorCodePendingVerification/CodeMessageYour account is currently being verified. Verification normally takes less than 2 hours. Until your account is verified, you may not be able to launch additional instances or create additional volumes. If you are still receiving this message after more than 2 hours, please let us know by writing to aws-verificat...@amazon.com aws-verificat...@amazon.com. We appreciate your patience...* However I can see the EC2 instances in AWS console as running Any thoughts on what's going on? Thanks, Vadim ᐧ
Re: Spark on EC2
You're probably requesting more instances than allowed by your account, so the error gets generated for the extra instances. Try launching a smaller cluster. On Wed, Apr 1, 2015 at 12:41 PM, Vadim Bichutskiy vadim.bichuts...@gmail.com wrote: Hi all, I just tried launching a Spark cluster on EC2 as described in http://spark.apache.org/docs/1.3.0/ec2-scripts.html I got the following response: *ResponseErrorsErrorCodePendingVerification/CodeMessageYour account is currently being verified. Verification normally takes less than 2 hours. Until your account is verified, you may not be able to launch additional instances or create additional volumes. If you are still receiving this message after more than 2 hours, please let us know by writing to aws-verificat...@amazon.com aws-verificat...@amazon.com. We appreciate your patience...* However I can see the EC2 instances in AWS console as running Any thoughts on what's going on? Thanks, Vadim ᐧ -- *Dan Osipov* *Shazam* 2114 Broadway Street, Redwood City, CA 94063 Please consider the environment before printing this document Shazam Entertainment Limited is incorporated in England and Wales under company number 3998831 and its registered office is at 26-28 Hammersmith Grove, London W6 7HA. Shazam Media Services Inc is a member of the Shazam Entertainment Limited group of companies.
Spark on EC2
Hi, I have just signed up for Amazon AWS because I learnt that it provides service for free for the first 12 months. I want to run Spark on EC2 cluster. Will they charge me for this? Thank You
Re: Spark on EC2
The free tier includes 750 hours of t2.micro instance time per month. http://aws.amazon.com/free/ That's basically a month of hours, so it's all free if you run one instance only at a time. If you run 4, you'll be able to run your cluster of 4 for about a week free. A t2.micro has 1GB of memory, which is small but something you could possible get work done with. However it provides only burst CPU. You can only use about 10% of 1 vCPU continuously due to capping. Imagine this as about 1/10th of 1 core on your laptop. It would be incredibly slow. This is not to mention the network and I/O bottleneck you're likely to run into as you don't get much provisioning with these free instances. So, no you really can't use this for anything that is at all CPU intensive. It's for, say, running a low-traffic web service. On Tue, Feb 24, 2015 at 2:55 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Hi, I have just signed up for Amazon AWS because I learnt that it provides service for free for the first 12 months. I want to run Spark on EC2 cluster. Will they charge me for this? Thank You - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark on EC2
Thank You Sean. I was just trying to experiment with the performance of Spark Applications with various worker instances (I hope you remember that we discussed about the worker instances). I thought it would be a good one to try in EC2. So, it doesn't work out, does it? Thank You On Tue, Feb 24, 2015 at 8:40 PM, Sean Owen so...@cloudera.com wrote: The free tier includes 750 hours of t2.micro instance time per month. http://aws.amazon.com/free/ That's basically a month of hours, so it's all free if you run one instance only at a time. If you run 4, you'll be able to run your cluster of 4 for about a week free. A t2.micro has 1GB of memory, which is small but something you could possible get work done with. However it provides only burst CPU. You can only use about 10% of 1 vCPU continuously due to capping. Imagine this as about 1/10th of 1 core on your laptop. It would be incredibly slow. This is not to mention the network and I/O bottleneck you're likely to run into as you don't get much provisioning with these free instances. So, no you really can't use this for anything that is at all CPU intensive. It's for, say, running a low-traffic web service. On Tue, Feb 24, 2015 at 2:55 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Hi, I have just signed up for Amazon AWS because I learnt that it provides service for free for the first 12 months. I want to run Spark on EC2 cluster. Will they charge me for this? Thank You
Re: Spark on EC2
No, I think I am ok with the time it takes. Just that, with the increase in the partitions along with the increase in the number of workers, I want to see the improvement in the performance of an application. I just want to see this happen. Any comments? Thank You On Tue, Feb 24, 2015 at 8:52 PM, Sean Owen so...@cloudera.com wrote: You can definitely, easily, try a 1-node standalone cluster for free. Just don't be surprised when the CPU capping kicks in within about 5 minutes of any non-trivial computation and suddenly the instance is very s-l-o-w. I would consider just paying the ~$0.07/hour to play with an m3.medium, which ought to be pretty OK for basic experimentation. On Tue, Feb 24, 2015 at 3:14 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Thank You Sean. I was just trying to experiment with the performance of Spark Applications with various worker instances (I hope you remember that we discussed about the worker instances). I thought it would be a good one to try in EC2. So, it doesn't work out, does it? Thank You On Tue, Feb 24, 2015 at 8:40 PM, Sean Owen so...@cloudera.com wrote: The free tier includes 750 hours of t2.micro instance time per month. http://aws.amazon.com/free/ That's basically a month of hours, so it's all free if you run one instance only at a time. If you run 4, you'll be able to run your cluster of 4 for about a week free. A t2.micro has 1GB of memory, which is small but something you could possible get work done with. However it provides only burst CPU. You can only use about 10% of 1 vCPU continuously due to capping. Imagine this as about 1/10th of 1 core on your laptop. It would be incredibly slow. This is not to mention the network and I/O bottleneck you're likely to run into as you don't get much provisioning with these free instances. So, no you really can't use this for anything that is at all CPU intensive. It's for, say, running a low-traffic web service. On Tue, Feb 24, 2015 at 2:55 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Hi, I have just signed up for Amazon AWS because I learnt that it provides service for free for the first 12 months. I want to run Spark on EC2 cluster. Will they charge me for this? Thank You
Re: Spark on EC2
Hi, I am sorry that I made a mistake on AWS tarif. You can read the email of sean owen which explains better the strategies to run spark on AWS. For your question: it means that you just download spark and unzip it. Then run spark shell by ./bin/spark-shell or ./bin/pyspark. It is useful to get familiar with spark. You can do this on your laptop as well as on ec2. In fact, running ./ec2/spark-ec2 means launching spark standalone mode on a cluster, you can find more details here: https://spark.apache.org/docs/latest/spark-standalone.html Cheers Gen On Tue, Feb 24, 2015 at 4:07 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Kindly bear with my questions as I am new to this. If you run spark on local mode on a ec2 machine What does this mean? Is it that I launch Spark cluster from my local machine,i.e., by running the shell script that is there in /spark/ec2? On Tue, Feb 24, 2015 at 8:32 PM, gen tang gen.tan...@gmail.com wrote: Hi, As a real spark cluster needs a least one master and one slaves, you need to launch two machine. Therefore the second machine is not free. However, If you run spark on local mode on a ec2 machine. It is free. The charge of AWS depends on how much and the types of machine that you launched, but not on the utilisation of machine. Hope it would help. Cheers Gen On Tue, Feb 24, 2015 at 3:55 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Hi, I have just signed up for Amazon AWS because I learnt that it provides service for free for the first 12 months. I want to run Spark on EC2 cluster. Will they charge me for this? Thank You
Re: Spark on EC2
Thank You Akhil. Will look into it. Its free, isn't it? I am still a student :) On Tue, Feb 24, 2015 at 9:06 PM, Akhil Das ak...@sigmoidanalytics.com wrote: If you signup for Google Compute Cloud, you will get free $300 credits for 3 months and you can start a pretty good cluster for your testing purposes. :) Thanks Best Regards On Tue, Feb 24, 2015 at 8:25 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Hi, I have just signed up for Amazon AWS because I learnt that it provides service for free for the first 12 months. I want to run Spark on EC2 cluster. Will they charge me for this? Thank You
Re: Spark on EC2
You can definitely, easily, try a 1-node standalone cluster for free. Just don't be surprised when the CPU capping kicks in within about 5 minutes of any non-trivial computation and suddenly the instance is very s-l-o-w. I would consider just paying the ~$0.07/hour to play with an m3.medium, which ought to be pretty OK for basic experimentation. On Tue, Feb 24, 2015 at 3:14 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Thank You Sean. I was just trying to experiment with the performance of Spark Applications with various worker instances (I hope you remember that we discussed about the worker instances). I thought it would be a good one to try in EC2. So, it doesn't work out, does it? Thank You On Tue, Feb 24, 2015 at 8:40 PM, Sean Owen so...@cloudera.com wrote: The free tier includes 750 hours of t2.micro instance time per month. http://aws.amazon.com/free/ That's basically a month of hours, so it's all free if you run one instance only at a time. If you run 4, you'll be able to run your cluster of 4 for about a week free. A t2.micro has 1GB of memory, which is small but something you could possible get work done with. However it provides only burst CPU. You can only use about 10% of 1 vCPU continuously due to capping. Imagine this as about 1/10th of 1 core on your laptop. It would be incredibly slow. This is not to mention the network and I/O bottleneck you're likely to run into as you don't get much provisioning with these free instances. So, no you really can't use this for anything that is at all CPU intensive. It's for, say, running a low-traffic web service. On Tue, Feb 24, 2015 at 2:55 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Hi, I have just signed up for Amazon AWS because I learnt that it provides service for free for the first 12 months. I want to run Spark on EC2 cluster. Will they charge me for this? Thank You - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark on EC2
This should help you understand the cost of running a Spark cluster for a short period of time: http://www.ec2instances.info/ If you run an instance for even 1 second of a single hour you are charged for that complete hour. So before you shut down your miniature cluster make sure you really are done with what you want to do, as firing up the cluster again will be like using an extra hour's worth of time. The purpose of EC2's free tier is to get you to purchase into AWS services. At the free level its not terribly useful except for the most simplest of web applications (which you could host on Heroku - also uses AWS - for free) or simple long running but largely dormant shell processes. On Tue Feb 24 2015 at 10:16:56 AM Deep Pradhan pradhandeep1...@gmail.com wrote: Thank You Sean. I was just trying to experiment with the performance of Spark Applications with various worker instances (I hope you remember that we discussed about the worker instances). I thought it would be a good one to try in EC2. So, it doesn't work out, does it? Thank You On Tue, Feb 24, 2015 at 8:40 PM, Sean Owen so...@cloudera.com wrote: The free tier includes 750 hours of t2.micro instance time per month. http://aws.amazon.com/free/ That's basically a month of hours, so it's all free if you run one instance only at a time. If you run 4, you'll be able to run your cluster of 4 for about a week free. A t2.micro has 1GB of memory, which is small but something you could possible get work done with. However it provides only burst CPU. You can only use about 10% of 1 vCPU continuously due to capping. Imagine this as about 1/10th of 1 core on your laptop. It would be incredibly slow. This is not to mention the network and I/O bottleneck you're likely to run into as you don't get much provisioning with these free instances. So, no you really can't use this for anything that is at all CPU intensive. It's for, say, running a low-traffic web service. On Tue, Feb 24, 2015 at 2:55 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Hi, I have just signed up for Amazon AWS because I learnt that it provides service for free for the first 12 months. I want to run Spark on EC2 cluster. Will they charge me for this? Thank You
Re: Spark on EC2
If you signup for Google Compute Cloud, you will get free $300 credits for 3 months and you can start a pretty good cluster for your testing purposes. :) Thanks Best Regards On Tue, Feb 24, 2015 at 8:25 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Hi, I have just signed up for Amazon AWS because I learnt that it provides service for free for the first 12 months. I want to run Spark on EC2 cluster. Will they charge me for this? Thank You
Re: Spark on EC2
Yes it is :) Thanks Best Regards On Tue, Feb 24, 2015 at 9:09 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Thank You Akhil. Will look into it. Its free, isn't it? I am still a student :) On Tue, Feb 24, 2015 at 9:06 PM, Akhil Das ak...@sigmoidanalytics.com wrote: If you signup for Google Compute Cloud, you will get free $300 credits for 3 months and you can start a pretty good cluster for your testing purposes. :) Thanks Best Regards On Tue, Feb 24, 2015 at 8:25 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Hi, I have just signed up for Amazon AWS because I learnt that it provides service for free for the first 12 months. I want to run Spark on EC2 cluster. Will they charge me for this? Thank You
Re: Spark on EC2
Kindly bear with my questions as I am new to this. If you run spark on local mode on a ec2 machine What does this mean? Is it that I launch Spark cluster from my local machine,i.e., by running the shell script that is there in /spark/ec2? On Tue, Feb 24, 2015 at 8:32 PM, gen tang gen.tan...@gmail.com wrote: Hi, As a real spark cluster needs a least one master and one slaves, you need to launch two machine. Therefore the second machine is not free. However, If you run spark on local mode on a ec2 machine. It is free. The charge of AWS depends on how much and the types of machine that you launched, but not on the utilisation of machine. Hope it would help. Cheers Gen On Tue, Feb 24, 2015 at 3:55 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Hi, I have just signed up for Amazon AWS because I learnt that it provides service for free for the first 12 months. I want to run Spark on EC2 cluster. Will they charge me for this? Thank You
Re: Spark on EC2
Hi, As a real spark cluster needs a least one master and one slaves, you need to launch two machine. Therefore the second machine is not free. However, If you run spark on local mode on a ec2 machine. It is free. The charge of AWS depends on how much and the types of machine that you launched, but not on the utilisation of machine. Hope it would help. Cheers Gen On Tue, Feb 24, 2015 at 3:55 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Hi, I have just signed up for Amazon AWS because I learnt that it provides service for free for the first 12 months. I want to run Spark on EC2 cluster. Will they charge me for this? Thank You
Re: Spark on EC2
Thank You All. I think I will look into paying ~$0.7/hr as Sean suggested. On Tue, Feb 24, 2015 at 9:01 PM, gen tang gen.tan...@gmail.com wrote: Hi, I am sorry that I made a mistake on AWS tarif. You can read the email of sean owen which explains better the strategies to run spark on AWS. For your question: it means that you just download spark and unzip it. Then run spark shell by ./bin/spark-shell or ./bin/pyspark. It is useful to get familiar with spark. You can do this on your laptop as well as on ec2. In fact, running ./ec2/spark-ec2 means launching spark standalone mode on a cluster, you can find more details here: https://spark.apache.org/docs/latest/spark-standalone.html Cheers Gen On Tue, Feb 24, 2015 at 4:07 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Kindly bear with my questions as I am new to this. If you run spark on local mode on a ec2 machine What does this mean? Is it that I launch Spark cluster from my local machine,i.e., by running the shell script that is there in /spark/ec2? On Tue, Feb 24, 2015 at 8:32 PM, gen tang gen.tan...@gmail.com wrote: Hi, As a real spark cluster needs a least one master and one slaves, you need to launch two machine. Therefore the second machine is not free. However, If you run spark on local mode on a ec2 machine. It is free. The charge of AWS depends on how much and the types of machine that you launched, but not on the utilisation of machine. Hope it would help. Cheers Gen On Tue, Feb 24, 2015 at 3:55 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Hi, I have just signed up for Amazon AWS because I learnt that it provides service for free for the first 12 months. I want to run Spark on EC2 cluster. Will they charge me for this? Thank You
Re: spark on ec2
I don't see anything that says you must explicitly restart them to load the new settings, but usually there is some sort of signal trapped [or brute force full restart] to get a configuration reload for most daemons. I'd take a guess and use the $SPARK_HOME/sbin/{stop,start}-slaves.sh scripts on your master node and see. ( http://spark.apache.org/docs/1.2.0/spark-standalone.html#cluster-launch-scripts ) I just tested this out on my integration EC2 cluster and got odd results for stopping the workers (no workers found) but the start script... seemed to work. My integration cluster was running and functioning after executing both scripts, but I also didn't make any changes to spark-env either. On Thu Feb 05 2015 at 9:49:49 PM Kane Kim kane.ist...@gmail.com wrote: Hi, I'm trying to change setting as described here: http://spark.apache.org/docs/1.2.0/ec2-scripts.html export SPARK_WORKER_CORES=6 Then I ran ~/spark-ec2/copy-dir /root/spark/conf to distribute to slaves, but without any effect. Do I have to restart workers? How to do that with spark-ec2? Thanks.
Re: spark on ec2
Oh yeah, they picked up changes after restart, thanks! On Thu, Feb 5, 2015 at 8:13 PM, Charles Feduke charles.fed...@gmail.com wrote: I don't see anything that says you must explicitly restart them to load the new settings, but usually there is some sort of signal trapped [or brute force full restart] to get a configuration reload for most daemons. I'd take a guess and use the $SPARK_HOME/sbin/{stop,start}-slaves.sh scripts on your master node and see. ( http://spark.apache.org/docs/1.2.0/spark-standalone.html#cluster-launch-scripts ) I just tested this out on my integration EC2 cluster and got odd results for stopping the workers (no workers found) but the start script... seemed to work. My integration cluster was running and functioning after executing both scripts, but I also didn't make any changes to spark-env either. On Thu Feb 05 2015 at 9:49:49 PM Kane Kim kane.ist...@gmail.com wrote: Hi, I'm trying to change setting as described here: http://spark.apache.org/docs/1.2.0/ec2-scripts.html export SPARK_WORKER_CORES=6 Then I ran ~/spark-ec2/copy-dir /root/spark/conf to distribute to slaves, but without any effect. Do I have to restart workers? How to do that with spark-ec2? Thanks.
spark on ec2
Hi, I'm trying to change setting as described here: http://spark.apache.org/docs/1.2.0/ec2-scripts.html export SPARK_WORKER_CORES=6 Then I ran ~/spark-ec2/copy-dir /root/spark/conf to distribute to slaves, but without any effect. Do I have to restart workers? How to do that with spark-ec2? Thanks.
RE: spark 1.2 ec2 launch script hang
We found the problem and already fixed it. Basically, spark-ec2 requires ec2 instances to have external ip addresses. You need to specify this in the ASW console. From: nicholas.cham...@gmail.com Date: Tue, 27 Jan 2015 17:19:21 + Subject: Re: spark 1.2 ec2 launch script hang To: charles.fed...@gmail.com; pzybr...@gmail.com; eyc...@hotmail.com CC: user@spark.apache.org For those who found that absolute vs. relative path for the pem file mattered, what OS and shell are you using? What version of Spark are you using? ~/ vs. absolute path shouldn’t matter. Your shell will expand the ~/ to the absolute path before sending it to spark-ec2. (i.e. tilde expansion.) Absolute vs. relative path (e.g. ../../path/to/pem) also shouldn’t matter, since we fixed that for Spark 1.2.0. Maybe there’s some case that we missed? Nick On Tue Jan 27 2015 at 10:10:29 AM Charles Feduke charles.fed...@gmail.com wrote: Absolute path means no ~ and also verify that you have the path to the file correct. For some reason the Python code does not validate that the file exists and will hang (this is the same reason why ~ hangs). On Mon, Jan 26, 2015 at 10:08 PM Pete Zybrick pzybr...@gmail.com wrote: Try using an absolute path to the pem file On Jan 26, 2015, at 8:57 PM, ey-chih chow eyc...@hotmail.com wrote: Hi, I used the spark-ec2 script of spark 1.2 to launch a cluster. I have modified the script according to https://github.com/grzegorz-dubicki/spark/commit/5dd8458d2ab9753aae939b3bb33be953e2c13a70 But the script was still hung at the following message: Waiting for cluster to enter 'ssh-ready' state. Any additional thing I should do to make it succeed? Thanks. Ey-Chih Chow -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-2-ec2-launch-script-hang-tp21381.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: spark 1.2 ec2 launch script hang
Hmm, I can’t see why using ~ would be problematic, especially if you confirm that echo ~/path/to/pem expands to the correct path to your identity file. If you have a simple reproduction of the problem, please send it over. I’d love to look into this. When I pass paths with ~ to spark-ec2 on my system, it works fine. I’m using bash, but zsh handles tilde expansion the same as bash. Nick On Wed Jan 28 2015 at 3:30:08 PM Charles Feduke charles.fed...@gmail.com wrote: It was only hanging when I specified the path with ~ I never tried relative. Hanging on the waiting for ssh to be ready on all hosts. I let it sit for about 10 minutes then I found the StackOverflow answer that suggested specifying an absolute path, cancelled, and re-run with --resume and the absolute path and all slaves were up in a couple minutes. (I've stood up 4 integration clusters and 2 production clusters on EC2 since with no problems.) On Wed Jan 28 2015 at 12:05:43 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: Ey-chih, That makes more sense. This is a known issue that will be fixed as part of SPARK-5242 https://issues.apache.org/jira/browse/SPARK-5242. Charles, Thanks for the info. In your case, when does spark-ec2 hang? Only when the specified path to the identity file doesn't exist? Or also when you specify the path as a relative path or with ~? Nick On Wed Jan 28 2015 at 9:29:34 AM ey-chih chow eyc...@hotmail.com wrote: We found the problem and already fixed it. Basically, spark-ec2 requires ec2 instances to have external ip addresses. You need to specify this in the ASW console. -- From: nicholas.cham...@gmail.com Date: Tue, 27 Jan 2015 17:19:21 + Subject: Re: spark 1.2 ec2 launch script hang To: charles.fed...@gmail.com; pzybr...@gmail.com; eyc...@hotmail.com CC: user@spark.apache.org For those who found that absolute vs. relative path for the pem file mattered, what OS and shell are you using? What version of Spark are you using? ~/ vs. absolute path shouldn’t matter. Your shell will expand the ~/ to the absolute path before sending it to spark-ec2. (i.e. tilde expansion.) Absolute vs. relative path (e.g. ../../path/to/pem) also shouldn’t matter, since we fixed that for Spark 1.2.0 https://issues.apache.org/jira/browse/SPARK-4137. Maybe there’s some case that we missed? Nick On Tue Jan 27 2015 at 10:10:29 AM Charles Feduke charles.fed...@gmail.com wrote: Absolute path means no ~ and also verify that you have the path to the file correct. For some reason the Python code does not validate that the file exists and will hang (this is the same reason why ~ hangs). On Mon, Jan 26, 2015 at 10:08 PM Pete Zybrick pzybr...@gmail.com wrote: Try using an absolute path to the pem file On Jan 26, 2015, at 8:57 PM, ey-chih chow eyc...@hotmail.com wrote: Hi, I used the spark-ec2 script of spark 1.2 to launch a cluster. I have modified the script according to https://github.com/grzegorz-dubicki/spark/commit/5dd8458d2ab 9753aae939b3bb33be953e2c13a70 But the script was still hung at the following message: Waiting for cluster to enter 'ssh-ready' state. Any additional thing I should do to make it succeed? Thanks. Ey-Chih Chow -- View this message in context: http://apache-spark-user-list. 1001560.n3.nabble.com/spark-1-2-ec2-launch-script-hang-tp21381.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: spark 1.2 ec2 launch script hang
Yeah, I agree ~ should work. And it could have been [read: probably was] the fact that one of the EC2 hosts was in my known_hosts (don't know, never saw an error message, but the behavior is no error message for that state), which I had fixed later with Pete's patch. But the second execution when things worked with an absolute path could have worked because the random hosts that came up on EC2 were never in my known_hosts. On Wed Jan 28 2015 at 3:45:36 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: Hmm, I can’t see why using ~ would be problematic, especially if you confirm that echo ~/path/to/pem expands to the correct path to your identity file. If you have a simple reproduction of the problem, please send it over. I’d love to look into this. When I pass paths with ~ to spark-ec2 on my system, it works fine. I’m using bash, but zsh handles tilde expansion the same as bash. Nick On Wed Jan 28 2015 at 3:30:08 PM Charles Feduke charles.fed...@gmail.com wrote: It was only hanging when I specified the path with ~ I never tried relative. Hanging on the waiting for ssh to be ready on all hosts. I let it sit for about 10 minutes then I found the StackOverflow answer that suggested specifying an absolute path, cancelled, and re-run with --resume and the absolute path and all slaves were up in a couple minutes. (I've stood up 4 integration clusters and 2 production clusters on EC2 since with no problems.) On Wed Jan 28 2015 at 12:05:43 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: Ey-chih, That makes more sense. This is a known issue that will be fixed as part of SPARK-5242 https://issues.apache.org/jira/browse/SPARK-5242. Charles, Thanks for the info. In your case, when does spark-ec2 hang? Only when the specified path to the identity file doesn't exist? Or also when you specify the path as a relative path or with ~? Nick On Wed Jan 28 2015 at 9:29:34 AM ey-chih chow eyc...@hotmail.com wrote: We found the problem and already fixed it. Basically, spark-ec2 requires ec2 instances to have external ip addresses. You need to specify this in the ASW console. -- From: nicholas.cham...@gmail.com Date: Tue, 27 Jan 2015 17:19:21 + Subject: Re: spark 1.2 ec2 launch script hang To: charles.fed...@gmail.com; pzybr...@gmail.com; eyc...@hotmail.com CC: user@spark.apache.org For those who found that absolute vs. relative path for the pem file mattered, what OS and shell are you using? What version of Spark are you using? ~/ vs. absolute path shouldn’t matter. Your shell will expand the ~/ to the absolute path before sending it to spark-ec2. (i.e. tilde expansion.) Absolute vs. relative path (e.g. ../../path/to/pem) also shouldn’t matter, since we fixed that for Spark 1.2.0 https://issues.apache.org/jira/browse/SPARK-4137. Maybe there’s some case that we missed? Nick On Tue Jan 27 2015 at 10:10:29 AM Charles Feduke charles.fed...@gmail.com wrote: Absolute path means no ~ and also verify that you have the path to the file correct. For some reason the Python code does not validate that the file exists and will hang (this is the same reason why ~ hangs). On Mon, Jan 26, 2015 at 10:08 PM Pete Zybrick pzybr...@gmail.com wrote: Try using an absolute path to the pem file On Jan 26, 2015, at 8:57 PM, ey-chih chow eyc...@hotmail.com wrote: Hi, I used the spark-ec2 script of spark 1.2 to launch a cluster. I have modified the script according to https://github.com/grzegorz-dubicki/spark/commit/5dd8458d2ab 9753aae939b3bb33be953e2c13a70 But the script was still hung at the following message: Waiting for cluster to enter 'ssh-ready' state. Any additional thing I should do to make it succeed? Thanks. Ey-Chih Chow -- View this message in context: http://apache-spark-user-list. 1001560.n3.nabble.com/spark-1-2-ec2-launch-script-hang-tp21381.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: spark 1.2 ec2 launch script hang
It was only hanging when I specified the path with ~ I never tried relative. Hanging on the waiting for ssh to be ready on all hosts. I let it sit for about 10 minutes then I found the StackOverflow answer that suggested specifying an absolute path, cancelled, and re-run with --resume and the absolute path and all slaves were up in a couple minutes. (I've stood up 4 integration clusters and 2 production clusters on EC2 since with no problems.) On Wed Jan 28 2015 at 12:05:43 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: Ey-chih, That makes more sense. This is a known issue that will be fixed as part of SPARK-5242 https://issues.apache.org/jira/browse/SPARK-5242. Charles, Thanks for the info. In your case, when does spark-ec2 hang? Only when the specified path to the identity file doesn't exist? Or also when you specify the path as a relative path or with ~? Nick On Wed Jan 28 2015 at 9:29:34 AM ey-chih chow eyc...@hotmail.com wrote: We found the problem and already fixed it. Basically, spark-ec2 requires ec2 instances to have external ip addresses. You need to specify this in the ASW console. -- From: nicholas.cham...@gmail.com Date: Tue, 27 Jan 2015 17:19:21 + Subject: Re: spark 1.2 ec2 launch script hang To: charles.fed...@gmail.com; pzybr...@gmail.com; eyc...@hotmail.com CC: user@spark.apache.org For those who found that absolute vs. relative path for the pem file mattered, what OS and shell are you using? What version of Spark are you using? ~/ vs. absolute path shouldn’t matter. Your shell will expand the ~/ to the absolute path before sending it to spark-ec2. (i.e. tilde expansion.) Absolute vs. relative path (e.g. ../../path/to/pem) also shouldn’t matter, since we fixed that for Spark 1.2.0 https://issues.apache.org/jira/browse/SPARK-4137. Maybe there’s some case that we missed? Nick On Tue Jan 27 2015 at 10:10:29 AM Charles Feduke charles.fed...@gmail.com wrote: Absolute path means no ~ and also verify that you have the path to the file correct. For some reason the Python code does not validate that the file exists and will hang (this is the same reason why ~ hangs). On Mon, Jan 26, 2015 at 10:08 PM Pete Zybrick pzybr...@gmail.com wrote: Try using an absolute path to the pem file On Jan 26, 2015, at 8:57 PM, ey-chih chow eyc...@hotmail.com wrote: Hi, I used the spark-ec2 script of spark 1.2 to launch a cluster. I have modified the script according to https://github.com/grzegorz-dubicki/spark/commit/5dd8458d2ab 9753aae939b3bb33be953e2c13a70 But the script was still hung at the following message: Waiting for cluster to enter 'ssh-ready' state. Any additional thing I should do to make it succeed? Thanks. Ey-Chih Chow -- View this message in context: http://apache-spark-user-list. 1001560.n3.nabble.com/spark-1-2-ec2-launch-script-hang-tp21381.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: spark 1.2 ec2 launch script hang
Thanks for sending this over, Peter. What if you try this? (i.e. Remove the = after --identity-file.) ec2/spark-ec2 --key-pair=spark-streaming-kp --identity-file ~/.pzkeys/spark-streaming-kp.pem --region=us-east-1 login pz-spark-cluster If that works, then I think the problem in this case is simply that Bash cannot expand the tilde because it’s stuck to the --identity-file=. This isn’t a problem with spark-ec2. Bash sees the --identity-file=~/.pzkeys/spark-streaming-kp.pem as one big argument, so it can’t do tilde expansion. Nick On Wed Jan 28 2015 at 9:17:06 PM Peter Zybrick pzybr...@gmail.com wrote: Below is trace from trying to access with ~/path. I also did the echo as per Nick (see the last line), looks ok to me. This is my development box with Spark 1.2.0 running CentOS 6.5, Python 2.6.6 [pete.zybrick@pz-lt2-ipc spark-1.2.0]$ ec2/spark-ec2 --key-pair=spark-streaming-kp --identity-file=~/.pzkeys/spark-streaming-kp.pem --region=us-east-1 login pz-spark-cluster Searching for existing cluster pz-spark-cluster... Found 1 master(s), 3 slaves Logging into master ec2-54-152-95-129.compute-1.amazonaws.com... Warning: Identity file ~/.pzkeys/spark-streaming-kp.pem not accessible: No such file or directory. Permission denied (publickey). Traceback (most recent call last): File ec2/spark_ec2.py, line 1082, in module main() File ec2/spark_ec2.py, line 1074, in main real_main() File ec2/spark_ec2.py, line 1007, in real_main ssh_command(opts) + proxy_opt + ['-t', '-t', %s@%s % (opts.user, master)]) File /usr/lib64/python2.6/subprocess.py, line 505, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['ssh', '-o', 'StrictHostKeyChecking=no', '-i', '~/.pzkeys/spark-streaming-kp.pem', '-t', '-t', u'r...@ec2-54-152-95-129.compute-1.amazonaws.com']' returned non-zero exit status 255 [pete.zybrick@pz-lt2-ipc spark-1.2.0]$ echo ~/.pzkeys/spark-streaming-kp.pem /home/pete.zybrick/.pzkeys/spark-streaming-kp.pem On Wed, Jan 28, 2015 at 3:49 PM, Charles Feduke charles.fed...@gmail.com wrote: Yeah, I agree ~ should work. And it could have been [read: probably was] the fact that one of the EC2 hosts was in my known_hosts (don't know, never saw an error message, but the behavior is no error message for that state), which I had fixed later with Pete's patch. But the second execution when things worked with an absolute path could have worked because the random hosts that came up on EC2 were never in my known_hosts. On Wed Jan 28 2015 at 3:45:36 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: Hmm, I can’t see why using ~ would be problematic, especially if you confirm that echo ~/path/to/pem expands to the correct path to your identity file. If you have a simple reproduction of the problem, please send it over. I’d love to look into this. When I pass paths with ~ to spark-ec2 on my system, it works fine. I’m using bash, but zsh handles tilde expansion the same as bash. Nick On Wed Jan 28 2015 at 3:30:08 PM Charles Feduke charles.fed...@gmail.com wrote: It was only hanging when I specified the path with ~ I never tried relative. Hanging on the waiting for ssh to be ready on all hosts. I let it sit for about 10 minutes then I found the StackOverflow answer that suggested specifying an absolute path, cancelled, and re-run with --resume and the absolute path and all slaves were up in a couple minutes. (I've stood up 4 integration clusters and 2 production clusters on EC2 since with no problems.) On Wed Jan 28 2015 at 12:05:43 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: Ey-chih, That makes more sense. This is a known issue that will be fixed as part of SPARK-5242 https://issues.apache.org/jira/browse/SPARK-5242. Charles, Thanks for the info. In your case, when does spark-ec2 hang? Only when the specified path to the identity file doesn't exist? Or also when you specify the path as a relative path or with ~? Nick On Wed Jan 28 2015 at 9:29:34 AM ey-chih chow eyc...@hotmail.com wrote: We found the problem and already fixed it. Basically, spark-ec2 requires ec2 instances to have external ip addresses. You need to specify this in the ASW console. -- From: nicholas.cham...@gmail.com Date: Tue, 27 Jan 2015 17:19:21 + Subject: Re: spark 1.2 ec2 launch script hang To: charles.fed...@gmail.com; pzybr...@gmail.com; eyc...@hotmail.com CC: user@spark.apache.org For those who found that absolute vs. relative path for the pem file mattered, what OS and shell are you using? What version of Spark are you using? ~/ vs. absolute path shouldn’t matter. Your shell will expand the ~/ to the absolute path before sending it to spark-ec2. (i.e. tilde expansion.) Absolute vs. relative path (e.g. ../../path/to/pem) also shouldn’t matter, since we fixed that for Spark 1.2.0 https://issues.apache.org/jira/browse/SPARK
Re: spark 1.2 ec2 launch script hang
If that was indeed the problem, I suggest updating your answer on SO http://stackoverflow.com/a/28005151/877069 to help others who may run into this same problem. On Wed Jan 28 2015 at 9:40:39 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: Thanks for sending this over, Peter. What if you try this? (i.e. Remove the = after --identity-file.) ec2/spark-ec2 --key-pair=spark-streaming-kp --identity-file ~/.pzkeys/spark-streaming-kp.pem --region=us-east-1 login pz-spark-cluster If that works, then I think the problem in this case is simply that Bash cannot expand the tilde because it’s stuck to the --identity-file=. This isn’t a problem with spark-ec2. Bash sees the --identity-file=~/.pzkeys/spark-streaming-kp.pem as one big argument, so it can’t do tilde expansion. Nick On Wed Jan 28 2015 at 9:17:06 PM Peter Zybrick pzybr...@gmail.com wrote: Below is trace from trying to access with ~/path. I also did the echo as per Nick (see the last line), looks ok to me. This is my development box with Spark 1.2.0 running CentOS 6.5, Python 2.6.6 [pete.zybrick@pz-lt2-ipc spark-1.2.0]$ ec2/spark-ec2 --key-pair=spark-streaming-kp --identity-file=~/.pzkeys/spark-streaming-kp.pem --region=us-east-1 login pz-spark-cluster Searching for existing cluster pz-spark-cluster... Found 1 master(s), 3 slaves Logging into master ec2-54-152-95-129.compute-1.amazonaws.com... Warning: Identity file ~/.pzkeys/spark-streaming-kp.pem not accessible: No such file or directory. Permission denied (publickey). Traceback (most recent call last): File ec2/spark_ec2.py, line 1082, in module main() File ec2/spark_ec2.py, line 1074, in main real_main() File ec2/spark_ec2.py, line 1007, in real_main ssh_command(opts) + proxy_opt + ['-t', '-t', %s@%s % (opts.user, master)]) File /usr/lib64/python2.6/subprocess.py, line 505, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['ssh', '-o', 'StrictHostKeyChecking=no', '-i', '~/.pzkeys/spark-streaming-kp.pem', '-t', '-t', u'r...@ec2-54-152-95-129.compute-1.amazonaws.com']' returned non-zero exit status 255 [pete.zybrick@pz-lt2-ipc spark-1.2.0]$ echo ~/.pzkeys/spark-streaming-kp. pem /home/pete.zybrick/.pzkeys/spark-streaming-kp.pem On Wed, Jan 28, 2015 at 3:49 PM, Charles Feduke charles.fed...@gmail.com wrote: Yeah, I agree ~ should work. And it could have been [read: probably was] the fact that one of the EC2 hosts was in my known_hosts (don't know, never saw an error message, but the behavior is no error message for that state), which I had fixed later with Pete's patch. But the second execution when things worked with an absolute path could have worked because the random hosts that came up on EC2 were never in my known_hosts. On Wed Jan 28 2015 at 3:45:36 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: Hmm, I can’t see why using ~ would be problematic, especially if you confirm that echo ~/path/to/pem expands to the correct path to your identity file. If you have a simple reproduction of the problem, please send it over. I’d love to look into this. When I pass paths with ~ to spark-ec2 on my system, it works fine. I’m using bash, but zsh handles tilde expansion the same as bash. Nick On Wed Jan 28 2015 at 3:30:08 PM Charles Feduke charles.fed...@gmail.com wrote: It was only hanging when I specified the path with ~ I never tried relative. Hanging on the waiting for ssh to be ready on all hosts. I let it sit for about 10 minutes then I found the StackOverflow answer that suggested specifying an absolute path, cancelled, and re-run with --resume and the absolute path and all slaves were up in a couple minutes. (I've stood up 4 integration clusters and 2 production clusters on EC2 since with no problems.) On Wed Jan 28 2015 at 12:05:43 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: Ey-chih, That makes more sense. This is a known issue that will be fixed as part of SPARK-5242 https://issues.apache.org/jira/browse/SPARK-5242 . Charles, Thanks for the info. In your case, when does spark-ec2 hang? Only when the specified path to the identity file doesn't exist? Or also when you specify the path as a relative path or with ~? Nick On Wed Jan 28 2015 at 9:29:34 AM ey-chih chow eyc...@hotmail.com wrote: We found the problem and already fixed it. Basically, spark-ec2 requires ec2 instances to have external ip addresses. You need to specify this in the ASW console. -- From: nicholas.cham...@gmail.com Date: Tue, 27 Jan 2015 17:19:21 + Subject: Re: spark 1.2 ec2 launch script hang To: charles.fed...@gmail.com; pzybr...@gmail.com; eyc...@hotmail.com CC: user@spark.apache.org For those who found that absolute vs. relative path for the pem file mattered, what OS and shell are you using? What version of Spark are you using? ~/ vs. absolute path shouldn’t matter
Re: spark 1.2 ec2 launch script hang
Below is trace from trying to access with ~/path. I also did the echo as per Nick (see the last line), looks ok to me. This is my development box with Spark 1.2.0 running CentOS 6.5, Python 2.6.6 [pete.zybrick@pz-lt2-ipc spark-1.2.0]$ ec2/spark-ec2 --key-pair=spark-streaming-kp --identity-file=~/.pzkeys/spark-streaming-kp.pem --region=us-east-1 login pz-spark-cluster Searching for existing cluster pz-spark-cluster... Found 1 master(s), 3 slaves Logging into master ec2-54-152-95-129.compute-1.amazonaws.com... Warning: Identity file ~/.pzkeys/spark-streaming-kp.pem not accessible: No such file or directory. Permission denied (publickey). Traceback (most recent call last): File ec2/spark_ec2.py, line 1082, in module main() File ec2/spark_ec2.py, line 1074, in main real_main() File ec2/spark_ec2.py, line 1007, in real_main ssh_command(opts) + proxy_opt + ['-t', '-t', %s@%s % (opts.user, master)]) File /usr/lib64/python2.6/subprocess.py, line 505, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['ssh', '-o', 'StrictHostKeyChecking=no', '-i', '~/.pzkeys/spark-streaming-kp.pem', '-t', '-t', u'r...@ec2-54-152-95-129.compute-1.amazonaws.com']' returned non-zero exit status 255 [pete.zybrick@pz-lt2-ipc spark-1.2.0]$ echo ~/.pzkeys/spark-streaming-kp.pem /home/pete.zybrick/.pzkeys/spark-streaming-kp.pem On Wed, Jan 28, 2015 at 3:49 PM, Charles Feduke charles.fed...@gmail.com wrote: Yeah, I agree ~ should work. And it could have been [read: probably was] the fact that one of the EC2 hosts was in my known_hosts (don't know, never saw an error message, but the behavior is no error message for that state), which I had fixed later with Pete's patch. But the second execution when things worked with an absolute path could have worked because the random hosts that came up on EC2 were never in my known_hosts. On Wed Jan 28 2015 at 3:45:36 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: Hmm, I can’t see why using ~ would be problematic, especially if you confirm that echo ~/path/to/pem expands to the correct path to your identity file. If you have a simple reproduction of the problem, please send it over. I’d love to look into this. When I pass paths with ~ to spark-ec2 on my system, it works fine. I’m using bash, but zsh handles tilde expansion the same as bash. Nick On Wed Jan 28 2015 at 3:30:08 PM Charles Feduke charles.fed...@gmail.com wrote: It was only hanging when I specified the path with ~ I never tried relative. Hanging on the waiting for ssh to be ready on all hosts. I let it sit for about 10 minutes then I found the StackOverflow answer that suggested specifying an absolute path, cancelled, and re-run with --resume and the absolute path and all slaves were up in a couple minutes. (I've stood up 4 integration clusters and 2 production clusters on EC2 since with no problems.) On Wed Jan 28 2015 at 12:05:43 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: Ey-chih, That makes more sense. This is a known issue that will be fixed as part of SPARK-5242 https://issues.apache.org/jira/browse/SPARK-5242. Charles, Thanks for the info. In your case, when does spark-ec2 hang? Only when the specified path to the identity file doesn't exist? Or also when you specify the path as a relative path or with ~? Nick On Wed Jan 28 2015 at 9:29:34 AM ey-chih chow eyc...@hotmail.com wrote: We found the problem and already fixed it. Basically, spark-ec2 requires ec2 instances to have external ip addresses. You need to specify this in the ASW console. -- From: nicholas.cham...@gmail.com Date: Tue, 27 Jan 2015 17:19:21 + Subject: Re: spark 1.2 ec2 launch script hang To: charles.fed...@gmail.com; pzybr...@gmail.com; eyc...@hotmail.com CC: user@spark.apache.org For those who found that absolute vs. relative path for the pem file mattered, what OS and shell are you using? What version of Spark are you using? ~/ vs. absolute path shouldn’t matter. Your shell will expand the ~/ to the absolute path before sending it to spark-ec2. (i.e. tilde expansion.) Absolute vs. relative path (e.g. ../../path/to/pem) also shouldn’t matter, since we fixed that for Spark 1.2.0 https://issues.apache.org/jira/browse/SPARK-4137. Maybe there’s some case that we missed? Nick On Tue Jan 27 2015 at 10:10:29 AM Charles Feduke charles.fed...@gmail.com wrote: Absolute path means no ~ and also verify that you have the path to the file correct. For some reason the Python code does not validate that the file exists and will hang (this is the same reason why ~ hangs). On Mon, Jan 26, 2015 at 10:08 PM Pete Zybrick pzybr...@gmail.com wrote: Try using an absolute path to the pem file On Jan 26, 2015, at 8:57 PM, ey-chih chow eyc...@hotmail.com wrote: Hi, I used the spark-ec2 script of spark 1.2 to launch a cluster. I have modified the script
Re: spark 1.2 ec2 launch script hang
Absolute path means no ~ and also verify that you have the path to the file correct. For some reason the Python code does not validate that the file exists and will hang (this is the same reason why ~ hangs). On Mon, Jan 26, 2015 at 10:08 PM Pete Zybrick pzybr...@gmail.com wrote: Try using an absolute path to the pem file On Jan 26, 2015, at 8:57 PM, ey-chih chow eyc...@hotmail.com wrote: Hi, I used the spark-ec2 script of spark 1.2 to launch a cluster. I have modified the script according to https://github.com/grzegorz-dubicki/spark/commit/ 5dd8458d2ab9753aae939b3bb33be953e2c13a70 But the script was still hung at the following message: Waiting for cluster to enter 'ssh-ready' state. Any additional thing I should do to make it succeed? Thanks. Ey-Chih Chow -- View this message in context: http://apache-spark-user-list. 1001560.n3.nabble.com/spark-1-2-ec2-launch-script-hang-tp21381.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: spark 1.2 ec2 launch script hang
For those who found that absolute vs. relative path for the pem file mattered, what OS and shell are you using? What version of Spark are you using? ~/ vs. absolute path shouldn’t matter. Your shell will expand the ~/ to the absolute path before sending it to spark-ec2. (i.e. tilde expansion.) Absolute vs. relative path (e.g. ../../path/to/pem) also shouldn’t matter, since we fixed that for Spark 1.2.0 https://issues.apache.org/jira/browse/SPARK-4137. Maybe there’s some case that we missed? Nick On Tue Jan 27 2015 at 10:10:29 AM Charles Feduke charles.fed...@gmail.com wrote: Absolute path means no ~ and also verify that you have the path to the file correct. For some reason the Python code does not validate that the file exists and will hang (this is the same reason why ~ hangs). On Mon, Jan 26, 2015 at 10:08 PM Pete Zybrick pzybr...@gmail.com wrote: Try using an absolute path to the pem file On Jan 26, 2015, at 8:57 PM, ey-chih chow eyc...@hotmail.com wrote: Hi, I used the spark-ec2 script of spark 1.2 to launch a cluster. I have modified the script according to https://github.com/grzegorz-dubicki/spark/commit/5dd8458d2ab 9753aae939b3bb33be953e2c13a70 But the script was still hung at the following message: Waiting for cluster to enter 'ssh-ready' state. Any additional thing I should do to make it succeed? Thanks. Ey-Chih Chow -- View this message in context: http://apache-spark-user-list. 1001560.n3.nabble.com/spark-1-2-ec2-launch-script-hang-tp21381.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: spark 1.2 ec2 launch script hang
Try using an absolute path to the pem file On Jan 26, 2015, at 8:57 PM, ey-chih chow eyc...@hotmail.com wrote: Hi, I used the spark-ec2 script of spark 1.2 to launch a cluster. I have modified the script according to https://github.com/grzegorz-dubicki/spark/commit/5dd8458d2ab9753aae939b3bb33be953e2c13a70 But the script was still hung at the following message: Waiting for cluster to enter 'ssh-ready' state. Any additional thing I should do to make it succeed? Thanks. Ey-Chih Chow -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-2-ec2-launch-script-hang-tp21381.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
spark 1.2 ec2 launch script hang
Hi, I used the spark-ec2 script of spark 1.2 to launch a cluster. I have modified the script according to https://github.com/grzegorz-dubicki/spark/commit/5dd8458d2ab9753aae939b3bb33be953e2c13a70 But the script was still hung at the following message: Waiting for cluster to enter 'ssh-ready' state. Any additional thing I should do to make it succeed? Thanks. Ey-Chih Chow -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-2-ec2-launch-script-hang-tp21381.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Cluster hangs in 'ssh-ready' state using Spark 1.2 EC2 launch script
Nathan, I posted a bunch of questions for you as a comment on your question http://stackoverflow.com/q/28002443/877069 on Stack Overflow. If you answer them (don't forget to @ping me) I may be able to help you. Nick On Sat Jan 17 2015 at 3:49:54 PM gen tang gen.tan...@gmail.com wrote: Hi, This is because ssh-ready is the ec2 scripy means that all the instances are in the status of running and all the instances in the status of OK, In another word, the instances is ready to download and to install software, just as emr is ready for bootstrap actions. Before, the script just repeatedly prints the information showing that we are waiting for every instance being launched.And it is quite ugly, so they change the information to print However, you can use ssh to connect the instance even if it is in the status of pending. If you wait patiently a little more,, the script will finish the launch of cluster. Cheers Gen On Sat, Jan 17, 2015 at 7:00 PM, Nathan Murthy nathan.mur...@gmail.com wrote: Originally posted here: http://stackoverflow.com/questions/28002443/cluster-hangs-in-ssh-ready-state-using-spark-1-2-ec2-launch-script I'm trying to launch a standalone Spark cluster using its pre-packaged EC2 scripts, but it just indefinitely hangs in an 'ssh-ready' state: ubuntu@machine:~/spark-1.2.0-bin-hadoop2.4$ ./ec2/spark-ec2 -k key-pair -i identity-file.pem -r us-west-2 -s 3 launch test Setting up security groups... Searching for existing cluster test... Spark AMI: ami-ae6e0d9e Launching instances... Launched 3 slaves in us-west-2c, regid = r-b___6 Launched master in us-west-2c, regid = r-0__0 Waiting for all instances in cluster to enter 'ssh-ready' state.. Yet I can SSH into these instances without compaint: ubuntu@machine:~$ ssh -i identity-file.pem root@master-ip Last login: Day MMM DD HH:mm:ss 20YY from c-AA-BBB--DDD.eee1.ff.provider.net __| __|_ ) _| ( / Amazon Linux AMI ___|\___|___| https://aws.amazon.com/amazon-linux-ami/2013.03-release-notes/ There are 59 security update(s) out of 257 total update(s) available Run sudo yum update to apply all updates. Amazon Linux version 2014.09 is available. root@ip-internal ~]$ I'm trying to figure out if this is a problem in AWS or with the Spark scripts. I've never had this issue before until recently. -- Nathan Murthy // 713.884.7110 (mobile) // @natemurthy
Cluster hangs in 'ssh-ready' state using Spark 1.2 EC2 launch script
Originally posted here: http://stackoverflow.com/questions/28002443/cluster-hangs-in-ssh-ready-state-using-spark-1-2-ec2-launch-script I'm trying to launch a standalone Spark cluster using its pre-packaged EC2 scripts, but it just indefinitely hangs in an 'ssh-ready' state: ubuntu@machine:~/spark-1.2.0-bin-hadoop2.4$ ./ec2/spark-ec2 -k key-pair -i identity-file.pem -r us-west-2 -s 3 launch test Setting up security groups... Searching for existing cluster test... Spark AMI: ami-ae6e0d9e Launching instances... Launched 3 slaves in us-west-2c, regid = r-b___6 Launched master in us-west-2c, regid = r-0__0 Waiting for all instances in cluster to enter 'ssh-ready' state.. Yet I can SSH into these instances without compaint: ubuntu@machine:~$ ssh -i identity-file.pem root@master-ip Last login: Day MMM DD HH:mm:ss 20YY from c-AA-BBB--DDD.eee1.ff.provider.net __| __|_ ) _| ( / Amazon Linux AMI ___|\___|___| https://aws.amazon.com/amazon-linux-ami/2013.03-release-notes/ There are 59 security update(s) out of 257 total update(s) available Run sudo yum update to apply all updates. Amazon Linux version 2014.09 is available. root@ip-internal ~]$ I'm trying to figure out if this is a problem in AWS or with the Spark scripts. I've never had this issue before until recently. -- Nathan Murthy // 713.884.7110 (mobile) // @natemurthy
Re: Cluster hangs in 'ssh-ready' state using Spark 1.2 EC2 launch script
Hi, This is because ssh-ready is the ec2 scripy means that all the instances are in the status of running and all the instances in the status of OK, In another word, the instances is ready to download and to install software, just as emr is ready for bootstrap actions. Before, the script just repeatedly prints the information showing that we are waiting for every instance being launched.And it is quite ugly, so they change the information to print However, you can use ssh to connect the instance even if it is in the status of pending. If you wait patiently a little more,, the script will finish the launch of cluster. Cheers Gen On Sat, Jan 17, 2015 at 7:00 PM, Nathan Murthy nathan.mur...@gmail.com wrote: Originally posted here: http://stackoverflow.com/questions/28002443/cluster-hangs-in-ssh-ready-state-using-spark-1-2-ec2-launch-script I'm trying to launch a standalone Spark cluster using its pre-packaged EC2 scripts, but it just indefinitely hangs in an 'ssh-ready' state: ubuntu@machine:~/spark-1.2.0-bin-hadoop2.4$ ./ec2/spark-ec2 -k key-pair -i identity-file.pem -r us-west-2 -s 3 launch test Setting up security groups... Searching for existing cluster test... Spark AMI: ami-ae6e0d9e Launching instances... Launched 3 slaves in us-west-2c, regid = r-b___6 Launched master in us-west-2c, regid = r-0__0 Waiting for all instances in cluster to enter 'ssh-ready' state.. Yet I can SSH into these instances without compaint: ubuntu@machine:~$ ssh -i identity-file.pem root@master-ip Last login: Day MMM DD HH:mm:ss 20YY from c-AA-BBB--DDD.eee1.ff.provider.net __| __|_ ) _| ( / Amazon Linux AMI ___|\___|___| https://aws.amazon.com/amazon-linux-ami/2013.03-release-notes/ There are 59 security update(s) out of 257 total update(s) available Run sudo yum update to apply all updates. Amazon Linux version 2014.09 is available. root@ip-internal ~]$ I'm trying to figure out if this is a problem in AWS or with the Spark scripts. I've never had this issue before until recently. -- Nathan Murthy // 713.884.7110 (mobile) // @natemurthy
Spark 1.2.0 ec2 launch script hadoop native libraries not found warning
Hi, Im facing this error on spark ec2 cluster when a job is submitted its says that native hadoop libraries are not found I have checked spark-env.sh and all the folders in the path but unable to find the problem even though the folder are containing. are there any performance drawbacks if we use inbuilt jars is there any body else fcing this problem. btw I'm using spark-1.2.0, hadoop major version = 2, scala version = 2.10.4. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-2-0-ec2-launch-script-hadoop-native-libraries-not-found-warning-tp21030.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Spark on EC2
Hello, I am trying to run a python script that makes use of the kmeans MLIB and I'm not getting anywhere. I'm using an c3.xlarge instance as master, and 10 c3.large instances as slaves. In the code I make a map of a 600MB csv file in S3, where each row has 128 integer columns. The problem is that around the TID7 my slave stops responding, and I can not finish my processing. Could you help me with this problem? I sending my script attached for review. Thank you, Gilberto #!/usr/bin/env python # coding: utf-8 from pyspark import SparkConf, SparkContext from pyspark.mllib.clustering import KMeans from numpy import array from math import sqrt conf = (SparkConf() .setMaster(spark://ec2-54-207-84-167.sa-east-1.compute.amazonaws.com:7077) .setAppName(Kmeans App) .set(spark.akka.frameSize, 20) .set(spark.executor.memory, 2048m)) sc = SparkContext(conf = conf) # Load and parse the data data = sc.textFile(s3n://boomage-npc-production/general_files/features/geral.csv) #data = sc.textFile(s3n://boo-kmeans-test/clustering/240x.csv) parsedData = data.map(lambda line: array([int(x) for x in line.split(',')])) # Build the model (cluster the data) clusters = KMeans.train(parsedData, 1000, maxIterations=10, runs=10, initializationMode=random) print {0} = {1}.format(B, array(clusters.clusterCenters).shape) - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark on EC2
Hi Gilberto, Could you please attach the driver logs as well, so that we can pinpoint what's going wrong? Could you also add the flag `--driver-memory 4g` while submitting your application and try that as well? Best, Burak - Original Message - From: Gilberto Lira g...@scanboo.com.br To: user@spark.apache.org Sent: Thursday, September 18, 2014 11:48:03 AM Subject: Spark on EC2 Hello, I am trying to run a python script that makes use of the kmeans MLIB and I'm not getting anywhere. I'm using an c3.xlarge instance as master, and 10 c3.large instances as slaves. In the code I make a map of a 600MB csv file in S3, where each row has 128 integer columns. The problem is that around the TID7 my slave stops responding, and I can not finish my processing. Could you help me with this problem? I sending my script attached for review. Thank you, Gilberto - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Issue with Spark on EC2 using spark-ec2 script
I'm also getting into same issue and is blocked here. Did any of you were able to go past this issue? I tried using both ephimeral and persistent-hdfs. I'm getting the same issue. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Issue-with-Spark-on-EC2-using-spark-ec2-script-tp11088p12232.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Down-scaling Spark on EC2 cluster
What about down-scaling when I use Mesos, does that really deteriorate the performance ? Otherwise we would probably go for spark on mesos on ec2 :) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Down-scaling-Spark-on-EC2-cluster-tp10494p12109.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Issue with Spark on EC2 using spark-ec2 script
It looked like you were running in standalone mode (master set to local[4]). That's how I ran it. Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http://typesafe.com @deanwampler http://twitter.com/deanwampler http://polyglotprogramming.com On Thu, Jul 31, 2014 at 8:37 PM, ratabora ratab...@gmail.com wrote: Hey Dean! Thanks! Did you try running this on a local environment or one generated by the spark-ec2 script? The environment I am running on is a 4 data node 1 master spark cluster generated by the spark-ec2 script. I haven't modified anything in the environment except for adding data to the ephemeral hdfs. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Issue-with-Spark-on-EC2-using-spark-ec2-script-tp11088p7.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Issue with Spark on EC2 using spark-ec2 script
Hey all, I was able to spawn up a cluster, but when I'm trying to submit a simple jar via spark-submit it fails to run. I am trying to run the simple Standalone Application from the quickstart. Oddly enough, I could get another application running through the spark-shell. What am I doing wrong here? :( http://spark.apache.org/docs/latest/quick-start.html * Here's my setup: * $ ls project simple.sbt src target $ ls -R src src: main src/main: scala src/main/scala: SimpleApp.scala $ cat src/main/scala/SimpleApp.scala package main.scala /* SimpleApp.scala */ import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf object SimpleApp { def main(args: Array[String]) { val logFile = /tmp/README.md val conf = new SparkConf().setAppName(Simple Application) val sc = new SparkContext(conf) val logData = sc.textFile(logFile, 2).cache() val numAs = logData.filter(line = line.contains(a)).count() val numBs = logData.filter(line = line.contains(b)).count() println(Lines with a: %s, Lines with b: %s.format(numAs, numBs)) } } $ cat simple.sbt name := Simple Project version := 1.0 scalaVersion := 2.10.4 libraryDependencies += org.apache.spark %% spark-core % 1.0.1 resolvers += Akka Repository at http://repo.akka.io/releases/; * Here's how I run the job: * $ /root/spark/bin/spark-submit --class main.scala.SimpleApp --master local[4] ./target/scala-2.10/simple-project_2.10-1.0.jar *Here is the error: * 14/07/31 16:23:56 INFO scheduler.DAGScheduler: Failed to run count at SimpleApp.scala:14 Exception in thread main org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0:1 failed 1 times, most recent failure: Exception failure in TID 1 on host localhost: java.io.IOException: No such file or directory java.io.UnixFileSystem.createFileExclusively(Native Method) java.io.File.createNewFile(File.java:1006) java.io.File.createTempFile(File.java:1989) org.apache.spark.util.Utils$.fetchFile(Utils.scala:326) org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:332) org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:330) scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772) scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) scala.collection.mutable.HashMap.foreach(HashMap.scala:98) scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) org.apache.spark.executor.Executor.org $apache$spark$executor$Executor$$updateDependencies(Executor.scala:330) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:168) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at
Re: Issue with Spark on EC2 using spark-ec2 script
Hey Dean! Thanks! Did you try running this on a local environment or one generated by the spark-ec2 script? The environment I am running on is a 4 data node 1 master spark cluster generated by the spark-ec2 script. I haven't modified anything in the environment except for adding data to the ephemeral hdfs. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Issue-with-Spark-on-EC2-using-spark-ec2-script-tp11088p7.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Down-scaling Spark on EC2 cluster
Any idea about the probable dates for this implementation. I believe it would be a wonderful (and essential) functionality to gain more acceptance in the community. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Down-scaling-Spark-on-EC2-cluster-tp10494p10639.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Down-scaling Spark on EC2 cluster
No idea. Right now implementing this is up for grabs by the community. On Fri, Jul 25, 2014 at 5:40 AM, Shubhabrata mail2shu...@gmail.com wrote: Any idea about the probable dates for this implementation. I believe it would be a wonderful (and essential) functionality to gain more acceptance in the community. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Down-scaling-Spark-on-EC2-cluster-tp10494p10639.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Down-scaling Spark on EC2 cluster
Hello, We plan to use Spark on EC2 for our data science pipeline. We successfully manage to set up cluster as-well-as launch and run applications on remote-clusters. However, to enhance scalability we would like to implement auto-scaling in EC2 for Spark applications. However, I did not find any proper reference about this. For example when we launch training programs that use Matlab scripts on EC2 cluster we do auto scaling by SQS. Can anyone please suggest what are the options for Spark ? This is especially more important when we would downscaling by removing a machine (how graceful can it be if it is in the middle of a task). Thanks in advance. Shubhabrata -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Down-scaling-Spark-on-EC2-cluster-tp10494.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Down-scaling Spark on EC2 cluster
Hi Currently this is not supported out of the Box. But you can of course add/remove workers in a running cluster. Better option would be to use a Mesos cluster where adding/removing nodes are quiet simple. But again, i believe adding new worker in the middle of a task won't give you better performance. Thanks Best Regards On Wed, Jul 23, 2014 at 6:36 PM, Shubhabrata mail2shu...@gmail.com wrote: Hello, We plan to use Spark on EC2 for our data science pipeline. We successfully manage to set up cluster as-well-as launch and run applications on remote-clusters. However, to enhance scalability we would like to implement auto-scaling in EC2 for Spark applications. However, I did not find any proper reference about this. For example when we launch training programs that use Matlab scripts on EC2 cluster we do auto scaling by SQS. Can anyone please suggest what are the options for Spark ? This is especially more important when we would downscaling by removing a machine (how graceful can it be if it is in the middle of a task). Thanks in advance. Shubhabrata -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Down-scaling-Spark-on-EC2-cluster-tp10494.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Down-scaling Spark on EC2 cluster
There is a JIRA issue to track adding such functionality to spark-ec2: SPARK-2008 https://issues.apache.org/jira/browse/SPARK-2008 - Enhance spark-ec2 to be able to add and remove slaves to an existing cluster On Wed, Jul 23, 2014 at 10:12 AM, Akhil Das ak...@sigmoidanalytics.com wrote: Hi Currently this is not supported out of the Box. But you can of course add/remove workers in a running cluster. Better option would be to use a Mesos cluster where adding/removing nodes are quiet simple. But again, i believe adding new worker in the middle of a task won't give you better performance. Thanks Best Regards On Wed, Jul 23, 2014 at 6:36 PM, Shubhabrata mail2shu...@gmail.com wrote: Hello, We plan to use Spark on EC2 for our data science pipeline. We successfully manage to set up cluster as-well-as launch and run applications on remote-clusters. However, to enhance scalability we would like to implement auto-scaling in EC2 for Spark applications. However, I did not find any proper reference about this. For example when we launch training programs that use Matlab scripts on EC2 cluster we do auto scaling by SQS. Can anyone please suggest what are the options for Spark ? This is especially more important when we would downscaling by removing a machine (how graceful can it be if it is in the middle of a task). Thanks in advance. Shubhabrata -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Down-scaling-Spark-on-EC2-cluster-tp10494.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Spark 1.0.1 EC2 - Launching Applications
Hi All, I've used the spark-ec2 scripts to build a simple 1.0.1 Standalone cluster on EC2. It appears that the spark-submit script is not bundled with a spark-ec2 install. Given that: What is the recommended way to execute spark jobs on a standalone EC2 cluster? Spark-submit provides extremely useful features that are still useful for EC2 deployments. We've used workarounds like modifying the spark-classpath and using run-example in the past to run simple one-time EC2 jobs. The 'Running Applications' section of the EC2-Scripts documentation does not mention how to actually submit jobs to the cluster either. Thanks! Josh
Re: Spark 1.0.1 EC2 - Launching Applications
The script should be there, in the spark/bin directory. What command did you use to launch the cluster? Matei On Jul 14, 2014, at 1:12 PM, Josh Happoldt josh.happo...@trueffect.com wrote: Hi All, I've used the spark-ec2 scripts to build a simple 1.0.1 Standalone cluster on EC2. It appears that the spark-submit script is not bundled with a spark-ec2 install. Given that: What is the recommended way to execute spark jobs on a standalone EC2 cluster? Spark-submit provides extremely useful features that are still useful for EC2 deployments. We've used workarounds like modifying the spark-classpath and using run-example in the past to run simple one-time EC2 jobs. The 'Running Applications' section of the EC2-Scripts documentation does not mention how to actually submit jobs to the cluster either. Thanks! Josh
Re: Spark on EC2
Hmm.. you've gotten further than me. Which AMI's are you using? On Sun, Jun 1, 2014 at 2:21 PM, superback andrew.matrix.c...@gmail.com wrote: Hi, I am trying to run an example on AMAZON EC2 and have successfully set up one cluster with two nodes on EC2. However, when I was testing an example using the following command, * ./run-example org.apache.spark.examples.GroupByTest spark://`hostname`:7077* I got the following warnings and errors. Can anyone help one solve this problem? Thanks very much! 46781 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 61544 [spark-akka.actor.default-dispatcher-3] ERROR org.apache.spark.deploy.client.AppClient$ClientActor - All masters are unresponsive! Giving up. 61544 [spark-akka.actor.default-dispatcher-3] ERROR org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend - Spark cluster looks dead, giving up. 61546 [spark-akka.actor.default-dispatcher-3] INFO org.apache.spark.scheduler.TaskSchedulerImpl - Remove TaskSet 0.0 from pool 61549 [main] INFO org.apache.spark.scheduler.DAGScheduler - Failed to run count at GroupByTest.scala:50 Exception in thread main org.apache.spark.SparkException: Job aborted: Spark cluster looks down at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028) at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1026) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.org $apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1026) at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:619) at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:619) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:619) at org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:207) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-EC2-tp6638.html Sent from the Apache Spark User List mailing list archive at Nabble.com. -- Jeremy Lee BCompSci(Hons) The Unorthodox Engineers
Re: Spark on EC2
I haven't set up AMI yet. I am just trying to run a simple job on the EC2 cluster. So, is setting up AMI a prerequisite for running simple Spark example like org.apache.spark.examples.GroupByTest? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-EC2-tp6638p6681.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Spark on EC2
No, you don't have to set up your own AMI. Actually it's probably simpler and less error prone if you let spark-ec2 manage that for you as you first start to get comfortable with Spark. Just spin up a cluster without any explicit mention of AMI and it will do the right thing. 2014년 6월 1일 일요일, superbackandrew.matrix.c...@gmail.com님이 작성한 메시지: I haven't set up AMI yet. I am just trying to run a simple job on the EC2 cluster. So, is setting up AMI a prerequisite for running simple Spark example like org.apache.spark.examples.GroupByTest? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-EC2-tp6638p6681.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Spark on EC2
Hi, I am trying to run an example on AMAZON EC2 and have successfully set up one cluster with two nodes on EC2. However, when I was testing an example using the following command, * ./run-example org.apache.spark.examples.GroupByTest spark://`hostname`:7077* I got the following warnings and errors. Can anyone help one solve this problem? Thanks very much! 46781 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 61544 [spark-akka.actor.default-dispatcher-3] ERROR org.apache.spark.deploy.client.AppClient$ClientActor - All masters are unresponsive! Giving up. 61544 [spark-akka.actor.default-dispatcher-3] ERROR org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend - Spark cluster looks dead, giving up. 61546 [spark-akka.actor.default-dispatcher-3] INFO org.apache.spark.scheduler.TaskSchedulerImpl - Remove TaskSet 0.0 from pool 61549 [main] INFO org.apache.spark.scheduler.DAGScheduler - Failed to run count at GroupByTest.scala:50 Exception in thread main org.apache.spark.SparkException: Job aborted: Spark cluster looks down at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028) at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1026) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1026) at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:619) at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:619) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:619) at org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:207) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-EC2-tp6638.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
CDH5 Spark on EC2
I’ve been able to get CDH5 up and running on EC2 and according to Cloudera Manager, Spark is running healthy. But when I try to run spark-shell, I eventually get the error: 14/04/02 07:18:18 INFO client.AppClient$ClientActor: Connecting to master spark://ip-172-xxx-xxx-xxx:7077... 14/04/02 07:18:38 ERROR client.AppClient$ClientActor: All masters are unresponsive! Giving up. 14/04/02 07:18:38 ERROR cluster.SparkDeploySchedulerBackend: Spark cluster looks dead, giving up. 14/04/02 07:18:38 ERROR scheduler.TaskSchedulerImpl: Exiting due to error from cluster scheduler: Spark cluster looks down Wondering which configurations I would need to change to get this to work? Thanks! Denny