Dear PJ$, If you are familiar with Puppet, you could try using the puppet module I wrote (currently for Spark 0.9.0, I custom compiled it since no Debian package was available at the time I started with a project I required it for).
https://github.com/stefanvanwouw/puppet-spark --- Kind regards, Stefan van Wouw On 02 Jun 2014, at 00:11, PJ$ <p...@chickenandwaffl.es> wrote: > Running on a few m3.larges with the ami-848a6eec image (debian 7). Haven't > gotten any further. No clue what's wrong. I'd really appreciate any guidance > y'all could offer. > > Best, > PJ$ > > > On Sat, May 31, 2014 at 1:40 PM, Matei Zaharia <matei.zaha...@gmail.com> > wrote: > What instance types did you launch on? > > Sometimes you also get a bad individual machine from EC2. It might help to > remove the node it’s complaining about from the conf/slaves file. > > Matei > > On May 30, 2014, at 11:18 AM, PJ$ <p...@chickenandwaffl.es> wrote: > >> Hey Folks, >> >> I'm really having quite a bit of trouble getting spark running on ec2. I'm >> not using scripts the https://github.com/apache/spark/tree/master/ec2 >> because I'd like to know how everything works. But I'm going a little crazy. >> I think that something about the networking configuration must be messed up, >> but I'm at a loss. Shortly after starting the cluster, I get a lot of this: >> >> 14/05/30 18:03:22 INFO master.Master: Registering worker >> ip-10-100-184-45.ec2.internal:7078 with 2 cores, 6.3 GB RAM >> 14/05/30 18:03:22 INFO master.Master: Registering worker >> ip-10-100-184-45.ec2.internal:7078 with 2 cores, 6.3 GB RAM >> 14/05/30 18:03:23 INFO master.Master: Registering worker >> ip-10-100-184-45.ec2.internal:7078 with 2 cores, 6.3 GB RAM >> 14/05/30 18:03:23 INFO master.Master: Registering worker >> ip-10-100-184-45.ec2.internal:7078 with 2 cores, 6.3 GB RAM >> 14/05/30 18:05:54 INFO master.Master: >> akka.tcp://spark@ip-10-100-75-70.ec2.internal:38485 got disassociated, >> removing it. >> 14/05/30 18:05:54 INFO actor.LocalActorRef: Message >> [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying] from >> Actor[akka://sparkMaster/deadLetters] to >> Actor[akka://sparkMaster/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkMaster%4010.100.75.70%3A36725-25#847210246] >> was not delivered. [5] dead letters encountered. This logging can be turned >> off or adjusted with configuration settings 'akka.log-dead-letters' and >> 'akka.log-dead-letters-during-shutdown'. >> 14/05/30 18:05:54 INFO master.Master: >> akka.tcp://spark@ip-10-100-75-70.ec2.internal:38485 got disassociated, >> removing it. >> 14/05/30 18:05:54 INFO master.Master: >> akka.tcp://spark@ip-10-100-75-70.ec2.internal:38485 got disassociated, >> removing it. >> 14/05/30 18:05:54 ERROR remote.EndpointWriter: AssociationError >> [akka.tcp://sparkMaster@ip-10-100-184-45.ec2.internal:7077] -> >> [akka.tcp://spark@ip-10-100-75-70.ec2.internal:38485]: Error [Association >> failed with [akka.tcp://spark@ip-10-100-75-70.ec2.internal:38485]] [ >> akka.remote.EndpointAssociationException: Association failed with >> [akka.tcp://spark@ip-10-100-75-70.ec2.internal:38485] >> Caused by: >> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: >> Connection refused: ip-10-100-75-70.ec2.internal/10.100.75.70:38485 >> ] >> 14/05/30 18:05:54 ERROR remote.EndpointWriter: AssociationError >> [akka.tcp://sparkMaster@ip-10-100-184-45.ec2.internal:7077] -> >> [akka.tcp://spark@ip-10-100-75-70.ec2.internal:38485]: Error [Association >> failed with [akka.tcp://spark@ip-10-100-75-70.ec2.internal:38485]] [ >> akka.remote.EndpointAssociationException: Association failed with >> [akka.tcp://spark@ip-10-100-75-70.ec2.internal:38485] >> Caused by: >> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: >> Connection refused: ip-10-100-75-70.ec2.internal/10.100.75.70:38485 >> ] >> 14/05/30 18:05:54 INFO master.Master: >> akka.tcp://spark@ip-10-100-75-70.ec2.internal:38485 got disassociated, >> removing it. >> 14/05/30 18:05:54 INFO master.Master: >> akka.tcp://spark@ip-10-100-75-70.ec2.internal:38485 got disassociated, >> removing it. >> 14/05/30 18:05:54 ERROR remote.EndpointWriter: AssociationError >> [akka.tcp://sparkMaster@ip-10-100-184-45.ec2.internal:7077] -> >> [akka.tcp://spark@ip-10-100-75-70.ec2.internal:38485]: Error [Association >> failed with [akka.tcp://spark@ip-10-100-75-70.ec2.internal:38485]] [ >> akka.remote.EndpointAssociationException: Association failed with >> [akka.tcp://spark@ip-10-100-75-70.ec2.internal:38485] >> Caused by: >> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: >> Connection refused: ip-10-100-75-70.ec2.internal/10.100.75.70:38485 > >