Having problems with the EC2 Python scripts

Richard Xia Mon, 16 Apr 2012 03:49:25 -0700

Hi,

I'm trying to go through the guide here(https://github.com/mesos/mesos/wiki/EC2-Scripts) and I'm running into acouple problems. I'm running the latest version of the trunk (r1310658)on Mac OS X 10.6 with the default Python (2.6).

The first problem that I run into is with the launch script. The defaultwait time of 60 seconds doesn't seem to be enough; I would consistentlyrun into the error of the ssh connection being refused. When I set thewait time to 120 seconds (just to be safe, I'm sure a smaller valuewould work as well), it worked and would run to completion. I was justusing the default settings suggested by the guide (1 slave, m1.largeinstance) and it took me a while to realize that the script just wasn'twaiting long enough for the instances to start up. Is this the expectedbehavior? If it is, I think the guide needs to be updated to mentionthat the default wait time may not be long enough.

The second problem I am having is with any of the scripts that target anexisting cluster. For example, if I try running ./mesos-ec2 stop<cluster-name>, I get the error message "ERROR: Could not find anyexisting cluster". When debugging the script, I found thatget_existing_cluster() wasn't working properly. On line 309, when itsets the variable group_names, it calls g.id where g is a securitygroup. The following lines seem to check whether the security group namematches "<cluster-name>-master", "-slaves", or "-zoo". However, whenrunning a debugger, I find that the security group's id is actually inthe form " sg-6561c10d", not "<cluster-name>-slaves". Instead, it seemsto me that line 309 should be group_names = [g.name for g inres.groups]. When I make this change myself, it seems to work.


Thanks,
Richard Xia

Having problems with the EC2 Python scripts

Reply via email to