[jira] [Commented] (SPARK-2396) Spark EC2 scripts fail when trying to log in to EC2 instances

Stephen M. Hopper (JIRA) Wed, 09 Jul 2014 19:17:22 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057045#comment-14057045
 ]


Stephen M. Hopper commented on SPARK-2396:
------------------------------------------

Update: I attempted this again except on a machine running Ubuntu Server 14.04 
and it worked first try. I kept all of the steps the same except I was using a 
prebuilt version of Spark (1.0.0 for Hadoop 2) instead of the version I had 
built myself from source using Maven.

> Spark EC2 scripts fail when trying to log in to EC2 instances
> -------------------------------------------------------------
>
>                 Key: SPARK-2396
>                 URL: https://issues.apache.org/jira/browse/SPARK-2396
>             Project: Spark
>          Issue Type: Bug
>          Components: EC2
>    Affects Versions: 1.0.0
>         Environment: Windows 8, Cygwin and command prompt, Python 2.7
>            Reporter: Stephen M. Hopper
>              Labels: aws, ec2, ssh
>
> I cannot seem to successfully start up a Spark EC2 cluster using the 
> spark-ec2 script.
> I'm using variations on the following command:
> ./spark-ec2 --instance-type=m1.small --region=us-west-1 --spot-price=0.05 
> --spark-version=1.0.0 -k my-key-name -i my-key-name.pem -s 1 launch 
> spark-test-cluster
> The script always allocates the EC2 instances without much trouble, but can 
> never seem to complete the SSH step to install Spark on the cluster.  It 
> always complains about my SSH key.  If I try to log in with my ssh key doing 
> something like this:
> ssh -i my-key-name.pem root@<insert ip of my instance here>
> it fails.  However, if I log in to the AWS console, click on my instance and 
> select "connect", it displays the instructions for SSHing into my instance 
> (which are no different from the ssh command from above).  So, if I rerun the 
> SSH command from above, I'm able to log in.
> Next, if I try to rerun the spark-ec2 command from above (replacing "launch" 
> with "start"), the script logs in and starts installing Spark.  However, it 
> eventually errors out with the following output:
> Cloning into 'spark-ec2'...
> remote: Counting objects: 1465, done.
> remote: Compressing objects: 100% (697/697), done.
> remote: Total 1465 (delta 485), reused 1465 (delta 485)
> Receiving objects: 100% (1465/1465), 228.51 KiB | 287 KiB/s, done.
> Resolving deltas: 100% (485/485), done.
> Connection to ec2-<my-clusters-ip>.us-west-1.compute.amazonaws.com closed.
> Searching for existing cluster spark-test-cluster...
> Found 1 master(s), 1 slaves
> Starting slaves...
> Starting master...
> Waiting for instances to start up...
> Waiting 120 more seconds...
> Deploying files to master...
> Traceback (most recent call last):
>   File "./spark_ec2.py", line 823, in <module>
>     main()
>   File "./spark_ec2.py", line 815, in main
>     real_main()
>   File "./spark_ec2.py", line 806, in real_main
>     setup_cluster(conn, master_nodes, slave_nodes, opts, False)
>   File "./spark_ec2.py", line 450, in setup_cluster
>     deploy_files(conn, "deploy.generic", opts, master_nodes, slave_nodes, 
> modules)
>   File "./spark_ec2.py", line 593, in deploy_files
>     subprocess.check_call(command)
>   File "E:\windows_programs\Python27\lib\subprocess.py", line 535, in 
> check_call
>     retcode = call(*popenargs, **kwargs)
>   File "E:\windows_programs\Python27\lib\subprocess.py", line 522, in call
>     return Popen(*popenargs, **kwargs).wait()
>   File "E:\windows_programs\Python27\lib\subprocess.py", line 710, in __init__
>     errread, errwrite)
>   File "E:\windows_programs\Python27\lib\subprocess.py", line 958, in 
> _execute_child
>     startupinfo)
> WindowsError: [Error 2] The system cannot find the file specified
> So, in short, am I missing something or is this a bug?  Any help would be 
> appreciated.
> Other notes:
> -I've tried both us-west-1 and us-east-1 regions.
> -I've tried several different instance types.
> -I've tried playing with the permissions on the ssh key (600, 400, etc.), but 
> to no avail



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2396) Spark EC2 scripts fail when trying to log in to EC2 instances

Reply via email to