Thanks for your response, Riccardo. I fixed that SSH keys problem but I 
still get errors:

*Here's the error:*

$ elasticluster start gce

-n elasticluster.sh: WARNING: 

Command 'env' does not support null-terminated lines;

elasticluster.sh cannot properly sanitize the environment in this case.

If you get errors later on about Docker being unable to process environment

variables, you will need to install GNU coreutils' 'env'.

Starting cluster `gce` with:

* 1 frontend nodes.

* 1 compute nodes.

(This may take a while...)

2020-08-18 17:25:18 4b6111acbb85 elasticluster[1] *WARNING* UserWarning: 
Cannot access 
/Users/mahsa/.elasticluster/storage/429683943466-eclntgdphrfcbiio29sj7ekq5dceuoi2.apps.googleusercontent.com.oauth.dat:
 
No such file or directory

No handlers could be found for logger "paramiko.transport"

2020-08-18 17:36:25 4b6111acbb85 elasticluster[1] *ERROR* Some nodes of the 
cluster were unreachable within the given 600-seconds timeout: frontend001, 
compute001

Configuring the cluster ...

(this too may take a while)

2020-08-18 17:36:25 4b6111acbb85 elasticluster[1] *WARNING* Ignoring node 
`frontend001`: No IP address.

2020-08-18 17:36:25 4b6111acbb85 elasticluster[1] *WARNING* Ignoring node 
`compute001`: No IP address.

2020-08-18 17:36:25 4b6111acbb85 elasticluster[1] *ERROR* The cluster hosts 
are up and running, but Ansible failed to set the cluster up: The cluster 
does not provide the minimum amount of nodes specified in the 
configuration. Some nodes are running, but the cluster will not be set up 
yet. Please change the minimum amount of nodes in the configuration or try 
to start a new cluster after checking the cloud provider settings.

2020-08-18 17:36:25 4b6111acbb85 elasticluster[1] *WARNING* Cluster `gce` 
not yet configured. Please, re-run `elasticluster setup gce` and/or check 
your configuration

WARNING: YOUR CLUSTER `gce` IS NOT READY YET!

Cluster name:     gce

Cluster template: gce

Default ssh to node: frontend001

- frontend nodes: 1

- compute nodes: 1

*Here's my Config File (I have not included the first part of it):*


[login/google]

# Do not include @gmail (example: [email protected] -> monajemi)

image_user=ubuntu

image_user_sudo=root

image_sudo=True

user_key_name=elasticluster

user_key_private=~/.ssh/google_compute_engine

user_key_public=~/.ssh/google_compute_engine.pub

[setup/ansible-slurm]

provider=ansible

frontend_groups=slurm_master

compute_groups=slurm_worker,cuda

# allow restart of compute nodes

compute_var_allow_reboot=yes

worker_var_allow_reboot=yes

global_var_allow_reboot=yes

global_var_slurm_taskplugin=task/cgroup

global_var_slurm_proctracktype=proctrack/cgroup

global_var_slurm_jobacctgathertype=jobacct_gather/cgroup

[cluster/gce]

cloud=google

login=google

setup=ansible-slurm

security_group=default

frontend_nodes=1

compute_nodes=1

ssh_to=frontend

# Ask for 500G of disk

boot_disk_type=pd-standard

boot_disk_size=500

[cluster/gce/frontend]

flavor=n1-standard-8

image_id=ubuntu-1604-xenial-v20171107b

# add 2x GPUs (NVidia Tesla K80) to the compute nodes

# note that as of Nov. 2017, GPU-enabled VMs are available only in few zones

# use `gcloud compute accelerator-types list` to see what is available

[cluster/gce/compute]

flavor=n1-standard-8

#flavor=n1-highmem-8

image_id=ubuntu-1604-xenial-v20171107b

#accelerator_count=1

#accelerator_type=nvidia-tesla-v100

#accelerator_type=nvidia-tesla-k80



Could you help me with this? Thanks in advance!

On Friday, August 14, 2020 at 1:40:35 PM UTC-7 Riccardo Murri wrote:

> Hello Mahsa,
>
> Regarding the error you're seeing:
>
> > DEBUG Ignoring error connecting to compute001: Invalid key -- <class 
> 'paramiko.ssh_exception.SSHException'>
>
> My first guess would be that you pointed ElastiCluster to some SSH key
> file that it cannot read.
>
> Check your configuration file; you should have some lines like the
> following ones:
>
> [login/google]
> image_user=riccardo.murri
> # ...
> user_key_private=~/.ssh/elasticluster
> user_key_public=~/.ssh/elasticluster.pub
>
> The lines influencing the SSH logins are the two `user_key_*` ones.
>
> Things to check:
>
> 1. *Both files must exist* on the machine where you run `elasticluster
> -vvvv start ...`
> 2. The file name is immaterial (could be `id_rsa` or `id_ed25519` or
> `google_cloud_sdk`) but, because of a bug, ElastiCluster can only use
> SSH keys of type RSA. To find out what type is the SSH key you're
> using, run this command (replace `~/.ssh/elasticluster-dev` with the
> path pointed to by `user_key_private` in your config file):
>
> $ ssh-keygen -l -f ~/.ssh/elasticluster-dev
> 4096 SHA256:9r/pBW5nB2mnFrGIFxuxs8HW4ZVWDUbS/AzMeU3tjRM riccardo.murri@dev 
> (RSA)
>
> If instead of "(RSA)" you get a different code, you will need to
> generate an RSA key and use that instead:
>
> 1. Create a new RSA key for use with elasticluster:
>
> ssh-keygen -t rsa -b 4096 -o -a 100 -f ~/.ssh/elasticluster
>
> 2. Replace the `user_key_*` lines with the following:
>
> user_key_private=~/.ssh/elasticluster
> user_key_public=~/.ssh/elasticluster.pub
>
> If after making these changes you are still running into issues,
> please post or send to me via email your configuration file (WARNING:
> remove all private data like passwords and access keys!!)
>
> Hope this helps,
> Riccardo
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticluster" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticluster/1f277b6e-25cd-41ce-a3ed-797db7365714n%40googlegroups.com.

Reply via email to