Re: [elasticluster] SLURM is not installed after cluster setup
Initially permissions were like this: drwxrwxr-x 2 orhan orhan 4096 Şub 3 21:24 /home/orhan/.ansible drwxrwxr-x 3 orhan orhan 4096 Şub 4 16:15 /home/orhan/.elasticluster drwx-- 2 orhan orhan 4096 Oca 29 19:57 /home/orhan/.ssh After the commands it became: drwxrwxrwx 2 orhan orhan 4096 Şub 3 21:24 /home/orhan/.ansible drwxrwxrwx 3 orhan orhan 4096 Şub 4 16:15 /home/orhan/.elasticluster drwx---rwx 2 orhan orhan 4096 Oca 29 19:57 /home/orhan/.ssh However, that Errno 13 is still there. Error message is as follows: 'import sitecustomize' failed; use -v for traceback Traceback (most recent call last): File "/usr/local/bin/ansible-playbook", line 43, in import ansible.constants as C File "/usr/local/lib/python2.7/site-packages/ansible/constants.py", line 202, in DEFAULT_LOCAL_TMP = get_config(p, DEFAULTS, 'local_tmp', 'ANSIBLE_LOCAL_TEMP', '~/.ansible/tmp', value_type='tmppath') File "/usr/local/lib/python2.7/site-packages/ansible/constants.py", line 109, in get_config makedirs_safe(value, 0o700) File "/usr/local/lib/python2.7/site-packages/ansible/utils/path.py", line 71, in makedirs_safe raise AnsibleError("Unable to create local directories(%s): %s" % (to_native(rpath), to_native(e))) ansible.errors.AnsibleError: Unable to create local directories(/home/.ansible/tmp): [Errno 13] Permission denied: '/home/.ansible' 2018-02-04 15:56:38 cfeda8a7b8b3 gc3.elasticluster[1] ERROR Command `ansible-playbook /home/elasticluster/share/playbooks/site.yml --inventory=/home/orhan/.elasticluster/storage/slurm-on-gce.inventory --become --become-user=root -vv` failed with exit code 1. 2018-02-04 15:56:38 cfeda8a7b8b3 gc3.elasticluster[1] ERROR Check the output lines above for additional information on this error. 2018-02-04 15:56:38 cfeda8a7b8b3 gc3.elasticluster[1] ERROR The cluster has likely *not* been configured correctly. You may need to re-run `elasticluster setup` or fix the playbooks. 2018-02-04 15:56:38 cfeda8a7b8b3 gc3.elasticluster[1] WARNING Cluster `slurm-on-gce` not yet configured. Please, re-run `elasticluster setup slurm-on-gce` and/or check your configuration Orhan On Sun, Feb 4, 2018 at 3:36 PM, Riccardo Murriwrote: > Dear Orxan, > > the following subdirectories of your home directory should be owned > and writable by your Linux accoun (which is `rmurri` in my case)t: > > $ ls -ld $HOME/.ansible $HOME/.elasticluster $HOME/.ssh > drwxrwxr-x 5 rmurri rmurri 4096 feb 2 2015 /home/rmurri/.ansible > drwxrwxr-x 3 rmurri rmurri 4096 feb 3 21:15 /home/rmurri/.elasticluster > drwxr-xr-x 3 rmurri rmurri 4096 gen 19 16:29 /home/rmurri/.ssh > > If they aren't, try running the following command to fix the permissions > > sudo chown -v -R $(whoami) $HOME/.ansible $HOME/.elasticluster > $HOME/.ssh > sudo chmod -v o+rwX $HOME/.ansible $HOME/.elasticluster $HOME/.ssh > > If it still doesn't work, please post the output of the above two > commands along with error message produced by ElastiCluster. > > Ciao, > R > -- You received this message because you are subscribed to the Google Groups "elasticluster" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticluster+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [elasticluster] SLURM is not installed after cluster setup
The `sudo` issue is solved but [Errno 13] is still there. Output is attached. Orhan On Sun, Feb 4, 2018 at 2:31 PM, Riccardo Murriwrote: > 2018-02-04 12:15 GMT+01:00 Orxan Shibliyev : > > The second command gave: > > > > orhan@orhan-MS-7850:~$ ./elasticluster.sh -vvv start slurm-on-gce > > docker: Got permission denied while trying to connect to the Docker > daemon > > socket at unix:///var/run/docker.sock: Post > > http://%2Fvar%2Frun%2Fdocker.sock/v1.31/containers/create: dial unix > > /var/run/docker.sock: connect: permission denied. > > > > Then you probably need to add yourself to the `docker` group: > > sudo gpasswd -a $(whoami) docker > > Note: replace `docker` above with whatever group owns the socket > `/var/run/docler.sock` > > You might need to log out and back in order for the additional change > to be picked up; or run `newgrp docker` to get a shell with the > correct permissions. > > Please let me know if it works, so I can automate this in the > `elasticluster.sh` script. > > Ciao, > R > -- You received this message because you are subscribed to the Google Groups "elasticluster" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticluster+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout. orhan@orhan-MS-7850:~$ ./elasticluster.sh -vvv start slurm-on-gce 2018-02-04 14:54:34 41e0a6cea578 gc3.elasticluster[1] DEBUG Checking section `cluster/slurm-on-gce` ... 2018-02-04 14:54:34 41e0a6cea578 gc3.elasticluster[1] DEBUG Checking section `cluster/gridengine-on-gce` ... 2018-02-04 14:54:34 41e0a6cea578 gc3.elasticluster[1] DEBUG Checking section `login/google` ... 2018-02-04 14:54:34 41e0a6cea578 gc3.elasticluster[1] DEBUG Checking section `setup/gridengine` ... 2018-02-04 14:54:34 41e0a6cea578 gc3.elasticluster[1] DEBUG Checking section `setup/slurm` ... 2018-02-04 14:54:34 41e0a6cea578 gc3.elasticluster[1] DEBUG Checking section `setup/pbs` ... 2018-02-04 14:54:34 41e0a6cea578 gc3.elasticluster[1] DEBUG Checking section `cloud/google` ... 2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG Using class from module to instanciate provider 'google' 2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG Using class from module to instanciate provider 'ansible' 2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG setting variable multiuser_cluster=yes for node kind compute 2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG setting variable multiuser_cluster=yes for node kind frontend 2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG setting variable multiuser_cluster=yes for node kind submit Starting cluster `slurm-on-gce` with: * 1 frontend nodes. * 2 compute nodes. (This may take a while...) 2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] INFO Starting cluster nodes ... 2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG Note: starting 3 nodes concurrently. 2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG _start_node: working on node `frontend001` 2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] INFO Starting node `frontend001` from image `ubuntu-1604-xenial-v20180126` with flavor n1-standard-1 ... 2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG _start_node: working on node `compute002` 2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG _start_node: working on node `compute001` 2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] INFO Starting node `compute002` from image `ubuntu-1604-xenial-v20180126` with flavor n1-standard-1 ... 2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] INFO Starting node `compute001` from image `ubuntu-1604-xenial-v20180126` with flavor n1-standard-1 ... 2018-02-04 14:54:47 41e0a6cea578 gc3.elasticluster[1] DEBUG Node `compute002` has instance ID `slurm-on-gce-compute002` 2018-02-04 14:54:47 41e0a6cea578 gc3.elasticluster[1] INFO Node `compute002` has been started. 2018-02-04 14:55:16 41e0a6cea578 gc3.elasticluster[1] DEBUG Node `frontend001` has instance ID `slurm-on-gce-frontend001` 2018-02-04 14:55:16 41e0a6cea578 gc3.elasticluster[1] INFO Node `frontend001` has been started. 2018-02-04 14:55:20 41e0a6cea578 gc3.elasticluster[1] DEBUG Node `compute001` has instance ID `slurm-on-gce-compute001` 2018-02-04 14:55:20 41e0a6cea578 gc3.elasticluster[1] INFO Node `compute001` has been started. 2018-02-04 14:55:20 41e0a6cea578 gc3.elasticluster[1] DEBUG Getting information for instance slurm-on-gce-compute002 2018-02-04 14:55:20 41e0a6cea578 gc3.elasticluster[1] DEBUG node `compute002` (instance id slurm-on-gce-compute002) is up. 2018-02-04 14:55:21 41e0a6cea578 gc3.elasticluster[1] DEBUG Getting information for instance slurm-on-gce-frontend001 2018-02-04 14:55:21 41e0a6cea578 gc3.elasticluster[1] DEBUG node `frontend001` (instance id slurm-on-gce-frontend001) is up. 2018-02-04 14:55:21 41e0a6cea578
Re: [elasticluster] SLURM is not installed after cluster setup
2018-02-04 12:15 GMT+01:00 Orxan Shibliyev: > The second command gave: > > orhan@orhan-MS-7850:~$ ./elasticluster.sh -vvv start slurm-on-gce > docker: Got permission denied while trying to connect to the Docker daemon > socket at unix:///var/run/docker.sock: Post > http://%2Fvar%2Frun%2Fdocker.sock/v1.31/containers/create: dial unix > /var/run/docker.sock: connect: permission denied. > Then you probably need to add yourself to the `docker` group: sudo gpasswd -a $(whoami) docker Note: replace `docker` above with whatever group owns the socket `/var/run/docler.sock` You might need to log out and back in order for the additional change to be picked up; or run `newgrp docker` to get a shell with the correct permissions. Please let me know if it works, so I can automate this in the `elasticluster.sh` script. Ciao, R -- You received this message because you are subscribed to the Google Groups "elasticluster" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticluster+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [elasticluster] SLURM is not installed after cluster setup
Ah, no, wait! You should *not* run `elasticluster.sh` through `sudo`! Otherwise you'll be running as root in your home directory, which screws all permissions up... Can you please run the following commands? # fix permissions chown -R $USER $HOME/.ansible $HOME/.ssh $HOME/.elasticluster # run elasticluster ./elasticluster.sh -vvv start slurm-on-gce Ciao, R -- You received this message because you are subscribed to the Google Groups "elasticluster" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticluster+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [elasticluster] SLURM is not installed after cluster setup
Dear Orxan, there seems to be an error with the Docker image; according to this log line, the Ansible configuration system did not run at all: ansible.errors.AnsibleError: Unable to create local directories(/home/.ansible/tmp): [Errno 13] Permission denied: '/home/.ansible' This is definitely a bug. I'll try to correct it later today or tomorrow, and push a new Docker image. Ciao, R -- You received this message because you are subscribed to the Google Groups "elasticluster" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticluster+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.