Re: [elasticluster] SLURM is not installed after cluster setup

2018-02-04 Thread Orxan Shibliyev
Initially permissions were like this:

drwxrwxr-x 2 orhan orhan 4096 Şub  3 21:24 /home/orhan/.ansible
drwxrwxr-x 3 orhan orhan 4096 Şub  4 16:15 /home/orhan/.elasticluster
drwx-- 2 orhan orhan 4096 Oca 29 19:57 /home/orhan/.ssh

After the commands it became:

drwxrwxrwx 2 orhan orhan 4096 Şub  3 21:24 /home/orhan/.ansible
drwxrwxrwx 3 orhan orhan 4096 Şub  4 16:15 /home/orhan/.elasticluster
drwx---rwx 2 orhan orhan 4096 Oca 29 19:57 /home/orhan/.ssh

However, that Errno 13 is still there. Error message is as follows:

'import sitecustomize' failed; use -v for traceback
Traceback (most recent call last):
  File "/usr/local/bin/ansible-playbook", line 43, in 
import ansible.constants as C
  File "/usr/local/lib/python2.7/site-packages/ansible/constants.py", line
202, in 
DEFAULT_LOCAL_TMP = get_config(p, DEFAULTS, 'local_tmp',
 'ANSIBLE_LOCAL_TEMP',  '~/.ansible/tmp', value_type='tmppath')
  File "/usr/local/lib/python2.7/site-packages/ansible/constants.py", line
109, in get_config
makedirs_safe(value, 0o700)
  File "/usr/local/lib/python2.7/site-packages/ansible/utils/path.py", line
71, in makedirs_safe
raise AnsibleError("Unable to create local directories(%s): %s" %
(to_native(rpath), to_native(e)))
ansible.errors.AnsibleError: Unable to create local
directories(/home/.ansible/tmp): [Errno 13] Permission denied:
'/home/.ansible'
2018-02-04 15:56:38 cfeda8a7b8b3 gc3.elasticluster[1] ERROR Command
`ansible-playbook /home/elasticluster/share/playbooks/site.yml
--inventory=/home/orhan/.elasticluster/storage/slurm-on-gce.inventory
--become --become-user=root -vv` failed with exit code 1.
2018-02-04 15:56:38 cfeda8a7b8b3 gc3.elasticluster[1] ERROR Check the
output lines above for additional information on this error.
2018-02-04 15:56:38 cfeda8a7b8b3 gc3.elasticluster[1] ERROR The cluster has
likely *not* been configured correctly. You may need to re-run
`elasticluster setup` or fix the playbooks.
2018-02-04 15:56:38 cfeda8a7b8b3 gc3.elasticluster[1] WARNING Cluster
`slurm-on-gce` not yet configured. Please, re-run `elasticluster setup
slurm-on-gce` and/or check your configuration

Orhan

On Sun, Feb 4, 2018 at 3:36 PM, Riccardo Murri 
wrote:

> Dear Orxan,
>
> the following subdirectories of your home directory should be owned
> and writable by your Linux accoun (which is `rmurri` in my case)t:
>
>  $ ls -ld $HOME/.ansible $HOME/.elasticluster $HOME/.ssh
> drwxrwxr-x 5 rmurri rmurri 4096 feb  2  2015 /home/rmurri/.ansible
> drwxrwxr-x 3 rmurri rmurri 4096 feb  3 21:15 /home/rmurri/.elasticluster
> drwxr-xr-x 3 rmurri rmurri 4096 gen 19 16:29 /home/rmurri/.ssh
>
> If they aren't, try running the following command to fix the permissions
>
> sudo chown -v -R $(whoami) $HOME/.ansible $HOME/.elasticluster
> $HOME/.ssh
> sudo chmod -v o+rwX $HOME/.ansible $HOME/.elasticluster $HOME/.ssh
>
> If it still doesn't work, please post the output of the above two
> commands along with error message produced by ElastiCluster.
>
> Ciao,
> R
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticluster" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticluster+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [elasticluster] SLURM is not installed after cluster setup

2018-02-04 Thread Orxan Shibliyev
The `sudo` issue is solved but [Errno 13] is still there. Output is
attached.

Orhan

On Sun, Feb 4, 2018 at 2:31 PM, Riccardo Murri 
wrote:

> 2018-02-04 12:15 GMT+01:00 Orxan Shibliyev :
> > The second command gave:
> >
> > orhan@orhan-MS-7850:~$ ./elasticluster.sh -vvv start slurm-on-gce
> > docker: Got permission denied while trying to connect to the Docker
> daemon
> > socket at unix:///var/run/docker.sock: Post
> > http://%2Fvar%2Frun%2Fdocker.sock/v1.31/containers/create: dial unix
> > /var/run/docker.sock: connect: permission denied.
> >
>
> Then you probably need to add yourself to the `docker` group:
>
> sudo gpasswd -a $(whoami) docker
>
> Note: replace `docker` above with whatever group owns the socket
> `/var/run/docler.sock`
>
> You might need to log out and back in order for the additional change
> to be picked up; or run `newgrp docker` to get a shell with the
> correct permissions.
>
> Please let me know if it works, so I can automate this in the
> `elasticluster.sh` script.
>
> Ciao,
> R
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticluster" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticluster+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
orhan@orhan-MS-7850:~$ ./elasticluster.sh -vvv start slurm-on-gce
2018-02-04 14:54:34 41e0a6cea578 gc3.elasticluster[1] DEBUG Checking section 
`cluster/slurm-on-gce` ...
2018-02-04 14:54:34 41e0a6cea578 gc3.elasticluster[1] DEBUG Checking section 
`cluster/gridengine-on-gce` ...
2018-02-04 14:54:34 41e0a6cea578 gc3.elasticluster[1] DEBUG Checking section 
`login/google` ...
2018-02-04 14:54:34 41e0a6cea578 gc3.elasticluster[1] DEBUG Checking section 
`setup/gridengine` ...
2018-02-04 14:54:34 41e0a6cea578 gc3.elasticluster[1] DEBUG Checking section 
`setup/slurm` ...
2018-02-04 14:54:34 41e0a6cea578 gc3.elasticluster[1] DEBUG Checking section 
`setup/pbs` ...
2018-02-04 14:54:34 41e0a6cea578 gc3.elasticluster[1] DEBUG Checking section 
`cloud/google` ...
2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG Using class  from module  to 
instanciate provider 'google'
2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG Using class  from module 
 to instanciate provider 'ansible'
2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG setting variable 
multiuser_cluster=yes for node kind compute
2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG setting variable 
multiuser_cluster=yes for node kind frontend
2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG setting variable 
multiuser_cluster=yes for node kind submit
Starting cluster `slurm-on-gce` with:
* 1 frontend nodes.
* 2 compute nodes.
(This may take a while...)
2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] INFO Starting cluster 
nodes ...
2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG Note: starting 3 
nodes concurrently.
2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG _start_node: 
working on node `frontend001`
2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] INFO Starting node 
`frontend001` from image `ubuntu-1604-xenial-v20180126` with flavor 
n1-standard-1 ...
2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG _start_node: 
working on node `compute002`
2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] DEBUG _start_node: 
working on node `compute001`
2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] INFO Starting node 
`compute002` from image `ubuntu-1604-xenial-v20180126` with flavor 
n1-standard-1 ...
2018-02-04 14:54:35 41e0a6cea578 gc3.elasticluster[1] INFO Starting node 
`compute001` from image `ubuntu-1604-xenial-v20180126` with flavor 
n1-standard-1 ...
2018-02-04 14:54:47 41e0a6cea578 gc3.elasticluster[1] DEBUG Node `compute002` 
has instance ID `slurm-on-gce-compute002`
2018-02-04 14:54:47 41e0a6cea578 gc3.elasticluster[1] INFO Node `compute002` 
has been started.
2018-02-04 14:55:16 41e0a6cea578 gc3.elasticluster[1] DEBUG Node `frontend001` 
has instance ID `slurm-on-gce-frontend001`
2018-02-04 14:55:16 41e0a6cea578 gc3.elasticluster[1] INFO Node `frontend001` 
has been started.
2018-02-04 14:55:20 41e0a6cea578 gc3.elasticluster[1] DEBUG Node `compute001` 
has instance ID `slurm-on-gce-compute001`
2018-02-04 14:55:20 41e0a6cea578 gc3.elasticluster[1] INFO Node `compute001` 
has been started.
2018-02-04 14:55:20 41e0a6cea578 gc3.elasticluster[1] DEBUG Getting information 
for instance slurm-on-gce-compute002
2018-02-04 14:55:20 41e0a6cea578 gc3.elasticluster[1] DEBUG node `compute002` 
(instance id slurm-on-gce-compute002) is up.
2018-02-04 14:55:21 41e0a6cea578 gc3.elasticluster[1] DEBUG Getting information 
for instance slurm-on-gce-frontend001
2018-02-04 14:55:21 41e0a6cea578 gc3.elasticluster[1] DEBUG node `frontend001` 
(instance id slurm-on-gce-frontend001) is up.
2018-02-04 14:55:21 41e0a6cea578 

Re: [elasticluster] SLURM is not installed after cluster setup

2018-02-04 Thread Riccardo Murri
2018-02-04 12:15 GMT+01:00 Orxan Shibliyev :
> The second command gave:
>
> orhan@orhan-MS-7850:~$ ./elasticluster.sh -vvv start slurm-on-gce
> docker: Got permission denied while trying to connect to the Docker daemon
> socket at unix:///var/run/docker.sock: Post
> http://%2Fvar%2Frun%2Fdocker.sock/v1.31/containers/create: dial unix
> /var/run/docker.sock: connect: permission denied.
>

Then you probably need to add yourself to the `docker` group:

sudo gpasswd -a $(whoami) docker

Note: replace `docker` above with whatever group owns the socket
`/var/run/docler.sock`

You might need to log out and back in order for the additional change
to be picked up; or run `newgrp docker` to get a shell with the
correct permissions.

Please let me know if it works, so I can automate this in the
`elasticluster.sh` script.

Ciao,
R

-- 
You received this message because you are subscribed to the Google Groups 
"elasticluster" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticluster+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [elasticluster] SLURM is not installed after cluster setup

2018-02-04 Thread Riccardo Murri
Ah, no, wait!  You should *not* run `elasticluster.sh` through `sudo`!
 Otherwise you'll be running as root in your home directory, which
screws all permissions up...

Can you please run the following commands?

# fix permissions
   chown -R $USER $HOME/.ansible $HOME/.ssh $HOME/.elasticluster

# run elasticluster
./elasticluster.sh -vvv start slurm-on-gce

Ciao,
R

-- 
You received this message because you are subscribed to the Google Groups 
"elasticluster" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticluster+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [elasticluster] SLURM is not installed after cluster setup

2018-02-04 Thread Riccardo Murri
Dear Orxan,

there seems to be an error with the Docker image; according to this
log line, the Ansible configuration system did not run at all:

ansible.errors.AnsibleError: Unable to create local
directories(/home/.ansible/tmp): [Errno 13] Permission denied:
'/home/.ansible'

This is definitely a bug. I'll try to correct it later today or
tomorrow, and push a new Docker image.

Ciao,
R

-- 
You received this message because you are subscribed to the Google Groups 
"elasticluster" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticluster+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.