Re: [galaxy-dev] CloudMan launch error

2015-10-16 Thread Aaron Darling
Thanks for the reply Enis,

I had some off-list discussion with Simon Gladman about this. Simon
indicated it seems that OpenStack security groups are not propagating to
worker nodes at startup. Manually configuring the worker node security
group and then rebooting the master appears to be a workaround.
Hopefully this is only a temporary issue, and obviously one that's
specific to NeCTAR.

Cheers,
-Aaron

On Thu, 2015-10-15 at 21:59 -0400, Enis Afgan wrote:
> Hi Aaron,
> 
> Does the "AMQP Connection Failure:" error continue indefinitely?
> It would be helpful to see the log files from the time a worker is
> being added as well as the CloudMan logs from a worker node. It
> appears you're launching GVL v3.04 however, the default is now 4.0.0.
> Have you tried keeping the defaults?
> 
> 
> 
> 
> PS
> I'm cc'ing h...@genome.edu.au as the default help mailing list for the
> GVL.
> 
> 
> On Wed, Oct 14, 2015 at 11:31 PM, Aaron Darling
>  wrote:
> 
> Hi all, I'm new to CloudMan, and trying to launch a cluster
> via GVL (3 or 4) on NeCTAR.
> I'm able to get a head node running without trouble via
> launch.genome.edu.au, but launching worker nodes from the
> CloudMan interface appears to fail. CloudMan reboots the
> worker repeatedly before giving up. I logged into the worker
> to inspect log files and found the following, but it's not
> obvious to me what to do next. Hope this is something simple?
> 
> 
> ubuntu@server-fbbd9a10-fb58-48d8-89cd-5ddd22821648:~$
> cat /mnt/cm/paster.log
> Python version:  (2, 7)
> Image configuration suports: {'apps': ['cloudman', 'galaxy']}
> 2015-10-15 14:15:24,973 DEBUGapp:73   Initializing
> app
> 2015-10-15 14:15:24,973 DEBUGec2:109  Gathering
> instance zone, attempt 0
> 2015-10-15 14:15:25,140 DEBUGec2:115  Instance
> zone is 'NCI'
> 2015-10-15 14:15:25,140 DEBUGec2:44   Gathering
> instance ami, attempt 0
> 2015-10-15 14:15:25,459 DEBUGapp:76   Running on
> 'openstack' type of cloud in zone 'NCI' using image
> 'ami-3484'.
> 2015-10-15 14:15:25,459 DEBUGapp:98   Getting
> pd.yaml
> 2015-10-15 14:15:25,459 DEBUG  openstack:99   Establishing
> a boto Swift connection.
> 2015-10-15 14:15:25,459 DEBUG  openstack:109  Got boto
> Swift connection.
> 2015-10-15 14:15:26,112 DEBUG   misc:578  Retrieved
> file 'persistent_data.yaml' from bucket
> 'cm-45b53bf5024e962bd27e15fd81fcc07d' on host
> 'swift.rc.nectar.org.au' to 'pd.yaml'.
> 2015-10-15 14:15:26,118 INFO app:119  Worker
> starting
> 2015-10-15 14:15:26,136 DEBUGec2:76   Gathering
> instance id, attempt 0
> 2015-10-15 14:15:26,338 DEBUGec2:82   Instance ID
> is 'i-0019a2fc'
> 2015-10-15 14:16:29,488 DEBUG   comm:134  AMQP
> Connection Failure:  [Errno 110] Connection timed out
> 2015-10-15 14:16:29,492 DEBUG   base:57   Enabling
> 'root' controller, class: CM
> 2015-10-15 14:16:29,494 DEBUG   buildapp:93   Enabling
> 'httpexceptions' middleware
> 2015-10-15 14:16:29,496 DEBUG   buildapp:99   Enabling
> 'recursive' middleware
> 2015-10-15 14:16:29,499 DEBUG   buildapp:119  Enabling
> 'print debug' middleware
> 2015-10-15 14:16:29,506 DEBUG   buildapp:133  Enabling
> 'error' middleware
> 2015-10-15 14:16:29,507 DEBUG   buildapp:143  Enabling
> 'config' middleware
> 2015-10-15 14:16:29,508 DEBUG   buildapp:147  Enabling
> 'x-forwarded-host' middleware
> 2015-10-15 14:16:29,517 DEBUG   misc:768
> 'cp /etc/hosts /etc/hosts.orig' command OK
> 2015-10-15 14:16:29,528 DEBUG   misc:768
> 'cp /tmp/tmpuV3NTJ /etc/hosts' command OK
> Starting server in PID 2825.
> 2015-10-15 14:16:29,533 DEBUG   misc:768  'chmod
> 644 /etc/hosts' command OK
> 2015-10-15 14:16:29,533 DEBUG worker:558  Trying to
> setup AMQP connection; conn = ' object at 0x2743950>'
> serving on 0.0.0.0:42284 view at http://127.0.0.1:42284
> 2015-10-15 14:17:32,656 DEBUG   comm:134  AMQP
> Connection Failure:  [Errno 110] Connection timed out
> 2015-10-15 14:17:32,656 DEBUG worker:558  Trying to
> setup AMQP connection; conn = ' object at 0x2743950>'
> 2015-10-15 14:18:35,760 DEBUG   comm:134  AMQP
> Connection Failure:  [Errno 110] Connection timed out
> 2015-10-15 14:18:35,760 DEBUG worker:558  Trying to
> setup AMQP 

[galaxy-dev] CloudMan launch error

2015-10-15 Thread Aaron Darling
Hi all, I'm new to CloudMan, and trying to launch a cluster via GVL (3
or 4) on NeCTAR.
I'm able to get a head node running without trouble via
launch.genome.edu.au, but launching worker nodes from the CloudMan
interface appears to fail. CloudMan reboots the worker repeatedly before
giving up. I logged into the worker to inspect log files and found the
following, but it's not obvious to me what to do next. Hope this is
something simple?


ubuntu@server-fbbd9a10-fb58-48d8-89cd-5ddd22821648:~$
cat /mnt/cm/paster.log
Python version:  (2, 7)
Image configuration suports: {'apps': ['cloudman', 'galaxy']}
2015-10-15 14:15:24,973 DEBUGapp:73   Initializing app
2015-10-15 14:15:24,973 DEBUGec2:109  Gathering instance
zone, attempt 0
2015-10-15 14:15:25,140 DEBUGec2:115  Instance zone is 'NCI'
2015-10-15 14:15:25,140 DEBUGec2:44   Gathering instance
ami, attempt 0
2015-10-15 14:15:25,459 DEBUGapp:76   Running on 'openstack'
type of cloud in zone 'NCI' using image 'ami-3484'.
2015-10-15 14:15:25,459 DEBUGapp:98   Getting pd.yaml
2015-10-15 14:15:25,459 DEBUG  openstack:99   Establishing a boto
Swift connection.
2015-10-15 14:15:25,459 DEBUG  openstack:109  Got boto Swift
connection.
2015-10-15 14:15:26,112 DEBUG   misc:578  Retrieved file
'persistent_data.yaml' from bucket 'cm-45b53bf5024e962bd27e15fd81fcc07d'
on host 'swift.rc.nectar.org.au' to 'pd.yaml'.
2015-10-15 14:15:26,118 INFO app:119  Worker starting
2015-10-15 14:15:26,136 DEBUGec2:76   Gathering instance id,
attempt 0
2015-10-15 14:15:26,338 DEBUGec2:82   Instance ID is
'i-0019a2fc'
2015-10-15 14:16:29,488 DEBUG   comm:134  AMQP Connection
Failure:  [Errno 110] Connection timed out
2015-10-15 14:16:29,492 DEBUG   base:57   Enabling 'root'
controller, class: CM
2015-10-15 14:16:29,494 DEBUG   buildapp:93   Enabling
'httpexceptions' middleware
2015-10-15 14:16:29,496 DEBUG   buildapp:99   Enabling 'recursive'
middleware
2015-10-15 14:16:29,499 DEBUG   buildapp:119  Enabling 'print debug'
middleware
2015-10-15 14:16:29,506 DEBUG   buildapp:133  Enabling 'error'
middleware
2015-10-15 14:16:29,507 DEBUG   buildapp:143  Enabling 'config'
middleware
2015-10-15 14:16:29,508 DEBUG   buildapp:147  Enabling
'x-forwarded-host' middleware
2015-10-15 14:16:29,517 DEBUG   misc:768
'cp /etc/hosts /etc/hosts.orig' command OK
2015-10-15 14:16:29,528 DEBUG   misc:768
'cp /tmp/tmpuV3NTJ /etc/hosts' command OK
Starting server in PID 2825.
2015-10-15 14:16:29,533 DEBUG   misc:768  'chmod 644 /etc/hosts'
command OK
2015-10-15 14:16:29,533 DEBUG worker:558  Trying to setup AMQP
connection; conn = ''
serving on 0.0.0.0:42284 view at http://127.0.0.1:42284
2015-10-15 14:17:32,656 DEBUG   comm:134  AMQP Connection
Failure:  [Errno 110] Connection timed out
2015-10-15 14:17:32,656 DEBUG worker:558  Trying to setup AMQP
connection; conn = ''
2015-10-15 14:18:35,760 DEBUG   comm:134  AMQP Connection
Failure:  [Errno 110] Connection timed out
2015-10-15 14:18:35,760 DEBUG worker:558  Trying to setup AMQP
connection; conn = ''
2015-10-15 14:19:38,864 DEBUG   comm:134  AMQP Connection
Failure:  [Errno 110] Connection timed out
2015-10-15 14:19:38,864 DEBUG worker:558  Trying to setup AMQP
connection; conn = ''




ubuntu@server-fbbd9a10-fb58-48d8-89cd-5ddd22821648:~$
cat /tmp/cm/cm_boot.py.log 
2015-10-15 14:23:43,713 DEBUG  cm_boot:430 - virtual-burrito seems to be
installed
2015-10-15 14:23:44,037 DEBUG  cm_boot:25  - Successfully ran '/bin/bash
-l -c 'VIRTUALENVWRAPPER_LOG_DIR=/tmp/;
HOME=/home/ubuntu; . /home/ubuntu/.venvburrito/startup.sh; lsvirtualenv
| grep CM''
2015-10-15 14:23:44,037 DEBUG  cm_boot:433 - 'CM' virtualenv found
2015-10-15 14:23:44,049 DEBUG  cm_boot:493 - Fixing /etc/hosts on NeCTAR
2015-10-15 14:23:44,930 INFO   cm_boot:244 - << Starting nginx >>
2015-10-15 14:23:44,931 DEBUG  cm_boot:169 - Reconfiguring nginx conf
2015-10-15 14:23:44,931 INFO   cm_boot:286 - Attempting to configure
max_client_body_size in /usr/nginx/conf/nginx.conf
2015-10-15 14:23:44,934 DEBUG  cm_boot:25  - Successfully ran
'cp /usr/nginx/conf/nginx.conf /tmp/cm/original_nginx.conf'
2015-10-15 14:23:44,936 DEBUG  cm_boot:25  - Successfully ran
'uniq /tmp/cm/original_nginx.conf > /usr/nginx/conf/nginx.conf'
2015-10-15 14:23:44,937 DEBUG  cm_boot:25  - Successfully ran 'grep
'client_max_body_size' /usr/nginx/conf/nginx.conf'
2015-10-15 14:23:44,938 DEBUG  cm_boot:265 - Creating tmp dir for
nginx /mnt/galaxy/upload_store
2015-10-15 14:23:44,938 DEBUG  cm_boot:68  -
Checking /usr/local/sbin/nginx
2015-10-15 14:23:44,938 DEBUG  cm_boot:58  - /usr/local/sbin/nginx is
file: False; it's executable: False
2015-10-15 14:23:44,938 DEBUG  cm_boot:68  -
Checking /usr/local/bin/nginx
2015-10-15 14:23:44,938 DEBUG  cm_boot:58  - /usr/local/bin/nginx is
file: False; 

Re: [galaxy-dev] cloudman launch

2015-07-02 Thread Alexander Vowinkel
Hi Enis,

thank you. I will try it next time I recreate the cluster.

I have some more things:

A) I wanted to tell you that I did the update of the galaxyFS
with the playbook, by using the role cm.filesystem only.
It worked -  thanks!

B) How could I mount another volume to the upstarting cluster?

 - name: stuff
   snap_id: snap-xyz

Would that work?
E.g when I want this snapshot be mounted as /mnt/stuff

C) How can I use the dev branch of galaxy with cloudman?
There are some things only in dev I will need. How would
that be possible?

D) There is an email from me to the list from two days ago.
Could you look at that?
It's not critical for me, but might lead to a bug.


I think that's it so far.
If I have more, I'll come back to you ;)


2015-07-02 11:32 GMT-05:00 Enis Afgan enis.af...@irb.hr:

 Hi Alexander,
 I traced the code and tried it out and you can override what is defined in
 the flavor by providing the details in the 'Extra user data' field on the
 launch form. For example:

 filesystem_templates:
  - name: galaxy
roles: galaxyTools,galaxyData
archive_url:
 http://s3.amazonaws.com/cloudman/fs-archives/galaxyFS-dev-latest.tar.gz
type: archive
size: 10
  - name: galaxyIndices
roles: galaxyIndices
snap_id: snap-4b20f451

 If you want to use your custom bucket (i.e., if you have modified CloudMan
 code in some way), you can then specify it in the 'default bucket' form
 field and place cm.tar.gz and cm_boot.py in that bucket but no need to
 provide snaps.yaml.
 If you do not want to use a custom bucket, you can use the current
 'Standard' flavor, which defaults to cloudman-test bucket with CloudMan
 from June 19th (I will not be updating anything about it until after GCC).

 Hope this helps and let me know if you have more questions,
 Enis

 On Thu, Jul 2, 2015 at 9:29 AM, Enis Afgan enis.af...@irb.hr wrote:

 Hi Alexander,
 Let me investigate this a bit more and get back to you because I was
 under the impression I had fixed this. In the mean time, I updated your
 flavor to specify snap-6f0b211d.



 On Wed, Jul 1, 2015 at 4:44 PM, Alexander Vowinkel 
 vowinkel.alexan...@gmail.com wrote:

 Hi Enis,

 I am trying to launch my cloudman instance with this launcher again:
 https://launch.usegalaxy.org/launch

 You created the flavor for me because it wouldn't load my
 galaxyIndices from the snaps.yaml in my default bucket.

 I thought, this problem was fixed, but it still creates a galaxyIndices
 from the default snapshot (600GB) when using the Standard flavor.

 When I use my flavor, it loads the snapshot, but it is a little outdated
 ;)

 So my question: Is this behaviour expected? If no, are you going to
 fix it soon? If not, could you update my flavor in the way that
 galaxyIndices gets created from snap-6f0b211d ?

 Further:
 Should I update CM? I am still on the version from Jun 22.
 I am still not sure what I need to change (my snaps.yaml) in order
 to have my clean cluster creation still functioning.

 Thanks,
 Alexander




___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] cloudman launch

2015-07-02 Thread Enis Afgan



 A) I wanted to tell you that I did the update of the galaxyFS
 with the playbook, by using the role cm.filesystem only.
 It worked -  thanks!


Excellent!


 B) How could I mount another volume to the upstarting cluster?

  - name: stuff
snap_id: snap-xyz

 Would that work?
 E.g when I want this snapshot be mounted as /mnt/stuff


 With this commit
https://github.com/galaxyproject/cloudman/commit/b56a6cd77899ac762d34a747da51ac78a14d3da7,
you can (I guess that'll be a goo reason to make a move on the next
question :) )


 C) How can I use the dev branch of galaxy with cloudman?
 There are some things only in dev I will need. How would
 that be possible?


You'll need to keep using a custom bucket and place cm.tar.gz and
cm_boot.py into that bucket. I'll send you a couple of scripts that will
make that super easy.


 D) There is an email from me to the list from two days ago.
 Could you look at that?
 It's not critical for me, but might lead to a bug.


Just replied. Thanks.
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] cloudman launch

2015-07-01 Thread Alexander Vowinkel
Hi Enis,

I am trying to launch my cloudman instance with this launcher again:
https://launch.usegalaxy.org/launch

You created the flavor for me because it wouldn't load my
galaxyIndices from the snaps.yaml in my default bucket.

I thought, this problem was fixed, but it still creates a galaxyIndices
from the default snapshot (600GB) when using the Standard flavor.

When I use my flavor, it loads the snapshot, but it is a little outdated ;)

So my question: Is this behaviour expected? If no, are you going to
fix it soon? If not, could you update my flavor in the way that
galaxyIndices gets created from snap-6f0b211d ?

Further:
Should I update CM? I am still on the version from Jun 22.
I am still not sure what I need to change (my snaps.yaml) in order
to have my clean cluster creation still functioning.

Thanks,
Alexander
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/