Re: [galaxy-dev] [galaxy-user] Problem using Galaxy on the cloud

2012-03-08 Thread Brad Chapman

Jan;
No problem, sorry I don't have a nicer immediate solution for
you. I appreciate the problem report and hopefully we'll figure out a
clean fix for this.

CloudBioLinux does have Velvet included by default, so you should be
able to use it without needing to compile. Were you having trouble
building, or running it? Velvet and other assemblers are memory hungry,
so the issue you were seeing may be due to running out of memory. 'top'
is a useful way to monitor memory usage from the console.

If it is a memory issue, AWS does have some high-memory instances you
could try:

http://aws.amazon.com/ec2/instance-types/

I don't have a lot of experience doing assembly so unfortunately don't
have good estimates for memory usage. It might be worth asking on the
Velvet mailing list if you are still running into issues.

Hope this helps,
Brad

 Thanks Brad,
 
 I will see if I can do what you suggest-even if you are talking
 Klingon now:) .  I really just want to use Velvet, so I may try to
 install it through CloudBioLinux-although last time I tried that I
 seemed to crash everything.  I really need to get better at this
 stuff-or move to a place with bioinformatics support.
 
 Many Many Thanks,
 Jan
 
 On Mar 7, 2012, at 9:22 PM, Brad Chapman wrote:
 
 
 Jan;
 Thanks for getting back with all the detailed information. I dug into
 this further and understand what is happening:
 
 - tools/data_source/upload.py calls
  lib/galaxy/datatypes/sniff.py:stream_to_file
 - stream_to_file uses pythons tempfile module
 - tempfile defaults to using /tmp
 - As large files stream in the temporary space fills up, causing the
  issue you are seeing.
 
 The best way to work around this is to have the galaxy user on Amazon
 export TMPDIR to point at a temporary directory on /mnt/galaxyData
 instead of the root filesystem.
 
 I'm hoping that Enis or Dannon might be able to help out with the best
 place to set this in CloudMan to avoid the issue, I've cc'ed them in.
 
 If you want to manually fix it to get some work done, you could create a
 directory /mnt/galaxyData/tmp and then symlink /tmp there:
 
 ln -s /mnt/galaxyData/tmp /tmp
 
 Hope this helps and we can come up with a more permanent fix. Thanks
 again,
 Brad
 
 
 Thanks Brad,
 Sorry it has taken me so long to respond.  I had a meeting sort of day. I 
 started another instance to recreate what is happening to me in more detail.  
 I hope this helps.  I tried to color code things so you could follow 
 everything more easily.
 
 This is what I have when I start up Galaxy through BioCloudCentral before I 
 import any files...
 
 Get cloud support with Ubuntu Advantage Cloud Guest
  http://www.ubuntu.com/business/services/cloud
 ubuntu@ip-10-44-78-218:~$ df -h
 FilesystemSize  Used Avail Use% Mounted on
 /dev/xvda1 20G   13G  6.1G  68% /
 udev  8.4G  4.0K  8.4G   1% /dev
 tmpfs 3.4G  644K  3.4G   1% /run
 none  5.0M 0  5.0M   0% /run/lock
 none  8.4G 0  8.4G   0% /run/shm
 /dev/xvdb 404G  201M  383G   1% /mnt
 
 After starting to import data (pasted 
 https://s3.amazonaws.com/thunnus/BluefinAQ30_shuffled.fastq), it starts 
 filling up immediately
 
 FilesystemSize  Used Avail Use% Mounted on
 /dev/xvda1 20G   19G 0 100% /
 udev  8.4G  4.0K  8.4G   1% /dev
 tmpfs 3.4G  660K  3.4G   1% /run
 none  5.0M 0  5.0M   0% /run/lock
 none  8.4G 0  8.4G   0% /run/shm
 /dev/xvdb 404G  201M  383G   1% /mnt
 /dev/xvdg1700G  654G   47G  94% /mnt/galaxyIndices
 /dev/xvdg2 10G  1.7G  8.4G  17% /mnt/galaxyTools
 /dev/xvdg3500G   81M  500G   1% /mnt/galaxyData
 
 
 Jan McDowell
 The Virginia Institute of Marine Science
 The College of William and Mary
 Department of Fisheries Science
 Phone: 804-684-7263
 Fax: 804-684-7157
 mcdow...@vims.edumailto:mcdow...@vims.edu
 
Non-text part: text/html
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] [galaxy-user] Problem using Galaxy on the cloud

2012-03-07 Thread Brad Chapman

Jan;

 I appreciate the help, I am in no way a competent UNIX/Linux user. I
 just have 2 shuffled fastq files in an S3 bucket-I pasted the URL for
 these into Galaxy's upload file URL/Text box and that seems to be
 where the trouble started. On my first try loading the data files, I
 got a message saying there was no disk space.

I'm trying to reproduce here but can't seem to figure out how the disk
is filling up. A new filesystem should look like:

$ df -h
FilesystemSize  Used Avail Use% Mounted on
/dev/xvda1 20G   13G  6.1G  68% /
udev  3.7G  4.0K  3.7G   1% /dev
tmpfs 1.5G  660K  1.5G   1% /run
none  5.0M 0  5.0M   0% /run/lock
none  3.7G 0  3.7G   0% /run/shm
/dev/xvdb 414G  201M  393G   1% /mnt
/dev/xvdg1700G  654G   47G  94% /mnt/galaxyIndices
/dev/xvdg2 10G  1.7G  8.4G  17% /mnt/galaxyTools
/dev/xvdg35.0G   81M  5.0G   2% /mnt/galaxyData

with 6Gb of free space on '/'. If you pasted a file, like:

https://s3.amazonaws.com/chapmanb/example.fastq

into the text box, it should show up as:

/mnt/galaxyData/files/000/dataset_1.dat

in the /mnt/galaxyData mount, which has all of the space for storing
files. So I'm not sure exactly what is filling up your root
partition. Could you try checking a couple of places and see if any of
the filesystem usage is different from what we expect:

$ du -sh /home/ubuntu/
361M   /home/ubuntu/

$ sudo du -sh /var/log/
3.1M   /var/log/

I'm cc'ing Enis in case he has any other ideas where we might be
accidentally filling up the root filesystem.

Thanks again for the feedback and sorry about the problems,
Brad




 I tried again, and it
 worked.  Then, when I tried to run velveth, it got hung up again with
 the no disk space issue.  I browsed around in the /dev/xvda1 file
 system and there were quite a few files (I pasted them below).  I
 assumed they were files relating to BloudBioLinux and I was not sure
 if I could get rid of any of the files. It was nothing I put there.  I
 started the instance using BioCloudCentral.  Maybe it would be better
 to just start Galaxy the long way (choosing a public AMI under launch
 instance...)?
 
 
 
 ubuntu@ip-10-44-117-85:/$ ls
 bin   etc initrd.img  lib64   mnt  proc  sbin sys  var
 boot  export  lib lost+found  opt  root  selinux  tmp  vmlinuz
 dev   homelib32   media   pkg  run   srv  usr
 
 
 Thanks again,
 Jan
 
 On Mar 6, 2012, at 8:47 PM, Brad Chapman wrote:
 
  
  Jan;
  Glad to hear you got Galaxy running successfully. It sounds like
  everything is good to go once we sort out the disk space issue.
  
  However, when I try to use NX to get the virtual desktop going I get the
  message usr/bin/nxserver: line 381: echo: write error: No space left
  on device.
  
  Using the df -h command, I get:
  
  FilesystemSize  Used Avail Use% Mounted on
  /dev/xvda1 20G   19G 0 100% /
  udev  8.4G  4.0K  8.4G   1% /dev
  tmpfs 3.4G  660K  3.4G   1% /run
  none  5.0M 0  5.0M   0% /run/lock
  none  8.4G 0  8.4G   0% /run/shm
  /dev/xvdb 404G  202M  383G   1% /mnt
  /dev/xvdg1700G  654G   47G  94% /mnt/galaxyIndices
  /dev/xvdg2 10G  1.7G  8.4G  17% /mnt/galaxyTools
  /dev/xvdg3200G   11G  190G   6% /mnt/galaxyData
  
  
  So, I guess my question as a new user is: How do I point Galaxy and
  CloudBioLinux to all of this unused space?
  
  By default CloudMan will put files into /mnt/galaxyData. However, as
  you noticed the main filesystem got filled up at some point. Could this
  have happened while transferring files over from S3? Are there files in
  your home directory that you could delete or move to /mnt/galaxyData to
  free up space?
  
  CloudBioLinux and CloudMan shouldn't put a large number of files in the
  root directory, but when the root filesystem is full it's going to be
  very unhappy. Once you manually clear up some room there hopefully
  things will run smoother.
  
  If that doesn't help let us know and we can dig into it further. Thanks,
  Brad
 
 
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] [galaxy-user] Problem using Galaxy on the cloud

2012-03-07 Thread Brad Chapman

Jan;
Thanks for getting back with all the detailed information. I dug into
this further and understand what is happening:

- tools/data_source/upload.py calls
  lib/galaxy/datatypes/sniff.py:stream_to_file
- stream_to_file uses pythons tempfile module
- tempfile defaults to using /tmp
- As large files stream in the temporary space fills up, causing the
  issue you are seeing.

The best way to work around this is to have the galaxy user on Amazon
export TMPDIR to point at a temporary directory on /mnt/galaxyData
instead of the root filesystem.

I'm hoping that Enis or Dannon might be able to help out with the best
place to set this in CloudMan to avoid the issue, I've cc'ed them in.

If you want to manually fix it to get some work done, you could create a
directory /mnt/galaxyData/tmp and then symlink /tmp there:

ln -s /mnt/galaxyData/tmp /tmp

Hope this helps and we can come up with a more permanent fix. Thanks
again,
Brad


 Thanks Brad,
 Sorry it has taken me so long to respond.  I had a meeting sort of day. I 
 started another instance to recreate what is happening to me in more detail.  
 I hope this helps.  I tried to color code things so you could follow 
 everything more easily.
 
 This is what I have when I start up Galaxy through BioCloudCentral before I 
 import any files...
 
 Get cloud support with Ubuntu Advantage Cloud Guest
   http://www.ubuntu.com/business/services/cloud
 ubuntu@ip-10-44-78-218:~$ df -h
 FilesystemSize  Used Avail Use% Mounted on
 /dev/xvda1 20G   13G  6.1G  68% /
 udev  8.4G  4.0K  8.4G   1% /dev
 tmpfs 3.4G  644K  3.4G   1% /run
 none  5.0M 0  5.0M   0% /run/lock
 none  8.4G 0  8.4G   0% /run/shm
 /dev/xvdb 404G  201M  383G   1% /mnt
 
 After starting to import data (pasted 
 https://s3.amazonaws.com/thunnus/BluefinAQ30_shuffled.fastq), it starts 
 filling up immediately
 
 FilesystemSize  Used Avail Use% Mounted on
 /dev/xvda1 20G   19G 0 100% /
 udev  8.4G  4.0K  8.4G   1% /dev
 tmpfs 3.4G  660K  3.4G   1% /run
 none  5.0M 0  5.0M   0% /run/lock
 none  8.4G 0  8.4G   0% /run/shm
 /dev/xvdb 404G  201M  383G   1% /mnt
 /dev/xvdg1700G  654G   47G  94% /mnt/galaxyIndices
 /dev/xvdg2 10G  1.7G  8.4G  17% /mnt/galaxyTools
 /dev/xvdg3500G   81M  500G   1% /mnt/galaxyData
 
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] [galaxy-user] Problem using Galaxy on the cloud

2012-03-06 Thread Jennifer Jackson

Hi Jan,

Since this question is about a Cloud installation, I am going to forward 
your question over to the galaxy-...@bx.psu.edu mailing list so that the 
development community will have a better chance of seeing it and 
providing feedback.

http://wiki.g2.bx.psu.edu/Support#Mailing_Lists

Thanks!

Jen
Galaxy team

On 3/6/12 1:44 PM, Jan R McDowell wrote:

Hi all,
I have been trying to get an instance of Galaxy going on the EC2.  I have no 
problem going through BioCloudCentral and getting an instance going.  I can 
also successfully load my data from an S2 bucket into Galaxy.  The problem 
occurs when I try to use velveth.  I always says 'job waiting to run'.  As a 
matter of curiosity, I then used SSH to get into CloudBioLinux, which worked.  
However, when I try to use NX to get the virtual desktop going I get the 
message usr/bin/nxserver: line 381: echo: write error: No space left on device.

Using the df -h command, I get:

FilesystemSize  Used Avail Use% Mounted on
/dev/xvda1 20G   19G 0 100% /
udev  8.4G  4.0K  8.4G   1% /dev
tmpfs 3.4G  660K  3.4G   1% /run
none  5.0M 0  5.0M   0% /run/lock
none  8.4G 0  8.4G   0% /run/shm
/dev/xvdb 404G  202M  383G   1% /mnt
/dev/xvdg1700G  654G   47G  94% /mnt/galaxyIndices
/dev/xvdg2 10G  1.7G  8.4G  17% /mnt/galaxyTools
/dev/xvdg3200G   11G  190G   6% /mnt/galaxyData


So, I guess my question as a new user is:  How do I point Galaxy and 
CloudBioLinux to all of this unused space?  I assume the problem is with the 
/dev/xvda1 that is 100% full. I am obviously doing something silly and/or 
missing a really big step.  Any help would be greatly appreciated.

Many thanks in advance,
Jan


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/


--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/wiki/Support
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] [galaxy-user] Problem using Galaxy on the cloud

2012-03-06 Thread Brad Chapman

Jan;
Glad to hear you got Galaxy running successfully. It sounds like
everything is good to go once we sort out the disk space issue.

  However, when I try to use NX to get the virtual desktop going I get the
  message usr/bin/nxserver: line 381: echo: write error: No space left
  on device.
 
  Using the df -h command, I get:
 
  FilesystemSize  Used Avail Use% Mounted on
  /dev/xvda1 20G   19G 0 100% /
  udev  8.4G  4.0K  8.4G   1% /dev
  tmpfs 3.4G  660K  3.4G   1% /run
  none  5.0M 0  5.0M   0% /run/lock
  none  8.4G 0  8.4G   0% /run/shm
  /dev/xvdb 404G  202M  383G   1% /mnt
  /dev/xvdg1700G  654G   47G  94% /mnt/galaxyIndices
  /dev/xvdg2 10G  1.7G  8.4G  17% /mnt/galaxyTools
  /dev/xvdg3200G   11G  190G   6% /mnt/galaxyData
 
 
  So, I guess my question as a new user is: How do I point Galaxy and
  CloudBioLinux to all of this unused space?

By default CloudMan will put files into /mnt/galaxyData. However, as
you noticed the main filesystem got filled up at some point. Could this
have happened while transferring files over from S3? Are there files in
your home directory that you could delete or move to /mnt/galaxyData to
free up space?

CloudBioLinux and CloudMan shouldn't put a large number of files in the
root directory, but when the root filesystem is full it's going to be
very unhappy. Once you manually clear up some room there hopefully
things will run smoother.

If that doesn't help let us know and we can dig into it further. Thanks,
Brad
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/