Re: [galaxy-dev] [galaxy-user] Problem using Galaxy on the cloud
Jan; No problem, sorry I don't have a nicer immediate solution for you. I appreciate the problem report and hopefully we'll figure out a clean fix for this. CloudBioLinux does have Velvet included by default, so you should be able to use it without needing to compile. Were you having trouble building, or running it? Velvet and other assemblers are memory hungry, so the issue you were seeing may be due to running out of memory. 'top' is a useful way to monitor memory usage from the console. If it is a memory issue, AWS does have some high-memory instances you could try: http://aws.amazon.com/ec2/instance-types/ I don't have a lot of experience doing assembly so unfortunately don't have good estimates for memory usage. It might be worth asking on the Velvet mailing list if you are still running into issues. Hope this helps, Brad Thanks Brad, I will see if I can do what you suggest-even if you are talking Klingon now:) . I really just want to use Velvet, so I may try to install it through CloudBioLinux-although last time I tried that I seemed to crash everything. I really need to get better at this stuff-or move to a place with bioinformatics support. Many Many Thanks, Jan On Mar 7, 2012, at 9:22 PM, Brad Chapman wrote: Jan; Thanks for getting back with all the detailed information. I dug into this further and understand what is happening: - tools/data_source/upload.py calls lib/galaxy/datatypes/sniff.py:stream_to_file - stream_to_file uses pythons tempfile module - tempfile defaults to using /tmp - As large files stream in the temporary space fills up, causing the issue you are seeing. The best way to work around this is to have the galaxy user on Amazon export TMPDIR to point at a temporary directory on /mnt/galaxyData instead of the root filesystem. I'm hoping that Enis or Dannon might be able to help out with the best place to set this in CloudMan to avoid the issue, I've cc'ed them in. If you want to manually fix it to get some work done, you could create a directory /mnt/galaxyData/tmp and then symlink /tmp there: ln -s /mnt/galaxyData/tmp /tmp Hope this helps and we can come up with a more permanent fix. Thanks again, Brad Thanks Brad, Sorry it has taken me so long to respond. I had a meeting sort of day. I started another instance to recreate what is happening to me in more detail. I hope this helps. I tried to color code things so you could follow everything more easily. This is what I have when I start up Galaxy through BioCloudCentral before I import any files... Get cloud support with Ubuntu Advantage Cloud Guest http://www.ubuntu.com/business/services/cloud ubuntu@ip-10-44-78-218:~$ df -h FilesystemSize Used Avail Use% Mounted on /dev/xvda1 20G 13G 6.1G 68% / udev 8.4G 4.0K 8.4G 1% /dev tmpfs 3.4G 644K 3.4G 1% /run none 5.0M 0 5.0M 0% /run/lock none 8.4G 0 8.4G 0% /run/shm /dev/xvdb 404G 201M 383G 1% /mnt After starting to import data (pasted https://s3.amazonaws.com/thunnus/BluefinAQ30_shuffled.fastq), it starts filling up immediately FilesystemSize Used Avail Use% Mounted on /dev/xvda1 20G 19G 0 100% / udev 8.4G 4.0K 8.4G 1% /dev tmpfs 3.4G 660K 3.4G 1% /run none 5.0M 0 5.0M 0% /run/lock none 8.4G 0 8.4G 0% /run/shm /dev/xvdb 404G 201M 383G 1% /mnt /dev/xvdg1700G 654G 47G 94% /mnt/galaxyIndices /dev/xvdg2 10G 1.7G 8.4G 17% /mnt/galaxyTools /dev/xvdg3500G 81M 500G 1% /mnt/galaxyData Jan McDowell The Virginia Institute of Marine Science The College of William and Mary Department of Fisheries Science Phone: 804-684-7263 Fax: 804-684-7157 mcdow...@vims.edumailto:mcdow...@vims.edu Non-text part: text/html ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] [galaxy-user] Problem using Galaxy on the cloud
Jan; I appreciate the help, I am in no way a competent UNIX/Linux user. I just have 2 shuffled fastq files in an S3 bucket-I pasted the URL for these into Galaxy's upload file URL/Text box and that seems to be where the trouble started. On my first try loading the data files, I got a message saying there was no disk space. I'm trying to reproduce here but can't seem to figure out how the disk is filling up. A new filesystem should look like: $ df -h FilesystemSize Used Avail Use% Mounted on /dev/xvda1 20G 13G 6.1G 68% / udev 3.7G 4.0K 3.7G 1% /dev tmpfs 1.5G 660K 1.5G 1% /run none 5.0M 0 5.0M 0% /run/lock none 3.7G 0 3.7G 0% /run/shm /dev/xvdb 414G 201M 393G 1% /mnt /dev/xvdg1700G 654G 47G 94% /mnt/galaxyIndices /dev/xvdg2 10G 1.7G 8.4G 17% /mnt/galaxyTools /dev/xvdg35.0G 81M 5.0G 2% /mnt/galaxyData with 6Gb of free space on '/'. If you pasted a file, like: https://s3.amazonaws.com/chapmanb/example.fastq into the text box, it should show up as: /mnt/galaxyData/files/000/dataset_1.dat in the /mnt/galaxyData mount, which has all of the space for storing files. So I'm not sure exactly what is filling up your root partition. Could you try checking a couple of places and see if any of the filesystem usage is different from what we expect: $ du -sh /home/ubuntu/ 361M /home/ubuntu/ $ sudo du -sh /var/log/ 3.1M /var/log/ I'm cc'ing Enis in case he has any other ideas where we might be accidentally filling up the root filesystem. Thanks again for the feedback and sorry about the problems, Brad I tried again, and it worked. Then, when I tried to run velveth, it got hung up again with the no disk space issue. I browsed around in the /dev/xvda1 file system and there were quite a few files (I pasted them below). I assumed they were files relating to BloudBioLinux and I was not sure if I could get rid of any of the files. It was nothing I put there. I started the instance using BioCloudCentral. Maybe it would be better to just start Galaxy the long way (choosing a public AMI under launch instance...)? ubuntu@ip-10-44-117-85:/$ ls bin etc initrd.img lib64 mnt proc sbin sys var boot export lib lost+found opt root selinux tmp vmlinuz dev homelib32 media pkg run srv usr Thanks again, Jan On Mar 6, 2012, at 8:47 PM, Brad Chapman wrote: Jan; Glad to hear you got Galaxy running successfully. It sounds like everything is good to go once we sort out the disk space issue. However, when I try to use NX to get the virtual desktop going I get the message usr/bin/nxserver: line 381: echo: write error: No space left on device. Using the df -h command, I get: FilesystemSize Used Avail Use% Mounted on /dev/xvda1 20G 19G 0 100% / udev 8.4G 4.0K 8.4G 1% /dev tmpfs 3.4G 660K 3.4G 1% /run none 5.0M 0 5.0M 0% /run/lock none 8.4G 0 8.4G 0% /run/shm /dev/xvdb 404G 202M 383G 1% /mnt /dev/xvdg1700G 654G 47G 94% /mnt/galaxyIndices /dev/xvdg2 10G 1.7G 8.4G 17% /mnt/galaxyTools /dev/xvdg3200G 11G 190G 6% /mnt/galaxyData So, I guess my question as a new user is: How do I point Galaxy and CloudBioLinux to all of this unused space? By default CloudMan will put files into /mnt/galaxyData. However, as you noticed the main filesystem got filled up at some point. Could this have happened while transferring files over from S3? Are there files in your home directory that you could delete or move to /mnt/galaxyData to free up space? CloudBioLinux and CloudMan shouldn't put a large number of files in the root directory, but when the root filesystem is full it's going to be very unhappy. Once you manually clear up some room there hopefully things will run smoother. If that doesn't help let us know and we can dig into it further. Thanks, Brad ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] [galaxy-user] Problem using Galaxy on the cloud
Jan; Thanks for getting back with all the detailed information. I dug into this further and understand what is happening: - tools/data_source/upload.py calls lib/galaxy/datatypes/sniff.py:stream_to_file - stream_to_file uses pythons tempfile module - tempfile defaults to using /tmp - As large files stream in the temporary space fills up, causing the issue you are seeing. The best way to work around this is to have the galaxy user on Amazon export TMPDIR to point at a temporary directory on /mnt/galaxyData instead of the root filesystem. I'm hoping that Enis or Dannon might be able to help out with the best place to set this in CloudMan to avoid the issue, I've cc'ed them in. If you want to manually fix it to get some work done, you could create a directory /mnt/galaxyData/tmp and then symlink /tmp there: ln -s /mnt/galaxyData/tmp /tmp Hope this helps and we can come up with a more permanent fix. Thanks again, Brad Thanks Brad, Sorry it has taken me so long to respond. I had a meeting sort of day. I started another instance to recreate what is happening to me in more detail. I hope this helps. I tried to color code things so you could follow everything more easily. This is what I have when I start up Galaxy through BioCloudCentral before I import any files... Get cloud support with Ubuntu Advantage Cloud Guest http://www.ubuntu.com/business/services/cloud ubuntu@ip-10-44-78-218:~$ df -h FilesystemSize Used Avail Use% Mounted on /dev/xvda1 20G 13G 6.1G 68% / udev 8.4G 4.0K 8.4G 1% /dev tmpfs 3.4G 644K 3.4G 1% /run none 5.0M 0 5.0M 0% /run/lock none 8.4G 0 8.4G 0% /run/shm /dev/xvdb 404G 201M 383G 1% /mnt After starting to import data (pasted https://s3.amazonaws.com/thunnus/BluefinAQ30_shuffled.fastq), it starts filling up immediately FilesystemSize Used Avail Use% Mounted on /dev/xvda1 20G 19G 0 100% / udev 8.4G 4.0K 8.4G 1% /dev tmpfs 3.4G 660K 3.4G 1% /run none 5.0M 0 5.0M 0% /run/lock none 8.4G 0 8.4G 0% /run/shm /dev/xvdb 404G 201M 383G 1% /mnt /dev/xvdg1700G 654G 47G 94% /mnt/galaxyIndices /dev/xvdg2 10G 1.7G 8.4G 17% /mnt/galaxyTools /dev/xvdg3500G 81M 500G 1% /mnt/galaxyData ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] [galaxy-user] Problem using Galaxy on the cloud
Hi Jan, Since this question is about a Cloud installation, I am going to forward your question over to the galaxy-...@bx.psu.edu mailing list so that the development community will have a better chance of seeing it and providing feedback. http://wiki.g2.bx.psu.edu/Support#Mailing_Lists Thanks! Jen Galaxy team On 3/6/12 1:44 PM, Jan R McDowell wrote: Hi all, I have been trying to get an instance of Galaxy going on the EC2. I have no problem going through BioCloudCentral and getting an instance going. I can also successfully load my data from an S2 bucket into Galaxy. The problem occurs when I try to use velveth. I always says 'job waiting to run'. As a matter of curiosity, I then used SSH to get into CloudBioLinux, which worked. However, when I try to use NX to get the virtual desktop going I get the message usr/bin/nxserver: line 381: echo: write error: No space left on device. Using the df -h command, I get: FilesystemSize Used Avail Use% Mounted on /dev/xvda1 20G 19G 0 100% / udev 8.4G 4.0K 8.4G 1% /dev tmpfs 3.4G 660K 3.4G 1% /run none 5.0M 0 5.0M 0% /run/lock none 8.4G 0 8.4G 0% /run/shm /dev/xvdb 404G 202M 383G 1% /mnt /dev/xvdg1700G 654G 47G 94% /mnt/galaxyIndices /dev/xvdg2 10G 1.7G 8.4G 17% /mnt/galaxyTools /dev/xvdg3200G 11G 190G 6% /mnt/galaxyData So, I guess my question as a new user is: How do I point Galaxy and CloudBioLinux to all of this unused space? I assume the problem is with the /dev/xvda1 that is 100% full. I am obviously doing something silly and/or missing a really big step. Any help would be greatly appreciated. Many thanks in advance, Jan ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] [galaxy-user] Problem using Galaxy on the cloud
Jan; Glad to hear you got Galaxy running successfully. It sounds like everything is good to go once we sort out the disk space issue. However, when I try to use NX to get the virtual desktop going I get the message usr/bin/nxserver: line 381: echo: write error: No space left on device. Using the df -h command, I get: FilesystemSize Used Avail Use% Mounted on /dev/xvda1 20G 19G 0 100% / udev 8.4G 4.0K 8.4G 1% /dev tmpfs 3.4G 660K 3.4G 1% /run none 5.0M 0 5.0M 0% /run/lock none 8.4G 0 8.4G 0% /run/shm /dev/xvdb 404G 202M 383G 1% /mnt /dev/xvdg1700G 654G 47G 94% /mnt/galaxyIndices /dev/xvdg2 10G 1.7G 8.4G 17% /mnt/galaxyTools /dev/xvdg3200G 11G 190G 6% /mnt/galaxyData So, I guess my question as a new user is: How do I point Galaxy and CloudBioLinux to all of this unused space? By default CloudMan will put files into /mnt/galaxyData. However, as you noticed the main filesystem got filled up at some point. Could this have happened while transferring files over from S3? Are there files in your home directory that you could delete or move to /mnt/galaxyData to free up space? CloudBioLinux and CloudMan shouldn't put a large number of files in the root directory, but when the root filesystem is full it's going to be very unhappy. Once you manually clear up some room there hopefully things will run smoother. If that doesn't help let us know and we can dig into it further. Thanks, Brad ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/