[galaxy-dev] file upload/unzip issues
Hello, I and other members of my lab are encountering issues uploading (some) files to the Main public Galaxy server. We routinely upload from our local server to main.g2.bx.psu.edu via ftp using Cyberduck. In my case, on 12/6/12 I uploaded 12 zipped fastq files. Of these, 9 were completely fine. For the remaining 3 files, the transfer through Cyberduck appeared to work fine, and the files appeared (with the correct file size) on the upload screen under Get data as usual. However, once the files were uploaded into a Galaxy history, they were empty with a message saying Problem decompressing gzipped data. An example is entry #45: '1PositiveRFP92112Pool41_ATCACG_L002_R1_001.fastq.gz' in my history called Tcell_120812. My account uses this email address as the login ID (mmo...@rockefeller.edu). All of these files are pretty large (from 5 to 10 GB once unzipped), but the failure/success did not appear to correlate to the size of the file. I am not exceeding my space quota. In addition, I was able to successfully upload the file listed above (1PositiveRFP92112Pool41_ATCACG_L002_R1_001.fastq.gz) to our local Galaxy installation, so I don't think there is anything wrong with the file itself. However, it has failed to upload to the public galaxy server multiple times. The experience of my lab mates is similar; some of their files are uploading correctly, and others are having the same problem I described above. We can't seem to find anything common among the files that are failing to upload. For instance, they span different sequencing runs and were deposited on our local server at different times Has anyone else been encountering similar issues ? Please let me know if I should direct this question elsewhere or if I can provide any further information. Thanks very much for your time. Michael Moore Rockefeller University ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] job terminating on warnings from software
I have a long string of steps dying on the last step because GNUPLOT sends a warning to the log which galaxy faithfully records 11 faithful times in my history files. The 10 other files have downloadable content, and in fact, outside of galaxy the plot works. It is simply changing some intervals. How can I tell galaxy to keep going unless I get an actual error? Michael ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Spooky behavior
OK data library, (brand new). I upload one BAM file using galaxy's Add Dataset I start a new history, titling it, testing multiple use. I import to current history. I run a job that reads the BAM file and produces some output. Everything works. I have 13 output data sets in the history, all showing green. Now I create a second history, called 2nd use same dataset same library Again I check the box and click import to current history Then I try to run the same job as before... [bam_header_read] bgzf_check_EOF: Invalid argument [bam_header_read] invalid BAM binary header (this is not a BAM file). [main_samview] fail to read the header from /.../galaxy/database/files/000/dataset_961.dat (Ellipsis is mine, not Python's) Oh my, looking in that file space, I discover that my BAM file is at dataset_962.dat. Was gibt? Michael Moore . ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] /bin/sh: samtools: not found--WORKAROUND
Yeah, I was using ['printenv| mail myaddress samtools'] inside the subprocess.Popen(['samtools'],... in upload.py. Now right before in upload.py before the subprocess was called, I had /usr/bin/ in my PATH, but inside the subprocess the story was different. I just used the symbolic link to access samtools from something that remained in the PATH. I have no idea how this could happen, but I have noticed that galaxy does things with input and output and seems to manipulate the environment heavily--but why should a child process have a different environment when none was invoked? I'll figure that out later. I am still trying to convert some software to run with galaxy. On Mon, Apr 30, 2012 at 8:57 AM, Nate Coraor n...@bx.psu.edu wrote: On Apr 24, 2012, at 8:36 PM, Michael Moore wrote: There is apparently a persistent problem with samtools which normally lives at /usr/bin/samtools. I encountered a similar problem in Python when uploading BAM files. I did not resolve the problem. I hacked for a while on binary.py in a lib/ subdirectory and used os.system to send myself mail describing the effective path at various points, and I added a missing logging.basicConfig() statement and scattered some log.WARNING statements strategically. All this told me nothing. So I made a few symlinks to samtools. The one that got things working was ln -s /usr/bin/samtools /home/galaxy/bin/samtools so--worked around but not resolved. Hi Michael, For tools that output BAM, samtools needs to be in your $PATH, or has to be set up via the tool dependencies system. See the following for details: http://wiki.g2.bx.psu.edu/Admin/Config/Tool%20Dependencies For SGE, you can modify the $PATH used on the cluster in ~/.sge_request or the file specified in the 'environment_setup_file' galaxy config option. --nate Michael On Tue, Apr 17, 2012 at 12:15 PM, zhengqiu cai caizhq2...@yahoo.com.cn wrote: Hi All, I submitted a job to convert sam to bam, and the job was running forever without outputing the result. I then checked the log, and it read: Traceback (most recent call last): File /mnt/galaxyTools/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 336, in finish_job drm_job_state.job_wrapper.finish( stdout, stderr ) File /mnt/galaxyTools/galaxy-dist/lib/galaxy/jobs/__init__.py, line 637, in finish dataset.set_meta( overwrite = False ) File /mnt/galaxyTools/galaxy-dist/lib/galaxy/model/__init__.py, line 875, in set_meta return self.datatype.set_meta( self, **kwd ) File /mnt/galaxyTools/galaxy-dist/lib/galaxy/datatypes/binary.py, line 179, in set_meta raise Exception, Error Setting BAM Metadata: %s % stderr Exception: Error Setting BAM Metadata: /bin/sh: samtools: not found It means that the samtools is not in the PATH. I tried to set the PATH in a couple of methods according the Galaxy documentation: 1. put the path in the env.sh in the tool directory and symbolink default to the tool directory, e.g. default - =/mnt/galaxyTools/tools/samtools/0.1.18 2. put -v PATH=/mnt/galaxyTools/tools/samtools/0.1.18 in ~/.sge_request 3. put -v PATH=/mnt/galaxyTools/tools/samtools/0.1.18 in /path/sge_request none of them worked, and I got the above same problem. Then I checked the job log file in the job_working_directory, and it read: Samtools Version: 0.1.18 (r982:295) SAM file converted to BAM which shows that sge knows the PATH of samtools. To double check it, I added samtools index to Galaxy, and it worked well. I am very confused why SGE knows the tool path but cannot run the job correctly. The system I am using is ubuntu on EC2. I checked out the code from galaxy-dist on bitbucket. Other tools such as bwa and bowtie worked well using the same setting method(put env.sh in the tools directory to set the tool path) Thank you very much for any help or hints. Cai ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] library_import_dir -- How is it supposed to work?
I set library_import_dir to a path and tried uploading a directory of bam files. After fixing the situation so galaxy could find samtools in that subshell, I was able to upload links to the history. But moving things to one directory did not appear to be terribly useful, so I tested what happened if I had subdirectories existing in library import directory. Test 1 I used folders u1 and u2, each with data, and some data in the root library_import_directory. After clearing the samtools eror I was presented with a drop-down list with choices 'None', 'u1' and 'u2'. Selecting 'None' did not result in seeking data but a sharp reminder that I had to pick a directory. Selecting those directories led to uploads, with the Non-Copy correctly sized and even downloadable from the data library, but NOT usable in the history, because upload.py decided the file did not exist (probably the 'path' variable in os.path.exists() line 99). It also became apparent that the upload would look one level down from the root directory and no further (tested by adding u3 with data and a subdirectory of u3 called v3, also with data. So the current state of affairs is that it is a single directory to which one must move files in order to upload links to a galaxy library or folders thereof.. OR, alternatively, make a directory of links called directory A, and then another directory of links to the links in directory A at the library_import_dir and then ask Galaxy to copy the data. (Not fully tested yet). But it is apparent from the UI that more utility was intended. If I have time, I will help with that. Michael Moore ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] How does one survive the updates?
I was running a single instance of galaxy on my own machine, playing with library_import and figuring why BAM files were getting errors on upload. (It is a path matter, when one drops into the subshell and the workaround is ln -s /usr/bin/samtools /home/galaxy/bin/samtools ) Anyway, I seemed to have it figured last eve so I shut down the notebook where galaxy was running (on RHEL6) and went home. This am when I started galaxy with sh run.sh I got some messages about egg,ini and replacing universe_wsgi. files from universe_wsgi.sample. files and nothing worked. My registration was gone. My admin_user was overwritten. I re-registered, and restored the file settings and restarted galaxy, but now no login sticks. It does not complain about a registered user, but it does not appear to hold the session--the user tab does not show me as logged in. Is there some different way it is handling cookies? Michael Moore ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] /bin/sh: samtools: not found--WORKAROUND
There is apparently a persistent problem with samtools which normally lives at /usr/bin/samtools. I encountered a similar problem in Python when uploading BAM files. I did not resolve the problem. I hacked for a while on binary.py in a lib/ subdirectory and used os.system to send myself mail describing the effective path at various points, and I added a missing logging.basicConfig() statement and scattered some log.WARNING statements strategically. All this told me nothing. So I made a few symlinks to samtools. The one that got things working was ln -s /usr/bin/samtools /home/galaxy/bin/samtools so--worked around but not resolved. Michael On Tue, Apr 17, 2012 at 12:15 PM, zhengqiu cai caizhq2...@yahoo.com.cnwrote: Hi All, I submitted a job to convert sam to bam, and the job was running forever without outputing the result. I then checked the log, and it read: Traceback (most recent call last): File /mnt/galaxyTools/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 336, in finish_job drm_job_state.job_wrapper.finish( stdout, stderr ) File /mnt/galaxyTools/galaxy-dist/lib/galaxy/jobs/__init__.py, line 637, in finish dataset.set_meta( overwrite = False ) File /mnt/galaxyTools/galaxy-dist/lib/galaxy/model/__init__.py, line 875, in set_meta return self.datatype.set_meta( self, **kwd ) File /mnt/galaxyTools/galaxy-dist/lib/galaxy/datatypes/binary.py, line 179, in set_meta raise Exception, Error Setting BAM Metadata: %s % stderr Exception: Error Setting BAM Metadata: /bin/sh: samtools: not found It means that the samtools is not in the PATH. I tried to set the PATH in a couple of methods according the Galaxy documentation: 1. put the path in the env.sh in the tool directory and symbolink default to the tool directory, e.g. default - =/mnt/galaxyTools/tools/samtools/0.1.18 2. put -v PATH=/mnt/galaxyTools/tools/samtools/0.1.18 in ~/.sge_request 3. put -v PATH=/mnt/galaxyTools/tools/samtools/0.1.18 in /path/sge_request none of them worked, and I got the above same problem. Then I checked the job log file in the job_working_directory, and it read: Samtools Version: 0.1.18 (r982:295) SAM file converted to BAM which shows that sge knows the PATH of samtools. To double check it, I added samtools index to Galaxy, and it worked well. I am very confused why SGE knows the tool path but cannot run the job correctly. The system I am using is ubuntu on EC2. I checked out the code from galaxy-dist on bitbucket. Other tools such as bwa and bowtie worked well using the same setting method(put env.sh in the tools directory to set the tool path) Thank you very much for any help or hints. Cai ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Problem with cleaning up galaxy datasets
Yes, you do not have a url like http://some_path or ssh://some_path defined in your config file. Look at the command-line options for your routines and figure out how it is finding your config file. It could be looking in the wrong place or you could have an error in the config file (missing piece or undefined variable which would default to False (a boolean). On 4/11/12, Klaus Metzeler m...@klaus-metzeler.de wrote: Dear all, I have a problem with the cleanup scripts on my local galaxy instance. I am using the updated cleanup_datasets.py, as per Nate's earlier reply here http://gmod.827538.n3.nabble.com/Problem-running-purge-datasets-sh-cleanup-scripts-td3688016.html#none. http://gmod.827538.n3.nabble.com/Problem-running-purge-datasets-sh-cleanup-scripts-td3688016.html#none This is the output I get when running cleanup_datasets.py: ~/ngs-bin/galaxy-dist $ scripts/cleanup_datasets/cleanup_datasets.py -d 2 -6 -r Traceback (most recent call last): File scripts/cleanup_datasets/cleanup_datasets.py, line 524, in module if __name__ == __main__: main() File scripts/cleanup_datasets/cleanup_datasets.py, line 82, in main ini_file = args[0] IndexError: list index out of range ... and this if I call it via the shell script cleanup_datasets.sh ~/ngs-bin/galaxy-dist $ scripts/cleanup_datasets/delete_datasets.sh Traceback (most recent call last): File ./scripts/cleanup_datasets/cleanup_datasets.py, line 524, in module if __name__ == __main__: main() File ./scripts/cleanup_datasets/cleanup_datasets.py, line 101, in main app = CleanupDatasetsApplication( config ) File ./scripts/cleanup_datasets/cleanup_datasets.py, line 512, in __init__ self.model = galaxy.model.mapping.init( config.file_path, config.database_connection, engine_options={}, create_tables=False, object_store=self.object_store ) File /home/klausmetzeler/ngs-bin/galaxy-dist/lib/galaxy/model/mapping.py, line 1818, in init load_egg_for_url( url ) File /home/klausmetzeler/ngs-bin/galaxy-dist/lib/galaxy/model/mapping.py, line 1798, in load_egg_for_url dialect = guess_dialect_for_url( url ) File /home/klausmetzeler/ngs-bin/galaxy-dist/lib/galaxy/model/mapping.py, line 1794, in guess_dialect_for_url return (url.split(':', 1))[0] AttributeError: 'bool' object has no attribute 'split' Any idea what might be wrong? Thanks a lot for your support, Klaus ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Toolshed tribulations--certain types seem unsupported
I am testing galaxy for wide use, and I have legacy text files that call algorithms, sorts, and displays. One such has a tool.xml file with 10 parameters, one select, three integer, and one float, with the rest text for the moment. (some will be type=data later if the runs equate to the runs we do outside galaxy) The tool does not show up. Firefox, emacs and vim all agree that it is well-formed, and galaxy has been properly bounced. I experimented with removing parameters and found with 7 parameters, I did not have the problem, then I noticed that all of them had been changed to text or select in my desperation to make it show up for placement on the workflow. I returned to 10 parameters, but this time all type=select and type=text, and everything worked. But slipping even one integer, even with the (optional) min max and default tags, and the tool would disappear on restart. Am I looking at a bug, or is there something I need to be doing to make this tool visible with numeric parameters? MGM ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/