Re: [galaxy-dev] HOW TO RETRIEVE DATA FROM HISTORY??!!
On Thu, Aug 4, 2011 at 9:57 PM, colin molter colin.mol...@gmail.com wrote: Is there a way to directly move/copy data from your galaxy history to a given location in the filesystem of the same galaxy server? 2011/9/1 Edward Kirton eskir...@lbl.gov why not create a simple export tool? perhaps with the option to cp or symlink. This is exactly what I would like to have. I checked on Shed Tool. But it seems that that tool doesn't exist yet. Before to try to make that tool, I wanted to be sure that nobody had a similar tool for sharing. thx colin ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] problem in workflow
Hi All, I am using local instance of galaxy,i am not able to use workflow .when i am clicking on workflow , it' displaying the loading picture only . i am not able to see workflow editor. may it depends on browser? Regards shashi ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] HOW TO RETRIEVE DATA FROM HISTORY??!!
Hi, We have written a tool that we call gls (galaxy ls). Running it is similar to doing ls -ltr in that it lists the histories in reverse chronological order and shows the actual path of the .dat file on the file system. You can then symlink/copy the actual files. Example output: 2011-03-08 14:29:18 - Test 1 2011-03-08 14:29:56 - CXXC.bed /galaxy/database/files/001/dataset_1749.dat 2011-03-16 14:08:00 - BED-to-GFF on data 1 /galaxy/database/files/001/dataset_1750.dat 2011-06-17 12:13:28 - Test 2 2011-06-17 12:14:24 - UCSC Main on Chicken: refGene (genome) /galaxy/database/files/003/dataset_3085.dat 2011-06-17 12:51:02 - UCSC Main on Chicken: refGene (chr2:57311158-57314247) /galaxy/database/files/003/dataset_3086.dat 2011-07-27 07:12:30 - Test 3 2011-07-27 07:15:44 - http://www.molbiol.ox.ac.uk/data/biopivot/example/small/example.gff3 /galaxy/database/files/003/dataset_3296.dat 2011-07-27 07:16:27 - annotated gff3 on data 1 /galaxy/database/files/003/dataset_3297.dat 2011-07-27 07:18:42 - UCSC Main on Human: eponine (genome) /galaxy/database/files/003/dataset_3298.dat 2011-07-27 07:19:38 - annotated overlap gff3 on data 3 and data 2 /galaxy/database/files/003/dataset_3299.dat 2011-08-10 07:35:20 - SLX-3645.591.s_4_Input_AB_peaks.txt /galaxy/database/files/004/dataset_4086.dat 2011-08-10 07:55:48 - macs2gff3 on data 5 /galaxy/database/files/004/dataset_4088.dat 2011-08-10 07:56:37 - annotated gff3 on data 7 /galaxy/database/files/004/dataset_4089.dat It's written in Perl, is run from the command line and accesses the galaxy database. We are happy to make this available if there is interest. Is the Tool Shed the best place to put it given it isn't a wrapper? Regards, Steve On Thu, Aug 4, 2011 at 9:57 PM, colin molter colin.mol...@gmail.com mailto:colin.mol...@gmail.com wrote: Is there a way to directly move/copy data from your galaxy history to a given location in the filesystem of the same galaxy server? 2011/9/1 Edward Kirton eskir...@lbl.gov mailto:eskir...@lbl.gov why not create a simple export tool? perhaps with the option to cp or symlink. This is exactly what I would like to have. I checked on Shed Tool. But it seems that that tool doesn't exist yet. Before to try to make that tool, I wanted to be sure that nobody had a similar tool for sharing. thx colin ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Problem with workflow
Hi, i am not able to create workflow in my local instance of galaxy. it's displaying only loading picture on browser. i am using old version version galaxy. Regards shashi shekhar ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] link file bug in the new version
Hi, We found a little problem in the new galaxy release. When we upload a dataset library from filesystem : Admin Manage data libraries create new data library Add dataset Upload files from filesystem path Link to files without copying If galaxy user is the owner of this file or if he has write permission on it (on the filesystem), the file is deleted !! I do not think it is the normal behaviour... Regards. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] HOW TO RETRIEVE DATA FROM HISTORY??!!
Hello Steve, Thanks for making this available. If you could email me the script, I'll include it in the ~/contrib directory in the Galaxy distribution. Greg Von Kuster On Sep 2, 2011, at 3:20 AM, Steve Taylor wrote: Hi, We have written a tool that we call gls (galaxy ls). Running it is similar to doing ls -ltr in that it lists the histories in reverse chronological order and shows the actual path of the .dat file on the file system. You can then symlink/copy the actual files. Example output: 2011-03-08 14:29:18 - Test 1 2011-03-08 14:29:56 - CXXC.bed /galaxy/database/files/001/dataset_1749.dat 2011-03-16 14:08:00 - BED-to-GFF on data 1 /galaxy/database/files/001/dataset_1750.dat 2011-06-17 12:13:28 - Test 2 2011-06-17 12:14:24 - UCSC Main on Chicken: refGene (genome) /galaxy/database/files/003/dataset_3085.dat 2011-06-17 12:51:02 - UCSC Main on Chicken: refGene (chr2:57311158-57314247) /galaxy/database/files/003/dataset_3086.dat 2011-07-27 07:12:30 - Test 3 2011-07-27 07:15:44 - http://www.molbiol.ox.ac.uk/data/biopivot/example/small/example.gff3 /galaxy/database/files/003/dataset_3296.dat 2011-07-27 07:16:27 - annotated gff3 on data 1 /galaxy/database/files/003/dataset_3297.dat 2011-07-27 07:18:42 - UCSC Main on Human: eponine (genome) /galaxy/database/files/003/dataset_3298.dat 2011-07-27 07:19:38 - annotated overlap gff3 on data 3 and data 2 /galaxy/database/files/003/dataset_3299.dat 2011-08-10 07:35:20 - SLX-3645.591.s_4_Input_AB_peaks.txt /galaxy/database/files/004/dataset_4086.dat 2011-08-10 07:55:48 - macs2gff3 on data 5 /galaxy/database/files/004/dataset_4088.dat 2011-08-10 07:56:37 - annotated gff3 on data 7 /galaxy/database/files/004/dataset_4089.dat It's written in Perl, is run from the command line and accesses the galaxy database. We are happy to make this available if there is interest. Is the Tool Shed the best place to put it given it isn't a wrapper? Regards, Steve On Thu, Aug 4, 2011 at 9:57 PM, colin molter colin.mol...@gmail.com mailto:colin.mol...@gmail.com wrote: Is there a way to directly move/copy data from your galaxy history to a given location in the filesystem of the same galaxy server? 2011/9/1 Edward Kirton eskir...@lbl.gov mailto:eskir...@lbl.gov why not create a simple export tool? perhaps with the option to cp or symlink. This is exactly what I would like to have. I checked on Shed Tool. But it seems that that tool doesn't exist yet. Before to try to make that tool, I wanted to be sure that nobody had a similar tool for sharing. thx colin ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ Greg Von Kuster Galaxy Development Team g...@bx.psu.edu ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Installation error
Greetings, I attempted to install Galaxy on a ROCKS 5.3 cluster. I checked out Galaxy to a location commonly accessible by all analysis nodes. I then set the path to Python 2.7 like so: export PATH=/opt/galaxy-python/python:$PATH However, executing run.sh gives me the error at the end of this message. The strange thing is that Galaxy runs fine on a CentOS 5.6 virtual machine. Any help would be much appreciated. Thanks! - Traceback (most recent call last): File ./scripts/paster.py, line 34, in ? command.run() File /opt/galaxy-dist/eggs/PasteScript-1.7.3-py2.4.egg/paste/script/command.py, line 84, in run invoke(command, command_name, options, args[1:]) File /opt/galaxy-dist/eggs/PasteScript-1.7.3-py2.4.egg/paste/script/command.py, line 123, in invoke exit_code = runner.run(args) File /opt/galaxy-dist/eggs/PasteScript-1.7.3-py2.4.egg/paste/script/command.py, line 218, in run result = self.command() File /opt/galaxy-dist/eggs/PasteScript-1.7.3-py2.4.egg/paste/script/serve.py, line 276, in command relative_to=base, global_conf=vars) File /opt/galaxy-dist/eggs/PasteScript-1.7.3-py2.4.egg/paste/script/serve.py, line 311, in loadapp return loadapp( File /opt/galaxy-dist/eggs/PasteDeploy-1.3.3-py2.4.egg/paste/deploy/loadwsgi.py, line 204, in loadapp return loadobj(APP, uri, name=name, **kw) File /opt/galaxy-dist/eggs/PasteDeploy-1.3.3-py2.4.egg/paste/deploy/loadwsgi.py, line 225, in loadobj return context.create() File /opt/galaxy-dist/eggs/PasteDeploy-1.3.3-py2.4.egg/paste/deploy/loadwsgi.py, line 625, in create return self.object_type.invoke(self) File /opt/galaxy-dist/eggs/PasteDeploy-1.3.3-py2.4.egg/paste/deploy/loadwsgi.py, line 110, in invoke return fix_call(context.object, context.global_conf, **context.local_conf) File /opt/galaxy-dist/eggs/PasteDeploy-1.3.3-py2.4.egg/paste/deploy/util/fixtypeerror.py, line 57, in fix_call val = callable(*args, **kw) File /opt/galaxy-dist/lib/galaxy/web/buildapp.py, line 90, in app_factory add_controllers( webapp, app ) File /opt/galaxy-dist/lib/galaxy/web/buildapp.py, line 39, in add_controllers module = __import__( module_name ) File /opt/galaxy-dist/lib/galaxy/web/controllers/admin.py, line 310 link=( lambda item: (dict( operation=Manage users and groups, id=item.id, webapp=galaxy ) if not item.default else dict( operation=Change amount, id=item.id, webapp=galaxy )) ), - ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] contribute tools to galaxy?
Thanks a lot! It's really helpful! I'm looking forward to seeing new version of galaxy / galaxy tool shed soon! Dongjun On 8/31/2011 2:38 PM, Greg Von Kuster wrote: Hello Dongjun, see my answers to your question inline. On Aug 28, 2011, at 10:12 AM, Dongjun Chung wrote: Hi All, I'm a newbie to galaxy and enjoying it a lot these days. Thanks for the great work. I have a question regarding contribution of software to galaxy. We developed a ChIP-seq peak calling algorithm and software (R package) and hope to contribute it to galaxy. I have read the wiki and prior mailing list about the contribution system but it is still somewhat confusing to me. 1. It seems that I can contribute our software to tool shed if I prepare appropriate code definition files. Then, users can download and use it with their locally installed galaxy. However, these files committed to tool shed will not appear in galaxy main or test servers. Am I correct? Tools from the Galaxy tool shed may or may not be available in the Galaxy test / main servers hosted at Penn State. 2. What is clear relationship between main/test servers tool shed? The Galaxy tool shed enables the Galaxy community to share tools. These tools are generally used in local Galaxy installations, but not available on the Penn State instances. Can we contribute our software to main or test servers as well? The tools available on the Penn State servers are generally developed by the Galaxy core development team, and are available in the Galaxy distribution. If your tools complement the tools currently available in the distribution, the Galaxy core development team may agree to include them. However, see my comments 2 answers below regarding upcoming enhancements to Galaxy and the tool shed. Or only galaxy core developers can add new tools to galaxy main or test servers? Yes. If so, which software is considered to be added to servers? Are they chosen from tools contributed to tool shed? Currently, tools available in the distribution and on the Penn State instances are generally developed by the core Galaxy development team. However, upcoming enhancements to Galaxy and the Galaxy tool shed will enable automatic installation of tools from the tool shed into local Galaxy instances, eliminating the necessity to include tools in the distribution. This automation will be available fairly soon. 3. If our software is a R package, then users need to download and install it first in their R system before they use our software within their galaxy system even in the case they have appropriate definition files. Am I right? Or is there any better solution for this? If your tools require R, then those that use your tools will need to install R in the Galaxy environment path so the tools will be functional. Tools that have dependencies like this should include a requirements tag set in the tool config. See http://wiki.g2.bx.psu.edu/Admin/Tools/Tool%20Config%20Syntax#A.3Crequirements.3E_tag_set Thanks! Best, Dongjun ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ Greg Von Kuster Galaxy Development Team g...@bx.psu.edu mailto:g...@bx.psu.edu ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Suggestion / Request for Comments on Galaxy Best Practices - Gradual migration to standard indentation
Peter Cock wrote: On Thu, Sep 1, 2011 at 9:00 PM, Trevor Wennblom tre...@well.com wrote: ... given that python has syntactically significant whitespace, i also try to maintain the convention of indentation with four-spaces. i've noticed this isn't consistent within the codebase, but does seem to be the preferred style such as in `lib/galaxy/datatypes/`. python comes packaged with the script `reindent.py`: ... this is recommended practice per PEP 8: http://www.python.org/dev/peps/pep-0008/ ... would anyone be opposed to me fixing up the current codebase to adhere to this? running `reindent.py` on the files is easy enough, i'm willing to step through the files (`opendiff` / `FileMerge.app`) and verify no unlikely syntactic changes have occurred. i can also deliver changes in gradual chunked pull requests to ease current developers getting possibly bit by merge issues. +1 on correcting any tabs to spaces in the Galaxy Python code. Doing this in chunked commits makes good sense too - although if you can get one of the Galaxy team to do this directly it might be quicker. It is the Galaxy Team's intent to use four-space indents, anything else is a mistake and we try to fix 'em as we see 'em. Nobody has yet taken the time to fix them all (i.e. with reindent.py) but such fixes would be welcome. Personally I'd like to go further and fix the non-PEP8 white space in most of the Galaxy Python code, e.g. function ( argument ) rather than: function(argument). There are a lot of parts of PEP 8 that I doubt we want to adhere to strictly. I agree it's annoying to find varying styles, but the space inside parentheses is not one I get too worked up about. FWIW, I prefer function( argument ) and I tend to see that style out of most of the rest of the team. Chatting to some of the Galaxy team at BOSC/ISMB 2011 there is some support for this internally. Again, there are automated tools to do this. would anyone be willing to add the appropriate hooks to the central repository as well? As long as there are no false positives identified during the initial tab/space conversion that seems sensible to prevent new tabs creeping in. But not essential. We can't add custom hooks to the bitbucket repository and we don't have an intermediate local repository that we all push changesets through. --nate Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Suggestion / Request for Comments on Galaxy Best Practices - Gradual migration to standard indentation
On Sep 2, 2011, at 9:57 AM, Nate Coraor wrote: ... Chatting to some of the Galaxy team at BOSC/ISMB 2011 there is some support for this internally. Again, there are automated tools to do this. would anyone be willing to add the appropriate hooks to the central repository as well? As long as there are no false positives identified during the initial tab/space conversion that seems sensible to prevent new tabs creeping in. But not essential. We can't add custom hooks to the bitbucket repository and we don't have an intermediate local repository that we all push changesets through. --nate The same applies for most public repos like bitbucket; github is the same (only post-receive is allowed there). I'm unsure whether repository hooks are stored on github, but they are definitely ignored (only specific post-receive hooks are allowed, and these are set up via the github admin API). With git the only way I can think of to set something like this up is client-side (pre/post-commit, or pre-push), then maybe have a separate post-checkout hook to set everything up after a clone. I assume hg has a similar mechanism. chris ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Galaxy egg fetching error? Mac OS X 10.7 (Lion)
Hi, I've been having trouble running a local instance of Galaxy on a Mac Pro after upgrading to OS X 10.7 (Lion). My Python version is 2.7.1 and Mercurial is 1.9.1 for MacOS X 10.7. I don't have any MacPorts installed. The error I see after running the startup script is pasted at the end of this message. Thanks in advance for any help or advice. jjw - [~/galaxy-dist] myuserid 10:11 AM ./run.sh Some eggs are out of date, attempting to fetch... Warning: MarkupSafe (a dependent egg of Mako) cannot be fetched Warning: decorator (a dependent egg of sqlalchemy-migrate) cannot be fetched Warning: simplejson (a dependent egg of WebHelpers) cannot be fetched Traceback (most recent call last): File ./scripts/fetch_eggs.py, line 30, in module c.resolve() # Only fetch eggs required by the config File /Users/myuserid/galaxy-dist/lib/galaxy/eggs/__init__.py, line 345, in resolve egg.resolve() File /Users/myuserid/galaxy-dist/lib/galaxy/eggs/__init__.py, line 195, in resolve return self.version_conflict( e.args[0], e.args[1] ) File /Users/myuserid/galaxy-dist/lib/galaxy/eggs/__init__.py, line 226, in version_conflict r = pkg_resources.working_set.resolve( ( dist.as_requirement(), ), env, egg.fetch ) File /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py, line 565, in resolve raise DistributionNotFound(req) # XXX put more info here pkg_resources.DistributionNotFound: numpy==1.6.0 Fetch failed. [~/galaxy-dist] myuserid 10:12 AM ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] rpy - No module named rpy CentoOs install
Joseph Hargitai wrote: additional info: it is possible on the same node to run manually ./gsummary.py with the header: #!/usr/bin/env python import sys, re, tempfile from rpy_options import set_options set_options(RHOME='/apps1/R/2.13.1/intel/lib64/R') from rpy import * Where else can there be an env setting to prevent this app not finding the mod from within galaxy? Hi Joe, If you set RHOME in the environment and then run gsummary.py without the additions, does it work? --nate j From: Joseph Hargitai Sent: Thursday, September 01, 2011 12:28 PM To: galaxy-dev@lists.bx.psu.edu Subject: rpy - No module named rpy CentoOs install Hi, On our Ubuntu install stat packages and all that require rpy work fine. On our CentOs install seeing this stubborn error that I did see from previous post to be difficult to fix. At first suspected the SGE issue - environment not transferring to compute nodes. After changing the app to run local had the same issue. CentOs: 2.6.18-92.1.13.el5 rpy module is in: /apps1/python/2.6.6/intel/lib/python2.6/site-packages _rpy2122.so _rpy2131.so version: [galaxy@compute-0-65 galaxy-dist]$ python -c import rpy; print rpy.__version__ 1.5.1 path: python -c 'import sys; print \n.join( sys.path )' /apps1/python/2.6.6/intel/lib/python2.6/site-packages/simplejson-2.0.9-py2.6-linux-x86_64.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Sphinx-1.0.7-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/docutils-0.7-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Jinja2-2.5.5-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Pygments-1.4-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/nose-1.0.0-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Traits-3.5.0-py2.6-linux-x86_64.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/nibabel-1.0.0-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/nipype-0.0.0-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/setuptools-0.6c12dev_r88846-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/birdsuite-1.0-py2.5.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/mpgutils-0.7-py2.5.egg /apps1/python/2.6.6/intel/lib/python26.zip /apps1/python/2.6.6/intel/lib/python2.6 /apps1/python/2.6.6/intel/lib/python2.6/plat-linux2 /apps1/python/2.6.6/intel/lib/python2.6/lib-tk /apps1/python/2.6.6/intel/lib/python2.6/lib-old /apps1/python/2.6.6/intel/lib/python2.6/lib-dynload /apps1/python/2.6.6/intel/lib/python2.6/site-packages /apps1/python/2.6.6/intel/lib/python2.6/site-packages/PIL compiled against /R/2.13.1 env: export PATH=.:\ /apps1/R/2.13.1/intel/bin:\ /apps1/python/2.6.6/intel/bin:\ /apps1/pipe/bowtie/0.12.7/intel:\ /apps1/pipe/bwa/0.5.9/intel:\ /apps1/samtools/0.1.13/intel/bin:\ /apps1/fastx_toolkit/0.0.13/intel/bin:\ /apps1/maq/maq-0.7.1:\ /apps1/maq/maq-0.7.1/scripts:\ /apps1/bfast/bfast-0.6.5a/butil:\ /apps1/bfast/bfast-0.6.5a/scripts:\ /apps1/abyss/1.2.7/intel/bin:\ /apps1/velvet/velvet_1.0.12:\ /apps1/pipe/tophat/1.3.0/intel/bin:\ /apps1/pipe/cufflinks/1.0.3/intel/bin:\ /apps1/blast/2.2.25/gnu/bin:\ /apps1/blast+/2.2.5/gnu/bin:\ /apps1/sputnik/intel/bin:\ /apps1/taxonomy/intel/bin:\ /apps1/add_scores/add_scores:\ /apps1/emboss/6.4.0/intel/bin:\ /apps1/hyphy/hyphy/HYPHY:\ /apps1/lastz/1.02.00:\ /apps1/perm/0.3.6/intel/bin:\ /apps1/beam2/intel/bin:\ /apps1/pass2/intel/bin:\ /apps1/plink/1.07/intel/bin:\ /apps1/fbat/2.0.3/bin:\ /apps1/eigensoft/3.0/intel/bin:\ /apps1/mosaik/Mosaik-1.1.0021-Linux-x64/bin:\ /apps1/freebayes/freebayes.git/bin:\ $PATH export LD_LIBRARY_PATH=.:\ /apps1/python/2.6.6/intel/lib:\ /apps1/libgtextutils/0.6/intel/lib:\ /apps1/emboss/6.4.0/intel/lib:\ /apps1/intel/lib/intel64:\ /apps1/intel/mkl/lib/em64t:\ /apps1/tcltk/8.5.9/intel/lib:\ /apps1/zlib/1.2.5/intel/lib:\ /apps1/graphviz/2.26.3/intel/lib:\ /apps1/python/2.6.6/intel/lib/python2.6/site-packages/simtk/chem/openmm/OpenMM:\ /apps1/python/2.6.6/intel/lib/python2.6/site-packages:\ /apps1/libpng/1.5.0/intel/lib:\ /apps1/R/2.13.1/intel/lib64/R/lib:\ $LD_LIBRARY_PATH export PKG_CONFIG_PATH=.:\ /apps1/R/2.13.1/intel/lib64/pkgconfig:\ /apps1/libgtextutils/0.6/intel/lib/pkgconfig:\ /apps1/sparsehash/1.11/intel/lib/pkgconfig:\ $PKG_CONFIG_PATH export CLASSPATH=.:\ /apps1/gatk/gatk-git/dist:\ /apps1/gatk/gatk-git/lib:\ /apps1/srma/srma-0.1.13:\ /apps1/haploview/4.2:\ /apps1/picard/picard-tools-1.50:\ /apps1/fastqc/fastqc-0.9.5:\ $CLASSPATH best, joe ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Modifying OpenID providers
Hi Nikolai, It's best to send questions directly to the mailing list, so they reach the widest audience and the right people. There are some responses below: Nikolai Vazov wrote: Hi, Nate, You helped me install Galaxy with a DB hosted on a remote server via SSL connection. Thanks a lot again for your help. I have been struggling to add OpenID providers to Galaxy, but have not been very successful. I have some questions: 1) Where are the variables (OpenID providers) in the dropdown menu stocked? In a file, the DB? The template (login.mako) has a line %def name=render_openid_form( referer, auto_associate, openid_providers ) but the where are the date for openid_providers (I am a newby in python ...) 2) In which files do you configure them? The list is hardcoded in: lib/galaxy/web/controllers/user.py These should ultimately be moved to a configuration file. 3) The provider I want to add uses SAML 2.0, which is the package I need for python? pysaml2.0? I haven't worked with SAML before so unfortunately I can't be of much help here. --nate Thank you for your help Nikolai -- Nikolay Vazov, PhD Research Computing Centre - http://hpc.uio.no USIT, University of Oslo ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] link file bug in the new version
remy d1 wrote: Hi, We found a little problem in the new galaxy release. When we upload a dataset library from filesystem : Admin Manage data libraries create new data library Add dataset Upload files from filesystem path Link to files without copying If galaxy user is the owner of this file or if he has write permission on it (on the filesystem), the file is deleted !! I do not think it is the normal behaviour... Hi Remy, I'm unable to duplicate this behavior. Would it be possible for you to do some debugging on your end to determine when this is happening? If it's Galaxy's code it would probably be somewhere in tools/data_source/upload.py --nate Regards. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Galaxy egg fetching error? Mac OS X 10.7 (Lion)
Whyte, Jeffrey wrote: Hi, I've been having trouble running a local instance of Galaxy on a Mac Pro after upgrading to OS X 10.7 (Lion). My Python version is 2.7.1 and Mercurial is 1.9.1 for MacOS X 10.7. I don't have any MacPorts installed. The error I see after running the startup script is pasted at the end of this message. Thanks in advance for any help or advice. jjw Hi Jeffrey, We haven't yet gotten our dependencies up to speed on Lion. Could you grab a copy of Python from python.org and use this? It'll install under /Library/Frameworks/Python.framework, just add the correct bin/ directory to the front of your $PATH and start Galaxy as normal. Sorry for the inconvenience, --nate - [~/galaxy-dist] myuserid 10:11 AM ./run.sh Some eggs are out of date, attempting to fetch... Warning: MarkupSafe (a dependent egg of Mako) cannot be fetched Warning: decorator (a dependent egg of sqlalchemy-migrate) cannot be fetched Warning: simplejson (a dependent egg of WebHelpers) cannot be fetched Traceback (most recent call last): File ./scripts/fetch_eggs.py, line 30, in module c.resolve() # Only fetch eggs required by the config File /Users/myuserid/galaxy-dist/lib/galaxy/eggs/__init__.py, line 345, in resolve egg.resolve() File /Users/myuserid/galaxy-dist/lib/galaxy/eggs/__init__.py, line 195, in resolve return self.version_conflict( e.args[0], e.args[1] ) File /Users/myuserid/galaxy-dist/lib/galaxy/eggs/__init__.py, line 226, in version_conflict r = pkg_resources.working_set.resolve( ( dist.as_requirement(), ), env, egg.fetch ) File /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py, line 565, in resolve raise DistributionNotFound(req) # XXX put more info here pkg_resources.DistributionNotFound: numpy==1.6.0 Fetch failed. [~/galaxy-dist] myuserid 10:12 AM ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Galaxy egg fetching error? Mac OS X 10.7 (Lion)
On Sep 2, 2011, at 1:16 PM, Nate Coraor wrote: We haven't yet gotten our dependencies up to speed on Lion. Could you grab a copy of Python from python.org and use this? It'll install under /Library/Frameworks/Python.framework, just add the correct bin/ directory to the front of your $PATH and start Galaxy as normal. i'm assuming this is similar? https://bitbucket.org/galaxy/galaxy-central/issue/616/bad-eggs-for-106 ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] handling galaxy updates
Hi, I am curious to know which revision of galaxy code gets deployed in the main ( http://main.g2.bx.psu.edu/ ) and test ( http://test.g2.bx.psu.edu/ ) instances of galaxy. I was thinking active development repository galaxy-central code gets deployed in the test galaxy instance and stable galaxy-dist code gets deployed in the main galaxy instance. However, it seems like main galaxy instance is updated more frequently than galaxy-dist repository. Should local galaxy instances keep up with PSU's main galaxy instance revisions or wait for stable code to be released in galaxy-dist repository? We have been keeping up with galaxy-dist repository and not the galaxy-central repository updates. Are other sites following similar update model? -- Thanks, Shantanu. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] rpy - No module named rpy CentoOs install
Nate, could we go to the beginning of the issue: where is the galaxy env set? I've seen a few post but I can only gather partial info. - it is NOT set from the galaxy user .bashrc or .profile - if it is indeed partially set from /etc/profile - using a Rocks cluster leaves you with many entries there to ponder - if it is using ld.so.conf.d as well - it will read /usr/lib64 entries etc... - is there a precise way to see what env is used for galaxy? The log script gives you a nice read on the python path but is there a way to see all envs? Looking at envs as the user galaxy does not equate what galaxy ends up using. multiple issues on the CentOS install: I found the setting or non-setting leading to the missing rpy module by looking at the runner log script - while it was loading python2.6.6 it was also loading the site-packages and other python parts from /usr/lib64...python2.4 Once I edited run.sh to use the correct python and correct R path and added the RHOME to the rpy dependent scripts - this problem went away seemingly only to produce an env looking issue: sh rm command not found when running rpy dependent applications. Did somehow the edit destroy the /bin and usr/bin path? Would these be set in run.sh as well? To your question: where do you set RHOME in the env? We'd prefer to set all path options in run.sh in case all above is true that you cannot set it in ~/.bash* best, joe From: Nate Coraor [n...@bx.psu.edu] Sent: Friday, September 02, 2011 1:40 PM To: Joseph Hargitai Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] rpy - No module named rpy CentoOs install Joseph Hargitai wrote: additional info: it is possible on the same node to run manually ./gsummary.py with the header: #!/usr/bin/env python import sys, re, tempfile from rpy_options import set_options set_options(RHOME='/apps1/R/2.13.1/intel/lib64/R') from rpy import * Where else can there be an env setting to prevent this app not finding the mod from within galaxy? Hi Joe, If you set RHOME in the environment and then run gsummary.py without the additions, does it work? --nate j From: Joseph Hargitai Sent: Thursday, September 01, 2011 12:28 PM To: galaxy-dev@lists.bx.psu.edu Subject: rpy - No module named rpy CentoOs install Hi, On our Ubuntu install stat packages and all that require rpy work fine. On our CentOs install seeing this stubborn error that I did see from previous post to be difficult to fix. At first suspected the SGE issue - environment not transferring to compute nodes. After changing the app to run local had the same issue. CentOs: 2.6.18-92.1.13.el5 rpy module is in: /apps1/python/2.6.6/intel/lib/python2.6/site-packages _rpy2122.so _rpy2131.so version: [galaxy@compute-0-65 galaxy-dist]$ python -c import rpy; print rpy.__version__ 1.5.1 path: python -c 'import sys; print \n.join( sys.path )' /apps1/python/2.6.6/intel/lib/python2.6/site-packages/simplejson-2.0.9-py2.6-linux-x86_64.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Sphinx-1.0.7-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/docutils-0.7-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Jinja2-2.5.5-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Pygments-1.4-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/nose-1.0.0-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Traits-3.5.0-py2.6-linux-x86_64.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/nibabel-1.0.0-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/nipype-0.0.0-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/setuptools-0.6c12dev_r88846-py2.6.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/birdsuite-1.0-py2.5.egg /apps1/python/2.6.6/intel/lib/python2.6/site-packages/mpgutils-0.7-py2.5.egg /apps1/python/2.6.6/intel/lib/python26.zip /apps1/python/2.6.6/intel/lib/python2.6 /apps1/python/2.6.6/intel/lib/python2.6/plat-linux2 /apps1/python/2.6.6/intel/lib/python2.6/lib-tk /apps1/python/2.6.6/intel/lib/python2.6/lib-old /apps1/python/2.6.6/intel/lib/python2.6/lib-dynload /apps1/python/2.6.6/intel/lib/python2.6/site-packages /apps1/python/2.6.6/intel/lib/python2.6/site-packages/PIL compiled against /R/2.13.1 env: export PATH=.:\ /apps1/R/2.13.1/intel/bin:\ /apps1/python/2.6.6/intel/bin:\ /apps1/pipe/bowtie/0.12.7/intel:\ /apps1/pipe/bwa/0.5.9/intel:\ /apps1/samtools/0.1.13/intel/bin:\ /apps1/fastx_toolkit/0.0.13/intel/bin:\ /apps1/maq/maq-0.7.1:\ /apps1/maq/maq-0.7.1/scripts:\ /apps1/bfast/bfast-0.6.5a/butil:\ /apps1/bfast/bfast-0.6.5a/scripts:\ /apps1/abyss/1.2.7/intel/bin:\ /apps1/velvet/velvet_1.0.12:\ /apps1/pipe/tophat/1.3.0/intel/bin:\ /apps1/pipe/cufflinks/1.0.3/intel/bin:\ /apps1/blast/2.2.25/gnu/bin:\ /apps1/blast+/2.2.5/gnu/bin:\
Re: [galaxy-dev] disk space and file formats
What, like a BAM file of unaligned reads? Uses gzip compression, and tracks the pairing information explicitly :) Some tools will already take this as an input format, but not all. ah, yes, precisely. i actually think illumina's pipeline produces files in this format now. wrappers which create a temporary fastq file would need to be created but that's easy enough. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] handling galaxy updates
Thanks for the reply Kanwei. Is there any announcement or news feed that people can subscribe to know when main galaxy instance is updated. This will help sites which want to keep up with the PSU's main galaxy instance. Do you announce it on the galaxy-user list? -- Shantanu. On Sep 2, 2011, at 1:57 PM, Kanwei Li wrote: Hi Shantanu, Test usually tracks galaxy-central pretty closely, and we do update main more often than galaxy-dist (you can see the version on main at the index page). If nothing breaks on main for a while we do a galaxy-dist release so we are generally confident that it will be stable. Thanks, K On Fri, Sep 2, 2011 at 2:22 PM, Shantanu Pavgi pa...@uab.edu wrote: Hi, I am curious to know which revision of galaxy code gets deployed in the main ( http://main.g2.bx.psu.edu/ ) and test ( http://test.g2.bx.psu.edu/ ) instances of galaxy. I was thinking active development repository galaxy-central code gets deployed in the test galaxy instance and stable galaxy-dist code gets deployed in the main galaxy instance. However, it seems like main galaxy instance is updated more frequently than galaxy-dist repository. Should local galaxy instances keep up with PSU's main galaxy instance revisions or wait for stable code to be released in galaxy-dist repository? We have been keeping up with galaxy-dist repository and not the galaxy-central repository updates. Are other sites following similar update model? -- Thanks, Shantanu. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] disk space and file formats
On Sep 2, 2011, at 3:02 PM, Edward Kirton wrote: What, like a BAM file of unaligned reads? Uses gzip compression, and tracks the pairing information explicitly :) Some tools will already take this as an input format, but not all. ah, yes, precisely. i actually think illumina's pipeline produces files in this format now. wrappers which create a temporary fastq file would need to be created but that's easy enough. My argument against that is the cost of going from BAM - temp fastq may be prohibitive, e.g. the need to generate very large temp fastq files on the fly as input for various applications may lead one back to just keeping a permanent FASTQ around anyway. One could probably get better performance out of a simpler format that removes most of the 'AM' parts of BAM. Or is the idea that the file itself is modified, like a database? And how would indexing work (BAM uses binning on the match to the reference seq), or does it matter? I recall hdf5 was planned as an alternate format (PacBio uses it, IIRC), and of course there is NCBI's .sra format. Anyone using the latter two? chris ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Suggestion / Request for Comments on Galaxy Best Practices - Gradual migration to standard indentation
Hello all, I've created a wiki page on coding best practices to record what is actually done, and the results of discussions like these: http://wiki.g2.bx.psu.edu/Develop/Best Practices So far, it only lists 2 standards: 1. 4 spaces per indent level 2. Use spaces, not tabs. I'll continue to watch this list and add best practices accordingly. Dave C. -- http://galaxyproject.org/ http://getgalaxy.org/ http://usegalaxy.org/ http://galaxyproject.org/wiki/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] downloading bowtie indexed files
Hi all, Just wondering if there is a way to download the bowtie indexed files after indexing. It seems that the indexed output is simply a meta-file that points to the directory where the indexed files are kept but what if I want to download all of the indexed files themselves? I wrote something to do this... basically I created an html page that is the output of bowtie-build which then points to the created files... but I'm wondering if there is an easier way...? - Nik. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] disk space and file formats
i actually think illumina's pipeline produces files in this format (unaligned-bam) now. Oh do they? - that's interesting. Do you have a reference/link? i caught wind of this at the recent illumina user's conference but i asked someone in our sequencing team to confirm and he hadn't heard of this. it must be limited to the forthcoming miseq sequencer for the timebeing, but may make it's way to the big sequencers later. apparently illumina is thinking about storage as well. i seem to recall the speaker saying they won't produce srf files anymore, but again, this was a talk about the miseq so may not apply to the other sequencers. wrappers which create a temporary fastq file would need to be created but that's easy enough. My argument against that is the cost of going from BAM - temp fastq may be prohibitive, e.g. the need to generate very large temp fastq files on the fly as input for various applications may lead one back to just keeping a permanent FASTQ around anyway. True - if you can't update the tools you need to take BAM. In some cases at least you can pipe the gzipped FASTQ into alignment tools which accepts FASTQ on stdin, so there is no temp file per se. the tools really do need to support the format; the tmpfile was simply a workaround. some tools already support bam, more currently support fastq.gz. (someone here made the wrong bet years ago and had adopted a site-wide fastq.bz2 standard which only recently changed to fastq.gz.) but if illumina does start producing bam files in the future, then we can expect more tools to support that format. until they do, probably fastq.gz is a safe bet. of course there is a computational cost to compressing/uncompressing files but that's probably better than storing unnecessarily huge files. it's a trade-off. similarly, there's a trade-off involved in limiting read qc tools to a single/few big tools which wrap several tools, with many options. users can't play around with read qc but that may be too expensive (computationally and storage-wise). for the most part, a standard qc will do. one can spend a lot of time and effort to squeeze a bit more useful data out of a bad library, for example, when they probably should have just sequenced another library. i favor leaving the playing around to the rd/development/qc team and just offering a canned/vetted qc solution to the average user. I recall hdf5 was planned as an alternate format (PacBio uses it, IIRC), and of course there is NCBI's .sra format. Anyone using the latter two? Moving from the custom BGZF modified gzip format used in BAM to HD5 has been proposed on the samtools mailing list (as Chris knows), and there is a proof of principle implementation too in BioHDF, http://www.hdfgroup.org/projects/biohdf/ The SAM/BAM group didn't seem overly enthusiastic though. For the NCBI's .sra format, there is no open specification, just their public domain source code: http://seqanswers.com/forums/showthread.php?t=12054 i believe hdf5 is an indexed data structure which, as you mentioned, isn't required for unprocessed reads. since i'm rapidly running out of storage, i think the best immediate solution for me is to deprecate all the fastq datatypes in favor of a new fastqsangergz and to bundle the read qc tools to eliminate intermediate files. sure, users won't be able to play around with their data as much, but my disk is 88% full and my cluster has been 100% occupied for 2-months straight, so less choice is probably better. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Problems with load_workflow_editor
No, it is happening only my instance of galaxy . i am using older version of galaxy . can it happen some browser won't support galaxy old version . On Sat, Sep 3, 2011 at 12:30 AM, Kanwei Li kan...@gmail.com wrote: Hi Shashi, Does this only happen on your instance or does it happen on our public instance as well? Thanks, K On Fri, Sep 2, 2011 at 6:46 AM, shashi shekhar meshash...@gmail.comwrote: Hi All, i am not able to create workflow in my local instance of galaxy. it's displaying only loading picture on browser when i am clicking on edit option of workflow . how can we resolve such type of problem ? i am using old version version galaxy. 192.168.60.115, 145.139.1.156 - - [02/Sep/2011:16:13:50 +0600] GET /workflow/get_datatypes?_=1314940546255 HTTP/1.1 200 - http://garu.ac.in/workflow/editor?id=df7a1f0c02a5b08e; Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.17) Gecko/20110420 Firefox/3.6.17 192.168.60.115, 145.139.1.156 - - [02/Sep/2011:16:13:51 +0600] GET /workflow/load_workflow?_=1314940546462id=df7a1f0c02a5b08e_=true HTTP/1.1 200 - http://garu.ac.in/workflow/editor?id=df7a1f0c02a5b08e; Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.17) Gecko/20110420 Firefox/3.6.17 Regards shashi shekhar ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] disk space and file formats
On Saturday, September 3, 2011, Edward Kirton eskir...@lbl.gov wrote: of course there is a computational cost to compressing/uncompressing files but that's probably better than storing unnecessarily huge files. it's a trade-off. It may still be faster due to less IO, probably depends on your hardware. since i'm rapidly running out of storage, i think the best immediate solution for me is to deprecate all the fastq datatypes in favor of a new fastqsangergz and to bundle the read qc tools to eliminate intermediate files. sure, users won't be able to play around with their data as much, but my disk is 88% full and my cluster has been 100% occupied for 2-months straight, so less choice is probably better. In your position I agree that is a pragmatic choice. You might be able to modify the file upload code to gzip any FASTQ files... that would prevent uncompressed FASTQ getting into new histories. I wonder if Galaxy would benefit from a new fastqsanger-gzip (etc) datatype? However this seems generally useful (not just for FASTQ) so perhaps a more general mechanism would be better where tool XML files can say which file types they accept and which of those can/must be compressed (possily not just gzip format?). Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] disk space and file formats
On Sep 2, 2011, at 8:02 PM, Peter Cock wrote: On Fri, Sep 2, 2011 at 9:27 PM, Fields, Christopher J cjfie...@illinois.edu wrote: On Sep 2, 2011, at 3:02 PM, Edward Kirton wrote: What, like a BAM file of unaligned reads? Uses gzip compression, and tracks the pairing information explicitly :) Some tools will already take this as an input format, but not all. ah, yes, precisely. i actually think illumina's pipeline produces files in this format now. Oh do they? - that's interesting. Do you have a reference/link? wrappers which create a temporary fastq file would need to be created but that's easy enough. My argument against that is the cost of going from BAM - temp fastq may be prohibitive, e.g. the need to generate very large temp fastq files on the fly as input for various applications may lead one back to just keeping a permanent FASTQ around anyway. True - if you can't update the tools you need to take BAM. In some cases at least you can pipe the gzipped FASTQ into alignment tools which accepts FASTQ on stdin, so there is no temp file per se. Some applications (Velvet for instance) accept gzipped FASTQ, though they may turn around and dump the data out uncompressed. One could probably get better performance out of a simpler format that removes most of the 'AM' parts of BAM. Yes, but that meaning inventing yet another file format. At least gzipped FASTQ is quite straightforward. Yes. Or is the idea that the file itself is modified, like a database? That would be quite a dramatic change from the current Galaxy workflow system - I doubt that would be acceptable in general. My thought as well. And how would indexing work (BAM uses binning on the match to the reference seq), or does it matter? BAM indexing as done in samtools/picard is only for the aligned reads - so no help for a BAM file of unaligned reads. You could use a different indexing system (e.g. by read name) and the same BAM BGZF block offset system (I've tried this as an experiment with Biopython's SQLite indexing of sequence files). However, for tasks taking unaligned reads as input, you generally just iterate over the reads in the order on disk. I think, unless there is a demonstrable advantage to using unaligned BAM, fastq.gz is the easiest. I recall hdf5 was planned as an alternate format (PacBio uses it, IIRC), and of course there is NCBI's .sra format. Anyone using the latter two? Moving from the custom BGZF modified gzip format used in BAM to HD5 has been proposed on the samtools mailing list (as Chris knows), and there is a proof of principle implementation too in BioHDF, http://www.hdfgroup.org/projects/biohdf/ The SAM/BAM group didn't seem overly enthusiastic though. Probably not, as it is somewhat a competitor of SAM/BAM (a bit broader in scope, beyond just alignments). As Peter indicated, I know the BioHDF folks (they are here in town); however, my actual question was whether anyone is actually using HDF5 or SRA in production? I haven't seen adoption beyond PacBio, but I have seen some things popping up in Galaxy. For the NCBI's .sra format, there is no open specification, just their public domain source code: http://seqanswers.com/forums/showthread.php?t=12054 Regards, Peter Simply gzipping FASTQ seems to give better compression that an .lite.sra file (and I'm not a happy user of their SRA toolset). And of course there is parallel gzip... chris ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/