[galaxy-dev] GATK 3.1
Hi, I've seen there is a new version (3.1-1) of GATK available at http://www.broadinstitute.org/gatk/download . Are there any plans of getting this version into Galaxy in the nearer future? Greetings, Thomas --- Thomas Berner Julius Kühn-Institut (JKI) - Federal Research Centre for Cultivated Plants - Erwin Baur-Straße 27 06484 Quedlinburg - Germany - Phone: ++49 ( 0 ) 3946 47 562 EMail: thomas.ber...@jki.bund.de ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] GATK 3.1
Il 2014-03-18 14:41 Berner, Thomas ha scritto: Hi, I've seen there is a new version (3.1-1) of GATK available at http://www.broadinstitute.org/gatk/download [1] . Are there any plans of getting this version into Galaxy in the nearer future? Hi Thomas, as you probably know there is a wrapper for GATK v.2 on the Tool Shed: http://toolshed.g2.bx.psu.edu/view/iuc/gatk2 which is developed and maintained by Jim Johnson, Björn Grüning and me. GATK v.3 has some new tools and also some changes to logging which prevent the use of gatk2 wrapper with v.3 , so we are planning to create a new gatk3 Tool Shed repository sooner or later, no time frame decided yet. You can follow development, and contribute if you want, at this git repository: https://github.com/bgruening/galaxytools Best, Nicola -- Nicola Soranzo, Ph.D. Bioinformatics Program, CRS4 Loc. Piscina Manna, 09010 Pula (CA), Italy http://www.bioinformatica.crs4.it/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] [CONTENT] Re: Unable to remove old datasets
Thanks, Ravi Peter I've added a card to get the allow_user_dataset_purge options into the client and to better show the viable options to the user: https://trello.com/c/RCPZ9zMF On Fri, Mar 14, 2014 at 11:10 AM, Peter Cock p.j.a.c...@googlemail.comwrote: On Fri, Mar 14, 2014 at 11:24 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Thu, Mar 13, 2014 at 6:40 PM, Sanka, Ravi rsa...@jcvi.org wrote: I do not think so. Several individual datasets have been deleted (clicked the upper-right X on the history item box) but no History has been permanently deleted. Is there any indication in the database if target dataset or datasets were marked for permanent deletion? In the dataset table, I see fields deleted, purged, and purgable, but nothing that says permanently deleted. I would welcome clarification from the Galaxy Team, here and on the wiki page which might benefit from a flow diagram? https://wiki.galaxyproject.org/Admin/Config/Performance/Purge%20Histories%20and%20Datasets My assumption is using permanently delete in the user interface marks an entry as purgable, and then it will be moved to purged (and the associated file on disk deleted) by the cleanup scripts - but I'm a bit hazy on this any why it takes a while for a user's usage figures to change. Hmm. Right now I've unable (via the web interface) to permanently delete a history - it stays stuck as deleted, and thus (presumably) won't get purged by the clean up scripts. I've tried: 1. Load problem history 2. Rename the history DIE DIE to avoid confusion 3. Top right menu, Delete permanently 4. Prompted Really delete the current history permanently? This cannot be undone, OK 5. Told History deleted, a new history is active 6. Top right menu, Saved Histories 7. Click Advanced Search, status all 8. Observe DIE DIE history is only deleted (while other older histories are deleted permanently) (BAD) 9. Run the cleanup scripts, $ sh scripts/cleanup_datasets/delete_userless_histories.sh $ sh scripts/cleanup_datasets/purge_histories.sh $ sh scripts/cleanup_datasets/purge_libraries.sh $ sh scripts/cleanup_datasets/purge_folders.sh $ sh scripts/cleanup_datasets/purge_datasets.sh 10. Reload the saved history list, no change. 11. Using the drop down menu, select Delete Permanently 12. Prompted History contents will be removed from disk, this cannot be undone. Continue, OK 13. No change to history status (BAD) 14. Tick the check-box, and use the Delete Permanently button at the bottom of the page 15. Prompted History contents will be removed from disk, this cannot be undone. Continue, OK 16. No change to history status (BAD) 17. Run the cleanup scripts, no change. Note that in my universe_wsgi.ini I have not (yet) set: allow_user_dataset_purge = True If this setting is important, then the interface seems confused - and if quotas are enforced, very frustrating :( Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] [CONTENT] Re: Unable to remove old datasets
I believe it's a (BAD) silent failure mode in the server code. If I understand correctly, the purge request isn't coughing an error when it gets to the 'allow_user_dataset_purge' check and instead is silently marking (or re-marking) the datasets as deleted. I would rather it fail with a 403 error if purge is explicitly requested. That said, it of course would be better to remove the purge operation based on the configuration then to show an error after we've found you can't do the operation. The same holds true for the 'permanently remove this dataset' link in deleted datasets. I'll see if I can find out the answer to your question on the cleanup scripts. On Tue, Mar 18, 2014 at 10:49 AM, Peter Cock p.j.a.c...@googlemail.comwrote: On Tue, Mar 18, 2014 at 2:14 PM, Carl Eberhard carlfeberh...@gmail.com wrote: Thanks, Ravi Peter I've added a card to get the allow_user_dataset_purge options into the client and to better show the viable options to the user: https://trello.com/c/RCPZ9zMF Thanks Carl - so this was a user interface bug, showing the user non-functional permanent delete (purge) options. That's clearer now. In this situation can the user just 'delete', and wait N days for the cleanup scripts to actually purge the files and free the space? (It seems N=10 in scripts/cleanup/purge_*.sh at least, elsewhere like the underlying Python script the default looks like N=60). Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] BugFix for dynamic_options ParamValueFilter to run IUC SnpEff in a workflow
Hey JJ, Thanks for the bug report. I can confirm the issue. I think the problem probably is that the other_values thing you are printing out there is very different when rendering tools (it is the value of things at that depth of the tool state tree) versus workflows (in which it is the global state of the tree from the top). The tool variant is probably the right approach since they work and has the nice advantage of avoiding ambiguities that arise otherwise. https://bitbucket.org/galaxy/galaxy-central/pull-request/349/bring-workflow-parameter-context/diff I have opened a pull request with an attempt to bring workflows in line with tools - it seems to fix snpEff for me locally - can you confirm? Any chance this also solves your other problem with the XY plotting tool (Pull Request #336)? -John On Fri, Feb 28, 2014 at 12:12 PM, Jim Johnson johns...@umn.edu wrote: The current code in: lib/galaxy/tools/parameters/dynamic_options.py only searches the top layer of the dict to find the dependency value. A fix is provide in pull request: #343: Need to traverse the other_value dict to find dependencies for ParamValueFilter in SnpEff tool_config inputs param format=vcf,tabular,pileup,bed name=input type=data label=Sequence changes (SNPs, MNPs, InDels)/ ... conditional name=snpDb param name=genomeSrc type=select label=Genome source option value=cachedLocally installed reference genome/option option value=historyReference genome from your history/option option value=namedNamed on demand/option /param when value=cached param name=genomeVersion type=select label=Genome !--GENOMEDESCRIPTION-- options from_data_table=snpeff_genomedb filter type=unique_value column=0 / /options /param param name=extra_annotations type=select display=checkboxes multiple=true label=Additional Annotations helpThese are available for only a few genomes/help options from_data_table=snpeff_annotations filter type=param_value ref=genomeVersion key=genome column=0 / filter type=unique_value column=1 / /options /param When running workflow: input.vcf - SnpEff The values in ParamValueFilter filter_options function: self.ref_name 'genomeVersion' other_values {u'spliceSiteSize': '1', u'filterHomHet': 'no_filter', u'outputFormat': 'vcf', u'filterOut': None, u'inputFormat': 'vcf', u'filterIn': 'no_filter', u'udLength': '5000', u'generate_stats': True, u'noLog': True, u'chr': 'None', u'intervals': None, u'snpDb': {'extra_annotations': None, 'regulation': None, 'genomeVersion': 'GRCh37.71', 'genomeSrc': 'cached', '__current_case__': 0}, u'offset': '', u'input': galaxy.tools.parameters.basic.DummyDataset object at 0x11451b8d0, u'transcripts': None, u'annotations': ['-canon', '-lof', '-onlyReg']} Since 'genomeVersion' isn't in the keys of other_values, but rather in other_values['snpDb'] this failed the assertion: assert self.ref_name in other_values, Required dependency '%s' not found in incoming values % self.ref_name Pull request 343: $ hg diff lib/galaxy/tools/parameters/dynamic_options.py diff -r 95517f976cca lib/galaxy/tools/parameters/dynamic_options.py --- a/lib/galaxy/tools/parameters/dynamic_options.pyThu Feb 27 16:56:25 2014 -0500 +++ b/lib/galaxy/tools/parameters/dynamic_options.pyFri Feb 28 11:37:04 2014 -0600 @@ -177,8 +177,27 @@ return self.ref_name def filter_options( self, options, trans, other_values ): if trans is not None and trans.workflow_building_mode: return [] -assert self.ref_name in other_values, Required dependency '%s' not found in incoming values % self.ref_name -ref = other_values.get( self.ref_name, None ) +## Depth first traversal to find the value for a dependency +def get_dep_value(param_name, dep_name, cur_val, layer): +dep_val = cur_val +if isinstance(layer, dict ): +if dep_name in layer: +dep_val = layer[dep_name] +if param_name in layer: +return dep_val +else: +for l in layer.itervalues(): +dep_val = get_dep_value(param_name, dep_name, dep_val, l) +if dep_val: +break +elif isinstance( layer, list): +for l in layer: +dep_val = get_dep_value(param_name, dep_name, dep_val, l) +if dep_val: +break +return None +ref =
Re: [galaxy-dev] Verifying test output datatypes, was: Problem with change_format and conditional inputs?
Merged. Thanks again for the input! I will look into the unicode issue and respond on the other thread. -John On Sat, Mar 15, 2014 at 7:50 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Fri, Mar 14, 2014 at 9:04 PM, John Chilton jmchil...@gmail.com wrote: On Fri, Mar 14, 2014 at 10:24 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Thanks John, I suggest making this test framework perform this check by default (the twill and API based frameworks) and seeing what - if anything - breaks as a result on the Test Tool Shed. Hey Peter, I hope it is okay, but I do not want to make this change to the Twill driven variant of tool tests, I consider that code at end of life - new development would be a waste I think. Running all tools against a modified environment that switched all tests to target the APIs would be nice, but it sounds like there is not really the infrastructure in place for doing that right now. Upon further consideration I am not sure there are really any backward compatibility concerns anyway - or at least no more so than anything else when switching over to the API driven tests. I'll let the pull request sit open for a few days and then merge it as is. Note that one area of fuzziness is subclasses, e.g. if the tool output was labelled fastqsanger, but the test just said fastq, I would say the test was broken. On the other hand, if the test used a specific datatype like fastqsanger but the tool produced a dataset tagged with a more generic datatype like fastq I think that is a again a real failure. 100% agreed on both points. I believe the implementation proposed in pull request #347 reflects this resolution of the fuzziness. Thanks and have a great weekend, -John That sounds like a plan :) Hmm. I wonder if it would be trivial to tweak our TravisCI setup to run the functional tests twice, once with the old twill framework and once with the new API based framework? Seems doable (but would up the run time quite a bit). Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] BugFix for dynamic_options ParamValueFilter to run IUC SnpEff in a workflow
Nice work John. This fixed issues for running workflows for both SnpEff and XYPlot. Please reject my pull requests : #336 and #343 in favor of #349 Thanks, JJ On 3/18/14, 11:44 AM, John Chilton wrote: Hey JJ, Thanks for the bug report. I can confirm the issue. I think the problem probably is that the other_values thing you are printing out there is very different when rendering tools (it is the value of things at that depth of the tool state tree) versus workflows (in which it is the global state of the tree from the top). The tool variant is probably the right approach since they work and has the nice advantage of avoiding ambiguities that arise otherwise. https://bitbucket.org/galaxy/galaxy-central/pull-request/349/bring-workflow-parameter-context/diff I have opened a pull request with an attempt to bring workflows in line with tools - it seems to fix snpEff for me locally - can you confirm? Any chance this also solves your other problem with the XY plotting tool (Pull Request #336)? -John On Fri, Feb 28, 2014 at 12:12 PM, Jim Johnson johns...@umn.edu wrote: The current code in: lib/galaxy/tools/parameters/dynamic_options.py only searches the top layer of the dict to find the dependency value. A fix is provide in pull request: #343: Need to traverse the other_value dict to find dependencies for ParamValueFilter in SnpEff tool_config inputs param format=vcf,tabular,pileup,bed name=input type=data label=Sequence changes (SNPs, MNPs, InDels)/ ... conditional name=snpDb param name=genomeSrc type=select label=Genome source option value=cachedLocally installed reference genome/option option value=historyReference genome from your history/option option value=namedNamed on demand/option /param when value=cached param name=genomeVersion type=select label=Genome !--GENOMEDESCRIPTION-- options from_data_table=snpeff_genomedb filter type=unique_value column=0 / /options /param param name=extra_annotations type=select display=checkboxes multiple=true label=Additional Annotations helpThese are available for only a few genomes/help options from_data_table=snpeff_annotations filter type=param_value ref=genomeVersion key=genome column=0 / filter type=unique_value column=1 / /options /param When running workflow: input.vcf - SnpEff The values in ParamValueFilter filter_options function: self.ref_name 'genomeVersion' other_values {u'spliceSiteSize': '1', u'filterHomHet': 'no_filter', u'outputFormat': 'vcf', u'filterOut': None, u'inputFormat': 'vcf', u'filterIn': 'no_filter', u'udLength': '5000', u'generate_stats': True, u'noLog': True, u'chr': 'None', u'intervals': None, u'snpDb': {'extra_annotations': None, 'regulation': None, 'genomeVersion': 'GRCh37.71', 'genomeSrc': 'cached', '__current_case__': 0}, u'offset': '', u'input': galaxy.tools.parameters.basic.DummyDataset object at 0x11451b8d0, u'transcripts': None, u'annotations': ['-canon', '-lof', '-onlyReg']} Since 'genomeVersion' isn't in the keys of other_values, but rather in other_values['snpDb'] this failed the assertion: assert self.ref_name in other_values, Required dependency '%s' not found in incoming values % self.ref_name Pull request 343: $ hg diff lib/galaxy/tools/parameters/dynamic_options.py diff -r 95517f976cca lib/galaxy/tools/parameters/dynamic_options.py --- a/lib/galaxy/tools/parameters/dynamic_options.pyThu Feb 27 16:56:25 2014 -0500 +++ b/lib/galaxy/tools/parameters/dynamic_options.pyFri Feb 28 11:37:04 2014 -0600 @@ -177,8 +177,27 @@ return self.ref_name def filter_options( self, options, trans, other_values ): if trans is not None and trans.workflow_building_mode: return [] -assert self.ref_name in other_values, Required dependency '%s' not found in incoming values % self.ref_name -ref = other_values.get( self.ref_name, None ) +## Depth first traversal to find the value for a dependency +def get_dep_value(param_name, dep_name, cur_val, layer): +dep_val = cur_val +if isinstance(layer, dict ): +if dep_name in layer: +dep_val = layer[dep_name] +if param_name in layer: +return dep_val +else: +for l in layer.itervalues(): +dep_val = get_dep_value(param_name, dep_name, dep_val, l) +if dep_val: +break +elif isinstance( layer, list): +for l in layer: +dep_val = get_dep_value(param_name, dep_name,
Re: [galaxy-dev] Persistent jobs in cluster queue even after canceling job in galaxy
Erg... I am pretty ignorant about mercurial so I should probably not respond to this but I will try. It is pretty common practice for the Galaxy team to push bug fixes to the last release to the stable branch of galaxy-central - which is very different than the default branch of galaxy-central which contains active development. These don't go out to galaxy-dist automatically to prevent the need to strip truly egregious stuff out of the stable branch that the galaxy-dev news says to target. A quirk of this however is that the stable branch of galaxy-central is actually a good deal more stable the stable branch of galaxy-dist. It is what usegalaxy.org targets and at least a few other high profile Galaxy maintainers have caught on to this trick as well. I think you can update (or merge) the latest stable branch by doing something like the following: hg pull https://bitbucket.org/galaxy/galaxy-central#stable hg update stable We should probably do a better job of keeping the stable branch of galaxy-dist up-to-date - but right now we just push out updates at releases and for major security issues as far as I know. -John On Fri, Mar 14, 2014 at 2:23 PM, Brian Claywell bclay...@fhcrc.org wrote: On Fri, Mar 14, 2014 at 9:01 AM, John Chilton jmchil...@gmail.com wrote: I believe this problem was fixed by Nate after the latest dist release and pushed to the stable branch of galaxy-central. https://bitbucket.org/galaxy/galaxy-central/commits/1298d3f6aca59825d0eb3d32afd5686c4b1b9294 If you are eager for this bug fix, you can track the latest stable branch of galaxy-central instead of the galaxy-dist tag mentioned in the dev news. Right now it has some other good bug fixes not in the latest release. Ah, got it, thanks! Is it unfeasible to push bug fixes like those back to galaxy-dist/stable so those of us that would prefer stable to bleeding-edge don't have to cherry-pick commits? -- Brian Claywell, Systems Analyst/Programmer Fred Hutchinson Cancer Research Center bclay...@fhcrc.org ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] [CONTENT] Re: Unable to remove old datasets
The cleanup scripts enforce a sort of lifetime for the datasets. The first time they're run, they may mark a dataset as deleted and also reset the update time and you'll have to wait N days for the next stage of the lifetime. The next time they're run, or if a dataset has already been marked as deleted, the actual file removal happens and purged is set to true (if it wasn't already). You can manually pass in '-d 0' to force removal of datasets recently marked as deleted. The purge scripts do not check 'allow_user_dataset_purge', of course. On Tue, Mar 18, 2014 at 11:50 AM, Carl Eberhard carlfeberh...@gmail.comwrote: I believe it's a (BAD) silent failure mode in the server code. If I understand correctly, the purge request isn't coughing an error when it gets to the 'allow_user_dataset_purge' check and instead is silently marking (or re-marking) the datasets as deleted. I would rather it fail with a 403 error if purge is explicitly requested. That said, it of course would be better to remove the purge operation based on the configuration then to show an error after we've found you can't do the operation. The same holds true for the 'permanently remove this dataset' link in deleted datasets. I'll see if I can find out the answer to your question on the cleanup scripts. On Tue, Mar 18, 2014 at 10:49 AM, Peter Cock p.j.a.c...@googlemail.comwrote: On Tue, Mar 18, 2014 at 2:14 PM, Carl Eberhard carlfeberh...@gmail.com wrote: Thanks, Ravi Peter I've added a card to get the allow_user_dataset_purge options into the client and to better show the viable options to the user: https://trello.com/c/RCPZ9zMF Thanks Carl - so this was a user interface bug, showing the user non-functional permanent delete (purge) options. That's clearer now. In this situation can the user just 'delete', and wait N days for the cleanup scripts to actually purge the files and free the space? (It seems N=10 in scripts/cleanup/purge_*.sh at least, elsewhere like the underlying Python script the default looks like N=60). Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] GATK 3.1
Hi Thomas, we are have some rough plans to do that, as Nicolas already mentioned. But at first we want to release one final GATK2.8 wrapper. Help is very much appreciated. Cheers, Bjoern Am 18.03.2014 14:41, schrieb Berner, Thomas: Hi, I've seen there is a new version (3.1-1) of GATK available at http://www.broadinstitute.org/gatk/download . Are there any plans of getting this version into Galaxy in the nearer future? Greetings, Thomas --- Thomas Berner Julius Kühn-Institut (JKI) - Federal Research Centre for Cultivated Plants - Erwin Baur-Straße 27 06484 Quedlinburg - Germany - Phone: ++49 ( 0 ) 3946 47 562 EMail: thomas.ber...@jki.bund.de ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Tool Testing Enhancements
Hi Peter, Thanks for the bug report. It looks like if requests is available to the framework these unicode errors go away (it doesn't have to fall back on my poor attempt to provide a similar interface). https://github.com/jmchilton/pico_galaxy/commit/e7d37f31951d8e729cfdbda0ca9085feb7f2da73 I'll create a Trello card and try to add an egg for this - other developers on the team have likewise said they would like a requests dependency available in the past so I doubt there will be objections. -John P.S. It looks like the ftype changeset has already caught some errors (your sample_seqs tool only produces fasta outputs but some outputs are labelled as sff): https://travis-ci.org/jmchilton/pico_galaxy/jobs/21036295 On Tue, Mar 18, 2014 at 6:46 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Fri, Mar 14, 2014 at 8:36 PM, John Chilton jmchil...@gmail.com wrote: Hello Tool Developers, Haven't known when to send this out, but I figure since you haven't received any e-mail from me today it might be a good time. tl;dr - Tool functional tests experienced a significant overhaul over the last release and will continue to change over the next couple releases, but in a backward compatible so hopefully you will not need to care unless you want to. ... Thank you John for this detailed report - I knew bits and pieces from prior discussions, but still found this recap very useful. I'm now taking advantage of the new environment variable to test with both GALAXY_TEST_DEFAULT_INTERACTOR=api (the new framework) and GALAXY_TEST_DEFAULT_INTERACTOR=twill (the old framework) under TravisCI, see: https://github.com/peterjc/galaxy_blast/commit/b9db5c9edc57314c5ab4122bce0b00fa2f9cfb94 https://github.com/peterjc/pico_galaxy/commit/ceed9e0698989b7a617d75d6c483fa28ae61b333 http://blastedbio.blogspot.co.uk/2013/09/using-travis-ci-for-testing-galaxy-tools.html http://lists.bx.psu.edu/pipermail/galaxy-dev/2014-March/018677.html The BLAST+ tests are fine both ways, but there appears to be a unicode issue with the API based testing of pico_bio (twill is fine): https://travis-ci.org/peterjc/pico_galaxy/builds/21008616 UnicodeDecodeError: 'ascii' codec can't decode byte 0xbe in position 847: ordinal not in range(128) Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Tool Shed install best practise : Precompiled binaries vs local compile
Hi Greg and Peter, Am 06.03.2014 18:46, schrieb Peter Cock: Hi Greg, I've retitled the thread, previously about a ToolShed nightly test failure. A brief recap, we're talking about the Galaxy ToolShed XML installation recipes for the NCBI BLAST+ packages and my MIRA4 wrapper in their tool_dependencies.xml files: http://toolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_29 http://testtoolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_29 http://testtoolshed.g2.bx.psu.edu/view/peterjc/mira4_assembler These use the pattern of having os/arch specific action tags (which download and install the tool author's precompiled binaries) and a fall back default action which is to report an error with the os/arch combination and that there are no ready made binaries available. Greg is instead advocating the fall back action be to download the source code, and do a local compile. My reply is below... On Thu, Mar 6, 2014 at 5:24 PM, Peter Cock p.j.a.c...@googlemail.com wrote: On Thu, Mar 6, 2014 at 4:53 PM, Greg Von Kuster g...@bx.psu.edu wrote: As we briefly discussed earlier, your mira4 recipe is not currently following best practices. Although you uncovered a problem in the framework which has now been corrected, your recipe's fall back actions tag set should be the recipe for installing mira4 from source ( http://sourceforge.net/projects/mira-assembler/ ) since there is no licensing issues for doing so. This would be a more ideal approach than echoing the error messages. Thanks very much for helping us discover this problem though! Greg Von Kuster Hi Greg, No problem - I'm good at discovering problems ;) If the download approach failed, it it most likely due to a transient error (e.g. network issues with download). Here I would much prefer Galaxy aborted and reported this as an error (and does not attempt the default action). Is that what you just fixed? As to best practice for the fall back action, I think that needs a new thread. Regards, Peter As to best practice, I do not agree that in cases like this (MIRA4, NCBI BLAST+) where there are provided binaries for the major platforms that the fall back should be compiling from source. The NCBI BLAST+ provide binaries for 32 bit and 64 bit Linux and Mac OS X (which I believe covers all the mainstream platforms Galaxy runs on). Similarly, MIRA4 provides binaries for 64 bit Linux and Mac OS X. Note that 32 bit binaries are not provided, but would be very restricted in terms of the datasets they could be used on anyway - and I doubt many of the systems Galaxy runs on these days are 32 bits. I also think that supporting 32 bit is not really needed and in case of a few libs are really troublesome. If the os/arch combination is exotic enough that precompiled binaries are not available, then it is likely compilation will be tricky anyway - or not supported for that tool, or Galaxy itself. Essentially I am arguing that where the precompiled tool binaries cover any mainstream system Galaxy might be used on, a local compile fall back is not needed. Imho, that statement is to general. There might be some binaries that are done properly but many of them have still some strange runtime dependencies. In these cases we need to have a compile time fallback. Also, these are both complex tools which are relatively slow to compile, and have quite a large compile time dependency set (e.g. MIRA4 requires at least a quite recent GCC, BOOST, flex, expat, and strongly recommends TCmalloc). Here at least some of the dependencies have been packaged for the ToolShed (probably by Bjoern?) but in the case of MIRA4 and BLAST+ this is still a lot of effort for no practical gain. I don't think compile time really matters, you only need to compile them once and I think most of us can wait one hour. I also feel there is an argument that the Galaxy goal of reproducibility should favour using precompiled binaries if available: A locally compiled binary will generally mean a different compiler version, perhaps with different optimisation flags, and different library versions. It will not necessarily give the same results as the tool author's provided precompiled binary. Yes, that's a good point. One the other hand we should not forget that binaries are not necessarily usable over many years. As a really bad example take a look at the UCSC tools. You can't run the latest UCSC tools on a old scientific linux. Because libc is to old. So you are totally lost. I'm not sure how good the MIRA binaries are, but I would like to point out that there are huge differences in how you can produce these binaries. I'm in favour of having both options available where ever we can and let the administrator choose the best way to install. Maybe with a default universe_wsgi.xml setting (preferred_toolshed_install = binary). I would not call it 'fallback', its really a different installation strategy, with different priorities. (There was/is a trello card for
[galaxy-dev] REFERENCE genome in Bowtie2
Dear Galaxy Representative, I'm trying to use Bowtie2 in Galaxy and I need to select reference genome. It says If your genome of interest is not listed, contact the Galaxy team and that is why I am contacting you. My genome of interest is Maize (corn, Zea mays) and information about it can be found at http://www.maizegdb.org/ Thank you. Emir -- Emir Islamovic Postdoctoral Research Assistant Plant and Microbial Biology Department 311 Koshland Hall University of California, Berkeley Berkeley, CA 94720-3102 Phone: 510-642-8058 Fax: 510-642-4995 Email: emirislamo...@berkeley.edu emirislamo...@yahoo.com ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/