Re: [galaxy-dev] Nothing being tested on Test and main Tool Shed?
On Thu, Nov 6, 2014 at 11:08 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Thanks Dave, The good news is yes, the tests are running again on the Test Tool Shed (although not the main Tool Shed yet), and many of my tools now have successful test results from last night. e.g. My new basic mummer tool which now has a full set of dependency packages thanks to Bjoern: https://testtoolshed.g2.bx.psu.edu/view/peterjc/mummer The bad news is there are many unexpected failures with: Exception: History in error state. I'm sure you'll learn more once you look over the logs, Thank you, Peter Hi Dave, Any progress? All the following seem to have been tested in the last few days on the TestToolShed, but failed with Exception: History in error state. https://testtoolshed.g2.bx.psu.edu/view/peterjc/blast2go https://testtoolshed.g2.bx.psu.edu/view/peterjc/blastxml_to_top_descr https://testtoolshed.g2.bx.psu.edu/view/peterjc/clinod https://testtoolshed.g2.bx.psu.edu/view/peterjc/fastq_paired_unpaired https://testtoolshed.g2.bx.psu.edu/view/peterjc/get_orfs_or_cdss https://testtoolshed.g2.bx.psu.edu/view/peterjc/mira_assembler https://testtoolshed.g2.bx.psu.edu/view/peterjc/ncbi_blast_plus https://testtoolshed.g2.bx.psu.edu/view/peterjc/nlstradamus https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_primer_clip https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_rename https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_select_by_id That's about half of my TestToolShed repositories - most of the others report their tests passed :) I am also seeing unexpected problems with packages, e.g. https://testtoolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_29 Error getting revision e78bbab7933d of repository package_blast_plus_2_2_29 owned by iuc: An entry for the repository was not found in the database. Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Public toolshed giving internal server error
On Wed, Nov 19, 2014 at 9:36 AM, Peter Briggs peter.bri...@manchester.ac.uk wrote: Hello I'm trying to make a new repository on the public toolshed at https://toolshed.g2.bx.psu.edu/ but I keep getting the internal server error page. I was also unable to log out, or even to see the front page when trying to access it from a different browser. Is anyone else having this problem? Thanks best wishes Peter I just tried uploading an update on the test tool shed and got: Internal Server Error Galaxy was unable to successfully complete your request ... IOError: [Errno 28] No space left on device Perhaps a coincidence, but maybe the main ToolShed has the same issue? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Public toolshed giving internal server error
Thanks - uploading tar-balls to the Test Tool Shed is working again :) Peter On Wed, Nov 19, 2014 at 2:34 PM, Nate Coraor n...@bx.psu.edu wrote: Hi all, Sorry for the service interruption. The Tool Shed should be back now, and the underlying disk usage problem has been alleviated, so it shouldn't occur again. --nate ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Is anyone using composite datatype uploads?
Hi John, Sam, I've not done it yet, but was hoping to implement uploading of BLAST databases at some point - mainly for use within the test framework, rather than expecting it to be useful for the end user. Is the issue here uploading an archive (e.g. .zip or .tar.gz) or offering a way to pick multiple files to be treated together as a composite dataset? Peter On Wed, Nov 19, 2014 at 3:32 PM, John Chilton jmchil...@gmail.com wrote: Well there is at least one person using this functionality - http://dev.list.galaxyproject.org/Problem-to-upload-data-to-Galaxy-when-using-pbed-file-format-td4666000.html. Just to make this more concrete - Sam has swapped the upload file button to use the new upload widget this release cycle (targeted for December 1st). So barring negative feedback - uploading pbed or velvet report datatypes (or other similar composite datatypes) will no longer be possible via the GUI. -John On Thu, Aug 14, 2014 at 2:32 PM, Aysam Guerler aysam.guer...@gmail.com wrote: Hello everyone, We are considering to disable the deprecated upload tool form which is currently accessible through Tool panel Get Data Upload file. The new upload feature (icon at the top of the Tool panel) covers all of its functionality except uploading composite datatypes like e.g. Velvet. Please let us know if you are using the composite file upload functionality of the former tool form. Thanks, Sam ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Nothing being tested on Test and main Tool Shed?
Thanks John, Fingers crossed we'll get some more detailed logs in a day or two :) Peter On Wed, Nov 19, 2014 at 5:48 PM, John Chilton jmchil...@gmail.com wrote: Hey Peter, Dave is out this week - so I have tried to fumble around and see if I could make some progress on this. I found some bugs in a recent commit and fixed them - that might help (https://bitbucket.org/galaxy/galaxy-central/commits/b81798f94dc0fd14de1d585ed7e57f820f998fae). I also have enabled more verbose logging that might help those History in error state exceptions (https://bitbucket.org/galaxy/galaxy-central/commits/a799879a82c54c2d1afec6e33d8918479bbf2373) but I am not sure it will propagate through to the tool shed API - we will see I guess. If the install and test framework just picks up the latest central - these fixes will hopefully be reflected in the next run. -John On Wed, Nov 19, 2014 at 4:48 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi Dave, Any progress? All the following seem to have been tested in the last few days on the TestToolShed, but failed with Exception: History in error state. https://testtoolshed.g2.bx.psu.edu/view/peterjc/blast2go https://testtoolshed.g2.bx.psu.edu/view/peterjc/blastxml_to_top_descr https://testtoolshed.g2.bx.psu.edu/view/peterjc/clinod https://testtoolshed.g2.bx.psu.edu/view/peterjc/fastq_paired_unpaired https://testtoolshed.g2.bx.psu.edu/view/peterjc/get_orfs_or_cdss https://testtoolshed.g2.bx.psu.edu/view/peterjc/mira_assembler https://testtoolshed.g2.bx.psu.edu/view/peterjc/ncbi_blast_plus https://testtoolshed.g2.bx.psu.edu/view/peterjc/nlstradamus https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_primer_clip https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_rename https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_select_by_id That's about half of my TestToolShed repositories - most of the others report their tests passed :) I am also seeing unexpected problems with packages, e.g. https://testtoolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_29 Error getting revision e78bbab7933d of repository package_blast_plus_2_2_29 owned by iuc: An entry for the repository was not found in the database. Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] MarkupSafe egg missing? e.args[1].key != e.args[0].key
On Mon, Nov 17, 2014 at 7:23 PM, John Chilton jmchil...@gmail.com wrote: Ummm... I think it is that the VM started shipping with an incompatible paramkio. [...] Anyway - planemo's TravisCI integration tests Galaxy in a virtualenv and it works fine [...] That's a useful workaround, but a virtualenv is not viable for existing Galaxy installations - which may also come to have this problem if their system copy of paramiko is updated? I'm not familiar with paramiko or markupsafe, but if there is a conflict it would be good to have a direct fix. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] MarkupSafe egg missing? e.args[1].key != e.args[0].key
On Nov 18, 2014 7:26 AM, John Chilton jmchil...@gmail.com wrote: I will admit to not actually understanding Galaxy's dependency management but I think virtualenv is exactly the advice people who do understand it give http://dev.list.galaxyproject.org/Local-installation-problem-td4662627.html. It is a widely used tool precisely designed to solve such problems - I think it is the best way to go. I don't know why it would not be appropriate for existing installations - I think it is in fact somethimg of a best practice for existing installations. Our existing installation is not using a virtual env, and I fear switching to that could be disruptive. Certainly that error message should be more helpful but I am not sure we should do anything to address this beyond that - do you have a particular idea in mind? Not show the IndexError exception? :P Here the user should be told something about a conflict between MarkupSafe and paramiko (assuming this is the real problem). Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] MarkupSafe egg missing? e.args[1].key != e.args[0].key
On Tue, Nov 18, 2014 at 1:24 PM, Peter Cock p.j.a.c...@googlemail.com wrote: On Nov 18, 2014 7:26 AM, John Chilton jmchil...@gmail.com wrote: I will admit to not actually understanding Galaxy's dependency management but I think virtualenv is exactly the advice people who do understand it give http://dev.list.galaxyproject.org/Local-installation-problem-td4662627.html. It is a widely used tool precisely designed to solve such problems - I think it is the best way to go. I don't know why it would not be appropriate for existing installations - I think it is in fact somethimg of a best practice for existing installations. Our existing installation is not using a virtual env, and I fear switching to that could be disruptive. Getting back to TravisCI, using a virtual env wasn't too painful: https://github.com/peterjc/pico_galaxy/commit/26489a65a9cd60f9d055488d003346eab87941b0 I can now get back to tweaking the tests I was working on :) Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] MarkupSafe egg missing? e.args[1].key != e.args[0].key
On Tue, Nov 18, 2014 at 5:51 PM, Nate Coraor n...@bx.psu.edu wrote: Peter, Unless you've made modifications to Galaxy that depend on external libraries, switching to a virtualenv for the server itself should be pretty safe. Tools themselves can still run without using the/any virtualenv, if desired. --nate OK - that sounds more straightforward than I had feared - but I will cross that bridge as needed ;) Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] MarkupSafe egg missing? e.args[1].key != e.args[0].key
Hello all, There looks to be an egg problem with the latest galaxy-central, here are excepts from a failed Galaxy install on my TravisCI test setup, https://travis-ci.org/peterjc/pico_galaxy/builds/41233860 Here's another project build with the same error: https://travis-ci.org/peterjc/galaxy_blast/builds/41235899 $ ./run.sh --stop-daemon || true Initializing config/migrated_tools_conf.xml from migrated_tools_conf.xml.sample ... Initializing static/welcome.html from welcome.html.sample Some eggs are out of date, attempting to fetch... Fetched http://eggs.galaxyproject.org/Mako/Mako-0.4.1-py2.7.egg Fetched http://eggs.galaxyproject.org/importlib/importlib-1.0.3-py2.7.egg Fetched http://eggs.galaxyproject.org/pysam/pysam-0.4.2_kanwei_b10f6e722e9a-py2.7-linux-x86_64-ucs4.egg Fetched http://eggs.galaxyproject.org/ordereddict/ordereddict-1.1-py2.7.egg Fetched http://eggs.galaxyproject.org/Fabric/Fabric-1.7.0-py2.7.egg Fetched http://eggs.galaxyproject.org/Babel/Babel-1.3-py2.7.egg Fetched http://eggs.galaxyproject.org/Whoosh/Whoosh-0.3.18-py2.7.egg Fetched http://eggs.galaxyproject.org/Parsley/Parsley-1.1-py2.7.egg Fetched http://eggs.galaxyproject.org/Cheetah/Cheetah-2.2.2-py2.7-linux-x86_64-ucs4.egg Traceback (most recent call last): File ./scripts/fetch_eggs.py, line 46, in module c.resolve() # Only fetch eggs required by the config File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/lib/galaxy/eggs/__init__.py, line 347, in resolve egg.resolve() File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/lib/galaxy/eggs/__init__.py, line 192, in resolve if e.args[1].key != e.args[0].key: IndexError: tuple index out of range Fetch failed. No PID file exists in paster.pid $ python scripts/fetch_eggs.py Warning: MarkupSafe (a dependent egg of Mako) cannot be fetched Traceback (most recent call last): File scripts/fetch_eggs.py, line 46, in module c.resolve() # Only fetch eggs required by the config File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/lib/galaxy/eggs/__init__.py, line 347, in resolve egg.resolve() File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/lib/galaxy/eggs/__init__.py, line 192, in resolve if e.args[1].key != e.args[0].key: IndexError: tuple index out of range The command python scripts/fetch_eggs.py failed and exited with 1 Looking at http://eggs.galaxyproject.org/MarkupSafe/ for Python 2.7 and Linux x86_64 there is a ucs4 egg: MarkupSafe-0.12-py2.7-linux-x86_64-ucs4.egg09-Jun-2011 03:09 30724 So, why is this failing? Not I am not explicitly installing MakupSafe (so I do not expect there to be a conflicting version installed). Also it would seem there is a bug in the resolve method assuming that e.arg will always have (at least) two entries? Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Repeats shown upside down on galaxy-central
Hi Sam, On Thu, Nov 13, 2014 at 6:25 AM, Aysam Guerler aysam.guer...@gmail.com wrote: Hey Peter, I have modified the data selectors appearance by adding a single select field as default option. It resembles the previous functionality now and I think its an improvement. Great. I won't be able to try this until next week at the earliest though. Regarding the length of the select box I think that the tool form overall looks more organized now since almost all the input elements have the same length. However I will discuss this with the others and see what the consensus is. Perhaps a confounding variable is how wide your screen is? In my screenshots the visual cue that this is a select box is on the extreme right and thus separated from the text it is connected to. I think that makes it a bad GUI design choice (as well somehow not liking the visual aesthetic, which is a more personal impression). Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Nothing being tested on Test and main Tool Shed?
Thanks Dave, The good news is yes, the tests are running again on the Test Tool Shed (although not the main Tool Shed yet), and many of my tools now have successful test results from last night. e.g. My new basic mummer tool which now has a full set of dependency packages thanks to Bjoern: https://testtoolshed.g2.bx.psu.edu/view/peterjc/mummer The bad news is there are many unexpected failures with: Exception: History in error state. I'm sure you'll learn more once you look over the logs, Thank you, Peter On Wed, Nov 5, 2014 at 6:21 PM, Dave Bouvier d...@bx.psu.edu wrote: Peter, This was due to a number of issues with the testing framework, the last of which appears to have been resolved, and I see tools being tested in the framework log. I'll check again in the morning to see if any of the tools you listed below were still not tested. --Dave B. On 11/03/2014 05:51 AM, Peter Cock wrote: Hello all, I am currently hoping to review the automated test results for some repositories which I have recently updated, in one case for dependency handling, the other functional changes: https://testtoolshed.g2.bx.psu.edu/view/peterjc/mummer https://testtoolshed.g2.bx.psu.edu/view/peterjc/ncbi_blast_plus These have not yet been tested. On further investigation of a sample of my other tools, it appears none of them have been tested on the Test Tool Shed since 2014-09-15, e.g. https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_rename https://testtoolshed.g2.bx.psu.edu/view/peterjc/sample_seqs https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats https://testtoolshed.g2.bx.psu.edu/view/peterjc/effectivet3 https://testtoolshed.g2.bx.psu.edu/view/peterjc/clinod Similarly, some of my tools on the Main Tool Shed appear not to have been tested since 2014-09-21, e.g. https://toolshed.g2.bx.psu.edu/view/peterjc/seq_rename https://toolshed.g2.bx.psu.edu/view/peterjc/effectivet3 https://toolshed.g2.bx.psu.edu/view/peterjc/clinod or 2014-10-27, https://toolshed.g2.bx.psu.edu/view/peterjc/sample_seqs https://toolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats Is there a known problem with the automated tool testing (previously every second night) on the Tool Sheds? Or have you had to further reduce the testing cycle? Testing less frequently seems fine, say fortnightly, if this can be supplemented by testing updated tools everynight. That would give Tool Authors prompt feedback on their updates, but also catch regressions where changes in Galaxy break a previously working tool. Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Repeats shown upside down on galaxy-central
Hi Sam, I found the old approach (new repeat blocks at bottom) worked fine when adding blocks one by one and completing them as I went (which means once you have filled in the new block, you have scrolled down to the button ready to add another block if needed). If find the new approach (new blocks inserted at top) visually confusing as I am used to filling in forms from top to bottom. I do concede the approach makes it easy to add several repeats with a few clicks, and then fill them in - but personally that isn't a common thing for me to do, and I do not think this change is worth the confusion. There are a number of other visual changes on galaxy-central compared to the current release, so of which seem harmless like the boolean parameters becoming a yes/no toggle in place of a tick box, others are more daunting/scary (e.g. collection related changes to the file picker for automatic batch jobs). Is there any draft documentation on these changes being prepared to go into the next release notes? This is the sort of thing local Galaxy Admins would appreciate to anticipate local user queries. Thanks, Peter On Thu, Nov 6, 2014 at 12:45 PM, Aysam Guerler aysam.guer...@gmail.com wrote: Hi Peter, Yes we changed this on purpose, however it is open to discussion. The advantage is that the user does not have to scroll down after adding a new repeat block. Additionally it enables users to easily add more than one repeat block quickly, since the insert button does not relocate on the screen after adding new repeat blocks. Thanks, Sam On Mon, Nov 3, 2014 at 7:25 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi all, I'm running galaxy-central as my development server, and noticed what to me is a regression with repeat parameters, e.g. https://github.com/peterjc/pico_galaxy/blob/master/tools/clc_assembly_cell/clc_mapper.xml Read group: [+ insert read group] * 1: Read Group * 2: Read Group which on clicking becomes: Read group: [+ insert read group] * 2: Read Group * 1: Read Group This to me is upside down, the current behaviour on galaxy-dist is more natural (and also pluralises the group heading): Read Groups * Read Group 1 [Add new Read Group] which on clicking becomes: Read Groups * Read Group 1 * Read Group 2 [Add new Read Group] Is this a deliberate change? If so, why? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Galaxy's dependency on old samtools vs tools wrapping later versions?
OK, so this should work then... :) Thanks Dave, Peter On Mon, Nov 3, 2014 at 7:06 PM, Dave Bouvier d...@bx.psu.edu wrote: Peter, For the automated indexing of bam files, Galaxy uses the samtools version linked to as default under tool-dependencies/samtools/ This should normally be 0.1.19 or older, due to the not-yet-implemented handling of bam_index_build and other potential regressions that could be uncovered in the future. --Dave B. On 11/03/2014 01:52 PM, Peter Cock wrote: Hello all, Galaxy currently requires samtools on the $PATH in order to sort and index BAM files automatically, and samtools 0.1.19 works fine. Unfortunately later versions of samtools index have a regression: https://github.com/samtools/samtools/issues/199 This has caught several people out already, e.g. https://biostar.usegalaxy.org/p/7928/ and https://biostar.usegalaxy.org/p/9335/ While eventually samtools will be fixed, right now this means we can't have samtools 1.1 as the first samtools on the $PATH used by Galaxy. I am working on a wrapper for samtools bam2fq: https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_bam2fq https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq The bam2qf command in samtools 0.1.19 has a number of bugs, so I want to target samtools 1.1. However this has complicated my testing since for my BAM input files Galaxy will call samtools index, and if it calls samtools 1.1 this will fail. I'm not using the tool shed dependencies during development so instead came up with the following hack: https://github.com/peterjc/picobio/blob/master/sambam/samtools_auto.py My question is, what is expected to happen with a Tool Shed installed wrapper for samtools 1.1 and Galaxy's attempts to automatically call samtools to index any BAM output file? Would the tool environment put samtools 1.1 on the (local) $PATH which would then break setting the metadata as part of the same job? Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Galaxy's dependency on old samtools vs tools wrapping later versions?
Fingers crossed - perhaps I jumped the gun uploading this to the main tool shed without seeing the test results on the Test Tool Shed: https://toolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq I look forward to the automated test results from the Tool Sheds ... http://lists.bx.psu.edu/pipermail/galaxy-dev/2014-November/020792.html Thanks, Peter On Tue, Nov 4, 2014 at 8:38 AM, Peter Cock p.j.a.c...@googlemail.com wrote: OK, so this should work then... :) Thanks Dave, Peter On Mon, Nov 3, 2014 at 7:06 PM, Dave Bouvier d...@bx.psu.edu wrote: Peter, For the automated indexing of bam files, Galaxy uses the samtools version linked to as default under tool-dependencies/samtools/ This should normally be 0.1.19 or older, due to the not-yet-implemented handling of bam_index_build and other potential regressions that could be uncovered in the future. --Dave B. On 11/03/2014 01:52 PM, Peter Cock wrote: Hello all, Galaxy currently requires samtools on the $PATH in order to sort and index BAM files automatically, and samtools 0.1.19 works fine. Unfortunately later versions of samtools index have a regression: https://github.com/samtools/samtools/issues/199 This has caught several people out already, e.g. https://biostar.usegalaxy.org/p/7928/ and https://biostar.usegalaxy.org/p/9335/ While eventually samtools will be fixed, right now this means we can't have samtools 1.1 as the first samtools on the $PATH used by Galaxy. I am working on a wrapper for samtools bam2fq: https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_bam2fq https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq The bam2qf command in samtools 0.1.19 has a number of bugs, so I want to target samtools 1.1. However this has complicated my testing since for my BAM input files Galaxy will call samtools index, and if it calls samtools 1.1 this will fail. I'm not using the tool shed dependencies during development so instead came up with the following hack: https://github.com/peterjc/picobio/blob/master/sambam/samtools_auto.py My question is, what is expected to happen with a Tool Shed installed wrapper for samtools 1.1 and Galaxy's attempts to automatically call samtools to index any BAM output file? Would the tool environment put samtools 1.1 on the (local) $PATH which would then break setting the metadata as part of the same job? Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Can existing SAM/BAM filter tools give me mapped/unmapped pairs?
Hi all, I'm looking for a little advice on the pre-existing SAM/BAM filtering tools already in the Galaxy Tool Shed (to avoid reinventing the wheel). As I mentioned on another thread, I'm working on a wrapper for the samtools bam2fq command (targeting samtools 1.1 which fixed some bugs in this tool and added new functionality compared to samtools 0.1.19), see: https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_bam2fq https://toolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq One of my motivating use cases is a workflow like this: 1. Upload paired end FASTQ files. 2. Map them against a known contaminant genome giving a BAM file (note I need the mapper to report unmapped reads in the output). 3. Filter the BAM to get unmapped reads, plus reads whose partner is unmapped (conversely, remove reads where both partners are mapped). 4. Convert the filtered BAM back into FASTQ (with samtools bam2fq). 5. Proceed with analysis (e.g. de novo assembly). Assuming I have understood samtools view, this filtering step has to be multiple parts: This would get the unmapped reads $ samtools view -f 0x4 ... This would get reads with an unmapped partner: $ samtools view -f 0x8 ... However this would only get unmapped reads with an unmapped partner: $ samtools view -f 0x12 ... i.e. samtools view allows logical AND, not logical OR, when combining flag filters. So, I believe using samtools directly, a two stage filter is needed followed by a merge (and sort), taking care not to duplicate reads, perhaps: $ samtools view -f 4 ... unmapped.bam $ samtools view -f 8 -F 4 ... mapped_with_partner_unmapped.bam $ samtools merge unmapped.bam mapped_with_partner_unmapped.bam ... That could be repeated within Galaxy but is surprisingly complicated with multiple steps in the history - so I do not want to go that route. Have I overlooked a simple ToolShed solution using samtools? As far as I could tell, the only other option on the current Tool Shed is the Sambamba Filter tool (using unmapped or mate_is_unmapped), which has a very capable looking filter system: https://toolshed.g2.bx.psu.edu/view/lomereiter/sambamba_filter @Artem - have you explored updating your tool_dependencies.xml to download your pre-compiled binaries by default? That would make deployment far easier, since D compilers are still rare, and would mean we can see the test results on the Tool Shed :) Please ask if you'd like advice on Tool Shed packaging. Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Can existing SAM/BAM filter tools give me mapped/unmapped pairs?
On Tue, Nov 4, 2014 at 2:44 PM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi all, I'm looking for a little advice on the pre-existing SAM/BAM filtering tools already in the Galaxy Tool Shed (to avoid reinventing the wheel). As I mentioned on another thread, I'm working on a wrapper for the samtools bam2fq command (targeting samtools 1.1 which fixed some bugs in this tool and added new functionality compared to samtools 0.1.19), see: https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_bam2fq https://toolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq Going off topic, but I just hit a problem here: https://github.com/samtools/samtools/issues/313 Depending on if the reads have a QUAL value or not, samtool bam2fq will produce either FASTA or FASTQ output - and will happily give a mixture in one file. I know Heng Li has a parser that will take this kind of input, but Galaxy likes to have well defined file formats. I may have to fix samtools, perhaps by adding a strict FASTQ output mode? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Nothing being tested on Test and main Tool Shed?
Hello all, I am currently hoping to review the automated test results for some repositories which I have recently updated, in one case for dependency handling, the other functional changes: https://testtoolshed.g2.bx.psu.edu/view/peterjc/mummer https://testtoolshed.g2.bx.psu.edu/view/peterjc/ncbi_blast_plus These have not yet been tested. On further investigation of a sample of my other tools, it appears none of them have been tested on the Test Tool Shed since 2014-09-15, e.g. https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_rename https://testtoolshed.g2.bx.psu.edu/view/peterjc/sample_seqs https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats https://testtoolshed.g2.bx.psu.edu/view/peterjc/effectivet3 https://testtoolshed.g2.bx.psu.edu/view/peterjc/clinod Similarly, some of my tools on the Main Tool Shed appear not to have been tested since 2014-09-21, e.g. https://toolshed.g2.bx.psu.edu/view/peterjc/seq_rename https://toolshed.g2.bx.psu.edu/view/peterjc/effectivet3 https://toolshed.g2.bx.psu.edu/view/peterjc/clinod or 2014-10-27, https://toolshed.g2.bx.psu.edu/view/peterjc/sample_seqs https://toolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats Is there a known problem with the automated tool testing (previously every second night) on the Tool Sheds? Or have you had to further reduce the testing cycle? Testing less frequently seems fine, say fortnightly, if this can be supplemented by testing updated tools everynight. That would give Tool Authors prompt feedback on their updates, but also catch regressions where changes in Galaxy break a previously working tool. Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Repeats shown upside down on galaxy-central
Hi all, I'm running galaxy-central as my development server, and noticed what to me is a regression with repeat parameters, e.g. https://github.com/peterjc/pico_galaxy/blob/master/tools/clc_assembly_cell/clc_mapper.xml Read group: [+ insert read group] * 1: Read Group * 2: Read Group which on clicking becomes: Read group: [+ insert read group] * 2: Read Group * 1: Read Group This to me is upside down, the current behaviour on galaxy-dist is more natural (and also pluralises the group heading): Read Groups * Read Group 1 [Add new Read Group] which on clicking becomes: Read Groups * Read Group 1 * Read Group 2 [Add new Read Group] Is this a deliberate change? If so, why? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Galaxy's dependency on old samtools vs tools wrapping later versions?
Hello all, Galaxy currently requires samtools on the $PATH in order to sort and index BAM files automatically, and samtools 0.1.19 works fine. Unfortunately later versions of samtools index have a regression: https://github.com/samtools/samtools/issues/199 This has caught several people out already, e.g. https://biostar.usegalaxy.org/p/7928/ and https://biostar.usegalaxy.org/p/9335/ While eventually samtools will be fixed, right now this means we can't have samtools 1.1 as the first samtools on the $PATH used by Galaxy. I am working on a wrapper for samtools bam2fq: https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_bam2fq https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq The bam2qf command in samtools 0.1.19 has a number of bugs, so I want to target samtools 1.1. However this has complicated my testing since for my BAM input files Galaxy will call samtools index, and if it calls samtools 1.1 this will fail. I'm not using the tool shed dependencies during development so instead came up with the following hack: https://github.com/peterjc/picobio/blob/master/sambam/samtools_auto.py My question is, what is expected to happen with a Tool Shed installed wrapper for samtools 1.1 and Galaxy's attempts to automatically call samtools to index any BAM output file? Would the tool environment put samtools 1.1 on the (local) $PATH which would then break setting the metadata as part of the same job? Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Test failure, JSONDecodeError: Unpaired high surrogate
I have solved this by commenting out the apparently harmless test: https://github.com/peterjc/pico_galaxy/commit/f3d4261846566a86f9c85a158fb95877ca8bc7c5 Peter On Wed, Oct 29, 2014 at 5:39 PM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi all, I'm getting the following exception in a failing unit test: https://travis-ci.org/peterjc/pico_galaxy/builds/39398677 Testing this tool (where two of the three near identical tests passed): https://github.com/peterjc/pico_galaxy/blob/dd03346710e6a46cb6ec9dda1eed23d5fd301d03/tools/mummer/mummer.xml ``` Traceback (most recent call last): File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py, line 116, in test_tool self.do_it( td ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py, line 35, in do_it self._verify_outputs( testdef, test_history, jobs, shed_tool_id, data_list, galaxy_interactor ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py, line 75, in _verify_outputs galaxy_interactor.verify_output( history, jobs, output_data, output_testdef=output_testdef, shed_tool_id=shed_tool_id, maxseconds=maxseconds ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/interactor.py, line 89, in verify_output self._verify_metadata( history_id, hid, attributes ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/interactor.py, line 102, in _verify_metadata dataset = self._get( histories/%s/contents/%s % ( history_id, hid ) ).json() File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/eggs/requests-2.2.1-py2.7.egg/requests/models.py, line 740, in json return json.loads(self.content.decode(encoding), **kwargs) File /usr/lib/python2.7/dist-packages/simplejson/__init__.py, line 413, in loads return _default_decoder.decode(s) File /usr/lib/python2.7/dist-packages/simplejson/decoder.py, line 402, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File /usr/lib/python2.7/dist-packages/simplejson/decoder.py, line 418, in raw_decode obj, end = self.scan_once(s, idx) JSONDecodeError: Unpaired high surrogate: line 1 column 785 (char 785) ``` Probably relevant: - https://github.com/simplejson/simplejson/issues/62 - http://bugs.python.org/issue11489 Any thoughts? What does Galaxy write to these job-associated JSON metadata files? Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Editing admin rights on an (empty) ToolShed repo
Hi all, It seems the ToolShed now uses roles for granting admin rights... but still has the old Grant authority to make changes feature? I just hit a possible glitch here - I wanted to create a new repo under the iuc user, edit the admin settings, then log in as my normal personal account and upload the first version of the tool. So: 1. Log into the Tool Shed as the iuc user. 2. Created https://toolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_30 3. Attempted to add other administrators, e.g. IUC group or myself, but the top right menu only offered upload, and the old panel to do this was also missing: Grant authority to make changes If I do the first upload as iuc, then the menu changes to include Manage Repository Administrators, plus the panel on the main page appears Grant authority to make changes (which is what we used to use). Is this a transition stage, or are the change rights a subset of the admin role? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Test failure, JSONDecodeError: Unpaired high surrogate
Hi all, I'm getting the following exception in a failing unit test: https://travis-ci.org/peterjc/pico_galaxy/builds/39398677 Testing this tool (where two of the three near identical tests passed): https://github.com/peterjc/pico_galaxy/blob/dd03346710e6a46cb6ec9dda1eed23d5fd301d03/tools/mummer/mummer.xml ``` Traceback (most recent call last): File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py, line 116, in test_tool self.do_it( td ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py, line 35, in do_it self._verify_outputs( testdef, test_history, jobs, shed_tool_id, data_list, galaxy_interactor ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py, line 75, in _verify_outputs galaxy_interactor.verify_output( history, jobs, output_data, output_testdef=output_testdef, shed_tool_id=shed_tool_id, maxseconds=maxseconds ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/interactor.py, line 89, in verify_output self._verify_metadata( history_id, hid, attributes ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/interactor.py, line 102, in _verify_metadata dataset = self._get( histories/%s/contents/%s % ( history_id, hid ) ).json() File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/eggs/requests-2.2.1-py2.7.egg/requests/models.py, line 740, in json return json.loads(self.content.decode(encoding), **kwargs) File /usr/lib/python2.7/dist-packages/simplejson/__init__.py, line 413, in loads return _default_decoder.decode(s) File /usr/lib/python2.7/dist-packages/simplejson/decoder.py, line 402, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File /usr/lib/python2.7/dist-packages/simplejson/decoder.py, line 418, in raw_decode obj, end = self.scan_once(s, idx) JSONDecodeError: Unpaired high surrogate: line 1 column 785 (char 785) ``` Probably relevant: - https://github.com/simplejson/simplejson/issues/62 - http://bugs.python.org/issue11489 Any thoughts? What does Galaxy write to these job-associated JSON metadata files? Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Role of suite_config.xml in current Tool Shed?
Hi all, I have a suite_config.xml file in one of my Tool Shed packages, but I am unclear if this is still used, or simply a legacy from the old pre-hg-based Tool Shed? e.g. https://github.com/peterjc/pico_galaxy/blob/master/tools/protein_analysis/suite_config.xml My understanding was this allowed the tool author some control over the appearance/order that the tools will be shown in the Galaxy left hand pane. Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Blast+ Wrapper: blastdbcmd: range parameter
Thanks, https://github.com/biopython/biopython/pull/385 You can't (yet) do this via the BLAST wrappers. You would have to pull out the full sequences using the makeblastdb wrapper, then edit them with another Galaxy tool. Or work directly from the FASTA file if you have it. Peter On Wed, Oct 22, 2014 at 6:57 AM, Matthias Enders m.end...@german-seed-alliance.de wrote: Hi Peter, I added a new Issue, hope everything is correct. Kind regards, Matthias Enders -Ursprüngliche Nachricht- Von: Peter Cock [mailto:p.j.a.c...@googlemail.com] Gesendet: Tuesday, October 21, 2014 11:25 PM An: Matthias Enders Cc: galaxy-dev@lists.bx.psu.edu Betreff: Re: [galaxy-dev] Blast+ Wrapper: blastdbcmd: range parameter Hi Matthias, Can you file an issue here about adding this here please? https://github.com/peterjc/galaxy_blast Thanks! Peter On Tue, Oct 21, 2014 at 10:36 AM, Matthias Enders m.end...@german-seed-alliance.de wrote: Hello all, I use the ToolShed NCBI Blast+ Wrappers (https://toolshed.g2.bx.psu.edu/repository?repository_id=1d92ebdf7e8d466c) and I tried to retrieve sequence information from databases. The blastdbcmd comes with the feature to extract a given range of the sequence: -Range string Range of sequence to extract (Format: start-stop) Is this parameter / functionality also part of the wrapper, how can I use this functionality? Thanks, Matthias ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Blast+ Wrapper: blastdbcmd: range parameter
Hi Matthias, Can you file an issue here about adding this here please? https://github.com/peterjc/galaxy_blast Thanks! Peter On Tue, Oct 21, 2014 at 10:36 AM, Matthias Enders m.end...@german-seed-alliance.de wrote: Hello all, I use the ToolShed NCBI Blast+ Wrappers (https://toolshed.g2.bx.psu.edu/repository?repository_id=1d92ebdf7e8d466c) and I tried to retrieve sequence information from databases. The blastdbcmd comes with the feature to extract a given range of the sequence: -Range string Range of sequence to extract (Format: start-stop) Is this parameter / functionality also part of the wrapper, how can I use this functionality? Thanks, Matthias ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] question about GALAXY_SLOTS
On Thu, Oct 16, 2014 at 11:05 PM, Wolfgang Maier wolfgang.ma...@biologie.uni-freiburg.de wrote: Hi, this is just to make sure: the GALAXY_SLOTS environmental variable set by Galaxy when running tools will always be a number = 1 with 1 being the default if nothing else is configured in the job runner settings ? Correct ? Thanks, Wolfgang Hi Wolfgang, I believe so, however it is possible it might be unset in a corner case (please report this as a bug if you see it happen) or a tool could change the value. You can use the following bash syntax to set your own default in the tool's command template, e.g. -num_threads \${GALAXY_SLOTS:-8} Note the colon minus is the special bash syntax, here the default value is 8 (not minus 8) if $GALAXY_SLOTS is not set. Also note in the command XML you must escape the dollar sign. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Is the new tool repositories summary in the monthly newsletter useful?
On Wed, Oct 8, 2014 at 12:49 AM, Dave Clements cleme...@galaxyproject.org wrote: Hi All, The October Galaxy newsletter went out a week ago. Buried at the bottom is this 36 new ToolShed repos -- https://wiki.galaxyproject.org/GalaxyUpdates/2014_10#ToolShed_Contributions which lists repositories that have been published in the Galaxy Project ToolShed in the previous month. I have two questions about this: 1. How useful is this summary? Compiling it is a manual process and it's kind of mind-numbing. Most months it takes around 2 hours (I think). I find it moderately useful, so if most Galaxy Admins think the same, it probably is overall a good time investment. 2. If we keep the summary, should we put it in the Dev News Briefs instead? I'm kinda thinking this summary is a better match for the Dev News Briefs (every release), then it is for the general newsletter (every month). I would suggest both (easy if it is just a link, a tiny bit of copy and paste if not), but that wasn't an option on the Google form. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] ToolShed tool preview broken (TestToolShed too)
Hi all, From the new tools information Dave Compiled for the last Galaxy Update https://wiki.galaxyproject.org/GalaxyUpdates/2014_10#ToolShed_Contributions I had a look at galaxyp's filter_by_fasta_ids: Extract sequences from a FASTA file based on a list of IDs tool: https://toolshed.g2.bx.psu.edu/view/galaxyp/filter_by_fasta_ids I wanted to see how it compared to my own similar tools (which handle FASTA, FASTQ, SFF and could cover more - they replaced my older single format filter tools): https://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id https://toolshed.g2.bx.psu.edu/view/peterjc/seq_select_by_id Now for the bug report, clicking on the button (under valid tools) which would normally give a preview of the tool form is failing - giving just Internal Server Error. I have tried a random selection of other tools and this seems to be universal - moreover the TestToolShed also seems to have the same problem. Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] testtoolshed internal server error
On Wed, Oct 8, 2014 at 11:20 AM, Stef van Lieshout stefvanliesh...@fastmail.fm wrote: Anyone else getting this when trying to upload to a testtoolshed repos? I'm using the upload files to repository function in repository actions and get a blank page with internal server error.. Worked fine yesterday. Ciao, Stef There's a chance it is the same root problem as this issue which I hit a couple of hours ago (again internal server error): http://lists.bx.psu.edu/pipermail/galaxy-dev/2014-October/020614.html Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] testtoolshed internal server error
OK good - my issue with the ToolShed work now too :) On Wed, Oct 8, 2014 at 11:44 AM, Stef van Lieshout stefvanliesh...@fastmail.fm wrote: Ok, works for me again. Just a little hiccup I guess... Stef - Original message - From: Peter Cock p.j.a.c...@googlemail.com To: Stef van Lieshout stefvanliesh...@fastmail.fm Cc: Galaxy Dev galaxy-...@bx.psu.edu Subject: Re: [galaxy-dev] testtoolshed internal server error Date: Wed, 8 Oct 2014 11:33:36 +0100 On Wed, Oct 8, 2014 at 11:20 AM, Stef van Lieshout stefvanliesh...@fastmail.fm wrote: Anyone else getting this when trying to upload to a testtoolshed repos? I'm using the upload files to repository function in repository actions and get a blank page with internal server error.. Worked fine yesterday. Ciao, Stef There's a chance it is the same root problem as this issue which I hit a couple of hours ago (again internal server error): http://lists.bx.psu.edu/pipermail/galaxy-dev/2014-October/020614.html Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] ToolShed tool preview broken (TestToolShed too)
On Wed, Oct 8, 2014 at 9:22 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi all, From the new tools information Dave Compiled for the last Galaxy Update https://wiki.galaxyproject.org/GalaxyUpdates/2014_10#ToolShed_Contributions I had a look at galaxyp's filter_by_fasta_ids: Extract sequences from a FASTA file based on a list of IDs tool: https://toolshed.g2.bx.psu.edu/view/galaxyp/filter_by_fasta_ids I wanted to see how it compared to my own similar tools (which handle FASTA, FASTQ, SFF and could cover more - they replaced my older single format filter tools): https://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id https://toolshed.g2.bx.psu.edu/view/peterjc/seq_select_by_id Now for the bug report, clicking on the button (under valid tools) which would normally give a preview of the tool form is failing - giving just Internal Server Error. I have tried a random selection of other tools and this seems to be universal - moreover the TestToolShed also seems to have the same problem. Regards, Peter This is working again now. See also another possibly related Internal Server Error on upload which is also working again now: http://lists.bx.psu.edu/pipermail/galaxy-dev/2014-October/020619.html Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] clustalomega from toolshed installation issue
On Tue, Oct 7, 2014 at 1:47 AM, Isabelle Phan isabelle.p...@seattlebiomed.org wrote: Hello, I installed clustalomega from the galaxy main toolshed using the admin interface of our local galaxy install. When I run the tool, I get this message: Dataset generation errors Dataset 55: co_alignment.fasta Tool execution generated the following error message: Error invoking command: clustalo --force --threads=1 --maxnumseq=30 --maxseqlen=15000 -o /opt/galaxy-dist/database/files/000/dataset_158.dat -l /opt/galaxy-dist/database/files/000/dataset_159.dat -v -i /opt/galaxy-dist/database/files/000/dataset_153.dat [Errno 2] No such file or directory I've tried reloading my input file, reloading the tool's configuration, resetting the metadata on the clustalomega repository, deleting the tool completely and re-installing. No luck. If I try to rerun the job again, I get non-sensical errors that seem to originate from other tools, as if the database was completely garbled. Each of those tools work fine. Only clustalomega behaves oddly. I have no access to our galaxy server, so any solution would have to be implemented from the admin interface. thanks for any hints, Isabelle Hi Isabelle, I am guessing you used this ToolShed repository: https://toolshed.g2.bx.psu.edu/view/clustalomega/clustalomega As far as I can see, this does not automatically install the clustalo binary, which would explain [Errno 2] No such file or directory. The (incomplete and out of date) README file suggests you are expected to manually compile it from the bundled copy of the Clustal Omega source code. If the only access you have to modify the server is via the Galaxy admin interface, you will not be able to fix this. As it stands right now, this ToolShed repository would not get a gold star approval rating :( Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] datatype directory
On Tue, Sep 30, 2014 at 4:34 PM, David Hoover hoove...@helix.nih.gov wrote: Why isn't there a datatype for a directory of files? This seems like such a simple thing. If an executable generates or expects a directory as its input or output, why must a fancy complicated composite datatype be created to handle this? David Hoover A directory of files is too broad to be of use in itself - in the same way that defining tools to take or produce a generic data file is unhelpful. If you had a directory of files all the same format, then try to use the Galaxy collections feature (e.g. a collection of FASTA files). If the directory has some structure then the composite datatype is probably most suitable, e.g. an HTML file with a collection of image files; or a BLAST database made up of several binary files. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Set a new metadata attribute
On Fri, Sep 26, 2014 at 3:01 PM, Nikos Sidiropoulos nikos.sid...@gmail.com wrote: Hi all, In a tool that I am writting I want to pass an input parameter value (string) into the output file's metadata. Meaning that one of the tool parameters is a barcode signature, 'NNWTGXN' for example. I want that attribute to be stored somehow in the output file in order to be read by a subsequent tool without the user having to set that parameter again. The files I'll be working with are in FASTQ, BAM and tabular format. Is it possible? Bests, Nikos Your code can write the value directly into an output file (e.g. one of the SAM/BAM headers might work), but I don't think there is anything suitable within Galaxy for re-exporting the parameter value as an input parameter for a future tool. However, at the workflow level you can set variables - might that be a way forward? https://wiki.galaxyproject.org/Learn/AdvancedWorkflow/VariablesEdit Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] custom datatypes
Hi Calvin, The extension is really the Galaxy datatype name, so put quikrdb here. The actual filename on disk will be *.dat once loaded into Galaxy. More examples, e.g. https://github.com/peterjc/galaxy_blast/blob/master/datatypes/blast_datatypes/datatypes_conf.xml Peter On Fri, Sep 12, 2014 at 9:40 PM, Calvin Morrison mutanttur...@gmail.com wrote: Hi, I just want a simple data type for my 'custom' datatype (it's just a trained matrix in a specific format generated by my database building tool), so that my tools which use this database will only see ones with that format in the dropdown select list. in my datatypes.xml i have this: datatype extension=gz type=galaxy.datatypes.quikrdb mimetype=application/octet-stream display_in_upload=true description=A database trained with quikr. / then in my tool xml files i have outputs data name=output format=quikrdb / /outputs which seems to be fine, but my tools which use it still show me all my data, not just trained db's param name=dbname type=data format=quikrdb label=custom trained database/ Am i doing something wrong? Calvin Morrison ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Tool Errors
On Fri, Sep 12, 2014 at 4:11 PM, Calvin Morrison mutanttur...@gmail.com wrote: The stderr and stdout is empty, according to galaxy. here is paster.log output for quikr when i run it. galaxy.jobs.runners DEBUG 2014-09-12 10:33:45,997 (86) command is: # if user == user quikr -v -k 0 -s /data/galaxy/galaxy-dist/database/files/000/dataset_103.dat -i /data/galaxy/galaxy-dist/database/files/000/dataset_55.dat -o /data/galaxy/galaxy-dist/database/files/000/dataset_104.dat # else quikr -v -k 0 -s /data/galaxy/galaxy-dist/database/files/000/dataset_103.dat.mat.gz -i /data/galaxy/galaxy-dist/database/files/000/dataset_55.dat -o /data/galaxy/galaxy-dist/database/files/000/dataset_104.dat# end if that doesn't really seem all that helpful though. It does help - that command isn't going to work at the shell - try it and see? The problem is your Cheetah if statement has not been processed, and I think it is as simple as you've used invalid syntax in your command tag. I think you need to remove the extra spaces to have: #if ... Not: # if ... Then it might work? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] ToolShed test failure: NotFound: cannot find 'ucsc_display_sites' while searching for 'APP.config.ucsc_display_sites'
On Wed, Sep 10, 2014 at 7:55 PM, Nate Coraor n...@bx.psu.edu wrote: Hi Peter, This was due to a bug I introduced last week, which I've just fixed in d1f6d05. Sorry for the trouble. --nate Thanks - I'll check back in a day or two once the tests have run again. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] ToolShed test failure: NotFound: cannot find 'ucsc_display_sites' while searching for 'APP.config.ucsc_display_sites'
Hi all, I'm wondering why my samtools_depad repository tests have failed, and since I have not changed this recently presume this is due to a Galaxy change or general TestToolShed problem not specific to my tool: https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_depad Tests that failed Tool id: samtools_depad Tool version: samtools_depad Test: test_tool_00 (functional.test_toolbox.TestForTool_testtoolshed.g2.bx.psu.edu/repos/peterjc/samtools_depad/samtools_depad/0.0.1) Stderr: Traceback: Traceback (most recent call last): File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py, line 114, in test_tool self.do_it( td ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py, line 35, in do_it self._verify_outputs( testdef, test_history, shed_tool_id, data_list, galaxy_interactor ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py, line 75, in _verify_outputs galaxy_interactor.verify_output( history, output_data, output_testdef=output_testdef, shed_tool_id=shed_tool_id, maxseconds=maxseconds ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py, line 82, in verify_output self._verify_metadata( history_id, hid, attributes ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py, line 103, in _verify_metadata raise Exception( msg ) Exception: Dataset metadata verification for [file_ext] failed, expected [bam] but found [None]. Traceback (most recent call last): File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/web/framework/decorators.py, line 243, in decorator rval = func( self, trans, *args, **kwargs) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/webapps/galaxy/api/history_contents.py, line 188, in show return self.__show_dataset( trans, id, **kwd ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/webapps/galaxy/api/history_contents.py, line 214, in __show_dataset hda_dict[ 'display_apps' ] = self.get_display_apps( trans, hda ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/web/base/controller.py, line 855, in get_display_apps for display_app in hda.get_display_applications( trans ).itervalues(): File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/model/__init__.py, line 1754, in get_display_applications return self.datatype.get_display_applications_by_dataset( self, trans ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/datatypes/data.py, line 445, in get_display_applications_by_dataset value = value.filter_by_dataset( dataset, trans ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/datatypes/display_applications/application.py, line 200, in filter_by_dataset if link_value.filter_by_dataset( data, trans ): File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/datatypes/display_applications/application.py, line 78, in filter_by_dataset if fill_template( filter_elem.text, context = context ) != filter_elem.get( 'value', 'True' ): File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/util/template.py, line 9, in fill_template return str( Template( source=template_text, searchList=[context] ) ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/eggs/Cheetah-2.2.2-py2.7-linux-x86_64-ucs4.egg/Cheetah/Template.py, line 1004, in __str__ return getattr(self, mainMethName)() File cheetah_DynamicallyCompiledCheetahTemplate_1410263883_33_43576.py, line 82, in respond NotFound: cannot find 'ucsc_display_sites' while searching for 'APP.config.ucsc_display_sites' requests.packages.urllib3.connectionpool: DEBUG: GET /api/histories/993bad2fe35335db/contents/7fbe67cfae825002?key=edc04240db9605fb7edc7bab44d3404c HTTP/1.1 500 None requests.packages.urllib3.connectionpool: INFO: Starting new HTTP connection (1): 127.0.0.1 requests.packages.urllib3.connectionpool: DEBUG: GET /api/histories/993bad2fe35335db/contents/7fbe67cfae825002/provenance?key=edc04240db9605fb7edc7bab44d3404c HTTP/1.1 200 None requests.packages.urllib3.connectionpool: INFO: Starting new HTTP connection (1): 127.0.0.1 requests.packages.urllib3.connectionpool: DEBUG: GET /api/histories/993bad2fe35335db/contents/7fbe67cfae825002/provenance?key=edc04240db9605fb7edc7bab44d3404c HTTP/1.1 200 None Any thoughts? Thanks,
Re: [galaxy-dev] directory as an input file
You might be able to do this by accepting a collection of SAM/BAM files as input instead. This is a quite new feature in Galaxy, see: https://wiki.galaxyproject.org/News/2014_06_02_Galaxy_Distribution Peter On Wed, Sep 3, 2014 at 10:00 AM, Philippe Moncuquet philippe.m...@gmail.com wrote: Hi, I am trying to write a wrapper for a tool that take a directory containing SAM/BAM files as an input. I am not sure how to do that, is there another tool that implements this and that I can have a look at ? Any suggestions would be greatly appreciated. Regards, Philip ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] when else in conditional ? RE: refresh_on_change : is this a valid attribute? Any other ideas/options??
On Fri, Aug 29, 2014 at 11:43 PM, Lukasse, Pieter pieter.luka...@wur.nl wrote: So I need to refresh on changeI see that if I have a conditional item in my form, this causes a refresh of the page and a (re)evaluation of my dynamic_options methodsso I could misuse this “feature”. This is deliberate, although there has been talk of updating the conditional code to do the dependent parameters dynamically rather than server-side with a page refresh. From your outline description, I think you should be using the Galaxy conditional tag. However, it seems that when I have a conditional I must have a when entry for every item in my select box. There is no “when else” option? I think you are right - I've asked in the past about this, e.g. this discussion which appears not to have been fully on the mailing list though: http://dev.list.galaxyproject.org/Multiple-values-in-lt-when-gt-tags-for-lt-conditiona-gt-parameters-tc4659704.html#none This probably deserves to be tracked with a Trello Card... Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Examples of Galaxy tools in the toolsheds that install and run JAR files properly?
On Sat, Aug 30, 2014 at 11:17 AM, Melissa Cline cl...@soe.ucsc.edu wrote: Hi folks, I'm attempting something that should be straightforward, but it's not. I have a tool that runs a JAR file, which I have bundled with the tool. I simply want to run the JAR file. And to paraphrase Thomas Edison, I've tried several thousand things that do not work (at least for me), from setting the JAVA_JAR_PATH environment variable in the tool_dependencies.xml file to trying to copy the JAR file into the tool-data/shared/jars subdirectories (which is the closest thing I've got to working). So, at long last I'm doing the sensible thing and looking for one simple working example that I can use as a template. Who can suggest a good toolshed tool (either main or test) that involves running its own JAR file, and that works? Thanks! Melissa Here are a couple of my wrappers for Java tools, but I would suggest you invoke the Java script with an absolute path to the JAR file: $ java -jar ... Here is two examples done via a Python wrapper script (mainly used for pre or post processing the data files): https://github.com/peterjc/pico_galaxy/tree/master/tools/effectiveT3 https://github.com/peterjc/galaxy_blast/tree/master/tools/blast2go For EffectiveT3 which is open source and can therefore be easily redistributed, I set an environment variable EFFECTIVET3 for the location of the Jar file, which is used to invoke it via Java: $ java -jar ... For the Blast2GO wrapper, I require the person installing it setup an environment variable B2G4PIPE pointing at the folder with the JAR file. Older versions of this tool you be launched with the same -jar approach, but the current release requires setting a class path instead: $ java -cp ... I hope that helps, if not there are bound to be other Java examples in the ToolShed. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] BAM to SAM tool no way to get an unsorted SAM?
Do you mean a SAM/BAM file sorted by read name? If so, try samtools sort -n ... instead. Peter On Thu, Aug 28, 2014 at 12:53 PM, Alistair Chilcott alistair.chilc...@utas.edu.au wrote: Hello all, My users are trying to use a tool called bismark and it requires an unsorted SAM file for one of its steps Previous steps produce BAM files so we would like to include the tool BAM-to-SAM as a step in the workflow unfortunately it seems to be set to sort by default and there is no way to alter this behaviour. Is there another tool that would convert a BAM to a SAM without sorting it or can the BAM-to-SAM tool be altered to include a checkbox for an unsorted result? Regards, Alistair ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] BAM to SAM tool no way to get an unsorted SAM?
Ah - I missed this was on the Galaxy list, sorry. I think that Galaxy automatically coordinate sorts bam files, which is generally a good thing bar cases like yours. This problem has undoubtedly come up before - an unsortedbam datatype may be needed... Peter On Thu, Aug 28, 2014 at 1:35 PM, Alistair Chilcott alistair.chilc...@utas.edu.au wrote: I guess that is my point .. at a command prompt that can be achieved ( and gives the result they need ) but as far as I can see the same is not true via the Galaxy GUI ( unless there is a tool I am missing) They have 96 separate files to process hence the desire to use a workflow. Regards, Alistair -Original Message- From: Peter Cock [mailto:p.j.a.c...@googlemail.com] Sent: Thursday, 28 August 2014 2:27 PM To: Alistair Chilcott Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] BAM to SAM tool no way to get an unsorted SAM? Do you mean a SAM/BAM file sorted by read name? If so, try samtools sort -n ... instead. Peter On Thu, Aug 28, 2014 at 12:53 PM, Alistair Chilcott alistair.chilc...@utas.edu.au wrote: Hello all, My users are trying to use a tool called bismark and it requires an unsorted SAM file for one of its steps Previous steps produce BAM files so we would like to include the tool BAM-to-SAM as a step in the workflow unfortunately it seems to be set to sort by default and there is no way to alter this behaviour. Is there another tool that would convert a BAM to a SAM without sorting it or can the BAM-to-SAM tool be altered to include a checkbox for an unsorted result? Regards, Alistair ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY
On Fri, Aug 22, 2014 at 9:39 AM, Marija Atanaskovic ma...@unimelb.edu.au wrote: Mira doesn’t work on Galaxy. This is the log message I receive. Tool: Assemble with MIRA v3.4 Name: MIRA log Created: Fri Aug 22 00:24:38 2014 (UTC) Filesize: 0 bytes Dbkey: ? Format: txt Galaxy Tool Version: 0.0.10 Tool Version: None Tool Standard Output: stdout Tool Standard Error: stderr Tool Exit Code: None API ID: d5f55b83db1f410a Full Path: /mnt/galaxy/files/000/dataset_326.dat ... What was the stdout and stderr information? Did you install this from the main tool shed?: https://toolshed.g2.bx.psu.edu/view/peterjc/mira_assembler Also I can’t install Mira 4. This is the message I receive. Any suggestions. Getting Internal Server Error is unhelpful - I can't really guess what might be going wrong here :( I have had problems with the MIRA dependencies when Bastien has renamed folders on sourceforge... are you using the Test Tool Shed here (since I have not yet released the MIRA 4 wrapper on the main ToolShed)?: https://testtoolshed.g2.bx.psu.edu/view/peterjc/mira4_assembler Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY
On Fri, Aug 22, 2014 at 8:18 PM, Marija Atanaskovic ma...@unimelb.edu.au wrote: Hi Peter, I don¹t know what the stdout and stderr information was. I click on it but nothing comes up. I installed from the main toolshed: http://toolshed.g2.bx.psu.edu/view/peterjc/mira_assembler Yes, I did use the test toolshed for Mira 4. Regards, Marija If stdout and stderr are empty, this is consistent with MIRA never even starting (as suggested by the next bit of information below). On Fri, Aug 22, 2014 at 8:20 PM, Marija Atanaskovic ma...@unimelb.edu.au wrote: Hi Peter, This is the error in history under MIRA log: tool error An error occurred with this dataset:Unable to run job due to a misconfiguration of the Galaxy job running system. Please contact a site administrator. Regards, Marija That suggests your Galaxy job runner is not properly configured, and MIRA itself was never started. Are you the site administrator? Are other Galaxy tools working? Are other Galaxy tools installed from the ToolShed working? On Fri, Aug 22, 2014 at 8:22 PM, Marija Atanaskovic ma...@unimelb.edu.au wrote: Hi Peter, One more thing. The data that I am trying to analyse are fastq files of Ion Torrent reads. I have used them with other assemblers e.g., velvet, CLC. Marija I have not personally used Ion Torrent, but that ought to be fine. Peter P.S. You forgot to CC the mailing list. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY
On Fri, Aug 22, 2014 at 8:00 PM, Peter Cock p.j.a.c...@googlemail.com wrote: On Fri, Aug 22, 2014 at 9:39 AM, Marija Atanaskovic ma...@unimelb.edu.au wrote: Also I can’t install Mira 4. This is the message I receive. Any suggestions. Getting Internal Server Error is unhelpful - I can't really guess what might be going wrong here :( This is a guess, but it could be you need to set the https_proxy in the /etc/environment file - see this thread: http://dev.list.galaxyproject.org/Internal-Server-Error-when-trying-to-install-a-tool-from-the-Tool-shed-tt4665361.html Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Test run frequency on TestToolShed
Hi all, Are the main and test tool-sheds currently meant to be running the tool functional tests every 48 hours? I created and updated these repositories last week, but they have yet to be tested: https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_composition https://testtoolshed.g2.bx.psu.edu/view/peterjc/blast_rbh Thanks, Peter P.S. it would be nice to be able to sort the Repositories I own lists etc by date (particularly for my typical workflow of posting an update to the TestToolShed, waiting for a green light from the tests, and then pushing this to the main ToolShed). ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Test run frequency on TestToolShed
On Mon, Aug 18, 2014 at 11:16 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi all, Are the main and test tool-sheds currently meant to be running the tool functional tests every 48 hours? I created and updated these repositories last week, but they have yet to be tested: https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_composition https://testtoolshed.g2.bx.psu.edu/view/peterjc/blast_rbh Thanks, Peter P.S. it would be nice to be able to sort the Repositories I own lists etc by date (particularly for my typical workflow of posting an update to the TestToolShed, waiting for a green light from the tests, and then pushing this to the main ToolShed). Perhaps something deeper here, older untested examples: Revised 2014-07-30 , https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id https://testtoolshed.g2.bx.psu.edu/view/peterjc/blastxml_to_top_descr Revised 2014-07-31, https://testtoolshed.g2.bx.psu.edu/view/peterjc/blast2go Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Question about composite blast datatypes
On Wed, Aug 13, 2014 at 7:22 PM, Eric Rasche rasche.e...@yandex.ru wrote: Hi Peter, I'm working on composite datatypes now (for PacBio SMRT cells). In the datatype I know I'll have files with variable names (e.g. .1, .2, .3) and after using the blast datattype as reference material, I noticed that you had commented out files with variable names from the datatype. Is there rationale behind that? Are we ensured that the entire directory will always be transferred so it's not a problem to not specify files as part of the datatype? Reference: https://github.com/peterjc/galaxy_blast/blob/master/datatypes/blast_datatypes/blast.py#L232 Cheers, Eric Hi Eric, Edward Kirton wrote the original BLAST DB datatypes - I'm not sure how to nicely define open ended file lists for datatypes, but also the makeblastdb wrapper output does not have this problem (yet). If and when we try to support partitioned BLAST databases (with a *.nal or *.pal alias file) then this would be needed. I would think that the HTML composite datatype might be a better guide here, since you will order get lots of child files (images etc). My guess is the whole directory may be transferred anyway, but having undefined files could be a problem at some point... Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
On Mon, Jul 28, 2014 at 9:43 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Mon, Jul 28, 2014 at 8:28 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Dear Nate, dear Peter Sorry for the delay in replying. I can import both HTML and blastdb from a history to a data library. If I try to get the data out of the library into anothre history, I am successful for the html but not for the blastdb. The problem seems to be that the primary data file (the /path/dataset_12345.dat) is empty for the blastdb, while the html primary file has something in it. OK. Can you tell where Galaxy thinks the library files are on disk, and check to see if the folder of BLAST database files is actually there? When I try to import the blastdb (from library to history) there is a message along the lines of can't import empty file. I hypothesise (admittedly without having looked at a line of code) that there is a test for file size 0 somewhere that is either altogether unnecessary or, more likely, does not take into account that for composite datatypes it might be completely legitimate for the primary file to be empty. This guess makes sense - but I've not yet tried to trace through the code either. Or is my primary blastdb file not supposed to be empty in the first place? I can blast against it just fine. The BLAST databases do not define/populate a primary file, so Galaxy seems to create a dummy empty file on its own. I have wondered about altering the BLAST database datatype definition to have a human readable text file as the primary file (i.e. the information currently saved as a text log file when creating a database). Correction: I actually implemented this late last year (included in BLAST+ wrapper version v0.0.22 onwards, and the Galaxy BLAST datatypes version v0.0.18 onwards): https://github.com/peterjc/galaxy_blast/commit/9b3f65cddcc60de26de63272c362c6ca53f6559d https://github.com/peterjc/galaxy_blast/commit/2ebfb790d5a1bbe310c3d7ccc2b953c2c37bccf2 The makeblastdb wrapper will send the stdout (log information) to the dummy index file, see the end of the command tag in: https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus/ncbi_makeblastdb.xml The display_data method for a BLAST database will show any makeblastdb log information held in the dummy index file, see https://github.com/peterjc/galaxy_blast/blob/master/datatypes/blast_datatypes/blast.py i.e. Only older BLAST databases in histories should have empty dummy index files, which will mitigate the library problem: https://trello.com/c/bNEKfOWR Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Determining datatype inheritance in tool XML Cheetah
Hi all, I've just uploaded a simple sequence composition tool to the Test Tool Shed: https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_composition https://github.com/peterjc/pico_galaxy/commit/45669446f5a14fd90a8a0d9d7430499de2fb3493 This accepts multiple input in FASTA, FASTQ, or SFF format - and allows a mixture of these: inputs param name=input_file type=data format=fasta,fastq,sff multiple=true label=Sequence file help=FASTA, FASTQ, or SFF format. / /inputs In order to build the command line string, I am currently using this for loop: command interpreter=python seq_composition.py -o $output_file ##For loop over inputs #for i in $input_file --$i.ext ${i} #end for /command This results in things like this being run: seq_composition.py -o XXX.dat --fastqsanger XXX.dat --sff XXX.dat This works, but means my Python script has to know about not just the core data types that I specified in my input parameter XML (fasta,fastq,sff) but also any subclasses (e.g. fastqsanger). It seems what I want/need would be something along these lines in pseudo-code to map any datatype which is a subclass for fastq to use a single command line option: command interpreter=python seq_composition.py -o $output_file ##For loop over inputs #for i in $input_file #if isinstance($i.datatype, fastq): --fastq ${i} #else --$i.ext ${i} #end if #end for /command This mock example borrows from the Python isinstance function, but of course some Galaxy datatypes are defined as subclasses at the XML level rather than literally at the Python class level. This should result in getting the following regardless of which flavour of FASTQ the input dataset had assigned: seq_composition.py -o XXX.dat --fastq XXX.dat --sff XXX.dat Does anyone have any Tool XML examples probing an input file's datatype in this way? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] testtoolshed : python-2.7 installation error
Hi Geert, Which tool is this? Peter On Thu, Aug 7, 2014 at 9:00 AM, Geert Vandeweyer geert.vandewey...@uantwerpen.be wrote: Hi, I get an installation error on the python 2.7 package in the test toolshed. I used the 'contact owner' function, but wanted to mention it here too, as there hasn't been reaction so far. Sorry for double posting if so. Error: tar (child): 5.2.tar.bz2: Cannot open: No such file or directory A similar error is in the Test run outputs. I believe it is related to the following (unnecessary) line in the tool_dependency.xml: action type=change_directory../action located just after the download_file action for the 5.2.tar.bz2 file. Best, Geert -- Geert Vandeweyer, Ph.D. Department of Medical Genetics University of Antwerp Prins Boudewijnlaan 43 2650 Edegem Belgium Tel: +32 (0)3 275 97 56 E-mail: geert.vandewe...@ua.ac.be http://ua.ac.be/cognitivegenetics http://www.linkedin.com/in/geertvandeweyer ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] XLS TO CSV
Hi Mert, Most of the Galaxy tools dealing with tables of data use tabular format (tab separated variables), not csv (comma separated variables). CVS is a horrible horrible mess of formats, see e.g. http://tburette.github.io/blog/2014/05/25/so-you-want-to-write-your-own-CSV-code/ Also beware that anything other than MS Excel could be confused by quirks in the Excel format, e.g. multiple ways to record dates: http://support.microsoft.com/kb/180162 I would personally save each tab of the Excel sheet as tab separated data, and import those into Galaxy. Peter On Mon, Aug 4, 2014 at 1:56 PM, Mert Mehnur KIRKALI mertmehnurkirk...@gmail.com wrote: Hello, How can i convert xls file to csv file on galaxy ? Is that possible ? Best Regards,Mert. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] XLS TO CSV
On Mon, Aug 4, 2014 at 4:28 PM, Eric Rasche rasche.e...@yandex.ru wrote: Hi Peter, On 08/04/2014 09:25 AM, Peter Cock wrote: Hi Mert, Most of the Galaxy tools dealing with tables of data use tabular format (tab separated variables), not csv (comma separated variables). CVS is a horrible horrible mess of formats, see e.g. http://tburette.github.io/blog/2014/05/25/so-you-want-to-write-your-own-CSV-code/ This annoyed me when I was first starting out with galaxy, I really wish it'd be labelled TSV. The labels all read CSV so I gave galaxy CSV data and galaxy didn't like it, much to my confusion. Which labels say CSV at the moment? (And yes, I would also have preferred tsv to tabular as the datatype name in Galaxy, that way it would match the typical file extension). Also, most (biologists) I work with use the term CSV very generically without regard to the differences between the two. I've seen that too - but people saying CSV when they mean TSV will unavoidable cause confusion. Also beware that anything other than MS Excel could be confused by quirks in the Excel format, e.g. multiple ways to record dates: http://support.microsoft.com/kb/180162 I would personally save each tab of the Excel sheet as tab separated data, and import those into Galaxy. Would it not make sense to have an XLS - TSV datatype converter? I'm sure many biologists would appreciate being able to use the in-galaxy version as opposed to having to open+re-save all of their data. It makes sense to me to offer a tool mapping one Excel sheet to multiple tabular output files (one per sheet). How best to write this will depend on the platform and available dependencies (e.g. some of the R converters for this are Windows only IIRC). Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] TestToolShed failure, Exception: History in error state.
Hi Dave, You are right that on closer inspection I've mixed tool_dependencies.xml and repository_dependencies.xml *again*. Evidentially my mental model does not match Greg's here: (*) I need to define a tool installation recipe for something not in the Tool Shed -- write an install script called tool_dependencies.xml (*) I need to depend on a Python package by pointing at another repository in the Tool Shed -- repository_dependencies.xml (*) I need to depend on a datatype package by pointing at another repository in the Tool Shed -- repository_dependencies.xml (*) I need to depend on a binary package by pointing at another repository in the Tool Shed -- repository_dependencies.xml ? No. You need tool_dependencies.xml for this too. But that aside, the test framework error here is completely unhelpful. Why is there no error message about missing a dependency? Was there an error from running my tool which was not shown? Thanks, Peter On Wed, Jul 30, 2014 at 6:07 PM, Dave Bouvier d...@bx.psu.edu wrote: Peter, I believe part of the problem is that the install and test framework is unable to resolve the dependency on blast+ 2.2.29 because it is defined as a repository dependency, not a tool dependency. I would recommend replacing the repository dependency in the blast_rbh repository with a tool dependency definition that references package_blast_plus_2_2_29. --Dave B. On 07/30/2014 05:27 AM, Peter Cock wrote: Hi all, I'm not sure when this started (having hardly looked at my Tool Shed test results since GCC2014), but I think this is a fairly recent problem with my BLAST RBH tests failing (which has held me back from posting this to the main Tool Shed). This could be some silly mistake in my tar-ball, but usually missing test files and the like get an explicit error. The tests are passing on my GitHub/TravisCI setup (using Twill and the API backend): e.g. https://travis-ci.org/peterjc/galaxy_blast/builds/30592097 Here is the current error (the same for the last few Test Tool Shed runs), https://testtoolshed.g2.bx.psu.edu/view/peterjc/blast_rbh Traceback (most recent call last): File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py, line 108, in test_tool self.do_it( td ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py, line 35, in do_it self._verify_outputs( testdef, test_history, shed_tool_id, data_list, galaxy_interactor ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py, line 69, in _verify_outputs galaxy_interactor.verify_output( history, output_data, output_testdef=output_testdef, shed_tool_id=shed_tool_id, maxseconds=maxseconds ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py, line 53, in verify_output self.wait_for_history( history_id, maxseconds ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py, line 107, in wait_for_history self.twill_test_case.wait_for( lambda: not self.__history_ready( history_id ), maxseconds=maxseconds) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/twilltestcase.py, line 2453, in wait_for result = func() File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py, line 107, in lambda self.twill_test_case.wait_for( lambda: not self.__history_ready( history_id ), maxseconds=maxseconds) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py, line 257, in __history_ready return self._state_ready( state, error_msg=History in error state. ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py, line 316, in _state_ready raise Exception( error_msg ) Exception: History in error state. Is a more detailed log available which might help debug this? Thanks, Peter As an aside, this looks like the Test Tool Shed is still using the Twill backend for the functional tests? ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http
Re: [galaxy-dev] TestToolShed failure, Exception: History in error state.
On Thu, Jul 31, 2014 at 5:21 PM, bjoern.gruen...@googlemail.com bjoern.gruen...@gmail.com wrote: Hi Peter, 2014-07-31 10:57 GMT+02:00 Peter Cock p.j.a.c...@googlemail.com: Hi Dave, You are right that on closer inspection I've mixed tool_dependencies.xml and repository_dependencies.xml *again*. Evidentially my mental model does not match Greg's here: (*) I need to define a tool installation recipe for something not in the Tool Shed -- write an install script called tool_dependencies.xml (*) I need to depend on a Python package by pointing at another repository in the Tool Shed -- repository_dependencies.xml I might be wrong, but I think that also goes to tool_dependencies.xml Correct, e.g. https://github.com/peterjc/pico_galaxy/tree/master/tools/seq_select_by_id Thanks! (*) I need to depend on a datatype package by pointing at another repository in the Tool Shed -- repository_dependencies.xml (*) I need to depend on a binary package by pointing at another repository in the Tool Shed -- repository_dependencies.xml ? No. You need tool_dependencies.xml for this too As far as I understood, everything that is referenced in the tool.xml under the requirement section, needs to be in a tool_dependencies.xml file. Any other dependency are from the repository (data_types, data_manager, workflows ...). Ciao, Bjoern Sure, there is a logic here - but its a definition which I seem to still struggle with :( But that aside, the test framework error here is completely unhelpful. Why is there no error message about missing a dependency? Was there an error from running my tool which was not shown? Thanks, Peter I'd still like to get a more explicit error from the test suite than History in error state though ;) Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] TestToolShed failure, Exception: History in error state.
Hi all, I'm not sure when this started (having hardly looked at my Tool Shed test results since GCC2014), but I think this is a fairly recent problem with my BLAST RBH tests failing (which has held me back from posting this to the main Tool Shed). This could be some silly mistake in my tar-ball, but usually missing test files and the like get an explicit error. The tests are passing on my GitHub/TravisCI setup (using Twill and the API backend): e.g. https://travis-ci.org/peterjc/galaxy_blast/builds/30592097 Here is the current error (the same for the last few Test Tool Shed runs), https://testtoolshed.g2.bx.psu.edu/view/peterjc/blast_rbh Traceback (most recent call last): File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py, line 108, in test_tool self.do_it( td ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py, line 35, in do_it self._verify_outputs( testdef, test_history, shed_tool_id, data_list, galaxy_interactor ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py, line 69, in _verify_outputs galaxy_interactor.verify_output( history, output_data, output_testdef=output_testdef, shed_tool_id=shed_tool_id, maxseconds=maxseconds ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py, line 53, in verify_output self.wait_for_history( history_id, maxseconds ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py, line 107, in wait_for_history self.twill_test_case.wait_for( lambda: not self.__history_ready( history_id ), maxseconds=maxseconds) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/twilltestcase.py, line 2453, in wait_for result = func() File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py, line 107, in lambda self.twill_test_case.wait_for( lambda: not self.__history_ready( history_id ), maxseconds=maxseconds) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py, line 257, in __history_ready return self._state_ready( state, error_msg=History in error state. ) File /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py, line 316, in _state_ready raise Exception( error_msg ) Exception: History in error state. Is a more detailed log available which might help debug this? Thanks, Peter As an aside, this looks like the Test Tool Shed is still using the Twill backend for the functional tests? ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Uploads with embedded citations causing red error on Tool Shed
Hi John, Following the work at the BOSC 2014 CodeFest to support embedded citations within Galaxy Tool XML files [1], and your work adding this to the BLAST tools as an example [2], I tried uploading a minor tool using this to the Tool Shed. The upload seems to have worked, but there was a scary red error message about metadata... see below (both main and test toolsheds affected). Regards, Peter [1] https://bitbucket.org/galaxy/galaxy-central/pull-request/440/ [2] https://github.com/peterjc/galaxy_blast/commit/9d2e3906915895765ecc3f48421b91fabf2ccd8b -- Uploading on the Test Tool Shed, red error: Metadata may have been defined for some items in revision '796dc2ff8e8e'. Correct the following problems if necessary and reset metadata. blastxml_to_top_descr.xml - 'UniverseApplication' object has no attribute 'citations_manager' https://testtoolshed.g2.bx.psu.edu/view/peterjc/blastxml_to_top_descr -- Uploading on the Tool Shed, red error: Metadata may have been defined for some items in revision 'fe1ed74793c9'. Correct the following problems if necessary and reset metadata. blastxml_to_top_descr.xml - 'UniverseApplication' object has no attribute 'citations_manager' https://toolshed.g2.bx.psu.edu/view/peterjc/blastxml_to_top_descr ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
On Wed, Jul 30, 2014 at 11:52 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Dear Nate, dear Peter Again, sorry for the delay in replying. Yes I can. It looks like this [galaxy@srv ~]$ cat /galaxy/database/files/081/dataset_81002.dat [galaxy@srv ~]$ ls /galaxy/database/files/081/dataset_81002_files/ blastdb.nhd blastdb.nhi blastdb.nhr blastdb.nin blastdb.nog blastdb.nsd blastdb.nsi blastdb.nsq Good. Thanks for confirming that. I think the simplest solution would be to put something in the primary file. Just a short string that gets the file size above 0. That won't help with all the existing datasets out there - I think we rather need to fix something in the Galaxy code for composite files... I personally have followed you initial suggestion and made the dbs available globally via the .loc file. Thanks again Ulf Great. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Uploads with embedded citations causing red error on Tool Shed
Thanks John - is there any point/benefit to re-uploading my tool once the fix is live on the Tool Shed? i.e. Was it a harmless warning? Peter On Wed, Jul 30, 2014 at 12:12 PM, John Chilton jmchil...@gmail.com wrote: Hey Peter, Opps sorry about that and thanks for the bug report. The tool shed code should be fixed with https://bitbucket.org/galaxy/galaxy-central/commits/38ba45d6ba5be65b3b743fc08739e16cd6e0ac8f - it is in next-stable so I think the tool shed should pick up that fix at next tool shed update. -John On Wed, Jul 30, 2014 at 5:43 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi John, Following the work at the BOSC 2014 CodeFest to support embedded citations within Galaxy Tool XML files [1], and your work adding this to the BLAST tools as an example [2], I tried uploading a minor tool using this to the Tool Shed. The upload seems to have worked, but there was a scary red error message about metadata... see below (both main and test toolsheds affected). Regards, Peter [1] https://bitbucket.org/galaxy/galaxy-central/pull-request/440/ [2] https://github.com/peterjc/galaxy_blast/commit/9d2e3906915895765ecc3f48421b91fabf2ccd8b -- Uploading on the Test Tool Shed, red error: Metadata may have been defined for some items in revision '796dc2ff8e8e'. Correct the following problems if necessary and reset metadata. blastxml_to_top_descr.xml - 'UniverseApplication' object has no attribute 'citations_manager' https://testtoolshed.g2.bx.psu.edu/view/peterjc/blastxml_to_top_descr -- Uploading on the Tool Shed, red error: Metadata may have been defined for some items in revision 'fe1ed74793c9'. Correct the following problems if necessary and reset metadata. blastxml_to_top_descr.xml - 'UniverseApplication' object has no attribute 'citations_manager' https://toolshed.g2.bx.psu.edu/view/peterjc/blastxml_to_top_descr ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
On Mon, Jul 28, 2014 at 8:28 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Dear Nate, dear Peter Sorry for the delay in replying. I can import both HTML and blastdb from a history to a data library. If I try to get the data out of the library into anothre history, I am successful for the html but not for the blastdb. The problem seems to be that the primary data file (the /path/dataset_12345.dat) is empty for the blastdb, while the html primary file has something in it. OK. Can you tell where Galaxy thinks the library files are on disk, and check to see if the folder of BLAST database files is actually there? When I try to import the blastdb (from library to history) there is a message along the lines of can't import empty file. I hypothesise (admittedly without having looked at a line of code) that there is a test for file size 0 somewhere that is either altogether unnecessary or, more likely, does not take into account that for composite datatypes it might be completely legitimate for the primary file to be empty. This guess makes sense - but I've not yet tried to trace through the code either. Or is my primary blastdb file not supposed to be empty in the first place? I can blast against it just fine. The BLAST databases do not define/populate a primary file, so Galaxy seems to create a dummy empty file on its own. I have wondered about altering the BLAST database datatype definition to have a human readable text file as the primary file (i.e. the information currently saved as a text log file when creating a database). Thanks a lot for your help Ulf You too - you've found an interesting bug... Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] tool xml substitutes special characters in text parameter
On Mon, Jul 28, 2014 at 11:03 AM, Wolfgang Maier wolfgang.ma...@biologie.uni-freiburg.de wrote: Dear all, I noticed that with params of type text Galaxy seems to replace certain characters before passing them to the shell. As examples, it changes @ to __at__, } to __cc__ and \ to X. Is this the standard behavior or am I doing something wrong ? And if it's standard, are there workarounds ? Best, Wolfgang This is standard behaviour to prevent special characters being used to construct malicious command lines. It can be configured within your tool definition using the sanitizer tag set: https://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax#A.3Csanitizer.3E_tag_set Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] tool xml substitutes special characters in text parameter
On Mon, Jul 28, 2014 at 12:23 PM, Wolfgang Maier wolfgang.ma...@biologie.uni-freiburg.de wrote: On 28.07.2014 12:22, Peter Cock wrote: This is standard behaviour to prevent special characters being used to construct malicious command lines. It can be configured within your tool definition using the sanitizer tag set: https://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax#A.3Csanitizer.3E_tag_set Peter Thanks a lot, Peter, that solved my problem ! Unfortunately, with this one fixed I now run into an additional one: There is one free-text text field defined in my tool, which should accept characters outside the standard ascii code range (i.e. 127), in particular, German Umlaute äöüÄÖÜ. Hmm. This may be very complicated since a lot will depend on the local server/cluster's locale settings. Not everything will be UTF-8. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
On Wed, Jul 23, 2014 at 10:47 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Dear all I have several smallish BLAST databases that I would like to provide in a data library. I create them in a history with the makeblastdb tool and them try to add them to the library. I see that for each blast db there is an empty file created (like /path/dataset_12345.dat) and a folder with the same name (/path/dataset_12345_files/) that contains the actual db files (blastdb.n*). In my library the blastdb shows up empty and I cannot import it back to another history. I does not seem to be aware of the _files folder, despite it being the right data type (blastdbn). Any ideas what I am doing wrong? Thanks a lot for your help Ulf Hi Ulf, I've never tried that. It could be a bug in Galaxy importing composite datatypes into a library, or something in the BLAST database definition which needs fixing. Does importing an HTML report (with child files like images) into a library work for you? (This is another composite datatype so a useful comparison). Rather than using Data Libraries, we just list all the locally installed shared BLAST databases via the BLAST *.loc files instead. Note using the *.loc files makes the databases available to all the Galaxy users, while with a Data Library you can control access to specific groups/roles. Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
Interesting hypothesis - you may well be right. Galaxy guys - who is the expert to talk to on this and/or where in the code should we be looking? Thanks, Peter On Wed, Jul 23, 2014 at 11:22 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Dear Peter Thanks for your reply. I can import an html report (e.g. FastQC output) successfully into a new history from a data library. But the .dat file for the html is not empty like the one for the blastdb. Makes me think that I could do this with a blast db as well, if only it would not check for size 0 at the time of importing it. Thanks Ulf ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Once-run galaxy archives
On Wed, Jul 23, 2014 at 2:42 PM, John Chilton jmchil...@gmail.com wrote: Problem with automation is I could create dozens of templates over the next several years and consume less time in aggregate than it would take me to automate this. Nonetheless, there is a documentation component here that is important so I did enough to document - if someone wants to automate from there feel free. The template is just a copy of the sqlite database after a fresh Galaxy is launched. I usually just do this against whatever development instance of Galaxy I am working on. For completeness though I have put together a script to automate this task against a fresh install (I think): https://github.com/jmchilton/galaxy-downloads/blob/master/build_sqlite_template.sh Good luck! -John Perfect - documentation target achieved :) In terms of speeding up things like TravisCI using these SQLite database templates, refreshing this every few schema bumps should be enough. Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Once-run galaxy archives
On Mon, Jul 21, 2014 at 6:51 PM, Eric Rasche rasche.e...@yandex.ru wrote: Currently the checkout options consist of hg clones, and archives that mercurial produces. Having pulled or cloned galaxy a few times lately, I'm wondering if anyone would have a use for a once-run galaxy instance in an archive? I.e., I'd clone, run once to grab eggs and do the db migration, then re-tar result and store online. Might cut down on build/test times for those who are using travis or other CIs. Thoughts/opinions? Hi Eric, Given how close you can get now for minimal effort, this seem unnecessary. http://blastedbio.blogspot.co.uk/2013/09/using-travis-ci-for-testing-galaxy-tools.html My TravisCI setup this fetches the latest Galaxy as a tar ball (from a GitHub mirror as it was faster than a git clone which was faster than getting the tar ball from BitBucket, which in turn was faster than using hg clone), and a per-migrated SQLite database (using a bit of Galaxy functionality originally with $GALAXY_TEST_DB_TEMPLATE added to speed up running the functional tests). Note this does not cache the eggs and all the other side effects of the first run like creating config files, so there is room for some speed up. Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Once-run galaxy archives
On Tue, Jul 22, 2014 at 1:15 PM, Eric Rasche rasche.e...@yandex.ru wrote: Hi Peter, On July 22, 2014 3:15:41 AM CDT, Peter Cock p.j.a.c...@googlemail.com wrote: Given how close you can get now for minimal effort, this seem unnecessary. http://blastedbio.blogspot.co.uk/2013/09/using-travis-ci-for-testing-galaxy-tools.html My TravisCI setup this fetches the latest Galaxy as a tar ball (from a GitHub mirror as it was faster than a git clone which was faster than getting the tar ball from BitBucket, which in turn was faster than using hg clone), Yes, that post was at least part of the thinking behind this. :) .., and a per-migrated SQLite database (using a bit of Galaxy functionality originally with $GALAXY_TEST_DB_TEMPLATE added to speed up running the functional tests). Apologies for grammatical error - I pasted in the environment variable at the wrong point in the sentence. I know I've seen that used but was never able to get that working in practice (then again I didn't try that hard). If that's a working/usable feature, then that is already the majority of setup time. Yes, the creation of the test-database and all the migrations was an obvious low-hanging fruit when we were looking at making running the tool functional tests faster - although originally in the context of running the tests on a local development Galaxy instance. As to using this in practise, currently my TravisCI setup has: export GALAXY_TEST_DB_TEMPLATE=https://github.com/jmchilton/galaxy-downloads/raw/master/db_gx_rev_0117.sqlite I also added that line at the start of my local copy of script run_functional_tests.sh to benefit from this while doing development. That should be all there is to it (but from memory, this is only for use with the SQLite backend). John - could you add a current schema snapshot to https://github.com/jmchilton/galaxy-downloads/ please? Note this does not cache the eggs and all the other side effects of the first run like creating config files, so there is room for some speed up. Eggs would be nice but not the biggest thing in the world. Right. I do like your idea of automatically generated cutting-edge or each stable release Docker images though (even if I have no personal need for them at the moment). Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Basic Questions
Set yourself as an administrator, and you can import the files from disk (and link to them if you wish to avoid a copy) as part of a data library. See: https://wiki.galaxyproject.org/Admin/DataLibraries/UploadingLibraryFiles Peter On Tue, Jul 22, 2014 at 3:52 PM, Mark Lindsay m.a.lind...@bath.ac.uk wrote: Apologies if this sounds like a basic question or if I am enquiring of the incorrect list. I have just had a local instance of galaxy installed on my MacPro. Could somebody inform me of the best options for loading large BAM files (5Gb) from the same hard drive into this instance of Galaxy. It states o that it is not possible to load files 2Gb and that you must use either a URL or FTP. My scripting knowledge is virtually non-existant….although I have access to people that do. Cheers Mark ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Once-run galaxy archives
On Tue, Jul 22, 2014 at 7:41 PM, Eric Rasche rasche.e...@yandex.ru wrote: John, How are those generated? Would you be amenable to scripting that portion and running it once a month? (...say in a cron job, with a passwordless ssh key so you never have to touch it again) Cheers, Eric How to generate it was going to be my next question too ;) I'm impressed with Eric's zeal to automate things. Having a script for making the SQLite template would be good - under git in the same repository? Peter P.S. The schema version 120 template works great, thanks!: https://travis-ci.org/peterjc/pico_galaxy/builds/30592828 https://travis-ci.org/peterjc/galaxy_blast/builds/30592097 ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] writing datatypes
On Sun, Jul 20, 2014 at 6:23 PM, Björn Grüning bjo...@gruenings.eu wrote: Hi, single datatype definitions only work if you haven’t defined any converters. Let's assume I have a datatype X and want to ship a X - Y converter (Y - X is also possible), we will end up with a dependency loop, or? The X repository will depend on the Y repository, but Y is depending on X, because we want to include a Y - X converter. Any idea how to solve that? Excellent example! How to handle versions of datatypes? Extra repositories for stockholm 1.0 and 1.1? If so ... the associated python file (sniffing, splitting ...) should be also versioned, or? What happend if I have two stockholm.py files in my system? Potentially you might need/want to define those as two different Galaxy datatypes? @Peter, can we create a striped-down, python only biopython egg? All parsers should be included, Bio.SeqIO should be sufficient I think. Right now, yes in principle (and this is fine from the licence point of view), but in practise this is a fair chunk of work. However, we are looking at this - see https://github.com/biopython/biopython/issues/349 Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] datatype dependencies
On Fri, Jul 18, 2014 at 4:21 PM, Eric Rasche rasche.e...@yandex.ru wrote: On 07/18/2014 09:49 AM, John Chilton wrote: My understanding of the code is that tool shed dependencies (or local dependencies) will not be available to tool shed datatypes (for sniffing for instance). Sorry. I figured as much, not very surprising at all. Dependencies notwithstanding, the idea has some modicum of merit. There are plenty of people who have already written great parsers that throw up errors, why should datatypes re-write them? Exactly - Trello request for the toolshed to handle both Python and binary dependencies for datatypes? (e.g. samtools is a binary dependency of the SAM/BAM datatypes, used for conversion and indexing) If you want to hack up your local instance to resolve dependencies during the sniffing process that may be possible - my guess is you could add requirement tags to tools/data_source/upload.xml and the __SET_METADATA__ tool definition embedded in lib/galaxy/datatypes/registry.py - though I have not tried this. Well heck, at that point I'd just use the fact that I know I'm in lib/galaxy/datatypes to locate the BioPython dependency that was installed through greps, globs, and finds. Though I'll hold off on that for a better solution. I'd manually install the Python dependencies as part of the Python used to run Galaxy itself? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] API v/s twill based testing
On Fri, Jul 18, 2014 at 5:14 PM, Dave Bouvier d...@bx.psu.edu wrote: John, Peter, The buildbot builders are already using the api interactor for both functional tests and the install and test framework. --Dave B. Great news. When did that happen? Did it cause any regressions (and can/did you flag those to the repository authors to alert them)? Assuming that change over went smoothly, is the plan for changing the default test back-end in the master branch of galaxy-central (and thus eventually galaxy-stable) for those running tool tests locally? Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Multiple output tools in Workflow
On Thu, Jul 17, 2014 at 1:58 PM, Calogero Zarbo za...@fbk.eu wrote: Hello Peter, Thanks for your answer. I tried your come but I am still not able to make it work the way I want. I mean that in the workflow design page, when I switch the parameter, it doesn't change the graphical list of outputs of the tool. How can I fix it? I want something like the Input, where it shows different outputs according to the selected parameter from the select. Here is the XML code: inputs param name=input_dataset label=Input dataset (ShoweLab or FBK Format) to split type=data format=showelab-dataset,fbk-svm-dataset/ conditional name=format_condition param name=format_options label=Choose the type of dataset you want to split type=select option value=fbkFBK Format/option option value=showelab selected=TrueShoweLab Format/option /param when value=fbk param name=input_fbk_dataset_labels label=Input dataset labels (FBK Format) to split type=data format=fbk-labels/ /when when value=showelab / /conditional param name=split_perc label=Percentage of Training Set among complete dataset type=float min=0.05 max=0.95 value=0.75/ /inputs outputs data format=showelab-dataset name=trainingDataset label=Training Dataset extracted from ${input_dataset.name} filterformat_condition[format_options] == showelab/filter /data data format=showelab-dataset name=validationDataset label=Validation Dataset extracted from ${input_dataset.name} filterformat_condition[format_options] == showelab/filter /data data format=fbk-svm-dataset name=trainingDataset label=Training Dataset extracted from ${input_dataset.name} filterformat_condition[format_options] == fbk/filter /data data format=fbk-labels name=trainingLabels label=Training Dataset Labels extracted from ${input_fbk_dataset_labels.name} filterformat_condition[format_options] == fbk/filter /data data format=fbk-svm-dataset name=validationDataset label=Validation Dataset extracted from ${input_dataset.name} filterformat_condition[format_options] == fbk/filter /data data format=fbk-labels name=validationLabels label=Validation Dataset Labels extracted from ${input_fbk_dataset_labels.name} filterformat_condition[format_options] == fbk/filter /data /outputs You appear to have multiple defined datasets (three versions of trainingDataset) which may be the problem, as the name is meant to be unique. I think you should have ONE data tag for trainingDataset but set this up to switch output formats accordingly. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Multiple output tools in Workflow
On Thu, Jul 17, 2014 at 2:43 PM, Calogero Zarbo za...@fbk.eu wrote: Ok, thanks for the the tip. I changed the XML to this one: outputs data format=showelab-dataset name=trainingDataset label=Training Dataset extracted from ${input_dataset.name} change_format when input=format_options value=fbk format=fbk-svm-dataset/ /change_format /data data format=showelab-dataset name=validationDataset label=Validation Dataset extracted from ${input_dataset.name} change_format when input=format_options value=fbk format=fbk-svm-dataset/ /change_format /data data format=fbk-labels name=trainingLabels label=Training Dataset Labels extracted from ${input_fbk_dataset_labels.name} filterformat_condition['format_options'] == fbk/filter /data data format=fbk-labels name=validationLabels label=Validation Dataset Labels extracted from ${input_fbk_dataset_labels.name} filterformat_condition['format_options'] == fbk/filter /data /outputs Still is not working, maybe the when input=format_options value=fbk format=fbk-svm-dataset/ . Maybe it has some problem since the format_options parameter is inside a conditional tag? Thanks a lot for your time. I'm not sure off hand - is your complete wrapper in a public repository somewhere we can look at? However, my general advice would be: First of all, get it working in the normal tool usage mode (tested by hand). Then I would get it working with functional tests. Finally I would test it by hand in the workflow editor, at which point any problem is probably Galaxy's fault ;) Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] writing datatypes
On Thu, Jul 17, 2014 at 4:31 PM, Eric Rasche rasche.e...@yandex.ru wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 For those reading this thread from the future, there's a secret to adding completely new datatypes locally (and not through a toolshed). You have to manually edit lib/galaxy/datatypes/registry.py and import the module you've written at the top of the file. For instance, if you add a new gbk.py datatype, you'll need to add import gbk to the top of registry.py. This will cause your errors to go away and your datatype to be loaded on startup. Thanks to John Chilton for answering this on IRC. Cheers, Eric Indeed - sorry I hadn't spotted that complication. The README files for these datatype extensions may help: https://github.com/peterjc/galaxy_blast/tree/master/datatypes/blast_datatypes https://github.com/peterjc/pico_galaxy/tree/master/datatypes/mira_datatypes I have to do this manually with some sed magic in my TravisCI automated set setup, see: http://blastedbio.blogspot.co.uk/2013/09/using-travis-ci-for-testing-galaxy-tools.html Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] writing datatypes
On Thu, Jul 17, 2014 at 5:45 PM, Björn Grüning bjoern.gruen...@gmail.com wrote: Hi, I think you are right John. Datatypes have many issues in that regard as I can tell, from a few bug reports. Imho datatypes should be handled like Tool dependency definitions. There should be only one installable revsion. But that aside, emboss datatypes are already broken. For example asn1 was added into Galaxy but it still exists in emboss_datatypes. Moreover, howto add a proper genbank datatype with sniffer, split and merge functions? Ideally, every datatype should have its own repository, but that is an overhead I would like to omit ... any other ideas? I would love to discuss that issue further, maybe a hangout with Greg and Peter? Thanks John for your input, Bjoern This could be high level, e.g. other sequence file formats repository covering GenBank, EMBL, SwissProt plain text, UniProt XML, etc; one for multiple sequence alignments; one for EMBOSS' own output... But it wouldn't be that much more work to do one ToolShed repo per additional file format, would it? One reason I have been meaning to do some of these is familiarity with many of these formats from looking after/writing parsers in Biopython. Having this done sooner rather than later ought to head off too many incompatible datatype names which worries me. Is it too late to adopt something like the EDAM ontology for the datatypes within Galaxy? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] writing datatypes
On Thu, Jul 17, 2014 at 5:55 PM, Eric Rasche rasche.e...@yandex.ru wrote: Not a problem Peter, it's a somewhat subtle bug to have, and there isn't a lot of documentation on the wiki about writing new datatypes (though I plan to fix that soon). That particular error message could stand to be a bit more explicit. (e.g., Did you forget to add import mylib to registry.py?). Also, thanks for sharing the blog post. Since we develop all of our tools internally, I may adapt and publish your post with similar instructions for jenkins, if that's all right by you. Cheers, Eric Please do :) Peter P.S. I know Saket is using this approach too now: https://github.com/saketkc/galaxy_tools ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] writing datatypes
On Thu, Jul 17, 2014 at 6:10 PM, Björn Grüning bjoern.gruen...@gmail.com wrote: ... but the problem will stay the same ... one [datatype definition] repository can have multiple versions ... I like your idea that like tool dependency definitions, this should be a special repository type on the ToolShed: Earlier, Björn Grüning bjoern.gruen...@gmail.com wrote: Imho datatypes should be handled like Tool dependency definitions. There should be only one installable revsion. This is something Greg will have to comment on - there may be ramifications I'm not seeing. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] writing datatypes
On Thu, Jul 17, 2014 at 6:28 PM, Eric Rasche rasche.e...@yandex.ru wrote: Am 17.07.2014 18:51, schrieb Peter Cock: One reason I have been meaning to do some of these is familiarity with many of these formats from looking after/writing parsers in Biopython. Peter, similar case here with BioPerl. All of my tools can output the full range of Bio::SeqIO output formats, so having datatypes would be great. Happy to contribute there. Sounds good. The EMBOSS, BioPerl and Biopython projects have tried to adopt consistent file format names (pre-dating the EDAM ontology), but unfortunately the names adopted in Galaxy sometimes diverge :( Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] writing datatypes
Good point Greg. Let's refine this slightly then, a new special ToolShed repository type for a *single* datatype definition. That avoids this problem :) (This does not help with suites of very closely related datatypes - like different kinds of BLAST database.) Peter On Thu, Jul 17, 2014 at 6:35 PM, Greg Von Kuster g...@bx.psu.edu wrote: This would be easy to implement, but could adversely affect reproducibility. If a repository containing datatypes always had only a single installable revision (i.e., the chagelog tip), then any datatypes defined in an early changeset revision that are removed in a later changeset revision would no longer be available. Greg On Jul 17, 2014, at 1:30 PM, Peter Cock p.j.a.c...@googlemail.com wrote: On Thu, Jul 17, 2014 at 6:10 PM, Björn Grüning bjoern.gruen...@gmail.com wrote: ... but the problem will stay the same ... one [datatype definition] repository can have multiple versions ... I like your idea that like tool dependency definitions, this should be a special repository type on the ToolShed: Earlier, Björn Grüning bjoern.gruen...@gmail.com wrote: Imho datatypes should be handled like Tool dependency definitions. There should be only one installable revsion. This is something Greg will have to comment on - there may be ramifications I'm not seeing. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Wiki datatypes tutorial
On Thu, Jul 17, 2014 at 7:19 PM, Eric Rasche rasche.e...@yandex.ru wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Thanks to everyone for their assistance in my adventure of custom local datatypes. In response to this, I've added a new wiki page with a basic MWE/tutorial on adding datatypes. A complete example is at the end, because most people like copy+paste code to get them started. https://wiki.galaxyproject.org/Admin/Datatypes/AddingCompleteDatatypes Please feel free to add to it/fix things I completely misunderstood. I'm not sure what 80% of the functions that get called in datatypes do, nor where they're called from, so I can't offer much more detail in this wiki page than I already have. (E.g., when is split called? If I write a split method, how can I test it? What other methods should I implement?) Cheers, Eric Hi Eric, Good work :) Split and merge are used when a tool has a parallelism .../ tag and this is enabled in your universe_wsgi.ini file. As an example, see the BLAST wrappers, e.g. https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus/ncbi_blastn_wrapper.xml This will split on the query FASTA file, and merge on the output file (which could be text, html, tabular, blastxml) using the output datatype's merge method. I had to work out a lot of this from reading the code and queries on the mailing list. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] datatype dependencies
You could do something like that, and we already have Biopython packages in the ToolShed which can be listed as dependencies :) However, some things like GenBank are tricky - in order to tolerate NCBI dumps the Biopython parser will ignore any free text before the first LOCUS line. A confusing side effect is most text files are then treated as a GenBank file with zero records. But if it came back with some records it is probably OK :) Basically Biopython also does not care to offer file format detection simply because it is a can of worms. Zen of Python - explicit is better than implicit. We want you to tell us which format you want to try parsing it as. Sorry, Peter (Speaking as the Bio.SeqIO maintainer for Biopython) On Thu, Jul 17, 2014 at 7:45 PM, Eric Rasche rasche.e...@yandex.ru wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Let's pretend for a second that I'm rather lazy (oh...wait), and I have ZERO interest in writing datatype parsers to sniff and validate whether or not a specific file is a specific datatype. I'm a sysadmin and bioinformatician, and I've worked with dozens of libraries that exist to parse file formats, and they all die in flames when I feed them bad data. Would it be possible to somehow define requirements for datatypes? I don't want to take on the burden of code I write saying yes, I've sniffed+validated this and it is absolutely a genbank file. That's a lot of responsibility, especially if people have malformed genbank files and their tools fail as a result. I would like to do this with BioPython and turf the validation to another library that exists to parse genbank files, that will raise and exception if they're invalid. def sniff(self, filename): from Bio import SeqIO try: self.records = list(SeqIO.parse( filename, genbank )) return True except: self.records = None return False def validate(self, dataset): from Bio import SeqIO errors = list() try: self.records = list(SeqIO.parse( dataset.file_name, genbank )) except Exception, e: errors.append(e) return errors def set_meta(self, dataset, **kwd): if self.records is not None: dataset.metadata.number_of_sequences = len(self.records) so much easier! And I can shift the burden of validation and sniffing to upstream, rather than any failures being my fault and requiring maintenance of a complex sniffer. Cheers, Eric - -- Eric Rasche Programmer II Center for Phage Technology Texas AM University College Station, TX 77843 404-692-2048 e...@tamu.edu rasche.e...@yandex.ru -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJTyBmyAAoJEMqDXdrsMcpVQa0P/jj0edAKM6QsodhRWHglR92W tej1tJjtPgtJ15wsFzq6wVfhbL5J39ytsWjjtk//jhVNXh4FEE/OFZe6Nx9uTFKP ybazyTrLSCrxsST+w+Rx8Q9vfzShr87vjP+fC1k5i2EZOgogPOcQml1ouOHHjC6z pArrwPOvL3ZxWJG7oEcZjUjrPD8+ffhfQ/x096YYIMw7Hg74d50ARwtawJRoslZD JnYWa+aUOcsvC3QMrLKkDm4qBaTHa5x7x7P07Lcx7X65iMPDcuMZNtImiLztNscF QwbbdJdcs8oeSRRnmKgAllRAKf4dMeiyaSI+muVzNlpvLlSMZBNawD0bO1OXmIQH vAaV0eU+rYmDJSGo330o+RydvlDJENTXOkDt0TxmvfYAPtg2TlJCiWUdL7V1LqqF n8J5Z7Cu/sqRGSr5ww6KY27QHq6TU1WZDsVZiyEWJeKg3HGzp0MUmzMdr7iSZawK gnZxv6qg3+FlSqA30niyAuxEq588vS8uEFjjOfhnNLsUM7FAuFANF5z9bPOhG2qM Xjc3/NY7NsERd9nsIwfRuz0DWni8upvZ39vfeRZ3OAW9NwjRzqXrQiQp08XHa934 z4EBnpcWc9rNSV/3APF/imecBTOoiKtZfzIfILLtOPGE407Bmd8cE8hWyW7ipvrT QU6DIimj3eoMn+elXDfX =M+s5 -END PGP SIGNATURE- ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] datatype dependencies
On Thu, Jul 17, 2014 at 8:20 PM, Eric Rasche rasche.e...@yandex.ru wrote: On 07/17/2014 02:11 PM, Peter Cock wrote: You could do something like that, and we already have Biopython packages in the ToolShed which can be listed as dependencies :) If my module depends on the biopython from the toolshed, will that be accessible within a datatype? Would it be as simple as from Bio import X? Most of what I've seen of dependencies (and please forgive my lack of knowledge about them) consists of env.sh being sourced with paths to binaries, prior to tool run. I don't know - this may well be a gap in the ToolShed framework, since thus far most of the datatypes defined have been self contained. I have asked something similar before (in the context of defining automatic file format conversion like the way Galaxy can turn FASTA into tabular in input parameters expecting tabular), where there could be a binary dependency. However, some things like GenBank are tricky - in order to tolerate NCBI dumps the Biopython parser will ignore any free text before the first LOCUS line. A confusing side effect is most text files are then treated as a GenBank file with zero records. But if it came back with some records it is probably OK :) Interesting, very good to know. Basically Biopython also does not care to offer file format detection simply because it is a can of worms. Zen of Python - explicit is better than implicit. We want you to tell us which format you want to try parsing it as. Yes! Exactly! Which is why it's perfectly fine here: SeqIO.parse( dataset.file_name, genbank ) All I want to know is whether or not this parses as a genbank file (and has 1 or more records). BioPython may not do automatic format detection (yuck, agreed), but since I already know I'm looking for a genbank file, simply being able to parse it or not is good enough. With those provisos, you should be OK :) Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] writing datatypes
Indeed - ideally (once working) we can upload under the IUC ToolShed as a community maintained resource rather than under a personal account which becomes a single point of failure (the bus factor). We (the ICU) have previously discussed doing this so that the EMBOSS datatypes could become more of a meta-entry depending on other smaller specific datatype defining ToolShed repositories. But it hasn't reached the top of my personal TODO list yet ;) Peter On Wed, Jul 16, 2014 at 1:47 PM, Björn Grüning bjoern.gruen...@gmail.com wrote: Hi Eric, please have a look at: https://github.com/bgruening/galaxytools/blob/master/datatypes/msa_datatypes/datatypes_conf.xml You need somthing like: datatype extension=genbank type=galaxy.datatypes.data:Text subclass=True / Lets try to split the EMBOSS datatypes a little bit into small chunks. E.g. sequences_datatypes, msa_datatypes ... and so on ... Cheers, Bjoern Am 14.07.2014 20:31, schrieb Eric Rasche: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I'm trying to add a new datatype to my galaxy instance for genbank files, however I'm running into various issues. I've followed the tutorial (https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes) however that example subclasses tabular, and I'd like to subclass Text as they're plain text files, and I'd like to be able to define a sniffer for them (not possible if your type=galaxy.datatypes.data:Text) I figured the call ought to be something like datatype extension=gb type=galaxy.datatypes.data:Genbank subclass=True / however, everything I try fails with Error importing datatype module galaxy.datatypes.data: 'module' object has no attribute 'Genbank' To avoid this particular issue, I tried writing a separate datatype just for genbank files (type=galaxy.datatypes.genbank:Genbank), however that fails with the same error: galaxy.datatypes.registry ERROR 2014-07-14 13:23:23,100 Error importing datatype module galaxy.datatypes.genbank: 'module' object has no attribute 'genbank' Traceback (most recent call last): File /home/hxr/work/galaxy-central/lib/galaxy/datatypes/registry.py, line 206, in load_datatypes module = getattr( module, mod ) AttributeError: 'module' object has no attribute 'genbank' Here's my lib/galaxy/datatypes/genbank.py looks like: import pkg_resources pkg_resources.require( bx-python ) import logging from galaxy.datatypes import data log = logging.getLogger(__name__) class Genbank( data.Text ): file_ext = gb def sniff( self, filename ): header = open(filename).read(5) return header == 'LOCUS' To debug this, I've tried copying the tabular data type completely, removed all the classes other than Tabular, and renamed it Genbank, however this fails too with the same error. Can anyone offer some insight? Cheers, Eric -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJTxCHwAAoJEMqDXdrsMcpVmbsQAJ3eFIhZtZmVP9LCz/F9Ywg/ 148NJZy4lmxZU0KScJlc8kVDCDSADXIHd0Db/kpJwuUKEX7zei9q2uXfO7sWl3yt yxrFEdtX/a5SMVsa6F5WZuKwBs0zfvfsnIUoraOgh6nXeJnr53l9mYeWaKB6bi3Z xAlgJG/kdIR1jRjAimuQf4vMjNgtDQPOmotYBQTytbhsV6/nRzGI8RZAYwQ7GnVs XYOWFyhzrBgALndVI3BjI21rbRqguhrqr2t7i0Ma7Pp2JmAnNjmUaq70NN3Rueh6 DvnTtxInM1dVOQY+Yam6MCMmAedV1cG+rNGdpP2l82MajQAsMtbXckBXXKcSgyTq WCFoLVURYO1tHkWyq4ikamfFDHtJp1DogBYhUiPMyRw+CV+3sOvr0U5DcyRdiDsJ Xcm3ygqYVLGwauNmuN3yGcQcnfypDOOeFs1lppbNe3lw0w3ikZN4Zmu1ec5s1ITK MEcgBrGYgZrKDRXkx53lnABGpv6mYflYpag7fguDNL8j0lh9beaaNmHr4tmeEcug VZ1b1EWoLMj/ikJ/vZcluiHPTSTheiAP8Ttvh1WAayq4rKwVtZygaI9IDauqqBQ1 Dgotes3vcomlTQXDUEZACyOZDxl7wbAUh0LZVaa2fYNIOoPNPOItUFSjf6YveF88 dLiw3ddVm+BFmczJzRpt =4m2j -END PGP SIGNATURE- ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Multiple output tools in Workflow
On Wed, Jul 16, 2014 at 5:28 PM, Calogero Zarbo za...@fbk.eu wrote: Hello to everybody I'm developing my own tool that need to switch the numbers of output files according to a parameter selected by the user from a list in the inputs tag. How can I do such thing? Here is the XML code: inputs param name=input_dataset label=Input dataset type=data format=showelab-dataset,fbk-svm-dataset/ conditional name=format_condition param name=format_options label=Choose the type of dataset you want to split type=select option value=fbk.fbk-svm-dataset Format/option option value=showelab selected=True.showelab-dataset Format/option /param when value=fbk param name=input_fbk_dataset_labels label=Input dataset labels type=data format=fbk-labels/ /when /conditional /inputs outputs data format=showelab-dataset name=trainingDataset label=Training Dataset extracted from ${input_dataset.name} filterformat_options == showelab/filter /data data format=showelab-dataset name=validationDataset label=Validation Dataset extracted from ${input_dataset.name} filterformat_options == showelab/filter /data data format=fbk-svm-dataset name=trainingDataset label=Training Dataset extracted from ${input_dataset.name} filterformat_options == fbk/filter /data data format=fbk-labels name=trainingLabels label=Training Dataset Labels extracted from ${input_fbk_dataset_labels.name} filterformat_options == fbk/filter /data data format=fbk-svm-dataset name=validationDataset label=Validation Dataset extracted from ${input_dataset.name} filterformat_options == fbk/filter /data data format=fbk-labels name=validationLabels label=Validation Dataset Labels extracted from ${input_fbk_dataset_labels.name} filterformat_options == fbk/filter /data /outputs Basically I would like to have in the Workflow Canvas that the outputs displayed in the interface will change according to the format_options select parameter. Thanks in advance. Hi Calogero, I think this tool of mine would serve as a working example: https://github.com/peterjc/pico_galaxy/tree/master/tools/seq_filter_by_id Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] API v/s twill based testing
On Wed, Jul 16, 2014 at 7:44 PM, Saket Choudhary sake...@gmail.com wrote: Thanks Peter, I guess I should then rely on API based tests. If it is just the order, make sure the order of the output files in the test is consistent with that in the outputs and it make be OK with Twill... I wonder if I filed a Trello card on this, or just an email? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] writing datatypes
Hi Eric There is already a genbank format in the EMBOSS datatypes (although there is talk of defining this and others in a set of smaller repositories defined as its dependencies for more modularity). Note it uses genbank not gb as the name! https://toolshed.g2.bx.psu.edu/view/devteam/emboss_datatypes However that doesn't answer your question :( Peter On Mon, Jul 14, 2014 at 7:31 PM, Eric Rasche rasche.e...@yandex.ru wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I'm trying to add a new datatype to my galaxy instance for genbank files, however I'm running into various issues. I've followed the tutorial (https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes) however that example subclasses tabular, and I'd like to subclass Text as they're plain text files, and I'd like to be able to define a sniffer for them (not possible if your type=galaxy.datatypes.data:Text) I figured the call ought to be something like datatype extension=gb type=galaxy.datatypes.data:Genbank subclass=True / however, everything I try fails with Error importing datatype module galaxy.datatypes.data: 'module' object has no attribute 'Genbank' To avoid this particular issue, I tried writing a separate datatype just for genbank files (type=galaxy.datatypes.genbank:Genbank), however that fails with the same error: galaxy.datatypes.registry ERROR 2014-07-14 13:23:23,100 Error importing datatype module galaxy.datatypes.genbank: 'module' object has no attribute 'genbank' Traceback (most recent call last): File /home/hxr/work/galaxy-central/lib/galaxy/datatypes/registry.py, line 206, in load_datatypes module = getattr( module, mod ) AttributeError: 'module' object has no attribute 'genbank' Here's my lib/galaxy/datatypes/genbank.py looks like: import pkg_resources pkg_resources.require( bx-python ) import logging from galaxy.datatypes import data log = logging.getLogger(__name__) class Genbank( data.Text ): file_ext = gb def sniff( self, filename ): header = open(filename).read(5) return header == 'LOCUS' To debug this, I've tried copying the tabular data type completely, removed all the classes other than Tabular, and renamed it Genbank, however this fails too with the same error. Can anyone offer some insight? Cheers, Eric -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJTxCHwAAoJEMqDXdrsMcpVmbsQAJ3eFIhZtZmVP9LCz/F9Ywg/ 148NJZy4lmxZU0KScJlc8kVDCDSADXIHd0Db/kpJwuUKEX7zei9q2uXfO7sWl3yt yxrFEdtX/a5SMVsa6F5WZuKwBs0zfvfsnIUoraOgh6nXeJnr53l9mYeWaKB6bi3Z xAlgJG/kdIR1jRjAimuQf4vMjNgtDQPOmotYBQTytbhsV6/nRzGI8RZAYwQ7GnVs XYOWFyhzrBgALndVI3BjI21rbRqguhrqr2t7i0Ma7Pp2JmAnNjmUaq70NN3Rueh6 DvnTtxInM1dVOQY+Yam6MCMmAedV1cG+rNGdpP2l82MajQAsMtbXckBXXKcSgyTq WCFoLVURYO1tHkWyq4ikamfFDHtJp1DogBYhUiPMyRw+CV+3sOvr0U5DcyRdiDsJ Xcm3ygqYVLGwauNmuN3yGcQcnfypDOOeFs1lppbNe3lw0w3ikZN4Zmu1ec5s1ITK MEcgBrGYgZrKDRXkx53lnABGpv6mYflYpag7fguDNL8j0lh9beaaNmHr4tmeEcug VZ1b1EWoLMj/ikJ/vZcluiHPTSTheiAP8Ttvh1WAayq4rKwVtZygaI9IDauqqBQ1 Dgotes3vcomlTQXDUEZACyOZDxl7wbAUh0LZVaa2fYNIOoPNPOItUFSjf6YveF88 dLiw3ddVm+BFmczJzRpt =4m2j -END PGP SIGNATURE- ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] API v/s twill based testing
Hi Saket, From memory the Twill tests are fragile with the output file order in the XML. John was discussing switching the default from the Twill to API backend, not sure when that is happening though... Peter On Tue, Jul 15, 2014 at 9:31 AM, Saket Choudhary sake...@gmail.com wrote: I recently updated tests for one of my wrappers and came across this strange behaviour: The twill based testing reports a failure: https://travis-ci.org/saketkc/galaxy_tools/jobs/29956682#L1463 whereas, the API based testing shows success: https://travis-ci.org/saketkc/galaxy_tools/jobs/29956683 Unfortunately I cannot run these tests locally since I am behind a system proxy [Refer: http://dev.list.galaxyproject.org/Functional-Tests-and-ftype-td4664233.html] and have to rely on travis.. The place where twii tests fails shows that it is trying to compare the diff between 'chasm_output_genes.tabular' and 'chasm_output_variants.tabular' instead of 'chasm_output_genes.tabular'. [See : https://travis-ci.org/saketkc/galaxy_tools/jobs/29956682#L1469] I tried running my tools locally and I did not come across any case where the 'variants' output gets replaced by the 'genes' output, thus possibly ruling out unexpected behavior from the tool's server end. Is this a possible bug or am I missing something? Saket ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Escaped text from input fields
Hi Renato, Yes, Galaxy maps potentially problematic/dangerous characters to escaped versions. You can control this, see sanitizer on: https://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax Peter On Sat, Jul 5, 2014 at 9:04 AM, Renato Alves rjal...@igc.gulbenkian.pt wrote: Hi everyone, I'm writing a wrapper that includes one text field. I'm then passing the contents of this field to the underlying tool with a simple: command $textfield However when a user inputs something like Test#1 the command ends up as: command Test__pd__1 I did a quick search on the web and it seems to be the name of some escaping function in galaxy. Is there any way I can get the text field content across untouched? Thanks, Renato ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Per-tool configuration
On Fri, Jun 27, 2014 at 3:13 PM, John Chilton jmchil...@gmail.com wrote: On Fri, Jun 27, 2014 at 5:16 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Wed, Jun 18, 2014 at 12:14 PM, Peter Cock p.j.a.c...@googlemail.com wrote: John - that Trello issue you logged, https://trello.com/c/0XQXVhRz Generic infrastructure to let deployers specify limits for tools based on input metadata (number of sequences, file size, etc...) Would it be fair to say this is not likely to be implemented in the near future? i.e. Should we consider implementing the BLAST query limit approach as a short term hack? It would be good functionality - but I don't foresee myself or anyone on the core team getting to it in the next six months say. ... I am now angry with myself though because I realized that dynamic job destinations are a better way to implement this in the meantime (that environment stuff was very fresh when I responded so I think I just jumped there). You can build a flexible infrastructure locally that is largely decoupled from the tools and that may (?) work around the task splitting problem Peter brought up. Outline of the idea: snip Hi John, So the idea is to define a dynamic job mapper which checks the query input size, and if too big raises an error, and otherwise passes the job to the configured job handler (e.g. SGE cluster). See https://wiki.galaxyproject.org/Admin/Config/Jobs It sounds like this ought to be possible right now, but you are suggesting since this seems quite a general use case, the code to help build a dynamic mapper using things like file size (in bytes or number of sequences) could be added to Galaxy? This approach would need the Galaxy Admin to setup a custom job mapper for BLAST (which knows to look at the query file), but it taps into an existing Galaxy framework. By providing a reference implementation this ought to be fairly easy to setup, and can be extended to be more clever about the limits. e.g. For BLAST, we should consider both the number (and length) of the queries, plus the size of the database. Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Control on versioning in toolshed tools
Hi Eric, Despite the fact that internal hg repositories are used, the idea is NOT to use them as development repositories - but ONLY push releases to the ToolShed. In the interests of reproducibility (other people might use your ToolShed entry in a workflow, or as a dependency), you should not be able to ever rewrite history or delete commits - something you can do with a git or hg repository but should generally avoid. i.e. Being allowed to cleapup and start again is blocked by the Galaxy goal of reproducibility. I personally prefer git to hg, and therefore use that for development tracking of my own ToolShed releases - but if you like hg then I would suggest using a BitBucket.org hosted hg repository for developing your tool. You can see examples here - many of these tools do have explicit dependencies on other tools/packages in the ToolShed (either my own, or from 3rd parties): https://github.com/peterjc/galaxy_blast https://github.com/peterjc/pico_galaxy Regards, Peter On Tue, Jun 24, 2014 at 1:15 PM, Eric Kuyt eric.ku...@wur.nl wrote: Hi All, I am playing around with putting a tool in testtoolshed. Now when changes to dependency versions are detected, the toolshed detects a new version and a dropdown is created. but sometimes I do not want this behavior when the first version was erroneous for example. I tried hg strip on the repository and pushing it back to the testtoolshed but sadly it didn't result in a clean repository but a multi-headered mess. Is there a way to cleanup the remote repository and start over. And what would be a cleaner way to develop tools on a toolshed still making use of repository dependencies? Thanks, Eric ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Fwd: specifying default file by name in workflows
Hi Evan, Assuming you are talking about an input file from: param type=data .../ I don't think you can set a default - the possible files will depend on the current history, so could be zero, one or many files. Also, how would how specify a specific file? They would have Galaxy assigned *.dat filenames on disk, while the names could have been set to anything by the user. My guess is you may be better off defining a new datatype (a subclass of txt perhaps?). Regards, Peter On Wed, Jun 18, 2014 at 10:28 PM, Evan Bollig boll0...@umn.edu wrote: To clarify, I want to specify the default selected file name for an Input Dataset block in the workflow, but I'd like to keep the option open to select other input names with the same type. When I specify a file type for a tool's input, it is not enough. The input dataset can end up finding dozens of files that are txt for example. I want to know all possible tool_state annotations for Input Dataset (tool_id: null). Where would I find this? Cheers, -Evan Bollig Research Associate | Application Developer | User Support Consultant Minnesota Supercomputing Institute 599 Walter Library 612 624 1447 e...@msi.umn.edu boll0...@umn.edu -- Forwarded message -- From: Evan Bollig boll0...@umn.edu Date: Wed, Jun 18, 2014 at 11:19 AM Subject: specifying default file by name in workflows To: galaxy-...@bx.psu.edu galaxy-...@bx.psu.edu I don't know all the subtleties of the galaxy workflow syntax. My goal is to specify the default input file names for a number of tools in a workflow. Is this possible, or am I limited to only providing the file type or extension? If possible, can you provide an example? Thanks, -Evan Bollig Research Associate | Application Developer | User Support Consultant Minnesota Supercomputing Institute 599 Walter Library 612 624 1447 e...@msi.umn.edu boll0...@umn.edu ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Per-tool configuration
On Wed, Jun 18, 2014 at 12:04 PM, Jan Kanis jan.c...@jankanis.nl wrote: I am not using job splitting, because I am implementing this for a client with a small (one machine) galaxy setup. Ah - this also explains why a job size limit is important for you. Implementing a query limit feature in galaxy core would probably be the best idea, but that would also probably require an admin screen to edit those limits, and I don't think I can sell the required time to my boss under the contract we have with the client. The wrapper script idea I outlined to you earlier would be the least invasive (although might cause trouble if BLAST is run at the command line outside Galaxy), while your idea of inserting the check script into the Galaxy Tool XML just before running BLAST itself should also work well. I gave a quick try before on making the blast2html tool run in both python 2.6 and 3, but I gave up due to too many encoding issues. The client's machine has python 2.6. Maybe I should have another look. Jan It gets easier with practice - a mixture of little syntax things, and the big pain about bytes versus unicode (and thus encodings, and raw versus text mode for file handles). Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Gzipped input to functional tests with multiple=true
I've filed this bug in the Twill test framework on Trello: https://trello.com/c/XG3KemZE/1732-gzipped-input-to-twill-functional-tests-fails-with-multiple-true Peter On Fri, Jun 13, 2014 at 10:12 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi all, I think I've found a bug in the Galaxy test framework :( With most file inputs, a gzipped input file works fine (Galaxy's upload code handles unzipping it). However, with multiple=true this seems to break (with the Twill backend, the API test framework is OK), e.g. param name=filenames type=data format=fastq,mira multiple=true required=true label=Read file(s) help=Multiple files allowed, for example paired reads can be given as two files (MIRA looks at read names to identify pairs). / Fails: param name=filenames value=SRR639755_mito_pairs_sample.fastq.gz ftype=fastqsanger / e.g. https://travis-ci.org/peterjc/pico_galaxy/builds/27426318 Excerpt from log: == FAIL: test_tool_00 (functional.test_toolbox.TestForTool_mira_4_0_de_novo) MIRA v4.0 de novo assember ( mira_4_0_de_novo ) Test-1 -- Traceback (most recent call last): File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py, line 108, in test_tool self.do_it( td ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py, line 32, in do_it data_list = galaxy_interactor.run_tool( testdef, test_history ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/interactor.py, line 449, in run_tool self.twill_test_case.run_tool( testdef.tool.id, repeat_name=repeat_name, **page_inputs ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/twilltestcase.py, line 1789, in run_tool self.submit_form( **kwd ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/twilltestcase.py, line 1999, in submit_form raise AssertionError( errmsg ) AssertionError: Attempting to set field 'read_group_0|filenames' to value '['SRR639755_mito_pairs_sample.fastq.gz']' in form 'tool_form' threw exception: id=None name=None label='SRR639755_mito_pairs_sample.fastq.gz' control: SelectControl(read_group_0|filenames=[80]) If the above control is a DataToolparameter whose data type class does not include a sniff() method, make sure to include a proper 'ftype' attribute to the tag for the control within the test tag set. This works, param name=filenames value=SRR639755_mito_pairs_sample.fastq ftype=fastqsanger / e.g. https://travis-ci.org/peterjc/pico_galaxy/builds/27426336 See: https://github.com/peterjc/pico_galaxy/commit/e6967767535ca29debcdc19d7f0502d73276b6a0 Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Per-tool configuration
On Tue, Jun 17, 2014 at 4:57 PM, Jan Kanis jan.c...@jankanis.nl wrote: Too bad there aren't any really good options. I will use the environment variable approach for the query size limit. Are you using the optional job splitting (parallelism) feature in Galaxy? That seems to be me to be a good place to insert a Galaxy level job size limit. e.g. BLAST+ jobs are split into 1000 query chunks, so you might wish to impose a 25 chunk limit? Long term being able to set limits on the input file parameters of each tool would be nicer - e.g. Limit BLASTN to at most 20,000 queries, limit MIRA to at most 50GB FASTQ files, etc. For the gene bank links I guess modifying the .loc file is the least bad way. Maybe it can be merged into galaxy_blast, that would at least solve the interoperability problems. It would have to be sufficiently general, and backward compatible. FYI other people have also looked at extending the blast *.loc files (e.g. adding a category column for helping filter down a very large BLAST database list). @Peter: One potential problem in merging my blast2html tool could be that I have written it in python3, and the current tool wrapper therefore installs python3 and a host of its dependencies, making for a quite large download. Without seeing your code, it is hard to say, but actually writing Python code which works unmodified under Python 2.7 and Python 3 is quite doable (and under Python 2.6 with a few more provisos). Both NumPy and Biopython do this if you wanted some reassurance. On the other hand, Galaxy itself will need to more to Python 3 at some point, and certainly individual tools will too. This will probably mean (as with Linux Python packages) having double entries on the ToolSehd (one for Python 2, one for Python 3), e.g ToolShed package for NumPy under Python 2 (done) and under Python 3 (needed). Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Per-tool configuration
On Mon, Jun 16, 2014 at 4:18 AM, John Chilton jmchil...@gmail.com wrote: Hello Jan, Thanks for the clarification. Not quite what I was expecting so I am glad I asked - I don't have great answers for either case so hopefully other people will have some ideas. For the first use case - I would just specify some default input to supply to the input wrapper - lets call this N - add a parameter to the tool wrapper --limit-size=N - test that and then allow it to be overridden via an environment variable - so in your command block use --limit-size=\${BLAST_QUERY_LIMIT:N}. This will use N is not limit is set, but deployers can set limits. There are a number of ways to set such variables - DRM specific environment files, login rc files, etc Just this last release I added the ability to define environment variables right in job_conf.xml (https://bitbucket.org/galaxy/galaxy-central/pull-request/378/allow-specification-of-environment/diff). I thought the tool shed might have a way to collect such definitions as well and insert them into package files - but Google failed to find this for me. Hmm. Jan emailed me off list earlier about this. We could insert a pre-BLAST script to check the size of the query FASTA file, and abort if it is too large (e.g. number of queries, total sequence length, perhaps scaled according to the database size if we want to get clever?). I was hoping there was a more general mechanism in Galaxy - after all, BLAST is by no means the only computationally expensive tool ;) We have had query files of 20,000 and more genes against NR (both BLASTP and BLASTX), but our Galaxy has task-splitting enabled so this becomes 20 (or more) individual cluster jobs of 1000 queries each. This works fine apart from the occasional glitch with the network drive when the data is merged afterwards. (We know this failed once shortly after the underlying storage had been expanded, and would have been under heavy load rebalancing the data across the new disks.) Not sure about how to proceed with the second use case - extending the .loc file should work locally - I am not sure it is feasible within the context of the existing tool shed tools, data manager, etc You could certainly duplicate this stuff with your modifications - this how down sides in terms of interoperability though. Currently the BLAST wrappers use the *.loc files directly, but this is likely to switch to the newer Data Manager approach. That may or may not complicate local modifications like adding extra columns... Sorry I don't have great answers for either question, -John Thanks John, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] What is the correct place under Galaxy for a database that's created by a tool?
Hi Melissa, Galaxy expects history datasets to be read only, so the best option (in term of this data model) might be a (read only) SQLite database (since it is just a single file on disk). They could have multiple such databases in their history or histories. If you want the user to have just one database and update it, then things are rather different... I'll let one of the Galaxy team comment. Peter On Tue, Jun 17, 2014 at 12:07 AM, Melissa Cline cl...@soe.ucsc.edu wrote: Hi folks, Hopefully this is a quick question. I'm working on a set of tools that will fire off a VM from within Galaxy and will then communicate with the VM. The VM will create a local database. The vision is that this won't be a shared database; in a shared Galaxy instance, each user will have his or her own database. What is the best place to create this database under the Galaxy file system? Thanks! Melissa ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Gzipped input to functional tests with multiple=true
Hi all, I think I've found a bug in the Galaxy test framework :( With most file inputs, a gzipped input file works fine (Galaxy's upload code handles unzipping it). However, with multiple=true this seems to break (with the Twill backend, the API test framework is OK), e.g. param name=filenames type=data format=fastq,mira multiple=true required=true label=Read file(s) help=Multiple files allowed, for example paired reads can be given as two files (MIRA looks at read names to identify pairs). / Fails: param name=filenames value=SRR639755_mito_pairs_sample.fastq.gz ftype=fastqsanger / e.g. https://travis-ci.org/peterjc/pico_galaxy/builds/27426318 Excerpt from log: == FAIL: test_tool_00 (functional.test_toolbox.TestForTool_mira_4_0_de_novo) MIRA v4.0 de novo assember ( mira_4_0_de_novo ) Test-1 -- Traceback (most recent call last): File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py, line 108, in test_tool self.do_it( td ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py, line 32, in do_it data_list = galaxy_interactor.run_tool( testdef, test_history ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/interactor.py, line 449, in run_tool self.twill_test_case.run_tool( testdef.tool.id, repeat_name=repeat_name, **page_inputs ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/twilltestcase.py, line 1789, in run_tool self.submit_form( **kwd ) File /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/twilltestcase.py, line 1999, in submit_form raise AssertionError( errmsg ) AssertionError: Attempting to set field 'read_group_0|filenames' to value '['SRR639755_mito_pairs_sample.fastq.gz']' in form 'tool_form' threw exception: id=None name=None label='SRR639755_mito_pairs_sample.fastq.gz' control: SelectControl(read_group_0|filenames=[80]) If the above control is a DataToolparameter whose data type class does not include a sniff() method, make sure to include a proper 'ftype' attribute to the tag for the control within the test tag set. This works, param name=filenames value=SRR639755_mito_pairs_sample.fastq ftype=fastqsanger / e.g. https://travis-ci.org/peterjc/pico_galaxy/builds/27426336 See: https://github.com/peterjc/pico_galaxy/commit/e6967767535ca29debcdc19d7f0502d73276b6a0 Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/