Re: [galaxy-dev] Nothing being tested on Test and main Tool Shed?

2014-11-19 Thread Peter Cock
On Thu, Nov 6, 2014 at 11:08 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 Thanks Dave,

 The good news is yes, the tests are running again on the
 Test Tool Shed (although not the main Tool Shed yet), and
 many of my tools now have successful test results from
 last night.

 e.g. My new basic mummer tool which now has a full set
 of dependency packages thanks to Bjoern:
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/mummer

 The bad news is there are many unexpected failures with:
 Exception: History in error state.

 I'm sure you'll learn more once you look over the logs,

 Thank you,

 Peter

Hi Dave,

Any progress? All the following seem to have been tested
in the last few days on the TestToolShed, but failed with
Exception: History in error state.

https://testtoolshed.g2.bx.psu.edu/view/peterjc/blast2go
https://testtoolshed.g2.bx.psu.edu/view/peterjc/blastxml_to_top_descr
https://testtoolshed.g2.bx.psu.edu/view/peterjc/clinod
https://testtoolshed.g2.bx.psu.edu/view/peterjc/fastq_paired_unpaired
https://testtoolshed.g2.bx.psu.edu/view/peterjc/get_orfs_or_cdss
https://testtoolshed.g2.bx.psu.edu/view/peterjc/mira_assembler
https://testtoolshed.g2.bx.psu.edu/view/peterjc/ncbi_blast_plus
https://testtoolshed.g2.bx.psu.edu/view/peterjc/nlstradamus
https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id
https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_primer_clip
https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_rename
https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_select_by_id

That's about half of my TestToolShed repositories - most
of the others report their tests passed :)

I am also seeing unexpected problems with packages, e.g.

https://testtoolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_29

Error getting revision e78bbab7933d of repository
package_blast_plus_2_2_29 owned by iuc: An entry for the repository
was not found in the database.

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Public toolshed giving internal server error

2014-11-19 Thread Peter Cock
On Wed, Nov 19, 2014 at 9:36 AM, Peter Briggs
peter.bri...@manchester.ac.uk wrote:
 Hello

 I'm trying to make a new repository on the public toolshed at
 https://toolshed.g2.bx.psu.edu/
 but I keep getting the internal server error page.

 I was also unable to log out, or even to see the front page when trying to
 access it from a different browser.

 Is anyone else having this problem?

 Thanks  best wishes

 Peter

I just tried uploading an update on the test tool shed and got:

Internal Server Error
Galaxy was unable to successfully complete your request
...
IOError: [Errno 28] No space left on device

Perhaps a coincidence, but maybe the main ToolShed has the
same issue?

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Public toolshed giving internal server error

2014-11-19 Thread Peter Cock
Thanks - uploading tar-balls to the Test Tool Shed is working again :)

Peter

On Wed, Nov 19, 2014 at 2:34 PM, Nate Coraor n...@bx.psu.edu wrote:

 Hi all,

 Sorry for the service interruption. The Tool Shed should be back now,
 and the underlying disk usage problem has been alleviated, so it
 shouldn't occur again.

 --nate
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Is anyone using composite datatype uploads?

2014-11-19 Thread Peter Cock
Hi John, Sam,

I've not done it yet, but was hoping to implement uploading of
BLAST databases at some point - mainly for use within the
test framework, rather than expecting it to be useful for the
end user.

Is the issue here uploading an archive (e.g. .zip or .tar.gz) or
offering a way to pick multiple files to be treated together as
a composite dataset?

Peter

On Wed, Nov 19, 2014 at 3:32 PM, John Chilton jmchil...@gmail.com wrote:
 Well there is at least one person using this functionality -
 http://dev.list.galaxyproject.org/Problem-to-upload-data-to-Galaxy-when-using-pbed-file-format-td4666000.html.

 Just to make this more concrete - Sam has swapped the upload file
 button to use the new upload widget this release cycle (targeted for
 December 1st). So barring negative feedback - uploading pbed or velvet
 report datatypes (or other similar composite datatypes) will no longer
 be possible via the GUI.

 -John

 On Thu, Aug 14, 2014 at 2:32 PM, Aysam Guerler aysam.guer...@gmail.com 
 wrote:
 Hello everyone,

 We are considering to disable the deprecated upload tool form which is
 currently accessible through Tool panel  Get Data  Upload file. The new
 upload feature (icon at the top of the Tool panel) covers all of its
 functionality except uploading composite datatypes like e.g. Velvet.

 Please let us know if you are using the composite file upload functionality
 of the former tool form.

 Thanks,
 Sam

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Nothing being tested on Test and main Tool Shed?

2014-11-19 Thread Peter Cock
Thanks John,

Fingers crossed we'll get some more detailed logs in a day or two :)

Peter

On Wed, Nov 19, 2014 at 5:48 PM, John Chilton jmchil...@gmail.com wrote:
 Hey Peter,

 Dave is out this week - so I have tried to fumble around and see if I
 could make some progress on this. I found some bugs in a recent commit
 and fixed them - that might help
 (https://bitbucket.org/galaxy/galaxy-central/commits/b81798f94dc0fd14de1d585ed7e57f820f998fae).
 I also have enabled more verbose logging that might help those
 History in error state exceptions
 (https://bitbucket.org/galaxy/galaxy-central/commits/a799879a82c54c2d1afec6e33d8918479bbf2373)
 but I am not sure it will propagate through to the tool shed API - we
 will see I guess.

 If the install and test framework just picks up the latest central -
 these fixes will hopefully be reflected in the next run.

 -John


 On Wed, Nov 19, 2014 at 4:48 AM, Peter Cock p.j.a.c...@googlemail.com wrote:

 Hi Dave,

 Any progress? All the following seem to have been tested
 in the last few days on the TestToolShed, but failed with
 Exception: History in error state.

 https://testtoolshed.g2.bx.psu.edu/view/peterjc/blast2go
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/blastxml_to_top_descr
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/clinod
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/fastq_paired_unpaired
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/get_orfs_or_cdss
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/mira_assembler
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/ncbi_blast_plus
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/nlstradamus
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_primer_clip
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_rename
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_select_by_id

 That's about half of my TestToolShed repositories - most
 of the others report their tests passed :)

 I am also seeing unexpected problems with packages, e.g.

 https://testtoolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_29

 Error getting revision e78bbab7933d of repository
 package_blast_plus_2_2_29 owned by iuc: An entry for the repository
 was not found in the database.

 Thanks,

 Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] MarkupSafe egg missing? e.args[1].key != e.args[0].key

2014-11-18 Thread Peter Cock
On Mon, Nov 17, 2014 at 7:23 PM, John Chilton jmchil...@gmail.com wrote:
 Ummm... I think it is that the VM started shipping with an
 incompatible paramkio.

 [...]

 Anyway - planemo's TravisCI integration tests Galaxy in a virtualenv
 and it works fine [...]

That's a useful workaround, but a virtualenv is not viable for existing
Galaxy installations - which may also come to have this problem if
their system copy of paramiko is updated?

I'm not familiar with paramiko or markupsafe, but if there is a conflict
it would be good to have a direct fix.

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] MarkupSafe egg missing? e.args[1].key != e.args[0].key

2014-11-18 Thread Peter Cock
On Nov 18, 2014 7:26 AM, John Chilton jmchil...@gmail.com wrote:

 I will admit to not actually understanding Galaxy's dependency management
 but I think virtualenv is exactly the advice people who do understand it
 give
 http://dev.list.galaxyproject.org/Local-installation-problem-td4662627.html.
 It is a widely used tool precisely designed to solve such problems - I think
 it is the best way to go. I don't know why it would not be appropriate for
 existing installations - I think it is in fact somethimg of a best
 practice for existing installations.

Our existing installation is not using a virtual env, and I fear switching to
that could be disruptive.

 Certainly that error message should be more helpful but I am not sure we
 should do anything to address this beyond that - do you have a particular
 idea in mind?

Not show the IndexError exception? :P

Here the user should be told something about a conflict between
MarkupSafe and paramiko (assuming this is the real problem).

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] MarkupSafe egg missing? e.args[1].key != e.args[0].key

2014-11-18 Thread Peter Cock
On Tue, Nov 18, 2014 at 1:24 PM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Nov 18, 2014 7:26 AM, John Chilton jmchil...@gmail.com wrote:

 I will admit to not actually understanding Galaxy's dependency management
 but I think virtualenv is exactly the advice people who do understand it
 give
 http://dev.list.galaxyproject.org/Local-installation-problem-td4662627.html.
 It is a widely used tool precisely designed to solve such problems - I think
 it is the best way to go. I don't know why it would not be appropriate for
 existing installations - I think it is in fact somethimg of a best
 practice for existing installations.

 Our existing installation is not using a virtual env, and I fear switching to
 that could be disruptive.

Getting back to TravisCI, using a virtual env wasn't too painful:
https://github.com/peterjc/pico_galaxy/commit/26489a65a9cd60f9d055488d003346eab87941b0

I can now get back to tweaking the tests I was working on :)

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] MarkupSafe egg missing? e.args[1].key != e.args[0].key

2014-11-18 Thread Peter Cock
On Tue, Nov 18, 2014 at 5:51 PM, Nate Coraor n...@bx.psu.edu wrote:
 Peter,

 Unless you've made modifications to Galaxy that depend on external
 libraries, switching to a virtualenv for the server itself should be
 pretty safe. Tools themselves can still run without using the/any
 virtualenv, if desired.

 --nate

OK - that sounds more straightforward than I had feared - but I
will cross that bridge as needed ;)

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] MarkupSafe egg missing? e.args[1].key != e.args[0].key

2014-11-17 Thread Peter Cock
Hello all,

There looks to be an egg problem with the latest galaxy-central,
here are excepts from a failed Galaxy install on my TravisCI test
setup,
https://travis-ci.org/peterjc/pico_galaxy/builds/41233860

Here's another project build with the same error:
https://travis-ci.org/peterjc/galaxy_blast/builds/41235899

$ ./run.sh --stop-daemon || true
Initializing config/migrated_tools_conf.xml from migrated_tools_conf.xml.sample
...
Initializing static/welcome.html from welcome.html.sample
Some eggs are out of date, attempting to fetch...
Fetched http://eggs.galaxyproject.org/Mako/Mako-0.4.1-py2.7.egg
Fetched http://eggs.galaxyproject.org/importlib/importlib-1.0.3-py2.7.egg
Fetched 
http://eggs.galaxyproject.org/pysam/pysam-0.4.2_kanwei_b10f6e722e9a-py2.7-linux-x86_64-ucs4.egg
Fetched http://eggs.galaxyproject.org/ordereddict/ordereddict-1.1-py2.7.egg
Fetched http://eggs.galaxyproject.org/Fabric/Fabric-1.7.0-py2.7.egg
Fetched http://eggs.galaxyproject.org/Babel/Babel-1.3-py2.7.egg
Fetched http://eggs.galaxyproject.org/Whoosh/Whoosh-0.3.18-py2.7.egg
Fetched http://eggs.galaxyproject.org/Parsley/Parsley-1.1-py2.7.egg
Fetched 
http://eggs.galaxyproject.org/Cheetah/Cheetah-2.2.2-py2.7-linux-x86_64-ucs4.egg
Traceback (most recent call last):
  File ./scripts/fetch_eggs.py, line 46, in module
c.resolve() # Only fetch eggs required by the config
  File 
/home/travis/build/peterjc/pico_galaxy/galaxy-central-master/lib/galaxy/eggs/__init__.py,
line 347, in resolve
egg.resolve()
  File 
/home/travis/build/peterjc/pico_galaxy/galaxy-central-master/lib/galaxy/eggs/__init__.py,
line 192, in resolve
if e.args[1].key != e.args[0].key:
IndexError: tuple index out of range
Fetch failed.
No PID file exists in paster.pid


$ python scripts/fetch_eggs.py
Warning: MarkupSafe (a dependent egg of Mako) cannot be fetched
Traceback (most recent call last):
  File scripts/fetch_eggs.py, line 46, in module
c.resolve() # Only fetch eggs required by the config
  File 
/home/travis/build/peterjc/pico_galaxy/galaxy-central-master/lib/galaxy/eggs/__init__.py,
line 347, in resolve
egg.resolve()
  File 
/home/travis/build/peterjc/pico_galaxy/galaxy-central-master/lib/galaxy/eggs/__init__.py,
line 192, in resolve
if e.args[1].key != e.args[0].key:
IndexError: tuple index out of range

The command python scripts/fetch_eggs.py failed and exited with 1

Looking at http://eggs.galaxyproject.org/MarkupSafe/ for
Python 2.7 and Linux x86_64 there is a ucs4 egg:

MarkupSafe-0.12-py2.7-linux-x86_64-ucs4.egg09-Jun-2011 03:09
30724

So, why is this failing? Not I am not explicitly installing MakupSafe
(so I do not expect there to be a conflicting version installed).

Also it would seem there is a bug in the resolve method assuming that
e.arg will always have (at least) two entries?

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Repeats shown upside down on galaxy-central

2014-11-12 Thread Peter Cock
Hi Sam,


On Thu, Nov 13, 2014 at 6:25 AM, Aysam Guerler aysam.guer...@gmail.com
wrote:

 Hey Peter,

 I have modified the data selectors appearance by adding a single select
 field as default option. It resembles the previous functionality now and I
 think its an improvement.


Great. I won't be able to try this until next week at the earliest though.

Regarding the length of the select box I think that the tool form overall
 looks more organized now since almost all the input elements have the same
 length. However I will discuss this with the others and see what the
 consensus is.


Perhaps a confounding variable is how wide your screen is? In my
screenshots the visual cue that this is a select box is on the extreme
right and thus separated from the text it is connected to.  I think that
makes it a bad GUI design choice (as well somehow not liking the visual
aesthetic, which is a more personal impression).

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Nothing being tested on Test and main Tool Shed?

2014-11-06 Thread Peter Cock
Thanks Dave,

The good news is yes, the tests are running again on the
Test Tool Shed (although not the main Tool Shed yet), and
many of my tools now have successful test results from
last night.

e.g. My new basic mummer tool which now has a full set
of dependency packages thanks to Bjoern:
https://testtoolshed.g2.bx.psu.edu/view/peterjc/mummer

The bad news is there are many unexpected failures with:
Exception: History in error state.

I'm sure you'll learn more once you look over the logs,

Thank you,

Peter

On Wed, Nov 5, 2014 at 6:21 PM, Dave Bouvier d...@bx.psu.edu wrote:
 Peter,

 This was due to a number of issues with the testing framework, the last of
 which appears to have been resolved, and I see tools being tested in the
 framework log. I'll check again in the morning to see if any of the tools
 you listed below were still not tested.

--Dave B.


 On 11/03/2014 05:51 AM, Peter Cock wrote:

 Hello all,

 I am currently hoping to review the automated test results for
 some repositories which I have recently updated, in one case
 for dependency handling, the other functional changes:

 https://testtoolshed.g2.bx.psu.edu/view/peterjc/mummer
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/ncbi_blast_plus

 These have not yet been tested. On further investigation
 of a sample of my other tools, it appears none of them have
 been tested on the Test Tool Shed since 2014-09-15, e.g.

 https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_rename
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/sample_seqs
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/effectivet3
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/clinod

 Similarly, some of my tools on the Main Tool Shed appear
 not to have been tested since 2014-09-21, e.g.

 https://toolshed.g2.bx.psu.edu/view/peterjc/seq_rename
 https://toolshed.g2.bx.psu.edu/view/peterjc/effectivet3
 https://toolshed.g2.bx.psu.edu/view/peterjc/clinod

 or 2014-10-27,

 https://toolshed.g2.bx.psu.edu/view/peterjc/sample_seqs
 https://toolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats

 Is there a known problem with the automated tool testing
 (previously every second night) on the Tool Sheds?
 Or have you had to further reduce the testing cycle?

 Testing less frequently seems fine, say fortnightly, if this can
 be supplemented by testing updated tools everynight. That
 would give Tool Authors prompt feedback on their updates,
 but also catch regressions where changes in Galaxy break
 a previously working tool.

 Regards,

 Peter
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Repeats shown upside down on galaxy-central

2014-11-06 Thread Peter Cock
Hi Sam,

I found the old approach (new repeat blocks at bottom) worked fine
when adding blocks one by one and completing them as I went
(which means once you have filled in the new block, you have
scrolled down to the button ready to add another block if needed).

If find the new approach (new blocks inserted at top) visually
confusing as I am used to filling in forms from top to bottom.

I do concede the approach makes it easy to add several repeats
with a few clicks, and then fill them in - but personally that isn't
a common thing for me to do, and I do not think this change is
worth the confusion.

There are a number of other visual changes on galaxy-central
compared to the current release, so of which seem harmless
like the boolean parameters becoming a yes/no toggle in place
of a tick box, others are more daunting/scary (e.g. collection
related changes to the file picker for automatic batch jobs).

Is there any draft documentation on these changes being prepared
to go into the next release notes? This is the sort of thing local
Galaxy Admins would appreciate to anticipate local user queries.

Thanks,

Peter

On Thu, Nov 6, 2014 at 12:45 PM, Aysam Guerler aysam.guer...@gmail.com wrote:
 Hi Peter,

 Yes we changed this on purpose, however it is open to discussion. The
 advantage is that the user does not have to scroll down after adding a new
 repeat block. Additionally it enables users to easily add more than one
 repeat block quickly, since the insert button does not relocate on the
 screen after adding new repeat blocks.

 Thanks,
 Sam

 On Mon, Nov 3, 2014 at 7:25 AM, Peter Cock p.j.a.c...@googlemail.com
 wrote:

 Hi all,

 I'm running galaxy-central as my development server, and noticed what
 to me is a regression with repeat parameters,

 e.g.
 https://github.com/peterjc/pico_galaxy/blob/master/tools/clc_assembly_cell/clc_mapper.xml

 Read group:
 [+ insert read group]
 * 1: Read Group
 * 2: Read Group

 which on clicking becomes:

 Read group:
 [+ insert read group]
 * 2: Read Group
 * 1: Read Group

 This to me is upside down, the current behaviour on galaxy-dist is
 more natural (and also pluralises the group heading):

 Read Groups
 * Read Group 1
 [Add new Read Group]

 which on clicking becomes:

 Read Groups
 * Read Group 1
 * Read Group 2
 [Add new Read Group]

 Is this a deliberate change? If so, why?

 Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Galaxy's dependency on old samtools vs tools wrapping later versions?

2014-11-04 Thread Peter Cock
OK, so this should work then... :)

Thanks Dave,

Peter

On Mon, Nov 3, 2014 at 7:06 PM, Dave Bouvier d...@bx.psu.edu wrote:
 Peter,

 For the automated indexing of bam files, Galaxy uses the samtools version
 linked to as default under tool-dependencies/samtools/

 This should normally be 0.1.19 or older, due to the not-yet-implemented
 handling of bam_index_build and other potential regressions that could be
 uncovered in the future.

--Dave B.


 On 11/03/2014 01:52 PM, Peter Cock wrote:

 Hello all,

 Galaxy currently requires samtools on the $PATH in order to sort
 and index BAM files automatically, and samtools 0.1.19 works fine.

 Unfortunately later versions of samtools index have a regression:
 https://github.com/samtools/samtools/issues/199

 This has caught several people out already,
 e.g. https://biostar.usegalaxy.org/p/7928/
 and https://biostar.usegalaxy.org/p/9335/

 While eventually samtools will be fixed, right now this means we
 can't have samtools 1.1 as the first samtools on the $PATH used
 by Galaxy.

 I am working on a wrapper for samtools bam2fq:
 https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_bam2fq
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq

 The bam2qf command in samtools 0.1.19 has a number of bugs,
 so I want to target samtools 1.1.  However this has complicated my
 testing since for my BAM input files Galaxy will call samtools index,
 and if it calls samtools 1.1 this will fail.

 I'm not using the tool shed dependencies during development
 so instead came up with the following hack:
 https://github.com/peterjc/picobio/blob/master/sambam/samtools_auto.py

 My question is, what is expected to happen with a Tool Shed installed
 wrapper for samtools 1.1 and Galaxy's attempts to automatically call
 samtools to index any BAM output file? Would the tool environment
 put samtools 1.1 on the (local) $PATH which would then break setting
 the metadata as part of the same job?

 Regards,

 Peter
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Galaxy's dependency on old samtools vs tools wrapping later versions?

2014-11-04 Thread Peter Cock
Fingers crossed - perhaps I jumped the gun uploading this to the main
tool shed without seeing the test results on the Test Tool Shed:

https://toolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq
https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq

I look forward to the automated test results from the Tool Sheds ...
http://lists.bx.psu.edu/pipermail/galaxy-dev/2014-November/020792.html

Thanks,

Peter

On Tue, Nov 4, 2014 at 8:38 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 OK, so this should work then... :)

 Thanks Dave,

 Peter

 On Mon, Nov 3, 2014 at 7:06 PM, Dave Bouvier d...@bx.psu.edu wrote:
 Peter,

 For the automated indexing of bam files, Galaxy uses the samtools version
 linked to as default under tool-dependencies/samtools/

 This should normally be 0.1.19 or older, due to the not-yet-implemented
 handling of bam_index_build and other potential regressions that could be
 uncovered in the future.

--Dave B.


 On 11/03/2014 01:52 PM, Peter Cock wrote:

 Hello all,

 Galaxy currently requires samtools on the $PATH in order to sort
 and index BAM files automatically, and samtools 0.1.19 works fine.

 Unfortunately later versions of samtools index have a regression:
 https://github.com/samtools/samtools/issues/199

 This has caught several people out already,
 e.g. https://biostar.usegalaxy.org/p/7928/
 and https://biostar.usegalaxy.org/p/9335/

 While eventually samtools will be fixed, right now this means we
 can't have samtools 1.1 as the first samtools on the $PATH used
 by Galaxy.

 I am working on a wrapper for samtools bam2fq:
 https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_bam2fq
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq

 The bam2qf command in samtools 0.1.19 has a number of bugs,
 so I want to target samtools 1.1.  However this has complicated my
 testing since for my BAM input files Galaxy will call samtools index,
 and if it calls samtools 1.1 this will fail.

 I'm not using the tool shed dependencies during development
 so instead came up with the following hack:
 https://github.com/peterjc/picobio/blob/master/sambam/samtools_auto.py

 My question is, what is expected to happen with a Tool Shed installed
 wrapper for samtools 1.1 and Galaxy's attempts to automatically call
 samtools to index any BAM output file? Would the tool environment
 put samtools 1.1 on the (local) $PATH which would then break setting
 the metadata as part of the same job?

 Regards,

 Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Can existing SAM/BAM filter tools give me mapped/unmapped pairs?

2014-11-04 Thread Peter Cock
Hi all,

I'm looking for a little advice on the pre-existing SAM/BAM filtering
tools already in the Galaxy Tool Shed (to avoid reinventing the wheel).

As I mentioned on another thread, I'm working on a wrapper for the
samtools bam2fq command (targeting samtools 1.1 which fixed
some bugs in this tool and added new functionality compared to
samtools 0.1.19), see:

https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_bam2fq
https://toolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq
https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq

One of my motivating use cases is a workflow like this:

1. Upload paired end FASTQ files.
2. Map them against a known contaminant genome giving a BAM file
(note I need the mapper to report unmapped reads in the output).
3. Filter the BAM to get unmapped reads, plus reads whose partner is
unmapped (conversely, remove reads where both partners are mapped).
4. Convert the filtered BAM back into FASTQ (with samtools bam2fq).
5. Proceed with analysis (e.g. de novo assembly).

Assuming I have understood samtools view, this filtering step
has to be multiple parts:

This would get the unmapped reads
$ samtools view -f 0x4 ...

This would get reads with an unmapped partner:
$ samtools view -f 0x8 ...

However this would only get unmapped reads with an unmapped partner:
$ samtools view -f 0x12 ...

i.e. samtools view allows logical AND, not logical OR, when combining
flag filters.

So, I believe using samtools directly, a two stage filter is needed followed
by a merge (and sort), taking care not to duplicate reads, perhaps:

$ samtools view -f 4 ...  unmapped.bam
$ samtools view -f 8 -F 4 ...  mapped_with_partner_unmapped.bam
$ samtools merge unmapped.bam mapped_with_partner_unmapped.bam  ...

That could be repeated within Galaxy but is surprisingly complicated
with multiple steps in the history - so I do not want to go that route.

Have I overlooked a simple ToolShed solution using samtools?

As far as I could tell, the only other option on the current Tool Shed
is the Sambamba Filter tool (using unmapped or mate_is_unmapped),
which has a very capable looking filter system:
https://toolshed.g2.bx.psu.edu/view/lomereiter/sambamba_filter

@Artem - have you explored updating your tool_dependencies.xml
to download your pre-compiled binaries by default? That would
make deployment far easier, since D compilers are still rare, and
would mean we can see the test results on the Tool Shed :)
Please ask if you'd like advice on Tool Shed packaging.

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Can existing SAM/BAM filter tools give me mapped/unmapped pairs?

2014-11-04 Thread Peter Cock
On Tue, Nov 4, 2014 at 2:44 PM, Peter Cock p.j.a.c...@googlemail.com wrote:
 Hi all,

 I'm looking for a little advice on the pre-existing SAM/BAM filtering
 tools already in the Galaxy Tool Shed (to avoid reinventing the wheel).

 As I mentioned on another thread, I'm working on a wrapper for the
 samtools bam2fq command (targeting samtools 1.1 which fixed
 some bugs in this tool and added new functionality compared to
 samtools 0.1.19), see:

 https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_bam2fq
 https://toolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq


Going off topic, but I just hit a problem here:
https://github.com/samtools/samtools/issues/313

Depending on if the reads have a QUAL value or not, samtool bam2fq
will produce either FASTA or FASTQ output - and will happily give
a mixture in one file. I know Heng Li has a parser that will take this
kind of input, but Galaxy likes to have well defined file formats.

I may have to fix samtools, perhaps by adding a strict FASTQ
output mode?

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Nothing being tested on Test and main Tool Shed?

2014-11-03 Thread Peter Cock
Hello all,

I am currently hoping to review the automated test results for
some repositories which I have recently updated, in one case
for dependency handling, the other functional changes:

https://testtoolshed.g2.bx.psu.edu/view/peterjc/mummer
https://testtoolshed.g2.bx.psu.edu/view/peterjc/ncbi_blast_plus

These have not yet been tested. On further investigation
of a sample of my other tools, it appears none of them have
been tested on the Test Tool Shed since 2014-09-15, e.g.

https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_rename
https://testtoolshed.g2.bx.psu.edu/view/peterjc/sample_seqs
https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats
https://testtoolshed.g2.bx.psu.edu/view/peterjc/effectivet3
https://testtoolshed.g2.bx.psu.edu/view/peterjc/clinod

Similarly, some of my tools on the Main Tool Shed appear
not to have been tested since 2014-09-21, e.g.

https://toolshed.g2.bx.psu.edu/view/peterjc/seq_rename
https://toolshed.g2.bx.psu.edu/view/peterjc/effectivet3
https://toolshed.g2.bx.psu.edu/view/peterjc/clinod

or 2014-10-27,

https://toolshed.g2.bx.psu.edu/view/peterjc/sample_seqs
https://toolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats

Is there a known problem with the automated tool testing
(previously every second night) on the Tool Sheds?
Or have you had to further reduce the testing cycle?

Testing less frequently seems fine, say fortnightly, if this can
be supplemented by testing updated tools everynight. That
would give Tool Authors prompt feedback on their updates,
but also catch regressions where changes in Galaxy break
a previously working tool.

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Repeats shown upside down on galaxy-central

2014-11-03 Thread Peter Cock
Hi all,

I'm running galaxy-central as my development server, and noticed what
to me is a regression with repeat parameters,

e.g. 
https://github.com/peterjc/pico_galaxy/blob/master/tools/clc_assembly_cell/clc_mapper.xml

Read group:
[+ insert read group]
* 1: Read Group
* 2: Read Group

which on clicking becomes:

Read group:
[+ insert read group]
* 2: Read Group
* 1: Read Group

This to me is upside down, the current behaviour on galaxy-dist is
more natural (and also pluralises the group heading):

Read Groups
* Read Group 1
[Add new Read Group]

which on clicking becomes:

Read Groups
* Read Group 1
* Read Group 2
[Add new Read Group]

Is this a deliberate change? If so, why?

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Galaxy's dependency on old samtools vs tools wrapping later versions?

2014-11-03 Thread Peter Cock
Hello all,

Galaxy currently requires samtools on the $PATH in order to sort
and index BAM files automatically, and samtools 0.1.19 works fine.

Unfortunately later versions of samtools index have a regression:
https://github.com/samtools/samtools/issues/199

This has caught several people out already,
e.g. https://biostar.usegalaxy.org/p/7928/
and https://biostar.usegalaxy.org/p/9335/

While eventually samtools will be fixed, right now this means we
can't have samtools 1.1 as the first samtools on the $PATH used
by Galaxy.

I am working on a wrapper for samtools bam2fq:
https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_bam2fq
https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_bam2fq

The bam2qf command in samtools 0.1.19 has a number of bugs,
so I want to target samtools 1.1.  However this has complicated my
testing since for my BAM input files Galaxy will call samtools index,
and if it calls samtools 1.1 this will fail.

I'm not using the tool shed dependencies during development
so instead came up with the following hack:
https://github.com/peterjc/picobio/blob/master/sambam/samtools_auto.py

My question is, what is expected to happen with a Tool Shed installed
wrapper for samtools 1.1 and Galaxy's attempts to automatically call
samtools to index any BAM output file? Would the tool environment
put samtools 1.1 on the (local) $PATH which would then break setting
the metadata as part of the same job?

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Test failure, JSONDecodeError: Unpaired high surrogate

2014-10-31 Thread Peter Cock
I have solved this by commenting out the apparently harmless test:

https://github.com/peterjc/pico_galaxy/commit/f3d4261846566a86f9c85a158fb95877ca8bc7c5

Peter

On Wed, Oct 29, 2014 at 5:39 PM, Peter Cock p.j.a.c...@googlemail.com wrote:
 Hi all,

 I'm getting the following exception in a failing unit test:
 https://travis-ci.org/peterjc/pico_galaxy/builds/39398677

 Testing this tool (where two of the three near identical tests passed):
 https://github.com/peterjc/pico_galaxy/blob/dd03346710e6a46cb6ec9dda1eed23d5fd301d03/tools/mummer/mummer.xml

 ```
 Traceback (most recent call last):
   File 
 /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py,
 line 116, in test_tool
 self.do_it( td )
   File 
 /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py,
 line 35, in do_it
 self._verify_outputs( testdef, test_history, jobs, shed_tool_id,
 data_list, galaxy_interactor )
   File 
 /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py,
 line 75, in _verify_outputs
 galaxy_interactor.verify_output( history, jobs, output_data,
 output_testdef=output_testdef, shed_tool_id=shed_tool_id,
 maxseconds=maxseconds )
   File 
 /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/interactor.py,
 line 89, in verify_output
 self._verify_metadata( history_id, hid, attributes )
   File 
 /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/interactor.py,
 line 102, in _verify_metadata
 dataset = self._get( histories/%s/contents/%s % ( history_id,
 hid ) ).json()
   File 
 /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/eggs/requests-2.2.1-py2.7.egg/requests/models.py,
 line 740, in json
 return json.loads(self.content.decode(encoding), **kwargs)
   File /usr/lib/python2.7/dist-packages/simplejson/__init__.py, line
 413, in loads
 return _default_decoder.decode(s)
   File /usr/lib/python2.7/dist-packages/simplejson/decoder.py, line
 402, in decode
 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
   File /usr/lib/python2.7/dist-packages/simplejson/decoder.py, line
 418, in raw_decode
 obj, end = self.scan_once(s, idx)
 JSONDecodeError: Unpaired high surrogate: line 1 column 785 (char 785)
 ```

 Probably relevant:
  - https://github.com/simplejson/simplejson/issues/62
  - http://bugs.python.org/issue11489

 Any thoughts? What does Galaxy write to these job-associated JSON
 metadata files?

 Thanks,

 Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Editing admin rights on an (empty) ToolShed repo

2014-10-30 Thread Peter Cock
Hi all,

It seems the ToolShed now uses roles for granting admin rights...
but still has the old Grant authority to make changes feature?

I just hit a possible glitch here - I wanted to create a new repo
under the iuc user, edit the admin settings, then log in as my
normal personal account and upload the first version of the tool.

So:

1. Log into the Tool Shed as the iuc user.
2. Created https://toolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_30
3. Attempted to add other administrators, e.g. IUC group or myself,
but the top right menu only offered upload, and the old panel to
do this was also missing: Grant authority to make changes

If I do the first upload as iuc, then the menu changes to include
Manage Repository Administrators, plus the panel on the main
page appears Grant authority to make changes (which is what
we used to use).

Is this a transition stage, or are the change rights a subset of
the admin role?

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Test failure, JSONDecodeError: Unpaired high surrogate

2014-10-29 Thread Peter Cock
Hi all,

I'm getting the following exception in a failing unit test:
https://travis-ci.org/peterjc/pico_galaxy/builds/39398677

Testing this tool (where two of the three near identical tests passed):
https://github.com/peterjc/pico_galaxy/blob/dd03346710e6a46cb6ec9dda1eed23d5fd301d03/tools/mummer/mummer.xml

```
Traceback (most recent call last):
  File 
/home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py,
line 116, in test_tool
self.do_it( td )
  File 
/home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py,
line 35, in do_it
self._verify_outputs( testdef, test_history, jobs, shed_tool_id,
data_list, galaxy_interactor )
  File 
/home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py,
line 75, in _verify_outputs
galaxy_interactor.verify_output( history, jobs, output_data,
output_testdef=output_testdef, shed_tool_id=shed_tool_id,
maxseconds=maxseconds )
  File 
/home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/interactor.py,
line 89, in verify_output
self._verify_metadata( history_id, hid, attributes )
  File 
/home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/interactor.py,
line 102, in _verify_metadata
dataset = self._get( histories/%s/contents/%s % ( history_id,
hid ) ).json()
  File 
/home/travis/build/peterjc/pico_galaxy/galaxy-central-master/eggs/requests-2.2.1-py2.7.egg/requests/models.py,
line 740, in json
return json.loads(self.content.decode(encoding), **kwargs)
  File /usr/lib/python2.7/dist-packages/simplejson/__init__.py, line
413, in loads
return _default_decoder.decode(s)
  File /usr/lib/python2.7/dist-packages/simplejson/decoder.py, line
402, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File /usr/lib/python2.7/dist-packages/simplejson/decoder.py, line
418, in raw_decode
obj, end = self.scan_once(s, idx)
JSONDecodeError: Unpaired high surrogate: line 1 column 785 (char 785)
```

Probably relevant:
 - https://github.com/simplejson/simplejson/issues/62
 - http://bugs.python.org/issue11489

Any thoughts? What does Galaxy write to these job-associated JSON
metadata files?

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Role of suite_config.xml in current Tool Shed?

2014-10-28 Thread Peter Cock
Hi all,

I have a suite_config.xml file in one of my Tool Shed packages,
but I am unclear if this is still used, or simply a legacy from the
old pre-hg-based Tool Shed? e.g.

https://github.com/peterjc/pico_galaxy/blob/master/tools/protein_analysis/suite_config.xml

My understanding was this allowed the tool author some control
over the appearance/order that the tools will be shown in the
Galaxy left hand pane.

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Blast+ Wrapper: blastdbcmd: range parameter

2014-10-22 Thread Peter Cock
Thanks,
https://github.com/biopython/biopython/pull/385

You can't (yet) do this via the BLAST wrappers.

You would have to pull out the full sequences using the
makeblastdb wrapper, then edit them with another Galaxy
tool. Or work directly from the FASTA file if you have it.

Peter


On Wed, Oct 22, 2014 at 6:57 AM, Matthias Enders
m.end...@german-seed-alliance.de wrote:
 Hi Peter,

 I added a new Issue, hope everything is correct.

 Kind regards,
 Matthias Enders


 -Ursprüngliche Nachricht-
 Von: Peter Cock [mailto:p.j.a.c...@googlemail.com]
 Gesendet: Tuesday, October 21, 2014 11:25 PM
 An: Matthias Enders
 Cc: galaxy-dev@lists.bx.psu.edu
 Betreff: Re: [galaxy-dev] Blast+ Wrapper: blastdbcmd: range parameter

 Hi Matthias,

 Can you file an issue here about adding this here please?
 https://github.com/peterjc/galaxy_blast

 Thanks!

 Peter

 On Tue, Oct 21, 2014 at 10:36 AM, Matthias Enders 
 m.end...@german-seed-alliance.de wrote:
 Hello all,

 I use the ToolShed NCBI Blast+ Wrappers 
 (https://toolshed.g2.bx.psu.edu/repository?repository_id=1d92ebdf7e8d466c) 
 and I tried to retrieve sequence information from databases.

 The blastdbcmd comes with the feature to extract a given range of the 
 sequence:

 -Range  string  Range of sequence to extract (Format: start-stop)

 Is this parameter / functionality also part of the wrapper, how can I use 
 this functionality?

 Thanks,
 Matthias


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this and other
 Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Blast+ Wrapper: blastdbcmd: range parameter

2014-10-21 Thread Peter Cock
Hi Matthias,

Can you file an issue here about adding this here please?
https://github.com/peterjc/galaxy_blast

Thanks!

Peter

On Tue, Oct 21, 2014 at 10:36 AM, Matthias Enders
m.end...@german-seed-alliance.de wrote:
 Hello all,

 I use the ToolShed NCBI Blast+ Wrappers 
 (https://toolshed.g2.bx.psu.edu/repository?repository_id=1d92ebdf7e8d466c) 
 and I tried to retrieve sequence information from databases.

 The blastdbcmd comes with the feature to extract a given range of the 
 sequence:

 -Range  string  Range of sequence to extract (Format: start-stop)

 Is this parameter / functionality also part of the wrapper, how can I use 
 this functionality?

 Thanks,
 Matthias


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] question about GALAXY_SLOTS

2014-10-17 Thread Peter Cock
On Thu, Oct 16, 2014 at 11:05 PM, Wolfgang Maier
wolfgang.ma...@biologie.uni-freiburg.de wrote:
 Hi,

 this is just to make sure: the GALAXY_SLOTS environmental variable set by
 Galaxy when running tools will always be a number = 1 with 1 being the
 default if nothing else is configured in the job runner settings ?

 Correct ?

 Thanks,
 Wolfgang

Hi Wolfgang,

I believe so, however it is possible it might be unset in a corner case
(please report this as a bug if you see it happen) or a tool could change
the value.

You can use the following bash syntax to set your own default in
the tool's command template, e.g.

-num_threads \${GALAXY_SLOTS:-8}

Note the colon minus is the special bash syntax, here the default
value is 8 (not minus 8) if $GALAXY_SLOTS is not set. Also note
in the command XML you must escape the dollar sign.

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Is the new tool repositories summary in the monthly newsletter useful?

2014-10-08 Thread Peter Cock
On Wed, Oct 8, 2014 at 12:49 AM, Dave Clements
cleme...@galaxyproject.org wrote:
 Hi All,

 The October Galaxy newsletter went out a week ago.  Buried at the bottom is
 this

 36 new ToolShed repos

-- https://wiki.galaxyproject.org/GalaxyUpdates/2014_10#ToolShed_Contributions

 which lists repositories that have been published in the Galaxy Project
 ToolShed in the previous month.

 I have two questions about this:

 1. How useful is this summary?

 Compiling it is a manual process and it's kind of mind-numbing.  Most months
 it takes around 2 hours (I think).

I find it moderately useful, so if most Galaxy Admins think the same, it
probably is overall a good time investment.

 2. If we keep the summary, should we put it in the Dev News Briefs instead?

 I'm kinda thinking this summary is a better match for the Dev News Briefs
 (every release), then it is for the general newsletter (every month).

I would suggest both (easy if it is just a link, a tiny bit of copy and paste
if not), but that wasn't an option on the Google form.

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] ToolShed tool preview broken (TestToolShed too)

2014-10-08 Thread Peter Cock
Hi all,

From the new tools information Dave Compiled for the last Galaxy Update
https://wiki.galaxyproject.org/GalaxyUpdates/2014_10#ToolShed_Contributions
I had a look at galaxyp's filter_by_fasta_ids: Extract sequences from a
FASTA file based on a list of IDs tool:

https://toolshed.g2.bx.psu.edu/view/galaxyp/filter_by_fasta_ids

I wanted to see how it compared to my own similar tools (which handle
FASTA, FASTQ, SFF and could  cover more - they replaced my older
single format filter tools):

https://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id
https://toolshed.g2.bx.psu.edu/view/peterjc/seq_select_by_id

Now for the bug report, clicking on the button (under valid tools) which
would normally give a preview of the tool form is failing - giving just
Internal Server Error.

I have tried a random selection of other tools and this seems to be
universal - moreover the TestToolShed also seems to have
the same problem.

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] testtoolshed internal server error

2014-10-08 Thread Peter Cock
On Wed, Oct 8, 2014 at 11:20 AM, Stef van Lieshout
stefvanliesh...@fastmail.fm wrote:
 Anyone else getting this when trying to upload to a testtoolshed repos?
 I'm using the upload files to repository function in repository
 actions and get a blank page with internal server error.. Worked fine
 yesterday.

 Ciao,
 Stef

There's a chance it is the same root problem as this issue which
I hit a couple of hours ago (again internal server error):
http://lists.bx.psu.edu/pipermail/galaxy-dev/2014-October/020614.html

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] testtoolshed internal server error

2014-10-08 Thread Peter Cock
OK good - my issue with the ToolShed work now too :)

On Wed, Oct 8, 2014 at 11:44 AM, Stef van Lieshout
stefvanliesh...@fastmail.fm wrote:
 Ok, works for me again. Just a little hiccup I guess...

 Stef

 - Original message -
 From: Peter Cock p.j.a.c...@googlemail.com
 To: Stef van Lieshout stefvanliesh...@fastmail.fm
 Cc: Galaxy Dev galaxy-...@bx.psu.edu
 Subject: Re: [galaxy-dev] testtoolshed internal server error
 Date: Wed, 8 Oct 2014 11:33:36 +0100

 On Wed, Oct 8, 2014 at 11:20 AM, Stef van Lieshout
 stefvanliesh...@fastmail.fm wrote:
 Anyone else getting this when trying to upload to a testtoolshed repos?
 I'm using the upload files to repository function in repository
 actions and get a blank page with internal server error.. Worked fine
 yesterday.

 Ciao,
 Stef

 There's a chance it is the same root problem as this issue which
 I hit a couple of hours ago (again internal server error):
 http://lists.bx.psu.edu/pipermail/galaxy-dev/2014-October/020614.html

 Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] ToolShed tool preview broken (TestToolShed too)

2014-10-08 Thread Peter Cock
On Wed, Oct 8, 2014 at 9:22 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 Hi all,

 From the new tools information Dave Compiled for the last Galaxy Update
 https://wiki.galaxyproject.org/GalaxyUpdates/2014_10#ToolShed_Contributions
 I had a look at galaxyp's filter_by_fasta_ids: Extract sequences from a
 FASTA file based on a list of IDs tool:

 https://toolshed.g2.bx.psu.edu/view/galaxyp/filter_by_fasta_ids

 I wanted to see how it compared to my own similar tools (which handle
 FASTA, FASTQ, SFF and could  cover more - they replaced my older
 single format filter tools):

 https://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id
 https://toolshed.g2.bx.psu.edu/view/peterjc/seq_select_by_id

 Now for the bug report, clicking on the button (under valid tools) which
 would normally give a preview of the tool form is failing - giving just
 Internal Server Error.

 I have tried a random selection of other tools and this seems to be
 universal - moreover the TestToolShed also seems to have
 the same problem.

 Regards,

 Peter

This is working again now.

See also another possibly related Internal Server Error on
upload which is also working again now:
http://lists.bx.psu.edu/pipermail/galaxy-dev/2014-October/020619.html

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] clustalomega from toolshed installation issue

2014-10-07 Thread Peter Cock
On Tue, Oct 7, 2014 at 1:47 AM, Isabelle Phan
isabelle.p...@seattlebiomed.org wrote:
 Hello,

 I installed clustalomega from the galaxy main toolshed using the admin
 interface of our local galaxy install.

 When I run the tool, I get this message:


 Dataset generation errors
 Dataset 55: co_alignment.fasta


 Tool execution generated the following error message:
 Error invoking command:
 clustalo --force --threads=1 --maxnumseq=30 --maxseqlen=15000 -o
 /opt/galaxy-dist/database/files/000/dataset_158.dat -l
 /opt/galaxy-dist/database/files/000/dataset_159.dat -v -i
 /opt/galaxy-dist/database/files/000/dataset_153.dat

 [Errno 2] No such file or directory



 I've tried reloading my input file, reloading the tool's configuration,
 resetting the metadata on the clustalomega repository, deleting the tool
 completely and re-installing. No luck. If I try to rerun the job again, I
 get non-sensical errors that seem to originate from other tools, as if the
 database was completely garbled. Each of those tools work fine. Only
 clustalomega behaves oddly.

 I have no access to our galaxy server, so any solution would have to be
 implemented from the admin interface.

 thanks for any hints,


 Isabelle

Hi Isabelle,

I am guessing you used this ToolShed repository:

https://toolshed.g2.bx.psu.edu/view/clustalomega/clustalomega

As far as I can see, this does not automatically install the clustalo
binary, which would explain [Errno 2] No such file or directory.
The (incomplete and out of date) README file suggests you are
expected to manually compile it from the bundled copy of the
Clustal Omega source code.

If the only access you have to modify the server is via the Galaxy
admin interface, you will not be able to fix this.

As it stands right now, this ToolShed repository would not get a
gold star approval rating :(

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] datatype directory

2014-09-30 Thread Peter Cock
On Tue, Sep 30, 2014 at 4:34 PM, David Hoover hoove...@helix.nih.gov wrote:
 Why isn't there a datatype for a directory of files?  This seems like
 such a simple thing.  If an executable generates or expects a
 directory as its input or output, why must a fancy complicated
 composite datatype be created to handle this?

 David Hoover

A directory of files is too broad to be of use in itself - in the same
way that defining tools to take or produce a generic data file
is unhelpful.

If you had a directory of files all the same format, then try to
use the Galaxy collections feature (e.g. a collection of FASTA
files).

If the directory has some structure then the composite
datatype is probably most suitable, e.g. an HTML file with
a collection of image files; or a BLAST database made up
of several binary files.

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Set a new metadata attribute

2014-09-26 Thread Peter Cock
On Fri, Sep 26, 2014 at 3:01 PM, Nikos Sidiropoulos
nikos.sid...@gmail.com wrote:
 Hi all,

 In a tool that I am writting I want to pass an input parameter value
 (string) into the output file's metadata. Meaning that one of the tool
 parameters is a barcode signature, 'NNWTGXN' for example. I want that
 attribute to be stored somehow in the output file in order to be read by a
 subsequent tool without the user having to set that parameter again.

 The files I'll be working with are in FASTQ, BAM and tabular format.

 Is it possible?

 Bests,
 Nikos

Your code can write the value directly into an output file
(e.g. one of the SAM/BAM headers might work), but I
don't think there is anything suitable within Galaxy for
re-exporting the parameter value as an input parameter
for a future tool.

However, at the workflow level you can set variables -
might that be a way forward?

https://wiki.galaxyproject.org/Learn/AdvancedWorkflow/VariablesEdit

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] custom datatypes

2014-09-13 Thread Peter Cock
Hi Calvin,

The extension is really the Galaxy datatype name, so
put quikrdb here. The actual filename on disk will be *.dat
once loaded into Galaxy.

More examples, e.g.
https://github.com/peterjc/galaxy_blast/blob/master/datatypes/blast_datatypes/datatypes_conf.xml

Peter

On Fri, Sep 12, 2014 at 9:40 PM, Calvin Morrison mutanttur...@gmail.com wrote:
 Hi,

 I just want a simple data type for my 'custom' datatype (it's just a trained
 matrix in a specific format generated by my database building tool), so that
 my tools which use this database will only see ones with that format in the
 dropdown select list.

 in my datatypes.xml i have this:

 datatype extension=gz type=galaxy.datatypes.quikrdb
 mimetype=application/octet-stream display_in_upload=true description=A
 database trained with quikr. /

 then in my tool xml files i have


 outputs
   data name=output format=quikrdb /
 /outputs

 which seems to be fine, but my tools which use it still show me all my data,
 not just trained db's

 param name=dbname type=data format=quikrdb label=custom trained
 database/

 Am i doing something wrong?

 Calvin Morrison

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Tool Errors

2014-09-12 Thread Peter Cock
On Fri, Sep 12, 2014 at 4:11 PM, Calvin Morrison mutanttur...@gmail.com wrote:
 The stderr and stdout is empty, according to galaxy.

 here is paster.log output for quikr when i run it.


 galaxy.jobs.runners DEBUG 2014-09-12 10:33:45,997 (86) command is: # if user
 == user   quikr -v -k 0 -s
 /data/galaxy/galaxy-dist/database/files/000/dataset_103.dat -i
 /data/galaxy/galaxy-dist/database/files/000/dataset_55.dat -o
 /data/galaxy/galaxy-dist/database/files/000/dataset_104.dat   # else   quikr
 -v -k 0 -s
 /data/galaxy/galaxy-dist/database/files/000/dataset_103.dat.mat.gz -i
 /data/galaxy/galaxy-dist/database/files/000/dataset_55.dat -o
 /data/galaxy/galaxy-dist/database/files/000/dataset_104.dat# end if

 that doesn't really seem all that helpful though.

It does help - that command isn't going to work at the shell - try it and see?

The problem is your Cheetah if statement has not been processed,
and I think it is as simple as you've used invalid syntax in your
command tag. I think you need to remove the extra spaces to have:

#if ...

Not:

# if ...

Then it might work?

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] ToolShed test failure: NotFound: cannot find 'ucsc_display_sites' while searching for 'APP.config.ucsc_display_sites'

2014-09-11 Thread Peter Cock
On Wed, Sep 10, 2014 at 7:55 PM, Nate Coraor n...@bx.psu.edu wrote:
 Hi Peter,

 This was due to a bug I introduced last week, which I've just fixed in
 d1f6d05. Sorry for the trouble.

 --nate

Thanks - I'll check back in a day or two once the tests have
run again.

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] ToolShed test failure: NotFound: cannot find 'ucsc_display_sites' while searching for 'APP.config.ucsc_display_sites'

2014-09-09 Thread Peter Cock
Hi all,

I'm wondering why my samtools_depad repository tests have
failed, and since I have not changed this recently presume this
is due to a Galaxy change or general TestToolShed problem
not specific to my tool:

https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_depad

Tests that failed
Tool id: samtools_depad
Tool version: samtools_depad
Test: test_tool_00
(functional.test_toolbox.TestForTool_testtoolshed.g2.bx.psu.edu/repos/peterjc/samtools_depad/samtools_depad/0.0.1)
Stderr:
Traceback:
Traceback (most recent call last):
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py,
line 114, in test_tool
self.do_it( td )
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py,
line 35, in do_it
self._verify_outputs( testdef, test_history, shed_tool_id,
data_list, galaxy_interactor )
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py,
line 75, in _verify_outputs
galaxy_interactor.verify_output( history, output_data,
output_testdef=output_testdef, shed_tool_id=shed_tool_id,
maxseconds=maxseconds )
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py,
line 82, in verify_output
self._verify_metadata( history_id, hid, attributes )
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py,
line 103, in _verify_metadata
raise Exception( msg )
Exception: Dataset metadata verification for [file_ext] failed,
expected [bam] but found [None].
Traceback (most recent call last):
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/web/framework/decorators.py,
line 243, in decorator
rval = func( self, trans, *args, **kwargs)
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/webapps/galaxy/api/history_contents.py,
line 188, in show
return self.__show_dataset( trans, id, **kwd )
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/webapps/galaxy/api/history_contents.py,
line 214, in __show_dataset
hda_dict[ 'display_apps' ] = self.get_display_apps( trans, hda )
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/web/base/controller.py,
line 855, in get_display_apps
for display_app in hda.get_display_applications( trans ).itervalues():
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/model/__init__.py,
line 1754, in get_display_applications
return self.datatype.get_display_applications_by_dataset( self, trans )
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/datatypes/data.py,
line 445, in get_display_applications_by_dataset
value = value.filter_by_dataset( dataset, trans )
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/datatypes/display_applications/application.py,
line 200, in filter_by_dataset
if link_value.filter_by_dataset( data, trans ):
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/datatypes/display_applications/application.py,
line 78, in filter_by_dataset
if fill_template( filter_elem.text, context = context ) !=
filter_elem.get( 'value', 'True' ):
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/lib/galaxy/util/template.py,
line 9, in fill_template
return str( Template( source=template_text, searchList=[context] ) )
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/eggs/Cheetah-2.2.2-py2.7-linux-x86_64-ucs4.egg/Cheetah/Template.py,
line 1004, in __str__
return getattr(self, mainMethName)()
  File cheetah_DynamicallyCompiledCheetahTemplate_1410263883_33_43576.py,
line 82, in respond
NotFound: cannot find 'ucsc_display_sites' while searching for
'APP.config.ucsc_display_sites'
requests.packages.urllib3.connectionpool: DEBUG: GET
/api/histories/993bad2fe35335db/contents/7fbe67cfae825002?key=edc04240db9605fb7edc7bab44d3404c
HTTP/1.1 500 None
requests.packages.urllib3.connectionpool: INFO: Starting new HTTP
connection (1): 127.0.0.1
requests.packages.urllib3.connectionpool: DEBUG: GET
/api/histories/993bad2fe35335db/contents/7fbe67cfae825002/provenance?key=edc04240db9605fb7edc7bab44d3404c
HTTP/1.1 200 None
requests.packages.urllib3.connectionpool: INFO: Starting new HTTP
connection (1): 127.0.0.1
requests.packages.urllib3.connectionpool: DEBUG: GET
/api/histories/993bad2fe35335db/contents/7fbe67cfae825002/provenance?key=edc04240db9605fb7edc7bab44d3404c
HTTP/1.1 200 None

Any thoughts?

Thanks,


Re: [galaxy-dev] directory as an input file

2014-09-02 Thread Peter Cock
You might be able to do this by accepting a collection of
SAM/BAM files as input instead. This is a quite new feature
in Galaxy, see:

https://wiki.galaxyproject.org/News/2014_06_02_Galaxy_Distribution

Peter

On Wed, Sep 3, 2014 at 10:00 AM, Philippe Moncuquet
philippe.m...@gmail.com wrote:
 Hi,

 I am trying to write a wrapper for a tool that take a directory containing
 SAM/BAM files as an input. I am not sure how to do that, is there another
 tool that implements this and that I can have a look at ? Any suggestions
 would be greatly appreciated.

 Regards,
 Philip

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] when else in conditional ? RE: refresh_on_change : is this a valid attribute? Any other ideas/options??

2014-08-29 Thread Peter Cock
On Fri, Aug 29, 2014 at 11:43 PM, Lukasse, Pieter pieter.luka...@wur.nl wrote:
 So I need to refresh on changeI see that if I have a conditional item in
 my form, this causes a refresh of the page and a (re)evaluation of my
 dynamic_options methodsso I could misuse this “feature”.

This is deliberate, although there has been talk of updating the
conditional code to do the dependent parameters dynamically
rather than server-side with a page refresh.

From your outline description, I think you should be using the
Galaxy conditional tag.

 However, it seems that when I have a conditional I must have
 a when entry for every item in my select box. There is no
 “when else” option?

I think you are right - I've asked in the past about this, e.g.
this discussion which appears not to have been fully on
the mailing list though:

http://dev.list.galaxyproject.org/Multiple-values-in-lt-when-gt-tags-for-lt-conditiona-gt-parameters-tc4659704.html#none

This probably deserves to be tracked with a Trello Card...

Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Examples of Galaxy tools in the toolsheds that install and run JAR files properly?

2014-08-29 Thread Peter Cock
On Sat, Aug 30, 2014 at 11:17 AM, Melissa Cline cl...@soe.ucsc.edu wrote:
 Hi folks,

 I'm attempting something that should be straightforward, but it's not.  I
 have a tool that runs a JAR file, which I have bundled with the tool.  I
 simply want to run the JAR file.  And to paraphrase Thomas Edison, I've
 tried several thousand things that do not work (at least for me), from
 setting the JAVA_JAR_PATH environment variable in the tool_dependencies.xml
 file to trying to copy the JAR file into the tool-data/shared/jars
 subdirectories (which is the closest thing I've got to working).  So, at
 long last I'm doing the sensible thing and looking for one simple working
 example that I can use as a template.  Who can suggest a good toolshed tool
 (either main or test) that involves running its own JAR file, and that
 works?

 Thanks!

 Melissa

Here are a couple of my wrappers for Java tools, but I would
suggest you invoke the Java script with an absolute path to the
JAR file:

$ java -jar ...

Here is two examples done via a Python wrapper script (mainly
used for pre or post processing the data files):

https://github.com/peterjc/pico_galaxy/tree/master/tools/effectiveT3
https://github.com/peterjc/galaxy_blast/tree/master/tools/blast2go

For EffectiveT3 which is open source and can therefore be easily
redistributed, I set an environment variable EFFECTIVET3 for the
location of the Jar file, which is used to invoke it via Java:

$ java -jar ...

For the Blast2GO wrapper, I require the person installing it setup
an environment variable B2G4PIPE pointing at the folder with
the JAR file. Older versions of this tool you be launched with the
same -jar approach, but the current release requires setting a
class path instead:

$ java -cp ...

I hope that helps, if not there are bound to be other Java examples
in the ToolShed.

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] BAM to SAM tool no way to get an unsorted SAM?

2014-08-27 Thread Peter Cock
Do you mean a SAM/BAM file sorted by read name?
If so, try samtools sort -n ... instead.

Peter

On Thu, Aug 28, 2014 at 12:53 PM, Alistair Chilcott
alistair.chilc...@utas.edu.au wrote:
 Hello all,

 My users are trying to use a tool called bismark and it requires an unsorted 
 SAM file for one of its steps

 Previous steps produce BAM files so we would like to include the tool 
 BAM-to-SAM as a step in the workflow unfortunately it seems to be set to sort 
 by default and there is no way to alter this behaviour.

 Is there another tool that would convert a BAM to a SAM without sorting it or 
 can the BAM-to-SAM tool be altered to include a checkbox for an unsorted 
 result?


 Regards,

 Alistair


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] BAM to SAM tool no way to get an unsorted SAM?

2014-08-27 Thread Peter Cock
Ah - I missed this was on the Galaxy list, sorry.

I think that Galaxy automatically coordinate sorts bam files,
which is generally a good thing bar cases like yours. This
problem has undoubtedly come up before - an unsortedbam
datatype may be needed...

Peter

On Thu, Aug 28, 2014 at 1:35 PM, Alistair Chilcott
alistair.chilc...@utas.edu.au wrote:
 I guess that is my point .. at a command prompt that can be
 achieved ( and gives the result they need ) but as far as I can
 see the same is not true via the Galaxy GUI ( unless there is
 a tool I am missing)

 They have 96 separate files to process hence the desire to use a workflow.

 Regards,

 Alistair




 -Original Message-
 From: Peter Cock [mailto:p.j.a.c...@googlemail.com]
 Sent: Thursday, 28 August 2014 2:27 PM
 To: Alistair Chilcott
 Cc: galaxy-dev@lists.bx.psu.edu
 Subject: Re: [galaxy-dev] BAM to SAM tool no way to get an unsorted SAM?

 Do you mean a SAM/BAM file sorted by read name?
 If so, try samtools sort -n ... instead.

 Peter

 On Thu, Aug 28, 2014 at 12:53 PM, Alistair Chilcott 
 alistair.chilc...@utas.edu.au wrote:
 Hello all,

 My users are trying to use a tool called bismark and it requires an
 unsorted SAM file for one of its steps

 Previous steps produce BAM files so we would like to include the tool 
 BAM-to-SAM as a step in the workflow unfortunately it seems to be set to 
 sort by default and there is no way to alter this behaviour.

 Is there another tool that would convert a BAM to a SAM without sorting it 
 or can the BAM-to-SAM tool be altered to include a checkbox for an unsorted 
 result?


 Regards,

 Alistair


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this and other
 Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY

2014-08-22 Thread Peter Cock
On Fri, Aug 22, 2014 at 9:39 AM, Marija Atanaskovic
ma...@unimelb.edu.au wrote:

 Mira doesn’t work on Galaxy. This is the log message I receive.

 Tool: Assemble with MIRA v3.4
 Name: MIRA log
 Created: Fri Aug 22 00:24:38 2014 (UTC)
 Filesize: 0 bytes
 Dbkey: ?
 Format: txt
 Galaxy Tool Version: 0.0.10
 Tool Version: None
 Tool Standard Output: stdout
 Tool Standard Error: stderr
 Tool Exit Code: None
 API ID: d5f55b83db1f410a
 Full Path: /mnt/galaxy/files/000/dataset_326.dat
 ...

What was the stdout and stderr information?

Did you install this from the main tool shed?:
https://toolshed.g2.bx.psu.edu/view/peterjc/mira_assembler

 Also I can’t install Mira 4. This is the message I receive.
 Any suggestions.

Getting Internal Server Error is unhelpful - I can't
really guess what might be going wrong here :(

I have had problems with the MIRA dependencies when
Bastien has renamed folders on sourceforge... are you
using the Test Tool Shed here (since I have not yet
released the MIRA 4 wrapper on the main ToolShed)?:
https://testtoolshed.g2.bx.psu.edu/view/peterjc/mira4_assembler

Regards,

Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY

2014-08-22 Thread Peter Cock
On Fri, Aug 22, 2014 at 8:18 PM, Marija Atanaskovic
ma...@unimelb.edu.au wrote:
 Hi Peter,

 I don¹t know what the stdout and stderr information was. I click on it but
 nothing comes up.

 I installed from the main toolshed:
 http://toolshed.g2.bx.psu.edu/view/peterjc/mira_assembler
 Yes, I did use the test toolshed for Mira 4.

 Regards,

 Marija

If stdout and stderr are empty, this is consistent with MIRA never
even starting (as suggested by the next bit of information below).

On Fri, Aug 22, 2014 at 8:20 PM, Marija Atanaskovic
ma...@unimelb.edu.au wrote:
 Hi Peter,

 This is the error in history under MIRA log:

 tool error
 An error occurred with this dataset:Unable to run job due to a
 misconfiguration of the Galaxy job running system.  Please contact a site
 administrator.

 Regards,

 Marija

That suggests your Galaxy job runner is not properly configured,
and MIRA itself was never started. Are you the site administrator?
Are other Galaxy tools working? Are other Galaxy tools installed
from the ToolShed working?

On Fri, Aug 22, 2014 at 8:22 PM, Marija Atanaskovic
ma...@unimelb.edu.au wrote:
 Hi Peter,

 One more thing. The data that I am trying to analyse are fastq files of
 Ion Torrent reads. I have used them with other assemblers e.g., velvet,
 CLC.

 Marija

I have not personally used Ion Torrent, but that ought to be fine.

Peter

P.S. You forgot to CC the mailing list.

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY

2014-08-22 Thread Peter Cock
On Fri, Aug 22, 2014 at 8:00 PM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Fri, Aug 22, 2014 at 9:39 AM, Marija Atanaskovic
 ma...@unimelb.edu.au wrote:

 Also I can’t install Mira 4. This is the message I receive.
 Any suggestions.

 Getting Internal Server Error is unhelpful - I can't
 really guess what might be going wrong here :(


This is a guess, but it could be you need to set the
https_proxy in the /etc/environment file - see this thread:
http://dev.list.galaxyproject.org/Internal-Server-Error-when-trying-to-install-a-tool-from-the-Tool-shed-tt4665361.html

Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] Test run frequency on TestToolShed

2014-08-18 Thread Peter Cock
Hi all,

Are the main and test tool-sheds currently meant to
be running the tool functional tests every 48 hours?

I created and updated these repositories last week,
but they have yet to be tested:

https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_composition
https://testtoolshed.g2.bx.psu.edu/view/peterjc/blast_rbh

Thanks,

Peter

P.S. it would be nice to be able to sort the Repositories I
own lists etc by date (particularly for my typical workflow
of posting an update to the TestToolShed, waiting for a
green light from the tests, and then pushing this to the
main ToolShed).
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Test run frequency on TestToolShed

2014-08-18 Thread Peter Cock
On Mon, Aug 18, 2014 at 11:16 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 Hi all,

 Are the main and test tool-sheds currently meant to
 be running the tool functional tests every 48 hours?

 I created and updated these repositories last week,
 but they have yet to be tested:

 https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_composition
 https://testtoolshed.g2.bx.psu.edu/view/peterjc/blast_rbh

 Thanks,

 Peter

 P.S. it would be nice to be able to sort the Repositories I
 own lists etc by date (particularly for my typical workflow
 of posting an update to the TestToolShed, waiting for a
 green light from the tests, and then pushing this to the
 main ToolShed).

Perhaps something deeper here, older untested examples:

Revised 2014-07-30 ,
https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id
https://testtoolshed.g2.bx.psu.edu/view/peterjc/blastxml_to_top_descr

Revised 2014-07-31,
https://testtoolshed.g2.bx.psu.edu/view/peterjc/blast2go

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Question about composite blast datatypes

2014-08-14 Thread Peter Cock
On Wed, Aug 13, 2014 at 7:22 PM, Eric Rasche rasche.e...@yandex.ru wrote:
 Hi Peter,

 I'm working on composite datatypes now (for PacBio SMRT cells). In the
 datatype I know I'll have files with variable names (e.g. .1, .2, .3) and
 after using the blast datattype as reference material, I noticed that you
 had commented out files with variable names from the datatype. Is there
 rationale behind that? Are we ensured that the entire directory will always
 be transferred so it's not a problem to not specify files as part of the
 datatype?

 Reference:
 https://github.com/peterjc/galaxy_blast/blob/master/datatypes/blast_datatypes/blast.py#L232

 Cheers,
 Eric

Hi Eric,

Edward Kirton wrote the original BLAST DB datatypes - I'm not sure
how to nicely define open ended file lists for datatypes, but also the
makeblastdb wrapper output does not have this problem (yet).
If and when we try to support partitioned BLAST databases (with a
*.nal or *.pal alias file) then this would be needed.

I would think that the HTML composite datatype might be a better
guide here, since you will order get lots of child files (images etc).

My guess is the whole directory may be transferred anyway, but
having undefined files could be a problem at some point...

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-08-14 Thread Peter Cock
On Mon, Jul 28, 2014 at 9:43 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Mon, Jul 28, 2014 at 8:28 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear Nate, dear Peter

 Sorry for the delay in replying.

 I can import both HTML and blastdb from a history to a data library. If
 I try to get the data out of the library into anothre history, I am
 successful for the html but not for the blastdb. The problem seems to be
 that the primary data file (the /path/dataset_12345.dat) is empty for
 the blastdb, while the html primary file has something in it.

 OK. Can you tell where Galaxy thinks the library files are on disk,
 and check to see if the folder of BLAST database files is actually
 there?

 When I try to import the blastdb (from library to history) there is a
 message along the lines of can't import empty file. I hypothesise
 (admittedly without having looked at a line of code) that there is a
 test for file size 0 somewhere that is either altogether unnecessary or,
 more likely, does not take into account that for composite datatypes it
 might be completely legitimate for the primary file to be empty.

 This guess makes sense - but I've not yet tried to trace through
 the code either.

 Or is my primary blastdb file not supposed to be empty in the first
 place? I can blast against it just fine.

 The BLAST databases do not define/populate a primary file, so
 Galaxy seems to create a dummy empty file on its own. I have
 wondered about altering the BLAST database datatype definition
 to have a human readable text file as the primary file (i.e. the
 information currently saved as a text log file when creating a
 database).

Correction: I actually implemented this late last year (included in
BLAST+ wrapper version v0.0.22 onwards, and the Galaxy
BLAST datatypes version v0.0.18 onwards):

https://github.com/peterjc/galaxy_blast/commit/9b3f65cddcc60de26de63272c362c6ca53f6559d
https://github.com/peterjc/galaxy_blast/commit/2ebfb790d5a1bbe310c3d7ccc2b953c2c37bccf2

The makeblastdb wrapper will send the stdout (log information)
to the dummy index file, see the end of the command tag in:
https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus/ncbi_makeblastdb.xml

The display_data method for a BLAST database will show any
makeblastdb log information held in the dummy index file, see
https://github.com/peterjc/galaxy_blast/blob/master/datatypes/blast_datatypes/blast.py

i.e. Only older BLAST databases in histories should have empty
dummy index files, which will mitigate the library problem:
https://trello.com/c/bNEKfOWR

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Determining datatype inheritance in tool XML Cheetah

2014-08-12 Thread Peter Cock
Hi all,

I've just uploaded a simple sequence composition tool to the
Test Tool Shed:

https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_composition
https://github.com/peterjc/pico_galaxy/commit/45669446f5a14fd90a8a0d9d7430499de2fb3493

This accepts multiple input in FASTA, FASTQ, or SFF format -
and allows a mixture of these:

inputs
param name=input_file type=data format=fasta,fastq,sff
multiple=true label=Sequence file help=FASTA, FASTQ, or SFF
format. /
/inputs

In order to build the command line string, I am currently using this
for loop:

command interpreter=python
seq_composition.py -o $output_file
##For loop over inputs
#for i in $input_file
--$i.ext ${i}
#end for
/command

This results in things like this being run:

seq_composition.py -o XXX.dat --fastqsanger XXX.dat --sff XXX.dat

This works, but means my Python script has to know about not just
the core data types that I specified in my input parameter XML
(fasta,fastq,sff) but also any subclasses (e.g. fastqsanger).

It seems what I want/need would be something along these lines
in pseudo-code to map any datatype which is a subclass for fastq
to use a single command line option:

command interpreter=python
seq_composition.py -o $output_file
##For loop over inputs
#for i in $input_file
#if isinstance($i.datatype, fastq):
--fastq ${i}
#else
--$i.ext ${i}
#end if
#end for
/command

This mock example borrows from the Python isinstance function,
but of course some Galaxy datatypes are defined as subclasses
at the XML level rather than literally at the Python class level.

This should result in getting the following regardless of which
flavour of FASTQ the input dataset had assigned:

seq_composition.py -o XXX.dat --fastq XXX.dat --sff XXX.dat

Does anyone have any Tool XML examples probing an input file's
datatype in this way?

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] testtoolshed : python-2.7 installation error

2014-08-07 Thread Peter Cock
Hi Geert,

Which tool is this?

Peter

On Thu, Aug 7, 2014 at 9:00 AM, Geert Vandeweyer
geert.vandewey...@uantwerpen.be wrote:
 Hi,

 I get an installation error on the python 2.7 package in the test toolshed.
 I used the 'contact owner' function, but wanted to mention it here too, as
 there hasn't been reaction so far. Sorry for double posting if so.

 Error:

 tar (child): 5.2.tar.bz2: Cannot open: No such file or directory

 A similar error is in the Test run outputs. I believe it is related to the
 following (unnecessary)  line in the tool_dependency.xml:

  action type=change_directory../action

 located just after the download_file action for the 5.2.tar.bz2 file.

 Best,

 Geert

 --

 Geert Vandeweyer, Ph.D.
 Department of Medical Genetics
 University of Antwerp
 Prins Boudewijnlaan 43
 2650 Edegem
 Belgium
 Tel: +32 (0)3 275 97 56
 E-mail: geert.vandewe...@ua.ac.be
 http://ua.ac.be/cognitivegenetics
 http://www.linkedin.com/in/geertvandeweyer

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] XLS TO CSV

2014-08-04 Thread Peter Cock
Hi Mert,

Most of the Galaxy tools dealing with tables of data use tabular
format (tab separated variables), not csv (comma separated
variables). CVS is a horrible horrible mess of formats, see e.g.
http://tburette.github.io/blog/2014/05/25/so-you-want-to-write-your-own-CSV-code/

Also beware that anything other than MS Excel could be
confused by quirks in the Excel format, e.g. multiple ways
to record dates: http://support.microsoft.com/kb/180162

I would personally save each tab of the Excel sheet as tab
separated data, and import those into Galaxy.

Peter

On Mon, Aug 4, 2014 at 1:56 PM, Mert Mehnur KIRKALI
mertmehnurkirk...@gmail.com wrote:
 Hello,

 How can i convert xls file to csv file on galaxy ?

 Is that possible ?

 Best Regards,Mert.

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] XLS TO CSV

2014-08-04 Thread Peter Cock
On Mon, Aug 4, 2014 at 4:28 PM, Eric Rasche rasche.e...@yandex.ru wrote:
 Hi Peter,

 On 08/04/2014 09:25 AM, Peter Cock wrote:
 Hi Mert,

 Most of the Galaxy tools dealing with tables of data use tabular
 format (tab separated variables), not csv (comma separated
 variables). CVS is a horrible horrible mess of formats, see e.g.
 http://tburette.github.io/blog/2014/05/25/so-you-want-to-write-your-own-CSV-code/

 This annoyed me when I was first starting out with galaxy, I really wish
 it'd be labelled TSV. The labels all read CSV so I gave galaxy CSV
 data and galaxy didn't like it, much to my confusion.

Which labels say CSV at the moment?

(And yes, I would also have preferred tsv to tabular as the datatype
name in Galaxy, that way it would match the typical file extension).

 Also, most (biologists) I work with use the term CSV very generically
 without regard to the differences between the two.

I've seen that too - but people saying CSV when they mean TSV
will unavoidable cause confusion.

 Also beware that anything other than MS Excel could be
 confused by quirks in the Excel format, e.g. multiple ways
 to record dates: http://support.microsoft.com/kb/180162

 I would personally save each tab of the Excel sheet as tab
 separated data, and import those into Galaxy.

 Would it not make sense to have an XLS - TSV datatype converter? I'm
 sure many biologists would appreciate being able to use the in-galaxy
 version as opposed to having to open+re-save all of their data.

It makes sense to me to offer a tool mapping one Excel sheet to
multiple tabular output files (one per sheet). How best to write
this will depend on the platform and available dependencies
(e.g. some of the R converters for this are Windows only IIRC).

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] TestToolShed failure, Exception: History in error state.

2014-07-31 Thread Peter Cock
Hi Dave,

You are right that on closer inspection I've mixed tool_dependencies.xml
and repository_dependencies.xml *again*. Evidentially my mental model
does not match Greg's here:

(*) I need to define a tool installation recipe for something not in the
Tool Shed -- write an install script called tool_dependencies.xml

(*) I need to depend on a Python package by pointing at another repository
in the Tool Shed -- repository_dependencies.xml

(*) I need to depend on a datatype package by pointing at another repository
in the Tool Shed -- repository_dependencies.xml

(*) I need to depend on a binary package by pointing at another repository
in the Tool Shed -- repository_dependencies.xml ? No. You need
tool_dependencies.xml for this too.

But that aside, the test framework error here is completely unhelpful.

Why is there no error message about missing a dependency?
Was there an error from running my tool which was not shown?

Thanks,

Peter

On Wed, Jul 30, 2014 at 6:07 PM, Dave Bouvier d...@bx.psu.edu wrote:
 Peter,

 I believe part of the problem is that the install and test framework is
 unable to resolve the dependency on blast+ 2.2.29 because it is defined as a
 repository dependency, not a tool dependency. I would recommend replacing
 the repository dependency in the blast_rbh repository with a tool dependency
 definition that references package_blast_plus_2_2_29.

 --Dave B.

 On 07/30/2014 05:27 AM, Peter Cock wrote:

 Hi all,

 I'm not sure when this started (having hardly looked at my Tool Shed
 test results since GCC2014), but I think this is a fairly recent problem
 with my BLAST RBH tests failing (which has held me back from posting
 this to the main Tool Shed).

 This could be some silly mistake in my tar-ball, but usually missing
 test files and the like get an explicit error. The tests are passing
 on my GitHub/TravisCI setup (using Twill and the API backend):
 e.g. https://travis-ci.org/peterjc/galaxy_blast/builds/30592097

 Here is the current error (the same for the last few Test Tool Shed
 runs), https://testtoolshed.g2.bx.psu.edu/view/peterjc/blast_rbh

 Traceback (most recent call last):
File
 /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py,
 line 108, in test_tool
  self.do_it( td )
File
 /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py,
 line 35, in do_it
  self._verify_outputs( testdef, test_history, shed_tool_id,
 data_list, galaxy_interactor )
File
 /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py,
 line 69, in _verify_outputs
  galaxy_interactor.verify_output( history, output_data,
 output_testdef=output_testdef, shed_tool_id=shed_tool_id,
 maxseconds=maxseconds )
File
 /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py,
 line 53, in verify_output
  self.wait_for_history( history_id, maxseconds )
File
 /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py,
 line 107, in wait_for_history
  self.twill_test_case.wait_for( lambda: not self.__history_ready(
 history_id ), maxseconds=maxseconds)
File
 /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/twilltestcase.py,
 line 2453, in wait_for
  result = func()
File
 /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py,
 line 107, in lambda
  self.twill_test_case.wait_for( lambda: not self.__history_ready(
 history_id ), maxseconds=maxseconds)
File
 /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py,
 line 257, in __history_ready
  return self._state_ready( state, error_msg=History in error state.
 )
File
 /var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py,
 line 316, in _state_ready
  raise Exception( error_msg )
 Exception: History in error state.

 Is a more detailed log available which might help debug this?

 Thanks,

 Peter

 As an aside, this looks like the Test Tool Shed is still using the
 Twill backend for the functional tests?
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http

Re: [galaxy-dev] TestToolShed failure, Exception: History in error state.

2014-07-31 Thread Peter Cock
On Thu, Jul 31, 2014 at 5:21 PM, bjoern.gruen...@googlemail.com
bjoern.gruen...@gmail.com wrote:
 Hi Peter,


 2014-07-31 10:57 GMT+02:00 Peter Cock p.j.a.c...@googlemail.com:

 Hi Dave,

 You are right that on closer inspection I've mixed tool_dependencies.xml
 and repository_dependencies.xml *again*. Evidentially my mental model
 does not match Greg's here:

 (*) I need to define a tool installation recipe for something not in the
 Tool Shed -- write an install script called tool_dependencies.xml

 (*) I need to depend on a Python package by pointing at another repository
 in the Tool Shed -- repository_dependencies.xml

 I might be wrong, but I think that also goes to tool_dependencies.xml

Correct, e.g.
https://github.com/peterjc/pico_galaxy/tree/master/tools/seq_select_by_id

Thanks!


 (*) I need to depend on a datatype package by pointing at another
 repository in the Tool Shed -- repository_dependencies.xml

 (*) I need to depend on a binary package by pointing at another repository
 in the Tool Shed -- repository_dependencies.xml ? No. You need
 tool_dependencies.xml for this too

 As far as I understood, everything that is referenced in the tool.xml under
 the requirement section, needs to be in a tool_dependencies.xml file. Any
 other dependency are from the repository (data_types, data_manager,
 workflows ...).

 Ciao,
 Bjoern

Sure, there is a logic here - but its a definition which I seem to still
struggle with :(

 But that aside, the test framework error here is completely unhelpful.

 Why is there no error message about missing a dependency?
 Was there an error from running my tool which was not shown?

 Thanks,

 Peter

I'd still like to get a more explicit error from the test suite than
History in error state though ;)

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] TestToolShed failure, Exception: History in error state.

2014-07-30 Thread Peter Cock
Hi all,

I'm not sure when this started (having hardly looked at my Tool Shed
test results since GCC2014), but I think this is a fairly recent problem
with my BLAST RBH tests failing (which has held me back from posting
this to the main Tool Shed).

This could be some silly mistake in my tar-ball, but usually missing
test files and the like get an explicit error. The tests are passing
on my GitHub/TravisCI setup (using Twill and the API backend):
e.g. https://travis-ci.org/peterjc/galaxy_blast/builds/30592097

Here is the current error (the same for the last few Test Tool Shed
runs), https://testtoolshed.g2.bx.psu.edu/view/peterjc/blast_rbh

Traceback (most recent call last):
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py,
line 108, in test_tool
self.do_it( td )
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py,
line 35, in do_it
self._verify_outputs( testdef, test_history, shed_tool_id,
data_list, galaxy_interactor )
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py,
line 69, in _verify_outputs
galaxy_interactor.verify_output( history, output_data,
output_testdef=output_testdef, shed_tool_id=shed_tool_id,
maxseconds=maxseconds )
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py,
line 53, in verify_output
self.wait_for_history( history_id, maxseconds )
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py,
line 107, in wait_for_history
self.twill_test_case.wait_for( lambda: not self.__history_ready(
history_id ), maxseconds=maxseconds)
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/twilltestcase.py,
line 2453, in wait_for
result = func()
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py,
line 107, in lambda
self.twill_test_case.wait_for( lambda: not self.__history_ready(
history_id ), maxseconds=maxseconds)
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py,
line 257, in __history_ready
return self._state_ready( state, error_msg=History in error state. )
  File 
/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/interactor.py,
line 316, in _state_ready
raise Exception( error_msg )
Exception: History in error state.

Is a more detailed log available which might help debug this?

Thanks,

Peter

As an aside, this looks like the Test Tool Shed is still using the
Twill backend for the functional tests?
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Uploads with embedded citations causing red error on Tool Shed

2014-07-30 Thread Peter Cock
Hi John,

Following the work at the BOSC 2014 CodeFest to support
embedded citations within Galaxy Tool XML files [1], and your
work adding this to the BLAST tools as an example [2], I tried
uploading a minor tool using this to the Tool Shed.

The upload seems to have worked, but there was a scary
red error message about metadata... see below (both main
and test toolsheds affected).

Regards,

Peter


[1] https://bitbucket.org/galaxy/galaxy-central/pull-request/440/

[2] 
https://github.com/peterjc/galaxy_blast/commit/9d2e3906915895765ecc3f48421b91fabf2ccd8b

--

Uploading on the Test Tool Shed, red error:

Metadata may have been defined for some items in revision
'796dc2ff8e8e'. Correct the following problems if necessary and reset
metadata.
blastxml_to_top_descr.xml - 'UniverseApplication' object has no
attribute 'citations_manager'

https://testtoolshed.g2.bx.psu.edu/view/peterjc/blastxml_to_top_descr

--

Uploading on the Tool Shed, red error:

Metadata may have been defined for some items in revision
'fe1ed74793c9'. Correct the following problems if necessary and reset
metadata.
blastxml_to_top_descr.xml - 'UniverseApplication' object has no
attribute 'citations_manager'

https://toolshed.g2.bx.psu.edu/view/peterjc/blastxml_to_top_descr
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-30 Thread Peter Cock
On Wed, Jul 30, 2014 at 11:52 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear Nate, dear Peter

 Again, sorry for the delay in replying.

 Yes I can. It looks like this

 [galaxy@srv ~]$ cat /galaxy/database/files/081/dataset_81002.dat
 [galaxy@srv ~]$ ls /galaxy/database/files/081/dataset_81002_files/
 blastdb.nhd  blastdb.nhi  blastdb.nhr  blastdb.nin  blastdb.nog
 blastdb.nsd  blastdb.nsi  blastdb.nsq

Good. Thanks for confirming that.

 I think the simplest solution would be to put something in the primary
 file. Just a short string that gets the file size above 0.

That won't help with all the existing datasets out there - I think we
rather need to fix something in the Galaxy code for composite files...

 I personally have followed you initial suggestion and made the dbs
 available globally via the .loc file.

 Thanks again
 Ulf

Great.

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Uploads with embedded citations causing red error on Tool Shed

2014-07-30 Thread Peter Cock
Thanks John - is there any point/benefit to re-uploading
my tool once the fix is live on the Tool Shed?

i.e. Was it a harmless warning?

Peter


On Wed, Jul 30, 2014 at 12:12 PM, John Chilton jmchil...@gmail.com wrote:
 Hey Peter,

 Opps sorry about that and thanks for the bug report. The tool shed
 code should be fixed with
 https://bitbucket.org/galaxy/galaxy-central/commits/38ba45d6ba5be65b3b743fc08739e16cd6e0ac8f
 - it is in next-stable so I think the tool shed should pick up that
 fix at next tool shed update.

 -John

 On Wed, Jul 30, 2014 at 5:43 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 Hi John,

 Following the work at the BOSC 2014 CodeFest to support
 embedded citations within Galaxy Tool XML files [1], and your
 work adding this to the BLAST tools as an example [2], I tried
 uploading a minor tool using this to the Tool Shed.

 The upload seems to have worked, but there was a scary
 red error message about metadata... see below (both main
 and test toolsheds affected).

 Regards,

 Peter


 [1] https://bitbucket.org/galaxy/galaxy-central/pull-request/440/

 [2] 
 https://github.com/peterjc/galaxy_blast/commit/9d2e3906915895765ecc3f48421b91fabf2ccd8b

 --

 Uploading on the Test Tool Shed, red error:

 Metadata may have been defined for some items in revision
 '796dc2ff8e8e'. Correct the following problems if necessary and reset
 metadata.
 blastxml_to_top_descr.xml - 'UniverseApplication' object has no
 attribute 'citations_manager'

 https://testtoolshed.g2.bx.psu.edu/view/peterjc/blastxml_to_top_descr

 --

 Uploading on the Tool Shed, red error:

 Metadata may have been defined for some items in revision
 'fe1ed74793c9'. Correct the following problems if necessary and reset
 metadata.
 blastxml_to_top_descr.xml - 'UniverseApplication' object has no
 attribute 'citations_manager'

 https://toolshed.g2.bx.psu.edu/view/peterjc/blastxml_to_top_descr
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-28 Thread Peter Cock
On Mon, Jul 28, 2014 at 8:28 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear Nate, dear Peter

 Sorry for the delay in replying.

 I can import both HTML and blastdb from a history to a data library. If
 I try to get the data out of the library into anothre history, I am
 successful for the html but not for the blastdb. The problem seems to be
 that the primary data file (the /path/dataset_12345.dat) is empty for
 the blastdb, while the html primary file has something in it.

OK. Can you tell where Galaxy thinks the library files are on disk,
and check to see if the folder of BLAST database files is actually
there?

 When I try to import the blastdb (from library to history) there is a
 message along the lines of can't import empty file. I hypothesise
 (admittedly without having looked at a line of code) that there is a
 test for file size 0 somewhere that is either altogether unnecessary or,
 more likely, does not take into account that for composite datatypes it
 might be completely legitimate for the primary file to be empty.

This guess makes sense - but I've not yet tried to trace through
the code either.

 Or is my primary blastdb file not supposed to be empty in the first
 place? I can blast against it just fine.

The BLAST databases do not define/populate a primary file, so
Galaxy seems to create a dummy empty file on its own. I have
wondered about altering the BLAST database datatype definition
to have a human readable text file as the primary file (i.e. the
information currently saved as a text log file when creating a
database).

 Thanks a lot for your help
 Ulf

You too - you've found an interesting bug...

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] tool xml substitutes special characters in text parameter

2014-07-28 Thread Peter Cock
On Mon, Jul 28, 2014 at 11:03 AM, Wolfgang Maier
wolfgang.ma...@biologie.uni-freiburg.de wrote:
 Dear all,

 I noticed that with params of type text Galaxy seems to replace certain
 characters before passing them to the shell. As examples, it changes @ to
 __at__, } to __cc__ and \ to X.
 Is this the standard behavior or am I doing something wrong ? And if it's
 standard, are there workarounds ?

 Best,
 Wolfgang

This is standard behaviour to prevent special characters being used
to construct malicious command lines. It can be configured within
your tool definition using the sanitizer tag set:

https://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax#A.3Csanitizer.3E_tag_set

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] tool xml substitutes special characters in text parameter

2014-07-28 Thread Peter Cock
On Mon, Jul 28, 2014 at 12:23 PM, Wolfgang Maier
wolfgang.ma...@biologie.uni-freiburg.de wrote:
 On 28.07.2014 12:22, Peter Cock wrote:

 This is standard behaviour to prevent special characters being used
 to construct malicious command lines. It can be configured within
 your tool definition using the sanitizer tag set:

 https://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax#A.3Csanitizer.3E_tag_set

 Peter


 Thanks a lot, Peter, that solved my problem !

 Unfortunately, with this one fixed I now run into an additional one:

 There is one free-text text field defined in my tool, which should accept
 characters outside the standard ascii code range (i.e.  127), in
 particular, German Umlaute äöüÄÖÜ.

Hmm. This may be very complicated since a lot will depend on the
local server/cluster's locale settings. Not everything will be UTF-8.

Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-23 Thread Peter Cock
On Wed, Jul 23, 2014 at 10:47 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear all

 I have several smallish BLAST databases that I would like to provide in
 a data library. I create them in a history with the makeblastdb tool and
 them try to add them to the library. I see that for each blast db there
 is an empty file created (like /path/dataset_12345.dat) and a folder
 with the same name (/path/dataset_12345_files/) that contains the actual
 db files (blastdb.n*).

 In my library the blastdb shows up empty and I cannot import it back to
 another history. I does not seem to be aware of the _files folder,
 despite it being the right data type (blastdbn).

 Any ideas what I am doing wrong?

 Thanks a lot for your help
 Ulf

Hi Ulf,

I've never tried that. It could be a bug in Galaxy importing
composite datatypes into a library, or something in the BLAST
database definition which needs fixing. Does importing an
HTML report (with child files like images) into a library work
for you? (This is another composite datatype so a useful
comparison).

Rather than using Data Libraries, we just list all the locally
installed shared BLAST databases via the BLAST *.loc
files instead.

Note using the *.loc files makes the databases available to
all the Galaxy users, while with a Data Library you can
control access to specific groups/roles.

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-23 Thread Peter Cock
Interesting hypothesis - you may well be right.

Galaxy guys - who is the expert to talk to on this and/or where
in the code should we be looking?

Thanks,

Peter

On Wed, Jul 23, 2014 at 11:22 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear Peter

 Thanks for your reply.

 I can import an html report (e.g. FastQC output) successfully into a new
 history from a data library. But the .dat file for the html is not empty
 like the one for the blastdb. Makes me think that I could do this with a
 blast db as well, if only it would not check for size 0 at the time of
 importing it.

 Thanks
 Ulf
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Once-run galaxy archives

2014-07-23 Thread Peter Cock
On Wed, Jul 23, 2014 at 2:42 PM, John Chilton jmchil...@gmail.com wrote:
 Problem with automation is I could create dozens of templates over the
 next several years and consume less time in aggregate than it would
 take me to automate this. Nonetheless, there is a documentation
 component here that is important so I did enough to document - if
 someone wants to automate from there feel free.

 The template is just a copy of the sqlite database after a fresh
 Galaxy is launched. I usually just do this against whatever
 development instance of Galaxy I am working on. For completeness
 though I have put together a script to automate this task against a
 fresh install (I think):

 https://github.com/jmchilton/galaxy-downloads/blob/master/build_sqlite_template.sh

 Good luck!

 -John

Perfect - documentation target achieved :)

In terms of speeding up things like TravisCI using these SQLite
database templates, refreshing this every few schema bumps
should be enough.

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Once-run galaxy archives

2014-07-22 Thread Peter Cock
On Mon, Jul 21, 2014 at 6:51 PM, Eric Rasche rasche.e...@yandex.ru wrote:
 Currently the checkout options consist of hg clones, and archives that
 mercurial produces.

 Having pulled or cloned galaxy a few times lately, I'm wondering if anyone
 would have a use for a once-run galaxy instance in an archive? I.e., I'd
 clone, run once to grab eggs and do the db migration, then re-tar result and
 store online. Might cut down on build/test times for those who are using
 travis or other CIs. Thoughts/opinions?

Hi Eric,

Given how close you can get now for minimal effort,
this seem unnecessary.

http://blastedbio.blogspot.co.uk/2013/09/using-travis-ci-for-testing-galaxy-tools.html

My TravisCI setup this fetches the latest Galaxy as
a tar ball (from a GitHub mirror as it was faster than
a git clone which was faster than getting the tar ball
from BitBucket, which in turn was faster than using
hg clone), and a per-migrated SQLite database
(using a bit of Galaxy functionality originally with
$GALAXY_TEST_DB_TEMPLATE added to speed
up running the functional tests).

Note this does not cache the eggs and all the other
side effects of the first run like creating config files,
so there is room for some speed up.

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Once-run galaxy archives

2014-07-22 Thread Peter Cock
On Tue, Jul 22, 2014 at 1:15 PM, Eric Rasche rasche.e...@yandex.ru wrote:
 Hi Peter,

 On July 22, 2014 3:15:41 AM CDT, Peter Cock p.j.a.c...@googlemail.com wrote:

Given how close you can get now for minimal effort,
this seem unnecessary.

http://blastedbio.blogspot.co.uk/2013/09/using-travis-ci-for-testing-galaxy-tools.html

My TravisCI setup this fetches the latest Galaxy as
a tar ball (from a GitHub mirror as it was faster than
a git clone which was faster than getting the tar ball
from BitBucket, which in turn was faster than using
hg clone),

 Yes, that post was at least part of the thinking behind this.

:)

 .., and a per-migrated SQLite database
(using a bit of Galaxy functionality originally with
$GALAXY_TEST_DB_TEMPLATE added to speed
up running the functional tests).

Apologies for grammatical error - I pasted in the environment
variable at the wrong point in the sentence.

 I know I've seen that used but was never able to get that
 working in practice (then again I didn't try that hard). If
 that's a working/usable feature, then that is already the
 majority of setup time.

Yes, the creation of the test-database and all the migrations
was an obvious low-hanging fruit when we were looking at
making running the tool functional tests faster - although
originally in the context of running the tests on a local
development Galaxy instance.

As to using this in practise, currently my TravisCI setup has:

export 
GALAXY_TEST_DB_TEMPLATE=https://github.com/jmchilton/galaxy-downloads/raw/master/db_gx_rev_0117.sqlite

I also added that line at the start of my local copy of script
run_functional_tests.sh to benefit from this while doing
development. That should be all there is to it (but from
memory, this is only for use with the SQLite backend).

John - could you add a current schema snapshot to
https://github.com/jmchilton/galaxy-downloads/ please?

Note this does not cache the eggs and all the other
side effects of the first run like creating config files,
so there is room for some speed up.

 Eggs would be nice but not the biggest thing in the world.

Right. I do like your idea of automatically generated
cutting-edge or each stable release Docker images
though (even if I have no personal need for them at
the moment).

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Basic Questions

2014-07-22 Thread Peter Cock
Set yourself as an administrator, and you can import the files
from disk (and link to them if you wish to avoid a copy) as part
of a data library. See:

https://wiki.galaxyproject.org/Admin/DataLibraries/UploadingLibraryFiles

Peter

On Tue, Jul 22, 2014 at 3:52 PM, Mark Lindsay m.a.lind...@bath.ac.uk wrote:
 Apologies if this sounds like a basic question or if I am enquiring of the 
 incorrect list.

 I have just had a local instance of galaxy installed on my MacPro.

 Could somebody inform me of the best options for loading large BAM files 
 (5Gb) from the same hard drive into this instance of Galaxy. It states o that 
 it is not possible to load files  2Gb and that you must use either a URL or 
 FTP.

 My scripting knowledge is virtually non-existant….although I have access to 
 people that do.

 Cheers

 Mark




 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Once-run galaxy archives

2014-07-22 Thread Peter Cock
On Tue, Jul 22, 2014 at 7:41 PM, Eric Rasche rasche.e...@yandex.ru wrote:
 John,

 How are those generated? Would you be amenable to scripting that
 portion and running it once a month? (...say in a cron job, with a
 passwordless ssh key so you never have to touch it again)

 Cheers,
 Eric

How to generate it was going to be my next question too ;)

I'm impressed with Eric's zeal to automate things. Having a script
for making the SQLite template would be good - under git in the
same repository?

Peter

P.S. The schema version 120 template works great, thanks!:

https://travis-ci.org/peterjc/pico_galaxy/builds/30592828
https://travis-ci.org/peterjc/galaxy_blast/builds/30592097
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] writing datatypes

2014-07-20 Thread Peter Cock
On Sun, Jul 20, 2014 at 6:23 PM, Björn Grüning bjo...@gruenings.eu wrote:
 Hi,

 single datatype definitions only work if you haven’t defined any converters.
 Let's assume I have a datatype X and want to ship a X - Y converter (Y - X
 is also possible), we will end up with a dependency loop, or? The X
 repository will depend on the Y repository, but Y is depending on X, because
 we want to include a Y - X converter.

 Any idea how to solve that?

Excellent example!

 How to handle versions of datatypes? Extra repositories for stockholm 1.0
 and 1.1? If so ... the associated python file (sniffing, splitting ...)
 should be also versioned, or? What happend if I have two stockholm.py files
 in my system?

Potentially you might need/want to define those as two different
Galaxy datatypes?

 @Peter, can we create a striped-down, python only biopython egg? All parsers
 should be included, Bio.SeqIO should be sufficient I think.

Right now, yes in principle (and this is fine from the licence point of view),
but in practise this is a fair chunk of work. However, we are looking at
this - see https://github.com/biopython/biopython/issues/349

Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] datatype dependencies

2014-07-18 Thread Peter Cock
On Fri, Jul 18, 2014 at 4:21 PM, Eric Rasche rasche.e...@yandex.ru wrote:
 On 07/18/2014 09:49 AM, John Chilton wrote:
 My understanding of the code is that tool shed dependencies (or local
 dependencies) will not be available to tool shed datatypes (for
 sniffing for instance). Sorry.

 I figured as much, not very surprising at all. Dependencies
 notwithstanding, the idea has some modicum of merit. There are plenty of
 people who have already written great parsers that throw up errors, why
 should datatypes re-write them?

Exactly - Trello request for the toolshed to handle both Python and
binary dependencies for datatypes?

(e.g. samtools is a binary dependency of the SAM/BAM datatypes,
used for conversion and indexing)

 If you want to hack up your local instance to resolve dependencies
 during the sniffing process that may be possible - my guess is you
 could add requirement tags to tools/data_source/upload.xml and the
 __SET_METADATA__ tool definition embedded in
 lib/galaxy/datatypes/registry.py - though I have not tried this.

 Well heck, at that point I'd just use the fact that I know I'm in
 lib/galaxy/datatypes to locate the BioPython dependency that was
 installed through greps, globs, and finds. Though I'll hold off on that
 for a better solution.

I'd manually install the Python dependencies as part of the Python
used to run Galaxy itself?

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] API v/s twill based testing

2014-07-18 Thread Peter Cock
On Fri, Jul 18, 2014 at 5:14 PM, Dave Bouvier d...@bx.psu.edu wrote:
 John, Peter,

 The buildbot builders are already using the api interactor for both
 functional tests and the install and test framework.

   --Dave B.

Great news.

When did that happen? Did it cause any regressions (and
can/did you flag those to the repository authors to alert them)?

Assuming that change over went smoothly, is the plan for
changing the default test back-end in the master branch of
galaxy-central (and thus eventually galaxy-stable) for those
running tool tests locally?

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Multiple output tools in Workflow

2014-07-17 Thread Peter Cock
On Thu, Jul 17, 2014 at 1:58 PM, Calogero Zarbo za...@fbk.eu wrote:
 Hello Peter,

 Thanks for your answer.

 I tried your come but I am still not able to make it work the way I want.
 I mean that in the workflow design page, when I switch the parameter, it
 doesn't change the graphical list of outputs of the tool. How can I fix it?
 I want something like the Input, where it shows different outputs according
 to the selected parameter from the select.

 Here is the XML code:


 inputs
 param name=input_dataset label=Input dataset (ShoweLab or FBK
 Format) to split type=data format=showelab-dataset,fbk-svm-dataset/

 conditional name=format_condition

 param name=format_options label=Choose the type of dataset
 you want to split type=select
 option value=fbkFBK Format/option
 option value=showelab selected=TrueShoweLab
 Format/option
 /param
 when value=fbk
 param name=input_fbk_dataset_labels label=Input dataset
 labels (FBK Format) to split type=data format=fbk-labels/
 /when
 when value=showelab /
 /conditional
 param name=split_perc label=Percentage of Training Set among
 complete dataset type=float min=0.05 max=0.95 value=0.75/

 /inputs
 outputs
 data format=showelab-dataset name=trainingDataset
 label=Training Dataset extracted from ${input_dataset.name}
 filterformat_condition[format_options] ==
 showelab/filter

 /data
 data format=showelab-dataset name=validationDataset
 label=Validation Dataset extracted from ${input_dataset.name}
 filterformat_condition[format_options] ==
 showelab/filter

 /data
 data format=fbk-svm-dataset name=trainingDataset
 label=Training Dataset extracted from ${input_dataset.name}
 filterformat_condition[format_options]  == fbk/filter

 /data
 data format=fbk-labels name=trainingLabels label=Training
 Dataset Labels extracted from ${input_fbk_dataset_labels.name}
 filterformat_condition[format_options]  == fbk/filter

 /data
 data format=fbk-svm-dataset name=validationDataset
 label=Validation Dataset extracted from ${input_dataset.name}
 filterformat_condition[format_options]  == fbk/filter

 /data
 data format=fbk-labels name=validationLabels label=Validation
 Dataset Labels extracted from ${input_fbk_dataset_labels.name}
 filterformat_condition[format_options]  == fbk/filter
 /data
 /outputs

You appear to have multiple defined datasets (three versions of
trainingDataset) which may be the problem, as the name is meant
to be unique.

I think you should have ONE data tag for trainingDataset but
set this up to switch output formats accordingly.

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Multiple output tools in Workflow

2014-07-17 Thread Peter Cock
On Thu, Jul 17, 2014 at 2:43 PM, Calogero Zarbo za...@fbk.eu wrote:
 Ok, thanks for the the tip.

 I changed the XML to this one:


 outputs
 data format=showelab-dataset name=trainingDataset
 label=Training Dataset extracted from ${input_dataset.name}
 change_format
when input=format_options value=fbk
 format=fbk-svm-dataset/
 /change_format


 /data
 data format=showelab-dataset name=validationDataset
 label=Validation Dataset extracted from ${input_dataset.name}
 change_format
when input=format_options value=fbk
 format=fbk-svm-dataset/
 /change_format


 /data

 data format=fbk-labels name=trainingLabels label=Training
 Dataset Labels extracted from ${input_fbk_dataset_labels.name}
 filterformat_condition['format_options']  == fbk/filter
 /data

 data format=fbk-labels name=validationLabels label=Validation
 Dataset Labels extracted from ${input_fbk_dataset_labels.name}
 filterformat_condition['format_options']  == fbk/filter
 /data

 /outputs

 Still is not working, maybe the when input=format_options value=fbk
 format=fbk-svm-dataset/ . Maybe it has some problem since the
 format_options parameter is inside a conditional tag?

 Thanks a lot for your time.

I'm not sure off hand - is your complete wrapper in a public repository
somewhere we can look at? However, my general advice would be:

First of all, get it working in the normal tool usage mode (tested by hand).

Then I would get it working with functional tests.

Finally I would test it by hand in the workflow editor, at which point any
problem is probably Galaxy's fault ;)

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] writing datatypes

2014-07-17 Thread Peter Cock
On Thu, Jul 17, 2014 at 4:31 PM, Eric Rasche rasche.e...@yandex.ru wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 For those reading this thread from the future, there's a secret to
 adding completely new datatypes locally (and not through a toolshed).

 You have to manually edit lib/galaxy/datatypes/registry.py and import
 the module you've written at the top of the file.

 For instance, if you add a new gbk.py datatype, you'll need to add
 import gbk to the top of registry.py. This will cause your errors to
 go away and your datatype to be loaded on startup.

 Thanks to John Chilton for answering this on IRC.

 Cheers,
 Eric

Indeed - sorry I hadn't spotted that complication.

The README files for these datatype extensions may help:

https://github.com/peterjc/galaxy_blast/tree/master/datatypes/blast_datatypes
https://github.com/peterjc/pico_galaxy/tree/master/datatypes/mira_datatypes

I have to do this manually with some sed magic in my TravisCI
automated set setup, see:

http://blastedbio.blogspot.co.uk/2013/09/using-travis-ci-for-testing-galaxy-tools.html

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] writing datatypes

2014-07-17 Thread Peter Cock
On Thu, Jul 17, 2014 at 5:45 PM, Björn Grüning
bjoern.gruen...@gmail.com wrote:
 Hi,

 I think you are right John. Datatypes have many issues in that regard as I
 can tell, from a few bug reports. Imho datatypes should be handled like
 Tool dependency definitions. There should be only one installable
 revsion.

 But that aside, emboss datatypes are already broken. For example asn1 was
 added into Galaxy but it still exists in emboss_datatypes.

 Moreover, howto add a proper genbank datatype with sniffer, split and merge
 functions? Ideally, every datatype should have its own repository, but that
 is an overhead I would like to omit ... any other ideas?

 I would love to discuss that issue further, maybe a hangout with Greg and
 Peter?

 Thanks John for your input,
 Bjoern

This could be high level, e.g. other sequence file formats repository
covering GenBank, EMBL, SwissProt plain text, UniProt XML, etc;
one for multiple sequence alignments; one for EMBOSS' own output...

But it wouldn't be that much more work to do one ToolShed repo
per additional file format, would it?

One reason I have been meaning to do some of these is familiarity with
many of these formats from looking after/writing parsers in Biopython.

Having this done sooner rather than later ought to head off too many
incompatible datatype names which worries me. Is it too late to adopt
something like the EDAM ontology for the datatypes within Galaxy?

Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] writing datatypes

2014-07-17 Thread Peter Cock
On Thu, Jul 17, 2014 at 5:55 PM, Eric Rasche rasche.e...@yandex.ru wrote:

 Not a problem Peter, it's a somewhat subtle bug to have, and there isn't
 a lot of documentation on the wiki about writing new datatypes (though I
 plan to fix that soon).

 That particular error message could stand to be a bit more explicit.
 (e.g., Did you forget to add import mylib to registry.py?).

 Also, thanks for sharing the blog post. Since we develop all of our
 tools internally, I may adapt and publish your post with similar
 instructions for jenkins, if that's all right by you.

 Cheers,
 Eric

Please do :)

Peter

P.S. I know Saket is using this approach too now:
https://github.com/saketkc/galaxy_tools
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] writing datatypes

2014-07-17 Thread Peter Cock
On Thu, Jul 17, 2014 at 6:10 PM, Björn Grüning
bjoern.gruen...@gmail.com wrote:

 ... but the problem will stay the same ... one [datatype definition] 
 repository
 can have multiple versions ...


I like your idea that like tool dependency definitions, this should be a special
repository type on the ToolShed:

Earlier, Björn Grüning bjoern.gruen...@gmail.com wrote:

 Imho datatypes should be handled like Tool dependency definitions.
 There should be only one installable revsion.


This is something Greg will have to comment on - there may be
ramifications I'm not seeing.

Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] writing datatypes

2014-07-17 Thread Peter Cock
On Thu, Jul 17, 2014 at 6:28 PM, Eric Rasche rasche.e...@yandex.ru wrote:
 Am 17.07.2014 18:51, schrieb Peter Cock:

 One reason I have been meaning to do some of these is familiarity with
 many of these formats from looking after/writing parsers in Biopython.

 Peter, similar case here with BioPerl. All of my tools can output the
 full range of Bio::SeqIO output formats, so having datatypes would be
 great. Happy to contribute there.

Sounds good. The EMBOSS, BioPerl and Biopython projects have tried
to adopt consistent file format names (pre-dating the EDAM ontology),
but unfortunately the names adopted in Galaxy sometimes diverge :(

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] writing datatypes

2014-07-17 Thread Peter Cock
Good point Greg.

Let's refine this slightly then, a new special ToolShed repository type for
a *single* datatype definition. That avoids this problem :)

(This does not help with suites of very closely related datatypes -
like different
kinds of BLAST database.)

Peter

On Thu, Jul 17, 2014 at 6:35 PM, Greg Von Kuster g...@bx.psu.edu wrote:
 This would be easy to implement, but could adversely affect reproducibility.
 If a repository containing datatypes always had only a single installable
 revision (i.e., the chagelog tip), then any datatypes defined in an early
 changeset revision that are removed in a later changeset revision would
 no longer be available.

 Greg

 On Jul 17, 2014, at 1:30 PM, Peter Cock p.j.a.c...@googlemail.com wrote:

 On Thu, Jul 17, 2014 at 6:10 PM, Björn Grüning
 bjoern.gruen...@gmail.com wrote:

 ... but the problem will stay the same ... one [datatype definition] 
 repository
 can have multiple versions ...


 I like your idea that like tool dependency definitions, this should be a 
 special
 repository type on the ToolShed:

 Earlier, Björn Grüning bjoern.gruen...@gmail.com wrote:

 Imho datatypes should be handled like Tool dependency definitions.
 There should be only one installable revsion.


 This is something Greg will have to comment on - there may be
 ramifications I'm not seeing.

 Peter

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Wiki datatypes tutorial

2014-07-17 Thread Peter Cock
On Thu, Jul 17, 2014 at 7:19 PM, Eric Rasche rasche.e...@yandex.ru wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Thanks to everyone for their assistance in my adventure of custom local
 datatypes. In response to this, I've added a new wiki page with a basic
 MWE/tutorial on adding datatypes. A complete example is at the end,
 because most people like copy+paste code to get them started.

 https://wiki.galaxyproject.org/Admin/Datatypes/AddingCompleteDatatypes
 Please feel free to add to it/fix things I completely misunderstood.

 I'm not sure what 80% of the functions that get called in datatypes do,
 nor where they're called from, so I can't offer much more detail in this
 wiki page than I already have. (E.g., when is split called? If I write a
 split method, how can I test it? What other methods should I implement?)

 Cheers,
 Eric

Hi Eric,

Good work :)

Split and merge are used when a tool has a parallelism .../ tag
and this is enabled in your universe_wsgi.ini file. As an example,
see the BLAST wrappers, e.g.

https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus/ncbi_blastn_wrapper.xml

This will split on the query FASTA file, and merge on the output
file (which could be text, html, tabular, blastxml) using the output
datatype's merge method.

I had to work out a lot of this from reading the code and queries
on the mailing list.

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] datatype dependencies

2014-07-17 Thread Peter Cock
You could do something like that, and we already have
Biopython packages in the ToolShed which can be listed
as dependencies :)

However, some things like GenBank are tricky - in order
to tolerate NCBI dumps the Biopython parser will ignore
any free text before the first LOCUS line. A confusing
side effect is most text files are then treated as a
GenBank file with zero records. But if it came back
with some records it is probably OK :)

Basically Biopython also does not care to offer file
format detection simply because it is a can of worms.

Zen of Python - explicit is better than implicit.

We want you to tell us which format you want to try
parsing it as.

Sorry,

Peter
(Speaking as the Bio.SeqIO maintainer for Biopython)


On Thu, Jul 17, 2014 at 7:45 PM, Eric Rasche rasche.e...@yandex.ru wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Let's pretend for a second that I'm rather lazy (oh...wait), and I have
 ZERO interest in writing datatype parsers to sniff and validate whether
 or not a specific file is a specific datatype. I'm a sysadmin and
 bioinformatician, and I've worked with dozens of libraries that exist to
 parse file formats, and they all die in flames when I feed them bad data.

 Would it be possible to somehow define requirements for datatypes?

 I don't want to take on the burden of code I write saying yes, I've
 sniffed+validated this and it is absolutely a genbank file. That's a
 lot of responsibility, especially if people have malformed genbank files
 and their tools fail as a result.

 I would like to do this with BioPython and turf the validation to
 another library that exists to parse genbank files, that will raise and
 exception if they're invalid.

 def sniff(self, filename):
   from Bio import SeqIO
   try:
 self.records = list(SeqIO.parse( filename, genbank ))
 return True
   except:
 self.records = None
 return False

 def validate(self, dataset):
   from Bio import SeqIO
errors = list()
   try:
 self.records = list(SeqIO.parse( dataset.file_name, genbank ))
   except Exception, e:
 errors.append(e)
   return errors

 def set_meta(self, dataset, **kwd):
   if self.records is not None:
 dataset.metadata.number_of_sequences = len(self.records)

 so much easier! And I can shift the burden of validation and sniffing to
 upstream, rather than any failures being my fault and requiring
 maintenance of a complex sniffer.

 Cheers,
 Eric

 - --
 Eric Rasche
 Programmer II
 Center for Phage Technology
 Texas AM University
 College Station, TX 77843
 404-692-2048
 e...@tamu.edu
 rasche.e...@yandex.ru
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.22 (GNU/Linux)

 iQIcBAEBAgAGBQJTyBmyAAoJEMqDXdrsMcpVQa0P/jj0edAKM6QsodhRWHglR92W
 tej1tJjtPgtJ15wsFzq6wVfhbL5J39ytsWjjtk//jhVNXh4FEE/OFZe6Nx9uTFKP
 ybazyTrLSCrxsST+w+Rx8Q9vfzShr87vjP+fC1k5i2EZOgogPOcQml1ouOHHjC6z
 pArrwPOvL3ZxWJG7oEcZjUjrPD8+ffhfQ/x096YYIMw7Hg74d50ARwtawJRoslZD
 JnYWa+aUOcsvC3QMrLKkDm4qBaTHa5x7x7P07Lcx7X65iMPDcuMZNtImiLztNscF
 QwbbdJdcs8oeSRRnmKgAllRAKf4dMeiyaSI+muVzNlpvLlSMZBNawD0bO1OXmIQH
 vAaV0eU+rYmDJSGo330o+RydvlDJENTXOkDt0TxmvfYAPtg2TlJCiWUdL7V1LqqF
 n8J5Z7Cu/sqRGSr5ww6KY27QHq6TU1WZDsVZiyEWJeKg3HGzp0MUmzMdr7iSZawK
 gnZxv6qg3+FlSqA30niyAuxEq588vS8uEFjjOfhnNLsUM7FAuFANF5z9bPOhG2qM
 Xjc3/NY7NsERd9nsIwfRuz0DWni8upvZ39vfeRZ3OAW9NwjRzqXrQiQp08XHa934
 z4EBnpcWc9rNSV/3APF/imecBTOoiKtZfzIfILLtOPGE407Bmd8cE8hWyW7ipvrT
 QU6DIimj3eoMn+elXDfX
 =M+s5
 -END PGP SIGNATURE-
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] datatype dependencies

2014-07-17 Thread Peter Cock
On Thu, Jul 17, 2014 at 8:20 PM, Eric Rasche rasche.e...@yandex.ru wrote:
 On 07/17/2014 02:11 PM, Peter Cock wrote:
 You could do something like that, and we already have
 Biopython packages in the ToolShed which can be listed
 as dependencies :)


 If my module depends on the biopython from the toolshed, will that be
 accessible within a datatype? Would it be as simple as from Bio import
 X? Most of what I've seen of dependencies (and please forgive my lack
 of knowledge about them) consists of env.sh being sourced with paths to
 binaries, prior to tool run.

I don't know - this may well be a gap in the ToolShed
framework, since thus far most of the datatypes defined
have been self contained.

I have asked something similar before (in the context
of defining automatic file format conversion like the way
Galaxy can turn FASTA into tabular in input parameters
expecting tabular), where there could be a binary
dependency.

 However, some things like GenBank are tricky - in order
 to tolerate NCBI dumps the Biopython parser will ignore
 any free text before the first LOCUS line. A confusing
 side effect is most text files are then treated as a
 GenBank file with zero records. But if it came back
 with some records it is probably OK :)

 Interesting, very good to know.


 Basically Biopython also does not care to offer file
 format detection simply because it is a can of worms.

 Zen of Python - explicit is better than implicit.

 We want you to tell us which format you want to try
 parsing it as.

 Yes! Exactly! Which is why it's perfectly fine here:

 SeqIO.parse( dataset.file_name, genbank )

 All I want to know is whether or not this parses as a genbank file (and
 has 1 or more records). BioPython may not do automatic format detection
 (yuck, agreed), but since I already know I'm looking for a genbank file,
 simply being able to parse it or not is good enough.

With those provisos, you should be OK :)

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] writing datatypes

2014-07-16 Thread Peter Cock
Indeed - ideally (once working) we can upload under the IUC ToolShed as a
community maintained resource rather than under a personal account which
becomes a single point of failure (the bus factor).

We (the ICU) have previously discussed doing this so that the EMBOSS
datatypes could become more of a meta-entry depending on other smaller
specific datatype defining ToolShed repositories. But it hasn't reached the
top of my personal TODO list yet ;)

Peter

On Wed, Jul 16, 2014 at 1:47 PM, Björn Grüning
bjoern.gruen...@gmail.com wrote:
 Hi Eric,

 please have a look at:

 https://github.com/bgruening/galaxytools/blob/master/datatypes/msa_datatypes/datatypes_conf.xml

 You need somthing like:
 datatype extension=genbank type=galaxy.datatypes.data:Text
 subclass=True /

 Lets try to split the EMBOSS datatypes a little bit into small chunks. E.g.
 sequences_datatypes, msa_datatypes ... and so on ...

 Cheers,
 Bjoern


 Am 14.07.2014 20:31, schrieb Eric Rasche:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 I'm trying to add a new datatype to my galaxy instance for genbank
 files, however I'm running into various issues. I've followed the
 tutorial
 (https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes)

 however that example subclasses tabular, and I'd like to subclass Text
 as they're plain text files, and I'd like to be able to define a sniffer
 for them (not possible if your type=galaxy.datatypes.data:Text)

 I figured the call ought to be something like

 datatype extension=gb type=galaxy.datatypes.data:Genbank
 subclass=True /

 however, everything I try fails with

 Error importing datatype module galaxy.datatypes.data: 'module' object
 has no attribute 'Genbank'


 To avoid this particular issue, I tried writing a separate datatype just
 for genbank files (type=galaxy.datatypes.genbank:Genbank), however
 that fails with the same error:

 galaxy.datatypes.registry ERROR 2014-07-14 13:23:23,100 Error importing
 datatype module galaxy.datatypes.genbank: 'module' object has no attribute
 'genbank'
 Traceback (most recent call last):
File /home/hxr/work/galaxy-central/lib/galaxy/datatypes/registry.py,
 line 206, in load_datatypes
  module = getattr( module, mod )
 AttributeError: 'module' object has no attribute 'genbank'


 Here's my lib/galaxy/datatypes/genbank.py looks like:

 import pkg_resources
 pkg_resources.require( bx-python )
 import logging
 from galaxy.datatypes import data
 log = logging.getLogger(__name__)

 class Genbank( data.Text ):
  file_ext = gb

  def sniff( self, filename ):
  header = open(filename).read(5)
  return header == 'LOCUS'


 To debug this, I've tried copying the tabular data type completely,
 removed all the classes other than Tabular, and renamed it Genbank,
 however this fails too with the same error.

 Can anyone offer some insight?

 Cheers,
 Eric
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.22 (GNU/Linux)

 iQIcBAEBAgAGBQJTxCHwAAoJEMqDXdrsMcpVmbsQAJ3eFIhZtZmVP9LCz/F9Ywg/
 148NJZy4lmxZU0KScJlc8kVDCDSADXIHd0Db/kpJwuUKEX7zei9q2uXfO7sWl3yt
 yxrFEdtX/a5SMVsa6F5WZuKwBs0zfvfsnIUoraOgh6nXeJnr53l9mYeWaKB6bi3Z
 xAlgJG/kdIR1jRjAimuQf4vMjNgtDQPOmotYBQTytbhsV6/nRzGI8RZAYwQ7GnVs
 XYOWFyhzrBgALndVI3BjI21rbRqguhrqr2t7i0Ma7Pp2JmAnNjmUaq70NN3Rueh6
 DvnTtxInM1dVOQY+Yam6MCMmAedV1cG+rNGdpP2l82MajQAsMtbXckBXXKcSgyTq
 WCFoLVURYO1tHkWyq4ikamfFDHtJp1DogBYhUiPMyRw+CV+3sOvr0U5DcyRdiDsJ
 Xcm3ygqYVLGwauNmuN3yGcQcnfypDOOeFs1lppbNe3lw0w3ikZN4Zmu1ec5s1ITK
 MEcgBrGYgZrKDRXkx53lnABGpv6mYflYpag7fguDNL8j0lh9beaaNmHr4tmeEcug
 VZ1b1EWoLMj/ikJ/vZcluiHPTSTheiAP8Ttvh1WAayq4rKwVtZygaI9IDauqqBQ1
 Dgotes3vcomlTQXDUEZACyOZDxl7wbAUh0LZVaa2fYNIOoPNPOItUFSjf6YveF88
 dLiw3ddVm+BFmczJzRpt
 =4m2j
 -END PGP SIGNATURE-
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Multiple output tools in Workflow

2014-07-16 Thread Peter Cock
On Wed, Jul 16, 2014 at 5:28 PM, Calogero Zarbo za...@fbk.eu wrote:
 Hello to everybody

 I'm developing my own tool that need to switch the numbers of output files
 according to a parameter selected by the user from a list in the inputs
 tag.
 How can I do such thing?

 Here is the XML code:
 inputs
 param name=input_dataset label=Input dataset type=data
 format=showelab-dataset,fbk-svm-dataset/
 conditional name=format_condition

 param name=format_options label=Choose the type of dataset
 you want to split type=select
 option value=fbk.fbk-svm-dataset Format/option
 option value=showelab selected=True.showelab-dataset
 Format/option
 /param
 when value=fbk
 param name=input_fbk_dataset_labels label=Input dataset
 labels type=data format=fbk-labels/
 /when
 /conditional

 /inputs
 outputs
 data format=showelab-dataset name=trainingDataset
 label=Training Dataset extracted from ${input_dataset.name}
 filterformat_options == showelab/filter
 /data
 data format=showelab-dataset name=validationDataset
 label=Validation Dataset extracted from ${input_dataset.name}
 filterformat_options == showelab/filter
 /data
 data format=fbk-svm-dataset name=trainingDataset
 label=Training Dataset extracted from ${input_dataset.name}
 filterformat_options == fbk/filter
 /data
 data format=fbk-labels name=trainingLabels label=Training
 Dataset Labels extracted from ${input_fbk_dataset_labels.name}
 filterformat_options == fbk/filter
 /data
 data format=fbk-svm-dataset name=validationDataset
 label=Validation Dataset extracted from ${input_dataset.name}
 filterformat_options == fbk/filter
 /data
 data format=fbk-labels name=validationLabels label=Validation
 Dataset Labels extracted from ${input_fbk_dataset_labels.name}
 filterformat_options == fbk/filter
 /data

 /outputs


 Basically I would like to have in the Workflow Canvas that the outputs
 displayed in the interface will change according to the format_options
 select parameter.

 Thanks in advance.

Hi Calogero,

I think this tool of mine would serve as a working example:
https://github.com/peterjc/pico_galaxy/tree/master/tools/seq_filter_by_id

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] API v/s twill based testing

2014-07-16 Thread Peter Cock
On Wed, Jul 16, 2014 at 7:44 PM, Saket Choudhary sake...@gmail.com wrote:
 Thanks Peter, I guess I should then rely on API based tests.


If it is just the order, make sure the order of the output files in the test
is consistent with that in the outputs and it make be OK with Twill...
I wonder if I filed a Trello card on this, or just an email?

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] writing datatypes

2014-07-15 Thread Peter Cock
Hi Eric

There is already a genbank format in the EMBOSS datatypes
(although there is talk of defining this and others in a set of
smaller repositories defined as its dependencies for more
modularity). Note it uses genbank not gb as the name!

https://toolshed.g2.bx.psu.edu/view/devteam/emboss_datatypes

However that doesn't answer your question :(

Peter

On Mon, Jul 14, 2014 at 7:31 PM, Eric Rasche rasche.e...@yandex.ru wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 I'm trying to add a new datatype to my galaxy instance for genbank
 files, however I'm running into various issues. I've followed the
 tutorial (https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes)

 however that example subclasses tabular, and I'd like to subclass Text
 as they're plain text files, and I'd like to be able to define a sniffer
 for them (not possible if your type=galaxy.datatypes.data:Text)

 I figured the call ought to be something like

 datatype extension=gb type=galaxy.datatypes.data:Genbank
 subclass=True /

 however, everything I try fails with

 Error importing datatype module galaxy.datatypes.data: 'module' object has 
 no attribute 'Genbank'

 To avoid this particular issue, I tried writing a separate datatype just
 for genbank files (type=galaxy.datatypes.genbank:Genbank), however
 that fails with the same error:

 galaxy.datatypes.registry ERROR 2014-07-14 13:23:23,100 Error importing 
 datatype module galaxy.datatypes.genbank: 'module' object has no attribute 
 'genbank'
 Traceback (most recent call last):
   File /home/hxr/work/galaxy-central/lib/galaxy/datatypes/registry.py, 
 line 206, in load_datatypes
 module = getattr( module, mod )
 AttributeError: 'module' object has no attribute 'genbank'

 Here's my lib/galaxy/datatypes/genbank.py looks like:

 import pkg_resources
 pkg_resources.require( bx-python )
 import logging
 from galaxy.datatypes import data
 log = logging.getLogger(__name__)

 class Genbank( data.Text ):
 file_ext = gb

 def sniff( self, filename ):
 header = open(filename).read(5)
 return header == 'LOCUS'

 To debug this, I've tried copying the tabular data type completely,
 removed all the classes other than Tabular, and renamed it Genbank,
 however this fails too with the same error.

 Can anyone offer some insight?

 Cheers,
 Eric
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.22 (GNU/Linux)

 iQIcBAEBAgAGBQJTxCHwAAoJEMqDXdrsMcpVmbsQAJ3eFIhZtZmVP9LCz/F9Ywg/
 148NJZy4lmxZU0KScJlc8kVDCDSADXIHd0Db/kpJwuUKEX7zei9q2uXfO7sWl3yt
 yxrFEdtX/a5SMVsa6F5WZuKwBs0zfvfsnIUoraOgh6nXeJnr53l9mYeWaKB6bi3Z
 xAlgJG/kdIR1jRjAimuQf4vMjNgtDQPOmotYBQTytbhsV6/nRzGI8RZAYwQ7GnVs
 XYOWFyhzrBgALndVI3BjI21rbRqguhrqr2t7i0Ma7Pp2JmAnNjmUaq70NN3Rueh6
 DvnTtxInM1dVOQY+Yam6MCMmAedV1cG+rNGdpP2l82MajQAsMtbXckBXXKcSgyTq
 WCFoLVURYO1tHkWyq4ikamfFDHtJp1DogBYhUiPMyRw+CV+3sOvr0U5DcyRdiDsJ
 Xcm3ygqYVLGwauNmuN3yGcQcnfypDOOeFs1lppbNe3lw0w3ikZN4Zmu1ec5s1ITK
 MEcgBrGYgZrKDRXkx53lnABGpv6mYflYpag7fguDNL8j0lh9beaaNmHr4tmeEcug
 VZ1b1EWoLMj/ikJ/vZcluiHPTSTheiAP8Ttvh1WAayq4rKwVtZygaI9IDauqqBQ1
 Dgotes3vcomlTQXDUEZACyOZDxl7wbAUh0LZVaa2fYNIOoPNPOItUFSjf6YveF88
 dLiw3ddVm+BFmczJzRpt
 =4m2j
 -END PGP SIGNATURE-
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] API v/s twill based testing

2014-07-15 Thread Peter Cock
Hi Saket,

From memory the Twill tests are fragile with the output file order in the XML.

John was discussing switching the default from the Twill to API backend,
not sure when that is happening though...

Peter

On Tue, Jul 15, 2014 at 9:31 AM, Saket Choudhary sake...@gmail.com wrote:
 I recently updated tests for one of my wrappers and came across this strange
 behaviour:

 The twill based testing reports a failure:
 https://travis-ci.org/saketkc/galaxy_tools/jobs/29956682#L1463

 whereas, the API based testing shows success:
 https://travis-ci.org/saketkc/galaxy_tools/jobs/29956683

 Unfortunately I cannot run these tests locally since I am behind a system
 proxy [Refer:
 http://dev.list.galaxyproject.org/Functional-Tests-and-ftype-td4664233.html]
 and have to rely on travis..

 The place where twii tests fails shows that  it is trying to compare the
 diff between 'chasm_output_genes.tabular'  and
 'chasm_output_variants.tabular' instead of  'chasm_output_genes.tabular'.
 [See : https://travis-ci.org/saketkc/galaxy_tools/jobs/29956682#L1469]

 I tried running my tools locally and I did not come across any case where
 the 'variants' output gets replaced by the 'genes' output, thus possibly
 ruling out unexpected behavior from the tool's server end.

 Is this a possible bug or am I missing something?

 Saket

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Escaped text from input fields

2014-07-05 Thread Peter Cock
Hi Renato,

Yes, Galaxy maps potentially problematic/dangerous characters
to escaped versions. You can control this, see sanitizer on:
https://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax

Peter

On Sat, Jul 5, 2014 at 9:04 AM, Renato Alves rjal...@igc.gulbenkian.pt wrote:
 Hi everyone,

 I'm writing a wrapper that includes one text field.

 I'm then passing the contents of this field to the underlying tool with
 a simple:

   command $textfield

 However when a user inputs something like Test#1 the command ends up as:

   command Test__pd__1

 I did a quick search on the web and it seems to be the name of some
 escaping function in galaxy.

 Is there any way I can get the text field content across untouched?

 Thanks,
 Renato


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Per-tool configuration

2014-06-27 Thread Peter Cock
On Fri, Jun 27, 2014 at 3:13 PM, John Chilton jmchil...@gmail.com wrote:
 On Fri, Jun 27, 2014 at 5:16 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Wed, Jun 18, 2014 at 12:14 PM, Peter Cock p.j.a.c...@googlemail.com 
 wrote:

 John - that Trello issue you logged, https://trello.com/c/0XQXVhRz
 Generic infrastructure to let deployers specify limits for tools based
 on input metadata (number of sequences, file size, etc...)

 Would it be fair to say this is not likely to be implemented in the near
 future? i.e. Should we consider implementing the BLAST query limit
 approach as a short term hack?

 It would be good functionality - but I don't foresee myself or anyone
 on the core team getting to it in the next six months say.

 ...

 I am now angry with myself though because I realized that dynamic job
 destinations are a better way to implement this in the meantime (that
 environment stuff was very fresh when I responded so I think I just
 jumped there). You can build a flexible infrastructure locally that is
 largely decoupled from the tools and that may (?) work around the task
 splitting problem Peter brought up.

 Outline of the idea:
 snip

Hi John,

So the idea is to define a dynamic job mapper which checks the
query input size, and if too big raises an error, and otherwise
passes the job to the configured job handler (e.g. SGE cluster).

See https://wiki.galaxyproject.org/Admin/Config/Jobs

It sounds like this ought to be possible right now, but you are
suggesting since this seems quite a general use case, the
code to help build a dynamic mapper using things like file
size (in bytes or number of sequences) could be added to
Galaxy?

This approach would need the Galaxy Admin to setup a custom
job mapper for BLAST (which knows to look at the query file),
but it taps into an existing Galaxy framework. By providing a
reference implementation this ought to be fairly easy to setup,
and can be extended to be more clever about the limits.

e.g. For BLAST, we should consider both the number (and
length) of the queries, plus the size of the database.

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Control on versioning in toolshed tools

2014-06-24 Thread Peter Cock
Hi Eric,

Despite the fact that internal hg repositories are used, the idea
is NOT to use them as development repositories - but ONLY push
releases to the ToolShed.

In the interests of reproducibility (other people might use your
ToolShed entry in a workflow, or as a dependency), you should
not be able to ever rewrite history or delete commits - something
you can do with a git or hg repository but should generally avoid.

i.e. Being allowed to cleapup and start again is blocked by the
Galaxy goal of reproducibility.

I personally prefer git to hg, and therefore use that for development
tracking of my own ToolShed releases - but if you like hg then I
would suggest using a BitBucket.org hosted hg repository for
developing your tool.

You can see examples here - many of these tools do have
explicit dependencies on other tools/packages in the ToolShed
(either my own, or from 3rd parties):

https://github.com/peterjc/galaxy_blast
https://github.com/peterjc/pico_galaxy

Regards,

Peter

On Tue, Jun 24, 2014 at 1:15 PM, Eric Kuyt eric.ku...@wur.nl wrote:
 Hi All,

 I am playing around with putting a tool in testtoolshed. Now when changes to
 dependency versions are detected, the toolshed detects a new version and a
 dropdown is created.

 but sometimes I do not want this behavior when the first version was
 erroneous for example. I tried hg strip on the repository and pushing it
 back to the testtoolshed but sadly it didn't result in a clean repository
 but a multi-headered mess.

 Is there a way to cleanup the remote repository and start over. And what
 would be a cleaner way to develop tools on a toolshed still making use of
 repository dependencies?

 Thanks,

 Eric
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Fwd: specifying default file by name in workflows

2014-06-19 Thread Peter Cock
Hi Evan,

Assuming you are talking about an input file from:

param type=data .../

I don't think you can set a default - the possible files will
depend on the current history, so could be zero, one or
many files. Also, how would how specify a specific file?
They would have Galaxy assigned *.dat filenames on disk,
while the names could have been set to anything by the user.

My guess is you may be better off defining a new datatype
(a subclass of txt perhaps?).

Regards,

Peter


On Wed, Jun 18, 2014 at 10:28 PM, Evan Bollig boll0...@umn.edu wrote:
 To clarify, I want to specify the default selected file name for an
 Input Dataset block in the workflow, but I'd like to keep the option
 open to select other input names with the same type.

 When I specify a file type for a tool's input, it is not enough. The
 input dataset can end up finding dozens of files that are txt for
 example. I want to know all possible tool_state annotations for Input
 Dataset (tool_id: null). Where would I find this?

 Cheers,

 -Evan Bollig
 Research Associate | Application Developer | User Support Consultant
 Minnesota Supercomputing Institute
 599 Walter Library
 612 624 1447
 e...@msi.umn.edu
 boll0...@umn.edu



 -- Forwarded message --
 From: Evan Bollig boll0...@umn.edu
 Date: Wed, Jun 18, 2014 at 11:19 AM
 Subject: specifying default file by name in workflows
 To: galaxy-...@bx.psu.edu galaxy-...@bx.psu.edu


 I don't know all the subtleties of the galaxy workflow syntax. My goal
 is to specify the default input file names for a number of tools in a
 workflow. Is this possible, or am I limited to only providing the file
 type or extension?

 If possible, can you provide an example?

 Thanks,

 -Evan Bollig
 Research Associate | Application Developer | User Support Consultant
 Minnesota Supercomputing Institute
 599 Walter Library
 612 624 1447
 e...@msi.umn.edu
 boll0...@umn.edu
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Per-tool configuration

2014-06-18 Thread Peter Cock
On Wed, Jun 18, 2014 at 12:04 PM, Jan Kanis jan.c...@jankanis.nl wrote:
 I am not using job splitting, because I am implementing this for a client
 with a small (one machine) galaxy setup.

Ah - this also explains why a job size limit is important for you.

 Implementing a query limit feature in galaxy core would probably be the best
 idea, but that would also probably require an admin screen to edit those
 limits, and I don't think I can sell the required time to my boss under the
 contract we have with the client.

The wrapper script idea I outlined to you earlier would be the least
invasive (although might cause trouble if BLAST is run at the command
line outside Galaxy), while your idea of inserting the check script into
the Galaxy Tool XML just before running BLAST itself should also
work well.

 I gave a quick try before on making the blast2html tool run in both python
 2.6 and 3, but I gave up due to too many encoding issues. The client's
 machine has python 2.6. Maybe I should have another look.

 Jan

It gets easier with practice - a mixture of little syntax things, and
the big pain about bytes versus unicode (and thus encodings,
and raw versus text mode for file handles).

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Gzipped input to functional tests with multiple=true

2014-06-18 Thread Peter Cock
I've filed this bug in the Twill test framework on Trello:
https://trello.com/c/XG3KemZE/1732-gzipped-input-to-twill-functional-tests-fails-with-multiple-true

Peter

On Fri, Jun 13, 2014 at 10:12 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 Hi all,

 I think I've found a bug in the Galaxy test framework :(

 With most file inputs, a gzipped input file works fine (Galaxy's upload code
 handles unzipping it). However, with multiple=true this seems to break
 (with the Twill backend, the API test framework is OK), e.g.

 param name=filenames type=data format=fastq,mira
 multiple=true required=true label=Read file(s)
   help=Multiple files allowed, for example paired
 reads can be given as two files (MIRA looks at read names to identify
 pairs). /


 Fails:

 param name=filenames value=SRR639755_mito_pairs_sample.fastq.gz
 ftype=fastqsanger /

 e.g.  https://travis-ci.org/peterjc/pico_galaxy/builds/27426318

 Excerpt from log:

 ==
 FAIL: test_tool_00 (functional.test_toolbox.TestForTool_mira_4_0_de_novo)
 MIRA v4.0 de novo assember ( mira_4_0_de_novo )  Test-1
 --
 Traceback (most recent call last):
 File 
 /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py,
 line 108, in test_tool
 self.do_it( td )
 File 
 /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py,
 line 32, in do_it
 data_list = galaxy_interactor.run_tool( testdef, test_history )
 File 
 /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/interactor.py,
 line 449, in run_tool
 self.twill_test_case.run_tool( testdef.tool.id,
 repeat_name=repeat_name, **page_inputs )
 File 
 /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/twilltestcase.py,
 line 1789, in run_tool
 self.submit_form( **kwd )
 File 
 /home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/twilltestcase.py,
 line 1999, in submit_form
 raise AssertionError( errmsg )
 AssertionError: Attempting to set field 'read_group_0|filenames' to
 value '['SRR639755_mito_pairs_sample.fastq.gz']' in form 'tool_form'
 threw exception: id=None name=None
 label='SRR639755_mito_pairs_sample.fastq.gz'
 control: SelectControl(read_group_0|filenames=[80])
 If the above control is a DataToolparameter whose data type class does
 not include a sniff() method,
 make sure to include a proper 'ftype' attribute to the tag for the
 control within the test tag set.


 This works,

 param name=filenames value=SRR639755_mito_pairs_sample.fastq
 ftype=fastqsanger /

 e.g. https://travis-ci.org/peterjc/pico_galaxy/builds/27426336

 See: 
 https://github.com/peterjc/pico_galaxy/commit/e6967767535ca29debcdc19d7f0502d73276b6a0

 Regards,

 Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Per-tool configuration

2014-06-17 Thread Peter Cock
On Tue, Jun 17, 2014 at 4:57 PM, Jan Kanis jan.c...@jankanis.nl wrote:
 Too bad there aren't any really good options. I will use the environment
 variable approach for the query size limit.

Are you using the optional job splitting (parallelism) feature in Galaxy?
That seems to be me to be a good place to insert a Galaxy level
job size limit. e.g. BLAST+ jobs are split into 1000 query chunks,
so you might wish to impose a 25 chunk limit?

Long term being able to set limits on the input file parameters
of each tool would be nicer - e.g. Limit BLASTN to at most
20,000 queries, limit MIRA to at most 50GB FASTQ files, etc.

 For the gene bank links I guess modifying the .loc file is the least
 bad way. Maybe it can be merged into galaxy_blast, that would at
 least solve the interoperability problems.

It would have to be sufficiently general, and backward compatible.

FYI other people have also looked at extending the blast *.loc
files (e.g. adding a category column for helping filter down a
very large BLAST database list).

 @Peter: One potential problem in merging my blast2html tool
 could be that I have written it in python3, and the current tool
 wrapper therefore installs python3 and a host of its dependencies,
 making for a quite large download.

Without seeing your code, it is hard to say, but actually writing
Python code which works unmodified under Python 2.7 and
Python 3 is quite doable (and under Python 2.6 with a few
more provisos). Both NumPy and Biopython do this if you
wanted some reassurance.

On the other hand, Galaxy itself will need to more to Python 3
at some point, and certainly individual tools will too. This will
probably mean (as with Linux Python packages) having double
entries on the ToolSehd (one for Python 2, one for Python 3),

e.g ToolShed package for NumPy under Python 2 (done)
and under Python 3 (needed).

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Per-tool configuration

2014-06-16 Thread Peter Cock
On Mon, Jun 16, 2014 at 4:18 AM, John Chilton jmchil...@gmail.com wrote:
 Hello Jan,

 Thanks for the clarification. Not quite what I was expecting so I am
 glad I asked - I don't have great answers for either case so hopefully
 other people will have some ideas.

 For the first use case - I would just specify some default input to
 supply to the input wrapper - lets call this N - add a parameter to
 the tool wrapper --limit-size=N - test that and then allow it to be
 overridden via an environment variable - so in your command block use
 --limit-size=\${BLAST_QUERY_LIMIT:N}. This will use N is not limit
 is set, but deployers can set limits. There are a number of ways to
 set such variables - DRM specific environment files, login rc files,
 etc Just this last release I added the ability to define
 environment variables right in job_conf.xml
 (https://bitbucket.org/galaxy/galaxy-central/pull-request/378/allow-specification-of-environment/diff).
 I thought the tool shed might have a way to collect such definitions
 as well and insert them into package files - but Google failed to find
 this for me.

Hmm. Jan emailed me off list earlier about this. We could insert
a pre-BLAST script to check the size of the query FASTA file,
and abort if it is too large (e.g. number of queries, total sequence
length, perhaps scaled according to the database size if we want
to get clever?).

I was hoping there was a more general mechanism in Galaxy -
after all, BLAST is by no means the only computationally
expensive tool ;)

We have had query files of 20,000 and more genes against NR
(both BLASTP and BLASTX), but our Galaxy has task-splitting
enabled so this becomes 20 (or more) individual cluster jobs
of 1000 queries each. This works fine apart from the occasional
glitch with the network drive when the data is merged afterwards.
(We know this failed once shortly after the underlying storage
had been expanded, and would have been under heavy load
rebalancing the data across the new disks.)

 Not sure about how to proceed with the second use case - extending the
 .loc file should work locally - I am not sure it is feasible within
 the context of the existing tool shed tools, data manager, etc You
 could certainly duplicate this stuff with your modifications - this
 how down sides in terms of interoperability though.

Currently the BLAST wrappers use the *.loc files directly, but
this is likely to switch to the newer Data Manager approach.
That may or may not complicate local modifications like adding
extra columns...

 Sorry I don't have great answers for either question,
 -John

Thanks John,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] What is the correct place under Galaxy for a database that's created by a tool?

2014-06-16 Thread Peter Cock
Hi Melissa,

Galaxy expects history datasets to be read only, so the best
option (in term of this data model) might be a (read only)
SQLite database (since it is just a single file on disk). They
could have multiple such databases in their history or
histories.

If you want the user to have just one database and update
it, then things are rather different... I'll let one of the Galaxy
team comment.

Peter

On Tue, Jun 17, 2014 at 12:07 AM, Melissa Cline cl...@soe.ucsc.edu wrote:
 Hi folks,

 Hopefully this is a quick question.  I'm working on a set of tools that will
 fire off a VM from within Galaxy and will then communicate with the VM.  The
 VM will create a local database.  The vision is that this won't be a shared
 database; in a shared Galaxy instance, each user will have his or her own
 database.  What is the best place to create this database under the Galaxy
 file system?

 Thanks!

 Melissa


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Gzipped input to functional tests with multiple=true

2014-06-13 Thread Peter Cock
Hi all,

I think I've found a bug in the Galaxy test framework :(

With most file inputs, a gzipped input file works fine (Galaxy's upload code
handles unzipping it). However, with multiple=true this seems to break
(with the Twill backend, the API test framework is OK), e.g.

param name=filenames type=data format=fastq,mira
multiple=true required=true label=Read file(s)
  help=Multiple files allowed, for example paired
reads can be given as two files (MIRA looks at read names to identify
pairs). /


Fails:

param name=filenames value=SRR639755_mito_pairs_sample.fastq.gz
ftype=fastqsanger /

e.g.  https://travis-ci.org/peterjc/pico_galaxy/builds/27426318

Excerpt from log:

==
FAIL: test_tool_00 (functional.test_toolbox.TestForTool_mira_4_0_de_novo)
MIRA v4.0 de novo assember ( mira_4_0_de_novo )  Test-1
--
Traceback (most recent call last):
File 
/home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py,
line 108, in test_tool
self.do_it( td )
File 
/home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/functional/test_toolbox.py,
line 32, in do_it
data_list = galaxy_interactor.run_tool( testdef, test_history )
File 
/home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/interactor.py,
line 449, in run_tool
self.twill_test_case.run_tool( testdef.tool.id,
repeat_name=repeat_name, **page_inputs )
File 
/home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/twilltestcase.py,
line 1789, in run_tool
self.submit_form( **kwd )
File 
/home/travis/build/peterjc/pico_galaxy/galaxy-central-master/test/base/twilltestcase.py,
line 1999, in submit_form
raise AssertionError( errmsg )
AssertionError: Attempting to set field 'read_group_0|filenames' to
value '['SRR639755_mito_pairs_sample.fastq.gz']' in form 'tool_form'
threw exception: id=None name=None
label='SRR639755_mito_pairs_sample.fastq.gz'
control: SelectControl(read_group_0|filenames=[80])
If the above control is a DataToolparameter whose data type class does
not include a sniff() method,
make sure to include a proper 'ftype' attribute to the tag for the
control within the test tag set.


This works,

param name=filenames value=SRR639755_mito_pairs_sample.fastq
ftype=fastqsanger /

e.g. https://travis-ci.org/peterjc/pico_galaxy/builds/27426336

See: 
https://github.com/peterjc/pico_galaxy/commit/e6967767535ca29debcdc19d7f0502d73276b6a0

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


  1   2   3   4   5   6   7   8   9   10   >