[galaxy-dev] GATK 3.1

2014-03-18 Thread Berner, Thomas
Hi,

I've seen there is a new version (3.1-1) of GATK available at 
http://www.broadinstitute.org/gatk/download .
Are there any plans of getting this version into Galaxy in the nearer future?

Greetings, Thomas

---

Thomas Berner

Julius Kühn-Institut (JKI)
- Federal Research Centre for Cultivated Plants - Erwin Baur-Straße 27
06484 Quedlinburg
- Germany -

Phone: ++49  ( 0 ) 3946  47  562
EMail: thomas.ber...@jki.bund.de

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] GATK 3.1

2014-03-18 Thread Nicola Soranzo

Il 2014-03-18 14:41 Berner, Thomas ha scritto:

Hi,

I've seen there is a new version (3.1-1) of GATK available at
http://www.broadinstitute.org/gatk/download [1] .

Are there any plans of getting this version into Galaxy in the nearer
future?


Hi Thomas,
as you probably know there is a wrapper for GATK v.2 on the Tool Shed:

http://toolshed.g2.bx.psu.edu/view/iuc/gatk2

which is developed and maintained by Jim Johnson, Björn Grüning and me. 
GATK v.3 has some new tools and also some changes to logging which 
prevent the use of gatk2 wrapper with v.3 , so we are planning to create 
a new gatk3 Tool Shed repository sooner or later, no time frame decided 
yet.
You can follow development, and contribute if you want, at this git 
repository:


https://github.com/bgruening/galaxytools

Best,
Nicola

--
Nicola Soranzo, Ph.D.
Bioinformatics Program, CRS4
Loc. Piscina Manna, 09010 Pula (CA), Italy
http://www.bioinformatica.crs4.it/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] [CONTENT] Re: Unable to remove old datasets

2014-03-18 Thread Carl Eberhard
Thanks, Ravi  Peter

I've added a card to get the allow_user_dataset_purge options into the
client and to better show the viable options to the user:
https://trello.com/c/RCPZ9zMF


On Fri, Mar 14, 2014 at 11:10 AM, Peter Cock p.j.a.c...@googlemail.comwrote:

 On Fri, Mar 14, 2014 at 11:24 AM, Peter Cock p.j.a.c...@googlemail.com
 wrote:
  On Thu, Mar 13, 2014 at 6:40 PM, Sanka, Ravi rsa...@jcvi.org wrote:
  I do not think so. Several individual datasets have been deleted
 (clicked
  the upper-right X on the history item box) but no History has been
  permanently deleted.
 
  Is there any indication in the database if target dataset or datasets
 were
  marked for permanent deletion? In the dataset table, I see fields
  deleted, purged, and purgable, but nothing that says permanently
  deleted.
 
  I would welcome clarification from the Galaxy Team, here and
  on the wiki page which might benefit from a flow diagram?
 
 
 https://wiki.galaxyproject.org/Admin/Config/Performance/Purge%20Histories%20and%20Datasets
 
  My assumption is using permanently delete in the user interface
  marks an entry as purgable, and then it will be moved to purged
  (and the associated file on disk deleted) by the cleanup scripts -
  but I'm a bit hazy on this any why it takes a while for a user's
  usage figures to change.

 Hmm. Right now I've unable (via the web interface) to permanently
 delete a history - it stays stuck as deleted, and thus (presumably)
 won't get purged by the clean up scripts.

 I've tried:

 1. Load problem history
 2. Rename the history DIE DIE to avoid confusion
 3. Top right menu, Delete permanently
 4. Prompted Really delete the current history permanently? This
 cannot be undone, OK
 5. Told History deleted, a new history is active
 6. Top right menu, Saved Histories
 7. Click Advanced Search, status all
 8. Observe DIE DIE history is only deleted (while other older
 histories are deleted permanently) (BAD)
 9. Run the cleanup scripts,

 $ sh scripts/cleanup_datasets/delete_userless_histories.sh
 $ sh scripts/cleanup_datasets/purge_histories.sh
 $ sh scripts/cleanup_datasets/purge_libraries.sh
 $ sh scripts/cleanup_datasets/purge_folders.sh
 $ sh scripts/cleanup_datasets/purge_datasets.sh

 10. Reload the saved history list, no change.
 11. Using the drop down menu, select Delete Permanently
 12. Prompted History contents will be removed from disk, this cannot
 be undone.  Continue, OK
 13. No change to history status (BAD)
 14. Tick the check-box, and use the Delete Permanently button at the
 bottom of the page
 15. Prompted History contents will be removed from disk, this cannot
 be undone.  Continue, OK
 16. No change to history status (BAD)
 17. Run the cleanup scripts, no change.

 Note that in my universe_wsgi.ini I have not (yet) set:
 allow_user_dataset_purge = True

 If this setting is important, then the interface seems confused -
 and if quotas are enforced, very frustrating :(

 Peter
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] [CONTENT] Re: Unable to remove old datasets

2014-03-18 Thread Carl Eberhard
I believe it's a (BAD) silent failure mode in the server code.

If I understand correctly, the purge request isn't coughing an error when
it gets to the 'allow_user_dataset_purge' check and instead is silently
marking (or re-marking) the datasets as deleted.

I would rather it fail with a 403 error if purge is explicitly requested.

That said, it of course would be better to remove the purge operation based
on the configuration then to show an error after we've found you can't do
the operation. The same holds true for the 'permanently remove this
dataset' link in deleted datasets.

I'll see if I can find out the answer to your question on the cleanup
scripts.


On Tue, Mar 18, 2014 at 10:49 AM, Peter Cock p.j.a.c...@googlemail.comwrote:

 On Tue, Mar 18, 2014 at 2:14 PM, Carl Eberhard carlfeberh...@gmail.com
 wrote:
  Thanks, Ravi  Peter
 
  I've added a card to get the allow_user_dataset_purge options into the
  client and to better show the viable options to the user:
  https://trello.com/c/RCPZ9zMF

 Thanks Carl - so this was a user interface bug, showing the user
 non-functional permanent delete (purge) options. That's clearer now.

 In this situation can the user just 'delete', and wait N days for
 the cleanup scripts to actually purge the files and free the space?
 (It seems N=10 in scripts/cleanup/purge_*.sh at least, elsewhere
 like the underlying Python script the default looks like N=60).

 Regards,

 Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] BugFix for dynamic_options ParamValueFilter to run IUC SnpEff in a workflow

2014-03-18 Thread John Chilton
Hey JJ,

Thanks for the bug report. I can confirm the issue. I think the
problem probably is that the other_values thing you are printing out
there is very different when rendering tools (it is the value of
things at that depth of the tool state tree) versus workflows (in
which it is the global state of the tree from the top). The tool
variant is probably the right approach since they work and has the
nice advantage of avoiding ambiguities that arise otherwise.

https://bitbucket.org/galaxy/galaxy-central/pull-request/349/bring-workflow-parameter-context/diff

I have opened a pull request with an attempt to bring workflows in
line with tools - it seems to fix snpEff for me locally - can you
confirm? Any chance this also solves your other problem with the XY
plotting tool (Pull Request #336)?

-John

On Fri, Feb 28, 2014 at 12:12 PM, Jim Johnson johns...@umn.edu wrote:

 The current code in:   lib/galaxy/tools/parameters/dynamic_options.py
 only searches the top layer of the dict to find the dependency value.

 A fix is provide in pull request:
 #343: Need to traverse the other_value dict to find dependencies for
 ParamValueFilter in


 SnpEff  tool_config

 inputs
 param format=vcf,tabular,pileup,bed name=input type=data
 label=Sequence changes (SNPs, MNPs, InDels)/
 ...
 conditional name=snpDb
 param name=genomeSrc type=select label=Genome source
 option value=cachedLocally installed reference
 genome/option
 option value=historyReference genome from your
 history/option
 option value=namedNamed on demand/option
 /param
 when value=cached
 param name=genomeVersion type=select label=Genome
 !--GENOMEDESCRIPTION--
 options from_data_table=snpeff_genomedb
filter type=unique_value column=0 /
 /options
 /param
 param name=extra_annotations type=select
 display=checkboxes multiple=true label=Additional Annotations
helpThese are available for only a few
 genomes/help
options from_data_table=snpeff_annotations
filter type=param_value ref=genomeVersion
 key=genome column=0 /
filter type=unique_value column=1 /
/options
 /param



 When running workflow:   input.vcf -  SnpEff

 The values in ParamValueFilter filter_options function:

 self.ref_name
 'genomeVersion'

 other_values
 {u'spliceSiteSize': '1', u'filterHomHet': 'no_filter', u'outputFormat':
 'vcf', u'filterOut': None, u'inputFormat': 'vcf', u'filterIn': 'no_filter',
 u'udLength': '5000', u'generate_stats': True, u'noLog': True, u'chr':
 'None', u'intervals': None, u'snpDb': {'extra_annotations': None,
 'regulation': None, 'genomeVersion': 'GRCh37.71', 'genomeSrc': 'cached',
 '__current_case__': 0}, u'offset': '', u'input':
 galaxy.tools.parameters.basic.DummyDataset object at 0x11451b8d0,
 u'transcripts': None, u'annotations': ['-canon', '-lof', '-onlyReg']}

 Since  'genomeVersion' isn't in the keys of other_values, but rather in
 other_values['snpDb']
 this failed the assertion:
 assert self.ref_name in other_values, Required dependency '%s' not
 found in incoming values % self.ref_name

 Pull request 343:

 $ hg diff lib/galaxy/tools/parameters/dynamic_options.py
 diff -r 95517f976cca lib/galaxy/tools/parameters/dynamic_options.py
 --- a/lib/galaxy/tools/parameters/dynamic_options.pyThu Feb 27 16:56:25
 2014 -0500
 +++ b/lib/galaxy/tools/parameters/dynamic_options.pyFri Feb 28 11:37:04
 2014 -0600
 @@ -177,8 +177,27 @@
  return self.ref_name
  def filter_options( self, options, trans, other_values ):
  if trans is not None and trans.workflow_building_mode: return []
 -assert self.ref_name in other_values, Required dependency '%s' not
 found in incoming values % self.ref_name
 -ref = other_values.get( self.ref_name, None )
 +## Depth first traversal to find the value for a dependency
 +def get_dep_value(param_name, dep_name, cur_val, layer):
 +dep_val = cur_val
 +if isinstance(layer, dict ):
 +if dep_name in layer:
 +dep_val = layer[dep_name]
 +if param_name in layer:
 +return dep_val
 +else:
 +for l in layer.itervalues():
 +dep_val = get_dep_value(param_name, dep_name,
 dep_val, l)
 +if dep_val:
 +break
 +elif isinstance( layer, list):
 +for l in layer:
 +dep_val = get_dep_value(param_name, dep_name, dep_val,
 l)
 +if dep_val:
 +break
 +return None
 +ref = 

Re: [galaxy-dev] Verifying test output datatypes, was: Problem with change_format and conditional inputs?

2014-03-18 Thread John Chilton
Merged. Thanks again for the input!

I will look into the unicode issue and respond on the other thread.

-John

On Sat, Mar 15, 2014 at 7:50 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Fri, Mar 14, 2014 at 9:04 PM, John Chilton jmchil...@gmail.com wrote:
 On Fri, Mar 14, 2014 at 10:24 AM, Peter Cock p.j.a.c...@googlemail.com 
 wrote:
 Thanks John,

 I suggest making this test framework perform this check by default
 (the twill and API based frameworks) and seeing what - if anything -
 breaks as a result on the Test Tool Shed.

 Hey Peter,

 I hope it is okay, but I do not want to make this change to the Twill
 driven variant of tool tests, I consider that code at end of life -
 new development would be a waste I think.

 Running all tools against a modified environment that switched all
 tests to target the APIs would be nice, but it sounds like there is
 not really the infrastructure in place for doing that right now.

 Upon further consideration I am not sure there are really any backward
 compatibility concerns anyway - or at least no more so than anything
 else when switching over to the API driven tests. I'll let the pull
 request sit open for a few days and then merge it as is.


 Note that one area of fuzziness is subclasses, e.g. if the tool output
 was labelled fastqsanger, but the test just said fastq, I would
 say the test was broken. On the other hand, if the test used a
 specific datatype like fastqsanger but the tool produced a dataset
 tagged with a more generic datatype like fastq I think that is a
 again a real failure.

 100% agreed on both points. I believe the implementation proposed in
 pull request #347 reflects this resolution of the fuzziness.

 Thanks and have a great weekend,
 -John


 That sounds like a plan :)

 Hmm. I wonder if it would be trivial to tweak our TravisCI setup
 to run the functional tests twice, once with the old twill framework
 and once with the new API based framework? Seems doable
 (but would up the run time quite a bit).

 Thanks,

 Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] BugFix for dynamic_options ParamValueFilter to run IUC SnpEff in a workflow

2014-03-18 Thread Jim Johnson

Nice work John.
This fixed issues for running workflows for both SnpEff and XYPlot.
Please reject my pull requests : #336 and #343 in favor of #349

Thanks,

JJ


On 3/18/14, 11:44 AM, John Chilton wrote:

Hey JJ,

Thanks for the bug report. I can confirm the issue. I think the
problem probably is that the other_values thing you are printing out
there is very different when rendering tools (it is the value of
things at that depth of the tool state tree) versus workflows (in
which it is the global state of the tree from the top). The tool
variant is probably the right approach since they work and has the
nice advantage of avoiding ambiguities that arise otherwise.

https://bitbucket.org/galaxy/galaxy-central/pull-request/349/bring-workflow-parameter-context/diff

I have opened a pull request with an attempt to bring workflows in
line with tools - it seems to fix snpEff for me locally - can you
confirm? Any chance this also solves your other problem with the XY
plotting tool (Pull Request #336)?

-John

On Fri, Feb 28, 2014 at 12:12 PM, Jim Johnson johns...@umn.edu wrote:

The current code in:   lib/galaxy/tools/parameters/dynamic_options.py
only searches the top layer of the dict to find the dependency value.

A fix is provide in pull request:
#343: Need to traverse the other_value dict to find dependencies for
ParamValueFilter in


SnpEff  tool_config

 inputs
 param format=vcf,tabular,pileup,bed name=input type=data
label=Sequence changes (SNPs, MNPs, InDels)/
...
 conditional name=snpDb
 param name=genomeSrc type=select label=Genome source
 option value=cachedLocally installed reference
genome/option
 option value=historyReference genome from your
history/option
 option value=namedNamed on demand/option
 /param
 when value=cached
 param name=genomeVersion type=select label=Genome
 !--GENOMEDESCRIPTION--
 options from_data_table=snpeff_genomedb
filter type=unique_value column=0 /
 /options
 /param
 param name=extra_annotations type=select
display=checkboxes multiple=true label=Additional Annotations
helpThese are available for only a few
genomes/help
options from_data_table=snpeff_annotations
filter type=param_value ref=genomeVersion
key=genome column=0 /
filter type=unique_value column=1 /
/options
 /param



When running workflow:   input.vcf -  SnpEff

The values in ParamValueFilter filter_options function:


self.ref_name

'genomeVersion'


other_values

{u'spliceSiteSize': '1', u'filterHomHet': 'no_filter', u'outputFormat':
'vcf', u'filterOut': None, u'inputFormat': 'vcf', u'filterIn': 'no_filter',
u'udLength': '5000', u'generate_stats': True, u'noLog': True, u'chr':
'None', u'intervals': None, u'snpDb': {'extra_annotations': None,
'regulation': None, 'genomeVersion': 'GRCh37.71', 'genomeSrc': 'cached',
'__current_case__': 0}, u'offset': '', u'input':
galaxy.tools.parameters.basic.DummyDataset object at 0x11451b8d0,
u'transcripts': None, u'annotations': ['-canon', '-lof', '-onlyReg']}

Since  'genomeVersion' isn't in the keys of other_values, but rather in
other_values['snpDb']
this failed the assertion:
 assert self.ref_name in other_values, Required dependency '%s' not
found in incoming values % self.ref_name

Pull request 343:

$ hg diff lib/galaxy/tools/parameters/dynamic_options.py
diff -r 95517f976cca lib/galaxy/tools/parameters/dynamic_options.py
--- a/lib/galaxy/tools/parameters/dynamic_options.pyThu Feb 27 16:56:25
2014 -0500
+++ b/lib/galaxy/tools/parameters/dynamic_options.pyFri Feb 28 11:37:04
2014 -0600
@@ -177,8 +177,27 @@
  return self.ref_name
  def filter_options( self, options, trans, other_values ):
  if trans is not None and trans.workflow_building_mode: return []
-assert self.ref_name in other_values, Required dependency '%s' not
found in incoming values % self.ref_name
-ref = other_values.get( self.ref_name, None )
+## Depth first traversal to find the value for a dependency
+def get_dep_value(param_name, dep_name, cur_val, layer):
+dep_val = cur_val
+if isinstance(layer, dict ):
+if dep_name in layer:
+dep_val = layer[dep_name]
+if param_name in layer:
+return dep_val
+else:
+for l in layer.itervalues():
+dep_val = get_dep_value(param_name, dep_name,
dep_val, l)
+if dep_val:
+break
+elif isinstance( layer, list):
+for l in layer:
+dep_val = get_dep_value(param_name, dep_name, 

Re: [galaxy-dev] Persistent jobs in cluster queue even after canceling job in galaxy

2014-03-18 Thread John Chilton
Erg... I am pretty ignorant about mercurial so I should probably not
respond to this but I will try. It is pretty common practice for the
Galaxy team to push bug fixes to the last release to the stable branch
of galaxy-central - which is very different than the default branch of
galaxy-central which contains active development. These don't go out
to galaxy-dist automatically to prevent the need to strip truly
egregious stuff out of the stable branch that the galaxy-dev news says
to target.

A quirk of this however is that the stable branch of galaxy-central is
actually a good deal more stable the stable branch of galaxy-dist. It
is what usegalaxy.org targets and at least a few other high profile
Galaxy maintainers have caught on to this trick as well.

I think you can update (or merge) the latest stable branch by doing
something like the following:

hg pull https://bitbucket.org/galaxy/galaxy-central#stable
hg update stable

We should probably do a better job of keeping the stable branch of
galaxy-dist up-to-date - but right now we just push out updates at
releases and for major security issues as far as I know.

-John

On Fri, Mar 14, 2014 at 2:23 PM, Brian Claywell bclay...@fhcrc.org wrote:
 On Fri, Mar 14, 2014 at 9:01 AM, John Chilton jmchil...@gmail.com wrote:
 I believe this problem was fixed by Nate after the latest dist release
 and pushed to the stable branch of galaxy-central.

 https://bitbucket.org/galaxy/galaxy-central/commits/1298d3f6aca59825d0eb3d32afd5686c4b1b9294

 If you are eager for this bug fix, you can track the latest stable
 branch of galaxy-central instead of the galaxy-dist tag mentioned in
 the dev news. Right now it has some other good bug fixes not in the
 latest release.

 Ah, got it, thanks! Is it unfeasible to push bug fixes like those back
 to galaxy-dist/stable so those of us that would prefer stable to
 bleeding-edge don't have to cherry-pick commits?


 --
 Brian Claywell, Systems Analyst/Programmer
 Fred Hutchinson Cancer Research Center
 bclay...@fhcrc.org
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] [CONTENT] Re: Unable to remove old datasets

2014-03-18 Thread Carl Eberhard
The cleanup scripts enforce a sort of lifetime for the datasets.

The first time they're run, they may mark a dataset as deleted and also
reset the update time and you'll have to wait N days for the next stage of
the lifetime.

The next time they're run, or if a dataset has already been marked as
deleted, the actual file removal happens and purged is set to true (if it
wasn't already).

You can manually pass in '-d 0' to force removal of datasets recently
marked as deleted.

The purge scripts do not check 'allow_user_dataset_purge', of course.


On Tue, Mar 18, 2014 at 11:50 AM, Carl Eberhard carlfeberh...@gmail.comwrote:

 I believe it's a (BAD) silent failure mode in the server code.

 If I understand correctly, the purge request isn't coughing an error when
 it gets to the 'allow_user_dataset_purge' check and instead is silently
 marking (or re-marking) the datasets as deleted.

 I would rather it fail with a 403 error if purge is explicitly requested.

 That said, it of course would be better to remove the purge operation
 based on the configuration then to show an error after we've found you
 can't do the operation. The same holds true for the 'permanently remove
 this dataset' link in deleted datasets.

 I'll see if I can find out the answer to your question on the cleanup
 scripts.


 On Tue, Mar 18, 2014 at 10:49 AM, Peter Cock p.j.a.c...@googlemail.comwrote:

 On Tue, Mar 18, 2014 at 2:14 PM, Carl Eberhard carlfeberh...@gmail.com
 wrote:
  Thanks, Ravi  Peter
 
  I've added a card to get the allow_user_dataset_purge options into the
  client and to better show the viable options to the user:
  https://trello.com/c/RCPZ9zMF

 Thanks Carl - so this was a user interface bug, showing the user
 non-functional permanent delete (purge) options. That's clearer now.

 In this situation can the user just 'delete', and wait N days for
 the cleanup scripts to actually purge the files and free the space?
 (It seems N=10 in scripts/cleanup/purge_*.sh at least, elsewhere
 like the underlying Python script the default looks like N=60).

 Regards,

 Peter



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] GATK 3.1

2014-03-18 Thread Björn Grüning

Hi Thomas,

we are have some rough plans to do that, as Nicolas already mentioned. 
But at first we want to release one final GATK2.8 wrapper.


Help is very much appreciated.
Cheers,
Bjoern

Am 18.03.2014 14:41, schrieb Berner, Thomas:

Hi,

I've seen there is a new version (3.1-1) of GATK available at 
http://www.broadinstitute.org/gatk/download .
Are there any plans of getting this version into Galaxy in the nearer future?

Greetings, Thomas

---

Thomas Berner

Julius Kühn-Institut (JKI)
- Federal Research Centre for Cultivated Plants - Erwin Baur-Straße 27
06484 Quedlinburg
- Germany -

Phone: ++49  ( 0 ) 3946  47  562
EMail: thomas.ber...@jki.bund.de




___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Tool Testing Enhancements

2014-03-18 Thread John Chilton
Hi Peter,

Thanks for the bug report. It looks like if requests is available to
the framework these unicode errors go away (it doesn't have to fall
back on my poor attempt to provide a similar interface).

https://github.com/jmchilton/pico_galaxy/commit/e7d37f31951d8e729cfdbda0ca9085feb7f2da73

I'll create a Trello card and try to add an egg for this - other
developers on the team have likewise said they would like a requests
dependency available in the past so I doubt there will be objections.

-John

P.S.

It looks like the ftype changeset has already caught some errors (your
sample_seqs tool only produces fasta outputs but some outputs are
labelled as sff):

https://travis-ci.org/jmchilton/pico_galaxy/jobs/21036295

On Tue, Mar 18, 2014 at 6:46 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Fri, Mar 14, 2014 at 8:36 PM, John Chilton jmchil...@gmail.com wrote:
 Hello Tool Developers,

 Haven't known when to send this out, but I figure since you haven't
 received any e-mail from me today it might be a good time.

 tl;dr - Tool functional tests experienced a significant overhaul over
 the last release and will continue to change over the next couple
 releases, but in a backward compatible so hopefully you will not need
 to care unless you want to.

 ...

 Thank you John for this detailed report - I knew bits and pieces
 from prior discussions, but still found this recap very useful.

 I'm now taking advantage of the new environment variable to test
 with both GALAXY_TEST_DEFAULT_INTERACTOR=api (the new
 framework) and GALAXY_TEST_DEFAULT_INTERACTOR=twill
 (the old framework) under TravisCI, see:

 https://github.com/peterjc/galaxy_blast/commit/b9db5c9edc57314c5ab4122bce0b00fa2f9cfb94
 https://github.com/peterjc/pico_galaxy/commit/ceed9e0698989b7a617d75d6c483fa28ae61b333
 http://blastedbio.blogspot.co.uk/2013/09/using-travis-ci-for-testing-galaxy-tools.html
 http://lists.bx.psu.edu/pipermail/galaxy-dev/2014-March/018677.html

 The BLAST+ tests are fine both ways, but there appears to be a
 unicode issue with the API based testing of pico_bio (twill is fine):
 https://travis-ci.org/peterjc/pico_galaxy/builds/21008616

 UnicodeDecodeError: 'ascii' codec can't decode byte 0xbe in position
 847: ordinal not in range(128)

 Thanks,

 Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Tool Shed install best practise : Precompiled binaries vs local compile

2014-03-18 Thread Björn Grüning

Hi Greg and Peter,

Am 06.03.2014 18:46, schrieb Peter Cock:

Hi Greg,

I've retitled the thread, previously about a ToolShed nightly test
failure.

A brief recap, we're talking about the Galaxy ToolShed XML
installation recipes for the NCBI BLAST+ packages and my
MIRA4 wrapper in their tool_dependencies.xml files:

http://toolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_29
http://testtoolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_29
http://testtoolshed.g2.bx.psu.edu/view/peterjc/mira4_assembler

These use the pattern of having os/arch specific action tags
(which download and install the tool author's precompiled
binaries) and a fall back default action which is to report
an error with the os/arch combination and that there are no
ready made binaries available.

Greg is instead advocating the fall back action be to download
the source code, and do a local compile.

My reply is below...

On Thu, Mar 6, 2014 at 5:24 PM, Peter Cock p.j.a.c...@googlemail.com wrote:

On Thu, Mar 6, 2014 at 4:53 PM, Greg Von Kuster g...@bx.psu.edu wrote:


As we briefly discussed earlier, your mira4 recipe is not currently
following best practices.  Although you uncovered a problem in
the framework which has now been corrected, your recipe's fall
back actions tag set should be the recipe for installing mira4
from source ( http://sourceforge.net/projects/mira-assembler/ )
since there is no licensing issues for doing so.  This would be a
more ideal approach than echoing the error messages.

Thanks very much for helping us discover this problem though!

Greg Von Kuster


Hi Greg,

No problem - I'm good at discovering problems ;)

If the download approach failed, it it most likely due to a
transient error (e.g. network issues with download). Here I
would much prefer Galaxy aborted and reported this as an
error (and does not attempt the default action). Is that what
you just fixed?

As to best practice for the fall back action, I think that needs
a new thread.

Regards,

Peter


As to best practice, I do not agree that in cases like this
(MIRA4, NCBI BLAST+) where there are provided binaries
for the major platforms that the fall back should be compiling
from source.

The NCBI BLAST+ provide binaries for 32 bit and 64 bit
Linux and Mac OS X (which I believe covers all the
mainstream platforms Galaxy runs on).

Similarly, MIRA4 provides binaries for 64 bit Linux and
Mac OS X. Note that 32 bit binaries are not provided,
but would be very restricted in terms of the datasets
they could be used on anyway - and I doubt many of
the systems Galaxy runs on these days are 32 bits.


I also think that supporting 32 bit is not really needed and in case of 
a few libs are really troublesome.



If the os/arch combination is exotic enough that precompiled
binaries are not available, then it is likely compilation will be
tricky anyway - or not supported for that tool, or Galaxy itself.

Essentially I am arguing that where the precompiled tool
binaries cover any mainstream system Galaxy might
be used on, a local compile fall back is not needed.


Imho, that statement is to general. There might be some binaries that 
are done properly but many of them have still some strange runtime 
dependencies. In these cases we need to have a compile time fallback.



Also, these are both complex tools which are relatively slow
to compile, and have quite a large compile time dependency
set (e.g. MIRA4 requires at least a quite recent GCC, BOOST,
flex, expat, and strongly recommends TCmalloc).
Here at least some of the dependencies have been packaged for
the ToolShed (probably by Bjoern?) but in the case of
MIRA4 and BLAST+ this is still a lot of effort for no
practical gain.


I don't think compile time really matters, you only need to compile them 
once and I think most of us can wait one hour.



I also feel there is an argument that the Galaxy goal of
reproducibility should favour using precompiled binaries if
available: A locally compiled binary will generally mean a
different compiler version, perhaps with different optimisation
flags, and different library versions. It will not necessarily
give the same results as the tool author's provided
precompiled binary.


Yes, that's a good point. One the other hand we should not forget that 
binaries are not necessarily usable over many years. As a really bad 
example take a look at the UCSC tools. You can't run the latest UCSC 
tools on a old scientific linux. Because libc is to old. So you are 
totally lost. I'm not sure how good the MIRA binaries are, but I would 
like to point out that there are huge differences in how you can produce 
these binaries.


I'm in favour of having both options available where ever we can and let 
the administrator choose the best way to install. Maybe with a default 
universe_wsgi.xml setting (preferred_toolshed_install = binary). I 
would not call it 'fallback', its really a different installation 
strategy, with different priorities. (There was/is a trello card for 

[galaxy-dev] REFERENCE genome in Bowtie2

2014-03-18 Thread Emir Islamovic
Dear Galaxy Representative,

I'm trying to use Bowtie2 in Galaxy and I need to select reference genome.
It says If your genome of interest is not listed, contact the Galaxy team
 and that is why I am contacting you.
My genome of interest is Maize (corn, Zea mays) and information about it
can be found at http://www.maizegdb.org/

Thank you.
Emir

-- 
Emir Islamovic
Postdoctoral Research Assistant
Plant and Microbial Biology Department
311 Koshland Hall
University of California, Berkeley
Berkeley, CA 94720-3102
Phone: 510-642-8058
Fax: 510-642-4995
Email: emirislamo...@berkeley.edu
  emirislamo...@yahoo.com
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/