Hi Greg and Peter,

Am 06.03.2014 18:46, schrieb Peter Cock:
Hi Greg,

I've retitled the thread, previously about a ToolShed nightly test
failure.

A brief recap, we're talking about the Galaxy ToolShed XML
installation recipes for the NCBI BLAST+ packages and my
MIRA4 wrapper in their tool_dependencies.xml files:

http://toolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_29
http://testtoolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_29
http://testtoolshed.g2.bx.psu.edu/view/peterjc/mira4_assembler

These use the pattern of having os/arch specific <action> tags
(which download and install the tool author's precompiled
binaries) and a fall back default <action> which is to report
an error with the os/arch combination and that there are no
ready made binaries available.

Greg is instead advocating the fall back action be to download
the source code, and do a local compile.

My reply is below...

On Thu, Mar 6, 2014 at 5:24 PM, Peter Cock <p.j.a.c...@googlemail.com> wrote:
On Thu, Mar 6, 2014 at 4:53 PM, Greg Von Kuster <g...@bx.psu.edu> wrote:

As we briefly discussed earlier, your mira4 recipe is not currently
following best practices.  Although you uncovered a problem in
the framework which has now been corrected, your recipe's fall
back <actions> tag set should be the recipe for installing mira4
from source ( http://sourceforge.net/projects/mira-assembler/ )
since there is no licensing issues for doing so.  This would be a
more ideal approach than echoing the error messages.

Thanks very much for helping us discover this problem though!

Greg Von Kuster

Hi Greg,

No problem - I'm good at discovering problems ;)

If the download approach failed, it it most likely due to a
transient error (e.g. network issues with download). Here I
would much prefer Galaxy aborted and reported this as an
error (and does not attempt the default action). Is that what
you just fixed?

As to best practice for the fall back action, I think that needs
a new thread.

Regards,

Peter

As to best practice, I do not agree that in cases like this
(MIRA4, NCBI BLAST+) where there are provided binaries
for the major platforms that the fall back should be compiling
from source.

The NCBI BLAST+ provide binaries for 32 bit and 64 bit
Linux and Mac OS X (which I believe covers all the
mainstream platforms Galaxy runs on).

Similarly, MIRA4 provides binaries for 64 bit Linux and
Mac OS X. Note that 32 bit binaries are not provided,
but would be very restricted in terms of the datasets
they could be used on anyway - and I doubt many of
the systems Galaxy runs on these days are 32 bits.

I also think that supporting 32 bit is not really needed and in case of a few libs are really troublesome.

If the os/arch combination is exotic enough that precompiled
binaries are not available, then it is likely compilation will be
tricky anyway - or not supported for that tool, or Galaxy itself.

Essentially I am arguing that where the precompiled tool
binaries cover any mainstream system Galaxy might
be used on, a local compile fall back is not needed.

Imho, that statement is to general. There might be some binaries that are done properly but many of them have still some strange runtime dependencies. In these cases we need to have a compile time fallback.

Also, these are both complex tools which are relatively slow
to compile, and have quite a large compile time dependency
set (e.g. MIRA4 requires at least a quite recent GCC, BOOST,
flex, expat, and strongly recommends TCmalloc).
Here at least some of the dependencies have been packaged for
the ToolShed (probably by Bjoern?) but in the case of
MIRA4 and BLAST+ this is still a lot of effort for no
practical gain.

I don't think compile time really matters, you only need to compile them once and I think most of us can wait one hour.

I also feel there is an argument that the Galaxy goal of
reproducibility should favour using precompiled binaries if
available: A locally compiled binary will generally mean a
different compiler version, perhaps with different optimisation
flags, and different library versions. It will not necessarily
give the same results as the tool author's provided
precompiled binary.

Yes, that's a good point. One the other hand we should not forget that binaries are not necessarily usable over many years. As a really bad example take a look at the UCSC tools. You can't run the latest UCSC tools on a old scientific linux. Because libc is to old. So you are totally lost. I'm not sure how good the MIRA binaries are, but I would like to point out that there are huge differences in how you can produce these binaries.

I'm in favour of having both options available where ever we can and let the administrator choose the best way to install. Maybe with a default universe_wsgi.xml setting (preferred_toolshed_install = "binary"). I would not call it 'fallback', its really a different installation strategy, with different priorities. (There was/is a trello card for it, or?)

That said, I totally understand that if you have binaries you do not want to go through the trouble of compiling it with all dependency but we should highlight that this is the 'best/ideal/preferred way' to do so.

Ciao,
Bjoern

(Wow, this ended up being a long email!)

Regards,

Peter
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/

Reply via email to