I have implemented a cool idea Brad Chapman had the recent BOSC
Codefest. It would save me tons of effort related to maintaining
separate proteomics module definitions for the tool shed and
CloudBioLinux for Galaxy-P.

I thought I would throw this out there and see if anyone has any comments:

>From the pull request:

CloudBioLinux contains custom fabric install procedures for dozens of
bioinformatics packages and new ones are added all the time. It also
contains code used to automatically setup Galaxy env.sh files to
support multiple versions of given software. While it started off as a
way to configure a particular distribution of Linux, it now supports
many distributions and these custom install procedures in particular
are not tied to any particular varaint of Linux or even Linux itself.

This changeset adds the ability for Tool Shed tools to quickly and
easily leverage this wealth of Galaxy module ready software.

In particular it adds a new action type - 'cloudbiolinux_install'. I
have created the repository 'ngscbltest' in 'Next Gen Mappers' section
on the Test Tool Shed to demonstrate and test this functionality.

Here is an example from 'tool_dependencies.xml' in that repository:

    <package name="tophat2" version="2.0.8b">
        <install version="1.0">
              <action type="cloudbiolinux_install"
                      tool_version="2.0.8b" />
        <readme>Tophat 2.</readme>

When installed into a Galaxy instance, this example will cause the
version '2.0.8b' to be passed to the 'install_tophat2' function in
CloudBioLinux and configure it to install into the directory expected
by the Tool Shed client code. Additonally, CloudBioLinux will setup an
env.sh file for this installation. All of this will be installed as a
'tophat2' package.

In the above example,a particulr revision of CloudBioLinux was
specified for the sake of reproducibility. That attribute is optional
however and will default to 'master'. Likewise, reasonble defaults for
the attributes 'tool_name' and 'tool_version' can be inferred from
context. As a demonstration, the following package definition would
result in Tophat 1.3.3 being installed as the package 'tophat'.

    <package name="tophat" version="1.3.3">
        <install version="1.0">
              <action type="cloudbiolinux_install"/>
        <readme>Tophat 1.3.3.</readme>

The name of install method doesn't have to match the packge name, for
instance the install_cufflinks function in CloudBioLinux can be used
to install Cufflinks 1 or 2 as shown below:

    <package name="cufflinks2" version="2.1.1">
        <install version="1.0">
              <action type="cloudbiolinux_install"
                      tool_name="cufflinks" />

This final example also demonstrates how to target a customized fork
of CloudBioLinux.

Implementation: On the Galaxy side of this equation, the
implementation is fairly straight forward. fabric_util.py has been
updated with a '__install_cloudbiolinux_tool' function which
implements this action entirely. This function clones clodubiolinux to
a temporary directory (optionally checking out a particular revision),
and uses the CloudBioLinux deployer functionality to configure a
local, no-ssh install of the specified software.

Requirements: The core framework for obtaining and running
CloudBioLinux requires only bash, wget, git, and
python-dev/python-devel. These are likely a subset of packages the IUC
has already agreed will need to be present. Individual install
procedures may have additonal OS level requirements, but the same
could be true of any Make file compiled with the existing Tool Shed
infrastructure. If particular modules are isolated as being useful, I
am happy to work with the tool developers to ensure the CloudBioLinux
install procedure can work with minimal prerequisites.

Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

To search Galaxy mailing lists use the unified search at:

Reply via email to