Re: [galaxy-dev] The new hg based Galaxy Tool Shed

2011-06-16 Thread Peter Cock
On Thu, Jun 16, 2011 at 3:00 AM, Ravi Madduri madd...@mcs.anl.gov wrote:
 I apologize for jumping on to this thread a bit late. I read below that
 there is a plan to pull tools into a galaxy installation automagically. I
 wonder if you plan on providing some kind of API to query the tool registry
 and discover the tools and install them into an existing galaxy
 installation.

Yes, have a look at Greg's slides from the Galaxy Community Conference
http://wiki.g2.bx.psu.edu/GCC2011

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] The new hg based Galaxy Tool Shed

2011-06-15 Thread Ravi Madduri
I apologize for jumping on to this thread a bit late. I read below that there 
is a plan to pull tools into a galaxy installation automagically. I wonder if 
you plan on providing some kind of API to query the tool registry and discover 
the tools and install them into an existing galaxy installation.

PS: The link : How to upload, download and install tools under Help seems to be 
broken.
On Jun 1, 2011, at 3:00 PM, Nate Coraor wrote:

 Peter Cock wrote:
 
 Well, use of the dependency system isn't required, so just setting
 things up on the $PATH is always a possibility.  I was going to suggest
 that your patch could be applied if it was conditional on the local
 runner and checked after any requirement type=package
 dependencies were setup, ...
 
 Is that a request for me to update the patch? I've not delved into the
 job runner code before, so it might take me a bit longer that it would
 take you. Hint hint ;) I'd help with testing though.
 
 It's not a completely trivial thing, which is why I didn't do it at the
 time.  It's probably something that should be added to the DRM wrapper
 script so that a nice error message can be supplied.  I can't think of a
 way to check at tool load that wouldn't be painfully slow.
 
 ... but there's still the problem of people running jobs through
 the local runner which are actually sent to the cluster without Galaxy's
 knowledge.  Perhaps this is something we shouldn't worry too much about,
 but I know there are people doing it.
 
 You mean if Galaxy blindly calls a tool or script, and that script
 then submits the job to the cluster? I'd say checking the cluster
 dependencies there was the tool author's responsibility.
 
 Yeah, that's the idea.  Unfortunately, if the binary isn't installed on
 the Galaxy server (which is irrelevant), the tool won't load, which is
 certainly not what we want.
 
 --nate
 
 
 Peter
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/

--
Ravi K Madduri
The Globus Alliance | Argonne National Laboratory | University of Chicago
http://www.mcs.anl.gov/~madduri

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] The new hg based Galaxy Tool Shed

2011-06-01 Thread Nate Coraor
Hi Peter,

Greg will probably reply, but I'll throw in my $0.02 as well.

Peter Cock wrote:
 Hi Greg et al,
 
 I've just been looking over your slides from last week about the new
 'Galaxy Tool Shed', which are posted online here:
 
 http://wiki.g2.bx.psu.edu/GCC2011
 
 http://wiki.g2.bx.psu.edu/GCC2011?action=AttachFiledo=gettarget=GalaxyToolShed.pdf
 
 They talk about how you will be tracking individual tools in hg repositories.
 
 I can see two ways this might work:
 
 (1) Each of these tool specific repositories (or branches if you just make one
 repository for each tool owner) would be a full fork of the Galaxy code base.
 This allows in principle tools to include changes to core functionality (but
 that seems dangerous due to potential merge clashes), and any existing
 tool contributor's pre-existing hg forks on bitbucket might be reused.

The tool shed isn't really intended for framework changes - I would
suggest keeping these as bitbucket forks, although it would certainly be
good if we had a way to locate the list of such forks centrally.

 (2) Each of these tool specific repositories would ONLY track the tool 
 specific
 files you'd add to Galaxy to install the tool. So, typically there would be an
 XML file, perhaps a wrapper script, maybe a sample loc file, and a plain
 text readme file.
 
 I'm guessing you've gone for something along the lines of idea (2), but I

Yep.

 would love to hear more about how this will all work. e.g. Where would
 the tool shed repositories be hosted, and would tool authors use hg to
 work with them, or something like the current web based tool upload?

They're hosted here, and you can check them out and work with them
locally as you do the Galaxy source itself, or use the new web-based
upload to upload individual files or tarballs.

Have a look at the test instance of the next-gen toolshed here if you'd
like to see how it works:

  http://testtoolshed.g2.bx.psu.edu/

Please feel free to use this as a sandbox and report any issues you find.

--nate

 
 Regards,
 
 Peter
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] The new hg based Galaxy Tool Shed

2011-06-01 Thread Peter Cock
On Wed, Jun 1, 2011 at 3:22 PM, Nate Coraor n...@bx.psu.edu wrote:
 Hi Peter,

 Greg will probably reply, but I'll throw in my $0.02 as well.

Great - but with your answers you've triggered more questions ;)

 Peter Cock wrote:
 Hi Greg et al,

 I've just been looking over your slides from last week about the new
 'Galaxy Tool Shed', which are posted online here:

 http://wiki.g2.bx.psu.edu/GCC2011

 http://wiki.g2.bx.psu.edu/GCC2011?action=AttachFiledo=gettarget=GalaxyToolShed.pdf

 They talk about how you will be tracking individual tools in hg repositories.

 I can see two ways this might work:

 (1) Each of these tool specific repositories (or branches if you just make 
 one
 repository for each tool owner) would be a full fork of the Galaxy code base.
 This allows in principle tools to include changes to core functionality (but
 that seems dangerous due to potential merge clashes), and any existing
 tool contributor's pre-existing hg forks on bitbucket might be reused.

 The tool shed isn't really intended for framework changes - I would
 suggest keeping these as bitbucket forks, although it would certainly be
 good if we had a way to locate the list of such forks centrally.

Well, as long as the repository is created by forking on bitbucket, then
the link existing in the bitbucket web interface.
https://bitbucket.org/galaxy/galaxy-central/descendants

 (2) Each of these tool specific repositories would ONLY track the tool 
 specific
 files you'd add to Galaxy to install the tool. So, typically there would be 
 an
 XML file, perhaps a wrapper script, maybe a sample loc file, and a plain
 text readme file.

 I'm guessing you've gone for something along the lines of idea (2), but I

 Yep.

It did seem the most likely route.

 would love to hear more about how this will all work. e.g. Where would
 the tool shed repositories be hosted, and would tool authors use hg to
 work with them, or something like the current web based tool upload?

 They're hosted here, and you can check them out and work with them
 locally as you do the Galaxy source itself, or use the new web-based
 upload to upload individual files or tarballs.

 Have a look at the test instance of the next-gen toolshed here if you'd
 like to see how it works:

  http://testtoolshed.g2.bx.psu.edu/

 Please feel free to use this as a sandbox and report any issues you find.

I see the existing usernames and passwords from the old Tool Shed were
transferred - that makes life easier. And it lists the hg information, e.g.

hg clone http://pete...@testtoolshed.g2.bx.psu.edu/repos/peterjc/venn_list
hg clone 
http://pete...@testtoolshed.g2.bx.psu.edu/repos/peterjc/tmhmm_and_signalp

What happens with branches? Would the Tool Shed just show the
default branch? That seems best for a simple UI.

I have a query regarding the way the tools are shown in tables and the
version column, which shows a changeset and revision number. According
to Greg's slides (slide #10, titled Simpler tool versioning which seems ironic
to me), the old numerical version is still there in the XML - and I'd prefer to
see that. How about having both shown (two columns, perhaps call them
Public version and hg version or hg revision).

With regards to the planned installation functionality, what happens when
a tool repository (aka Tool Suite in the old model) contains several XML
wrappers - would you be able to choose which are wanted? The use case
I have here is when several tools share some common dependency (which
should be tracked in a single repository), and were therefore useful to
bundle together as a suite, but where not all the tools will be of global
interest (e.g. My TMHMM, SignalP, etc suite).

Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] The new hg based Galaxy Tool Shed

2011-06-01 Thread Greg Von Kuster
Hello Peter - I finally got a chance to jump in - see my inline comments...


On Jun 1, 2011, at 11:00 AM, Peter Cock wrote:

 On Wed, Jun 1, 2011 at 3:22 PM, Nate Coraor n...@bx.psu.edu wrote:
 Hi Peter,
 
 Greg will probably reply, but I'll throw in my $0.02 as well.
 
 Great - but with your answers you've triggered more questions ;)
 
 Peter Cock wrote:
 Hi Greg et al,
 
 I've just been looking over your slides from last week about the new
 'Galaxy Tool Shed', which are posted online here:
 
 http://wiki.g2.bx.psu.edu/GCC2011
 
 http://wiki.g2.bx.psu.edu/GCC2011?action=AttachFiledo=gettarget=GalaxyToolShed.pdf
 
 They talk about how you will be tracking individual tools in hg 
 repositories.
 
 I can see two ways this might work:
 
 (1) Each of these tool specific repositories (or branches if you just make 
 one
 repository for each tool owner) would be a full fork of the Galaxy code 
 base.
 This allows in principle tools to include changes to core functionality (but
 that seems dangerous due to potential merge clashes), and any existing
 tool contributor's pre-existing hg forks on bitbucket might be reused.
 
 The tool shed isn't really intended for framework changes - I would
 suggest keeping these as bitbucket forks, although it would certainly be
 good if we had a way to locate the list of such forks centrally.
 
 Well, as long as the repository is created by forking on bitbucket, then
 the link existing in the bitbucket web interface.
 https://bitbucket.org/galaxy/galaxy-central/descendants


What's important here is that each tool or set of tools is it's own separate 
entity - see the future big picture highlights below for reasons.


 
 (2) Each of these tool specific repositories would ONLY track the tool 
 specific
 files you'd add to Galaxy to install the tool. So, typically there would be 
 an
 XML file, perhaps a wrapper script, maybe a sample loc file, and a plain
 text readme file.
 
 I'm guessing you've gone for something along the lines of idea (2), but I
 
 Yep.
 
 It did seem the most likely route.
 
 would love to hear more about how this will all work. e.g. Where would
 the tool shed repositories be hosted, and would tool authors use hg to
 work with them, or something like the current web based tool upload?
 
 They're hosted here, and you can check them out and work with them
 locally as you do the Galaxy source itself, or use the new web-based
 upload to upload individual files or tarballs.
 
 Have a look at the test instance of the next-gen toolshed here if you'd
 like to see how it works:
 
  http://testtoolshed.g2.bx.psu.edu/
 
 Please feel free to use this as a sandbox and report any issues you find.
 
 I see the existing usernames and passwords from the old Tool Shed were
 transferred - that makes life easier. And it lists the hg information, e.g.
 
 hg clone http://pete...@testtoolshed.g2.bx.psu.edu/repos/peterjc/venn_list
 hg clone 
 http://pete...@testtoolshed.g2.bx.psu.edu/repos/peterjc/tmhmm_and_signalp
 
 What happens with branches? Would the Tool Shed just show the
 default branch? That seems best for a simple UI.

Some of the branching details are yet to be worked out, but forks are easy 
because repository urls include the unique username of the Galaxy user.

 
 I have a query regarding the way the tools are shown in tables and the
 version column, which shows a changeset and revision number. According
 to Greg's slides (slide #10, titled Simpler tool versioning which seems 
 ironic
 to me), the old numerical version is still there in the XML - and I'd prefer 
 to
 see that. How about having both shown (two columns, perhaps call them
 Public version and hg version or hg revision).


We can certainly do this, but what would you like to see for tool suites and 
other tool types?  The old Galaxy tool shed strictly required a 
suite_config.xml file that included the overall version of the suite.  To make 
tool development easier, we're no longer requiring the inclusion of a 
suite_config.xml file ( we don't even differentiate types of tools since 
everything is a repository ).  The definition of a tool in the next gen tool 
shed, is fairly loose.  A tool could be data, it could be an exported workflow, 
it could be a suite of tools, a single tool, or just a set of files.  So we'll 
need to define an easy way to provide a version of the tool if it will be 
different than the version of the repository tip.


 
 With regards to the planned installation functionality, what happens when
 a tool repository (aka Tool Suite in the old model) contains several XML
 wrappers - would you be able to choose which are wanted?

Yes - see below...

 The use case
 I have here is when several tools share some common dependency (which
 should be tracked in a single repository), and were therefore useful to
 bundle together as a suite, but where not all the tools will be of global
 interest (e.g. My TMHMM, SignalP, etc suite).



Here's the future big picture highlights.  Many of the 

Re: [galaxy-dev] The new hg based Galaxy Tool Shed

2011-06-01 Thread Peter Cock
On Wed, Jun 1, 2011 at 4:22 PM, Greg Von Kuster g...@bx.psu.edu wrote:
 Hello Peter - I finally got a chance to jump in - see my inline comments...

Hi :)

 What happens with branches? Would the Tool Shed just show the
 default branch? That seems best for a simple UI.

 Some of the branching details are yet to be worked out, but forks are easy
 because repository urls include the unique username of the Galaxy user.

Well, yes and no - as long as there are competing versions of a Galaxy tool
(e.g. from an original author and a fork by a second author), and they use
the same ID in their XML, you have a clash. This will have to be considered
in the (automated) install interface. i.e. In general, when installing
or updating
any tool, there may be existing versions of some components already present.
In fact two completely unrelated tools could even have the same XML ID by
accident.

 I have a query regarding the way the tools are shown in tables and the
 version column, which shows a changeset and revision number. According
 to Greg's slides (slide #10, titled Simpler tool versioning which seems 
 ironic
 to me), the old numerical version is still there in the XML - and I'd prefer 
 to
 see that. How about having both shown (two columns, perhaps call them
 Public version and hg version or hg revision).

 We can certainly do this, but what would you like to see for tool suites and
 other tool types?  The old Galaxy tool shed strictly required a 
 suite_config.xml
 file that included the overall version of the suite.  To make tool development
 easier, we're no longer requiring the inclusion of a suite_config.xml file ( 
 we
 don't even differentiate types of tools since everything is a repository ).  
 The
 definition of a tool in the next gen tool shed, is fairly loose.  A tool 
 could be
 data, it could be an exported workflow, it could be a suite of tools, a single
 tool, or just a set of files.  So we'll need to define an easy way to provide 
 a
 version of the tool if it will be different than the version of the 
 repository tip.

I see what you mean for the suite case. Maybe on the view details page
each constituent tool could be shown with its classical version number
from the XML file?


 Here's the future big picture highlights.  Many of the details are yet to
 be defined and fleshed out...

 We're hoping that in the near future there will be many local tool sheds
 ( just like Galaxy instances ).  I'm thinking that there will be a central 
 tool
 shed broker of sorts that is hosted by the Galaxy team.  This broker will
 provide 2 basic functions.  It will enable local tool sheds ( including the
 current tool shed hosted by the Galaxy team ) to advertise their tools,
 and it will allow local Galaxy instances to use those advertisements to
 find tools that the local Galaxy instance's users are interested in.  This
 specific point has not yet been discussed to any depth, so consider it
 fluid for now.

I'm not immediately sold on this plan. To me one of the big plus points
of having a single Official Tool Shed looked after by the Galaxy team
is the convenience factor (a one stop shop), which requires critical mass,
plus whatever QA happens as part of the current approval process. I
would regard it as a step backwards if in order to hunt for a wrapper for
a given tool, I had to resort to Google in order to find all the individual
Galaxy Tool Sheds.

 When a Galaxy instance's admin locates tools within a specific tool shed
 that they want to install, they will be able to install them via a Galaxy tool
 installation control panel.  Think of a UI that provides a check-boxed list
 of tools that have been found in some tool shed or sheds. The Galaxy
 admin will check those tools he wants to install, and the tools, along with
 all dependencies will automatically be installed in the local Galaxy instance.
 Dependencies could include 3rd party binaries, maybe some form of data,
 and other forms of dependencies.  This is another good reason to keep
 tools separated in their own repositories.

If you mean by dependencies the small task of installing the tool XML
and associated scripts and data files currently bundled in the tar balls
on the current Tool Shed, that seems fine. Anything beyond that seems
difficult and likely to impose a significant extra load on tool wrapper
authors.

 The installation will be virtually automatic, requiring little or no manual
 intervention via a package manage of sorts.  This will be done using
 a combination of fabric scripts, and other components.  All of the
 underlying mercurial stuff will be handled beneath the UI layer.

This larger aim of installing the underlying dependencies is impossible
in general - but that seems to be what you want to aim for. Consider
obvious use case of closed source (non-redistributable) 3rd party binaries.
I can think of several examples from the current Tool Shed wrappers,
including the Roche Newbler off instrument applications, TMHMM
and SignalP.


Re: [galaxy-dev] The new hg based Galaxy Tool Shed

2011-06-01 Thread Nate Coraor
Peter Cock wrote:
 
 Well, yes and no - as long as there are competing versions of a Galaxy tool
 (e.g. from an original author and a fork by a second author), and they use
 the same ID in their XML, you have a clash. This will have to be considered
 in the (automated) install interface. i.e. In general, when installing
 or updating
 any tool, there may be existing versions of some components already present.
 In fact two completely unrelated tools could even have the same XML ID by
 accident.

I agree there could be a problem with tool ID uniqueness.  We've talked
about suggesting that people namespace their tool IDs to prevent this,
but nothing formal has materialized at this point.

 I'm not immediately sold on this plan. To me one of the big plus points
 of having a single Official Tool Shed looked after by the Galaxy team
 is the convenience factor (a one stop shop), which requires critical mass,
 plus whatever QA happens as part of the current approval process. I
 would regard it as a step backwards if in order to hunt for a wrapper for
 a given tool, I had to resort to Google in order to find all the individual
 Galaxy Tool Sheds.

It'll be possible for people to run their own Tool Sheds if they'd like,
for whatever purpose - and this may be necessary for sharing extremely
large data which we can't possibly host at the main Shed, but there
should be an aggregator somewhere which lists all of the available
public Sheds and makes it easy to add them as new sources to your Galaxy
install.  Like a slightly more organized Debian APT system.

 If you mean by dependencies the small task of installing the tool XML
 and associated scripts and data files currently bundled in the tar balls
 on the current Tool Shed, that seems fine. Anything beyond that seems
 difficult and likely to impose a significant extra load on tool wrapper
 authors.

It'll be up to the authors to decide what level of complexity they care
to handle, but we want to move away from the situation where someone
installs a tool but finds that it's unusable because the actual
underlying dependency doesn't exist and is non-trivial to install.

 This larger aim of installing the underlying dependencies is impossible
 in general - but that seems to be what you want to aim for. Consider
 obvious use case of closed source (non-redistributable) 3rd party binaries.
 I can think of several examples from the current Tool Shed wrappers,
 including the Roche Newbler off instrument applications, TMHMM
 and SignalP.

Agreed, thankfully, the current dependency system (tool_dependency_dir
in the config file (not in the sample config, sorry, I'll rememdy that
shortly!)) only requires that you have an environment file that
configures whatever is necessary (generally just $PATH) to find a
dependency.  So the tools in the Tool Shed would provide the XML,
wrapper script (if necessary), and then instructions or perhaps an
interface to configure the env file.

 Even if you just hope to cover open source tool dependencies, this is
 another big problem which seems like something Galaxy shouldn't be
 taking on. Frankly the only way I expect this grand plan to have any
 practical chance of success is if you limit yourselves to a single existing
 Linux package management platform like RPM or Deb files (although
 doing that would limit Galaxy's appeal). e.g. Work hand in hand with
 Debian-Med to ensure any missing tool is covered.

Distributing binaries for the core platforms (Linux i686/x86_64) and Mac
OS X is probably not terribly difficult for us, but would be more work
for for 3rd party developers - but the choice to do this is up to them.
I also haven't given too much though about how this would work.  dpkg
and rpm have the upside of being deterministic, but the downside of
being platform-specific, requiring root, and not having much ability to
install to varying paths.

A fallback to source if binaries are not available would also be nice,
if it's possible to write some easy instructions on how to compile, but
of course this won't always be the case.

 Are you biting off more than you can chew? I hope I am misinterpreting
 your plans.

Hopefully not!  We're trying to think this through pretty thoroughly
before we get started, thanks for joining in the discussion. =)

 (And for the umpteenth time, I am frustrated I couldn't make it to
 the Galaxy conference last week in person - more for this kind of
 discussion rather than the talks themselves. Will you be at BOSC
 or ISMB 2011 in Vienna? Maybe that could be another thread...)

Agreed!  I do believe there are some people going to BOSC, Dave will
hopefully chime in with the details (when he's awake, I think he was
only flying back today).

--nate

 
 Regards,
 
 Peter
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
   

Re: [galaxy-dev] The new hg based Galaxy Tool Shed

2011-06-01 Thread Peter Cock
On Wed, Jun 1, 2011 at 5:25 PM, Nate Coraor n...@bx.psu.edu wrote:
 Peter Cock wrote:

 Well, yes and no - as long as there are competing versions of a Galaxy tool
 (e.g. from an original author and a fork by a second author), and they use
 the same ID in their XML, you have a clash. This will have to be considered
 in the (automated) install interface. i.e. In general, when installing or
 updating any tool, there may be existing versions of some components
  already present. In fact two completely unrelated tools could even have
 the same XML ID by accident.

 I agree there could be a problem with tool ID uniqueness.  We've talked
 about suggesting that people namespace their tool IDs to prevent this,
 but nothing formal has materialized at this point.

That sounds sensible, and the sooner the better.

 I'm not immediately sold on this plan. To me one of the big plus points
 of having a single Official Tool Shed looked after by the Galaxy team
 is the convenience factor (a one stop shop), which requires critical mass,
 plus whatever QA happens as part of the current approval process. I
 would regard it as a step backwards if in order to hunt for a wrapper for
 a given tool, I had to resort to Google in order to find all the individual
 Galaxy Tool Sheds.

 It'll be possible for people to run their own Tool Sheds if they'd like,
 for whatever purpose - and this may be necessary for sharing extremely
 large data which we can't possibly host at the main Shed, but there
 should be an aggregator somewhere which lists all of the available
 public Sheds and makes it easy to add them as new sources to your Galaxy
 install.  Like a slightly more organized Debian APT system.

If there is an official meta tool shed aggregator, that would address
my main concern about fragmenting things.

 If you mean by dependencies the small task of installing the tool XML
 and associated scripts and data files currently bundled in the tar balls
 on the current Tool Shed, that seems fine. Anything beyond that seems
 difficult and likely to impose a significant extra load on tool wrapper
 authors.

 It'll be up to the authors to decide what level of complexity they care
 to handle,

Good - that silences a lot of my worries.

 ... but we want to move away from the situation where someone
 installs a tool but finds that it's unusable because the actual
 underlying dependency doesn't exist and is non-trivial to install.

Improving the documentation shown on the tool shed could help here -
make it easier for the tool wrapper to tell the Tool Shed user what will
be required.

Currently we get a short plain text box as part of the upload (no markup),
and can include a (plain text) readme file which is easily viewable from
the tool shed. I've just filed an enhancement request on a related idea:

https://bitbucket.org/galaxy/galaxy-central/issue/565/
Show mockup of tool GUI in Galaxy Tool Shed

 This larger aim of installing the underlying dependencies is impossible
 in general - but that seems to be what you want to aim for. Consider
 obvious use case of closed source (non-redistributable) 3rd party binaries.
 I can think of several examples from the current Tool Shed wrappers,
 including the Roche Newbler off instrument applications, TMHMM
 and SignalP.

 Agreed, thankfully, the current dependency system (tool_dependency_dir
 in the config file (not in the sample config, sorry, I'll rememdy that
 shortly!)) only requires that you have an environment file that
 configures whatever is necessary (generally just $PATH) to find a
 dependency.  So the tools in the Tool Shed would provide the XML,
 wrapper script (if necessary), and then instructions or perhaps an
 interface to configure the env file.

I'd hope the common case where all that is required is the tool binary
to be on the path, would not require any extra configuration files. See
also: https://bitbucket.org/galaxy/galaxy-central/issue/82

 [cut]

 Are you biting off more than you can chew? I hope I am misinterpreting
 your plans.

 Hopefully not!  We're trying to think this through pretty thoroughly
 before we get started, thanks for joining in the discussion. =)

I've been reassured :)

Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] The new hg based Galaxy Tool Shed

2011-06-01 Thread Chris Fields
(apologies in advance, limiting my response to the two questions below)

On Jun 1, 2011, at 11:54 AM, Peter Cock wrote:

 On Wed, Jun 1, 2011 at 5:25 PM, Nate Coraor n...@bx.psu.edu wrote:
 Peter Cock wrote:
 
 Well, yes and no - as long as there are competing versions of a Galaxy tool
 (e.g. from an original author and a fork by a second author), and they use
 the same ID in their XML, you have a clash. This will have to be considered
 in the (automated) install interface. i.e. In general, when installing or
 updating any tool, there may be existing versions of some components
 already present. In fact two completely unrelated tools could even have
 the same XML ID by accident.
 
 I agree there could be a problem with tool ID uniqueness.  We've talked
 about suggesting that people namespace their tool IDs to prevent this,
 but nothing formal has materialized at this point.
 
 That sounds sensible, and the sooner the better.

Agreed.  I think simple namespace prefixes (maybe hg account?) is the easiest 
option.

 I'm not immediately sold on this plan. To me one of the big plus points
 of having a single Official Tool Shed looked after by the Galaxy team
 is the convenience factor (a one stop shop), which requires critical mass,
 plus whatever QA happens as part of the current approval process. I
 would regard it as a step backwards if in order to hunt for a wrapper for
 a given tool, I had to resort to Google in order to find all the individual
 Galaxy Tool Sheds.
 
 It'll be possible for people to run their own Tool Sheds if they'd like,
 for whatever purpose - and this may be necessary for sharing extremely
 large data which we can't possibly host at the main Shed, but there
 should be an aggregator somewhere which lists all of the available
 public Sheds and makes it easy to add them as new sources to your Galaxy
 install.  Like a slightly more organized Debian APT system.
 
 If there is an official meta tool shed aggregator, that would address
 my main concern about fragmenting things.

Not sure how feasible this is, but could you use hg subrepositories for this 
purpose?  For instance, have a 'blessed' set of galaxy tool sheds (as subrepos) 
listed in a main tool shed repository.  One of the nice advantages of this is 
it could allow one to use git or svn, though I think sticking with hg-only 
repos is the simplest option for now.

chris

PS - wonderful conference, sorry that Peter couldn't make it!


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] The new hg based Galaxy Tool Shed

2011-06-01 Thread Nate Coraor
Peter Cock wrote:
 
 If there is an official meta tool shed aggregator, that would address
 my main concern about fragmenting things.

If nothing else, there can be a wiki page, although something
programatic would be more ideal.

  ... but we want to move away from the situation where someone
  installs a tool but finds that it's unusable because the actual
  underlying dependency doesn't exist and is non-trivial to install.
 
 Improving the documentation shown on the tool shed could help here -
 make it easier for the tool wrapper to tell the Tool Shed user what will
 be required.
 
 Currently we get a short plain text box as part of the upload (no markup),
 and can include a (plain text) readme file which is easily viewable from
 the tool shed. I've just filed an enhancement request on a related idea:
 
 https://bitbucket.org/galaxy/galaxy-central/issue/565/
 Show mockup of tool GUI in Galaxy Tool Shed

Yeah, eventually we'll have to parse the tool configs in the repo, so
functionality like this should show up as the Shed matures.  Not sure
about the difficulty of doing the tool form mockup, but I like the idea.

  This larger aim of installing the underlying dependencies is impossible
  in general - but that seems to be what you want to aim for. Consider
  obvious use case of closed source (non-redistributable) 3rd party binaries.
  I can think of several examples from the current Tool Shed wrappers,
  including the Roche Newbler off instrument applications, TMHMM
  and SignalP.
 
  Agreed, thankfully, the current dependency system (tool_dependency_dir
  in the config file (not in the sample config, sorry, I'll rememdy that
  shortly!)) only requires that you have an environment file that
  configures whatever is necessary (generally just $PATH) to find a
  dependency.  So the tools in the Tool Shed would provide the XML,
  wrapper script (if necessary), and then instructions or perhaps an
  interface to configure the env file.
 
 I'd hope the common case where all that is required is the tool binary
 to be on the path, would not require any extra configuration files. See
 also: https://bitbucket.org/galaxy/galaxy-central/issue/82

Well, use of the dependency system isn't required, so just setting
things up on the $PATH is always a possibility.  I was going to suggest
that your patch could be applied if it was conditional on the local
runner and checked after any requirement type=package dependencies
were setup, but there's still the problem of people running jobs through
the local runner which are actually sent to the cluster without Galaxy's
knowledge.  Perhaps this is something we shouldn't worry too much about,
but I know there are people doing it.

--nate

 
  [cut]
 
  Are you biting off more than you can chew? I hope I am misinterpreting
  your plans.
 
  Hopefully not!  We're trying to think this through pretty thoroughly
  before we get started, thanks for joining in the discussion. =)
 
 I've been reassured :)
 
 Peter
 
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/