Before I went on that tangent, I should have said I of course agree
with 100% of what James said in the original e-mail on this thread.
For what it is worth, I believe the higher-level constructs he
outlined are essential to the long term adoption of the tool shed.

On Tue, Aug 27, 2013 at 1:59 PM, Nate Coraor <> wrote:
> On Aug 26, 2013, at 11:59 AM, James Taylor wrote:
>> On Mon, Aug 26, 2013 at 11:48 AM, John Chilton <> wrote:
>>> I think it is interesting that there was push back on providing
>>> infrastructure (tool actions) for obtaining CBL from github and
>>> performing installs based on it because it was not in the tool shed
>>> and therefore less reproducible, but the team believes infrastructure
>>> should be put in place to support pypi.
>> Well, first, I'm not sure what "the team" believes, I'm stating what I
>> believe and engaging in a discussion with "the community". At some
>> point this should evolve into what we are actually going to do and be
>> codified in a spec as a Trello card, which is even then not set in
>> stone.
>> Second, I'm not suggesting we depend on PyPI. The nice thing about the
>> second format I proposed on galaxy-dev is that we can easily parse out
>> the URL and archive that file. Then someday we could provide a
>> fallback repository where if the PyPI URL no longer works we still
>> have it stored.
> I concur here, the experience and lessons learned by long-established package 
> and dependency managers can provide some useful guidance for us going 
> forward.  APT has long relied on a model of archiving upstream source (as 
> well as distro-generated binary (dpkg) packages), cataloging changes as a set 
> of patches, and maintaining an understanding of installed files, even those 
> meant to be user-edited.  I think there is a strong advantage for us doing 
> this as well.
>>> I think we all value reproduciblity here, but we make different
>>> calculations on what is reproducible. I think in terms of implementing
>>> the ideas James has laid out or similar things I have proposed, it
>>> might be beneficial to have some final answers on what external
>>> resources are allowed - both for obtaining a Galaxy IUC gold star and
>>> for the tool shed providing infrastructure to support their usage.
>> My focus is ensuring that we can archive things that pass through the
>> toolshed. Tarballs from *anywhere* are easy enough to deal with.
>> External version control repositories are a bit more challenging,
>> especially when you are pulling just a particular file out, so that's
>> where things got a little hinky for me.
>> Since we don't have the archival mechanism in place yet anyway, this
>> is more a philosophical discussion and setting the right precedent.
>> And yes, keeping an archive of all the software in the world is a
>> scary prospect, though compared to the amount of data we currently
>> keep for people it is a blip. And I'm not sure how else we can really
>> achieve the level of reproducibility we desire.
> One additional step that will assist with long-term archival is generating 
> static metadata and allowing the packaging and dependency systems to work 
> outside of the Galaxy and Tool Shed applications.  A package metadata catalog 
> and package format that provided descriptions of packages on a generic 
> webserver and installable without a running Galaxy instance are components 
> that I believe are fairly important.
> As for user-edited files, the files, which are generated at 
> install-time and then essentially untracked afterward scare me a bit.  I 
> think it'd be useful for the packaging system have a tighter concept of 
> environment management.
> These are just my opinions, of course, and are going to be very 
> APT/dpkg-biased simply due to my experience with and favor for Debian-based 
> distros and dependency/package management, but I think there are useful 
> concepts in this (and other systems) that we can draw from.
> Along those lines, one more idea I had thrown out a while ago was coming up 
> with a way to incorporate (or at least automatically process so that we can 
> convert to our format) the build definitions for other systems like MacPorts, 
> BSD ports/pkgsrc, dpkg, rpm, etc. so that we can leverage the existing rules 
> for building across our target platforms that have already been worked out by 
> other package maintainers with more time.  I think this aligns pretty well 
> with Brad's thinking with CloudBioLinux, the difference in implementation 
> being that we require multiple installable versions and platform independence.

The CloudBioLinux galaxy tool stuff used by Galaxy-P, CloudMan, and in
integrated into tool shed installs with pull request 207 is platform
independent (or as platform independent as the tool shed) and allows
multiple installable versions.

> I am a bit worried that as we go down the "repackage (almost) all 
> dependencies" path (which I do think is the right path), we also run the risk 
> of most of our packages being out of date.  That's almost a guaranteed 
> outcome when even the huge packaging projects (Debian, Ubuntu, etc.) are rife 
> with out-of-date packages.  So being able to incorporate upstream build 
> definitions may help us package dependencies quickly.
> --nate
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>> To search Galaxy mailing lists use the unified search at:

Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

To search Galaxy mailing lists use the unified search at:

Reply via email to