Before I went on that tangent, I should have said I of course agree
with 100% of what James said in the original e-mail on this thread.
For what it is worth, I believe the higher-level constructs he
outlined are essential to the long term adoption of the tool shed.

On Tue, Aug 27, 2013 at 1:59 PM, Nate Coraor <n...@bx.psu.edu> wrote:
> On Aug 26, 2013, at 11:59 AM, James Taylor wrote:
>
>> On Mon, Aug 26, 2013 at 11:48 AM, John Chilton <chil...@msi.umn.edu> wrote:
>>
>>> I think it is interesting that there was push back on providing
>>> infrastructure (tool actions) for obtaining CBL from github and
>>> performing installs based on it because it was not in the tool shed
>>> and therefore less reproducible, but the team believes infrastructure
>>> should be put in place to support pypi.
>>
>> Well, first, I'm not sure what "the team" believes, I'm stating what I
>> believe and engaging in a discussion with "the community". At some
>> point this should evolve into what we are actually going to do and be
>> codified in a spec as a Trello card, which is even then not set in
>> stone.
>>
>> Second, I'm not suggesting we depend on PyPI. The nice thing about the
>> second format I proposed on galaxy-dev is that we can easily parse out
>> the URL and archive that file. Then someday we could provide a
>> fallback repository where if the PyPI URL no longer works we still
>> have it stored.
>
> I concur here, the experience and lessons learned by long-established package 
> and dependency managers can provide some useful guidance for us going 
> forward.  APT has long relied on a model of archiving upstream source (as 
> well as distro-generated binary (dpkg) packages), cataloging changes as a set 
> of patches, and maintaining an understanding of installed files, even those 
> meant to be user-edited.  I think there is a strong advantage for us doing 
> this as well.
>
>>
>>> I think we all value reproduciblity here, but we make different
>>> calculations on what is reproducible. I think in terms of implementing
>>> the ideas James has laid out or similar things I have proposed, it
>>> might be beneficial to have some final answers on what external
>>> resources are allowed - both for obtaining a Galaxy IUC gold star and
>>> for the tool shed providing infrastructure to support their usage.
>>
>> My focus is ensuring that we can archive things that pass through the
>> toolshed. Tarballs from *anywhere* are easy enough to deal with.
>> External version control repositories are a bit more challenging,
>> especially when you are pulling just a particular file out, so that's
>> where things got a little hinky for me.
>>
>> Since we don't have the archival mechanism in place yet anyway, this
>> is more a philosophical discussion and setting the right precedent.
>>
>> And yes, keeping an archive of all the software in the world is a
>> scary prospect, though compared to the amount of data we currently
>> keep for people it is a blip. And I'm not sure how else we can really
>> achieve the level of reproducibility we desire.
>
> One additional step that will assist with long-term archival is generating 
> static metadata and allowing the packaging and dependency systems to work 
> outside of the Galaxy and Tool Shed applications.  A package metadata catalog 
> and package format that provided descriptions of packages on a generic 
> webserver and installable without a running Galaxy instance are components 
> that I believe are fairly important.
>
> As for user-edited files, the env.sh files, which are generated at 
> install-time and then essentially untracked afterward scare me a bit.  I 
> think it'd be useful for the packaging system have a tighter concept of 
> environment management.
>
> These are just my opinions, of course, and are going to be very 
> APT/dpkg-biased simply due to my experience with and favor for Debian-based 
> distros and dependency/package management, but I think there are useful 
> concepts in this (and other systems) that we can draw from.
>
> Along those lines, one more idea I had thrown out a while ago was coming up 
> with a way to incorporate (or at least automatically process so that we can 
> convert to our format) the build definitions for other systems like MacPorts, 
> BSD ports/pkgsrc, dpkg, rpm, etc. so that we can leverage the existing rules 
> for building across our target platforms that have already been worked out by 
> other package maintainers with more time.  I think this aligns pretty well 
> with Brad's thinking with CloudBioLinux, the difference in implementation 
> being that we require multiple installable versions and platform independence.

The CloudBioLinux galaxy tool stuff used by Galaxy-P, CloudMan, and in
integrated into tool shed installs with pull request 207 is platform
independent (or as platform independent as the tool shed) and allows
multiple installable versions.

>
> I am a bit worried that as we go down the "repackage (almost) all 
> dependencies" path (which I do think is the right path), we also run the risk 
> of most of our packages being out of date.  That's almost a guaranteed 
> outcome when even the huge packaging projects (Debian, Ubuntu, etc.) are rife 
> with out-of-date packages.  So being able to incorporate upstream build 
> definitions may help us package dependencies quickly.
>
> --nate
>
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>  http://lists.bx.psu.edu/
>>
>> To search Galaxy mailing lists use the unified search at:
>>  http://galaxyproject.org/search/mailinglists/
>

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Reply via email to