Hey All,

There was a long conversation about this topic in IRC yesterday (among
people who don't actually use the tool shed all that frequently), I
have posted it to the new unofficial Galaxy Google+ group if anyone
would like to read and chime in.

https://plus.google.com/111860405027053012444/posts/TkCFwA2jkDN

-John


On Tue, May 14, 2013 at 3:59 PM, Nate Coraor <n...@bx.psu.edu> wrote:
> Greg created the following card, and I'm working on a few changes to your 
> commit:
>
> https://trello.com/card/toolshed-consider-enhancing-tool-dependency-definition-framework-per-john-chilton-s-pull-request/506338ce32ae458f6d15e4b3/848
>
> Thanks,
> --nate
>
> On May 14, 2013, at 1:45 PM, Nate Coraor wrote:
>
>> On May 14, 2013, at 10:58 AM, John Chilton wrote:
>>
>>> Hey Nate,
>>>
>>> On Tue, May 14, 2013 at 8:40 AM, Nate Coraor <n...@bx.psu.edu> wrote:
>>>> Hi John,
>>>>
>>>> A few of us in the lab here at Penn State actually discussed automatic 
>>>> creation of virtualenvs for dependency installations a couple weeks ago.  
>>>> This was in the context of Bjoern's request for supporting compile-time 
>>>> dependencies.  I think it's a great idea, but there's a limitation that 
>>>> we'd need to account for.
>>>>
>>>> If you're going to have frequently used and expensive to build libraries 
>>>> (e.g. numpy, R + rpy) in dependency-only repositories and then have your 
>>>> tool(s) depend on those repositories, the activate method won't work.  
>>>> virtualenvs cannot depend on other virtualenvs or be active at the same 
>>>> time as other virtualenvs.  We could work around it by setting PYTHONPATH 
>>>> in the dependencies' env.sh like we do now.  But then, other than making 
>>>> installation a bit easier (e.g. by allowing the use of pip), we have not 
>>>> gained much.
>>>
>>> I don't know what to make of your response. It seems like a no, but
>>> the word no doesn't appear anywhere.
>>
>> Sorry about being wishy-washy.  Unless anyone has any objections or can 
>> foresee other problems, I would say yes to this.  But I believe it should 
>> not break the concept of common-dependency-only repositories.
>>
>> I'm pretty sure that as long as the process of creating a venv also adds the 
>> venv's site-packages to PYTHONPATH in that dependency's env.sh, the problem 
>> should be automatically dealt with.
>>
>>> I don't know the particulars of rpy, but numpy installs fine via this
>>> method and I see no problem with each application having its own copy
>>> of numpy. I think relying on OS managed python packages for instance
>>> is something of a bad practice, when developing and distributing
>>> software I use virtualenvs for everything. I think that stand-alone
>>> python defined packages in the tool shed are directly analogous to OS
>>> managed packages.
>>
>> Completely agree that we want to avoid OS-managed python packages.  I had, 
>> in the past, considered that for something like numpy, we ought to make it 
>> easy for an administrator to allow their own version of numpy to be used, 
>> since numpy can be linked against a number of optimized libraries for 
>> significant performance gains, and this generally won't happen for versions 
>> installed from the toolshed unless the system already has stuff like 
>> atlas-dev installed.  But I think we still allow admins that possibility 
>> with reasonable ease since dependency management in Galaxy is not a 
>> requirement.
>>
>> What we do want to avoid is the situation where someone clones a new copy of 
>> Galaxy, wants to install 10 different tools that all depend on numpy, and 
>> has to wait an hour while 10 versions of numpy compile.  Add that in with 
>> other tools that will have a similar process (installing R + packages + rpy) 
>> plus the hope that down the line you'll be able to automatically maintain 
>> separate builds for remote resources that are not the same (i.e. multiple 
>> clusters with differing operating systems) and this hopefully highlights why 
>> I think reducing duplication where possible will be important.
>>
>>> I also disagree we have not gained much. Setting up these repositories
>>> is a onerous, brittle process. This patch provides some high-level
>>> functionality for creating virtualenv's which negates the need for
>>> creating separate repositories per package.
>>
>> This is a good point.  I probably also sold short the benefit of being able 
>> to install with pip, since this does indeed remove a similarly brittle and 
>> tedious step of downloading and installing modules.
>>
>> --nate
>>
>>>
>>> -John
>>>
>>>>
>>>> --nate
>>>>
>>>> On May 13, 2013, at 6:49 PM, John Chilton wrote:
>>>>
>>>>> The proliferation of individual python package install definitions has
>>>>> continued and it has spread to some MSI managed tools. I worry about
>>>>> the tedium I will have to endure in the future if that becomes an
>>>>> established best practice :) so I have implemented the python version
>>>>> of what I had described in this thread:
>>>>>
>>>>> As patch:
>>>>> https://github.com/jmchilton/galaxy-central/commit/161d3b288016077a99fb7196b6e08fe7d690f34b.patch
>>>>> Pretty version:
>>>>> https://github.com/jmchilton/galaxy-central/commit/161d3b288016077a99fb7196b6e08fe7d690f34b
>>>>>
>>>>> I understand that there are going to be differing opinions as to
>>>>> whether this is the best way forward but I thought I would give my
>>>>> position a better chance of succeeding by providing an implementation.
>>>>>
>>>>> Thanks for your consideration,
>>>>> -John
>>>>>
>>>>>
>>>>> On Wed, Apr 17, 2013 at 3:56 PM, Peter Cock <p.j.a.c...@googlemail.com> 
>>>>> wrote:
>>>>>> On Tue, Apr 16, 2013 at 2:46 PM, John Chilton <chil...@msi.umn.edu> 
>>>>>> wrote:
>>>>>>> Stepping back a little, is the right way to address Python
>>>>>>> dependencies?
>>>>>>
>>>>>> Looks like I missed this thread, hence:
>>>>>> http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-April/014169.html
>>>>>>
>>>>>>> I was a big advocate for inter-repository dependencies,
>>>>>>> but I think taking it to the level of individual python packages might
>>>>>>> be going too far - my thought was they were needed for big 100Mb
>>>>>>> programs and stuff like that.
>>>>>>
>>>>>> It should work but it is a lot of boilerplate for something which
>>>>>> should be more automated.
>>>>>>
>>>>>>> At the Java jar/Python library/Ruby gem
>>>>>>> level I think using some of the platform specific packaging stuff to
>>>>>>> creating isolated environments for each program might be a better way
>>>>>>> to go.
>>>>>>
>>>>>> I agree, the best way forward isn't obvious here, and it may make
>>>>>> sense to have tailored solutions for Python, Perl, Java, R, Ruby,
>>>>>> etc packages rather than the current Tool Shed package solution.
>>>>>>
>>>>>> I've like to be able to just continue to write this kind of thing in my
>>>>>> tool XML files and have it actually taken care of (rather than ignored):
>>>>>>
>>>>>> <requirements>
>>>>>>   <requirement type="python-module">numpy</requirement>
>>>>>>   <requirement type="python-module">Bio</requirement>
>>>>>> </requirements>
>>>>>>
>>>>>> Adding a version key would be sensible, handling min/max etc
>>>>>> as per Python packaging norms.
>>>>>>
>>>>>> Peter
>>>>> ___________________________________________________________
>>>>> Please keep all replies on the list by using "reply all"
>>>>> in your mail client.  To manage your subscriptions to this
>>>>> and other Galaxy lists, please use the interface at:
>>>>> http://lists.bx.psu.edu/
>>>>>
>>>>> To search Galaxy mailing lists use the unified search at:
>>>>> http://galaxyproject.org/search/mailinglists/
>>>>
>>>
>>
>>
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>  http://lists.bx.psu.edu/
>>
>> To search Galaxy mailing lists use the unified search at:
>>  http://galaxyproject.org/search/mailinglists/
>

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Reply via email to