Re: [galaxy-dev] Tool Shed Workflow

2012-06-09 Thread Ross
John,

Why not separate toolshed updates from dist updates - tool xml and
other code should be robust wrt dist version.
One thing at a time - tools get updated less often than dist I'd
wager, and you can subscribe to repository update emails.
After a dist update you want all the tool functional tests green as
evidence that at least the test cases are running!
As always, YMMV



On Sat, Jun 9, 2012 at 2:38 PM, John Chilton chil...@msi.umn.edu wrote:
 On Fri, Jun 8, 2012 at 3:27 PM, Greg Von Kuster g...@bx.psu.edu wrote:
 Hi John,

 On Jun 8, 2012, at 1:22 PM, John Chilton wrote:

 Hello Greg,

 Thanks for the prompt and detailed response (though it did make me
 sad). I think deploying tested, static components and configurations
 to production environments and having production environments not
 depending on outside services (like the tool shed) should be
 considered best practices.

 I'm not sure I understand this issue.  What processes are you using to 
 upgrade your test and production servers with new Galaxy distributions?  If 
 you are pulling
 new Galaxy distributions from our Galaxy dist repository in bitbucket, then 
 pulling tools from the Galaxy tool shed is not much different - both are 
 outside services.  Updating your test environment, determining it is 
 functionally correct, and then updating your production environment using 
 the same approach would generally follow a best practice approach.  This is 
 the approach we are currently using for our public test and main Galaxy 
 instances at Penn State.

 We don't pull down from bitbucket directly to our production
 environment, we pull galaxy-dist changes into our testing repository,
 merge (that can be quite complicated, sometimes a multihour process),
 auto-deploy to a testing server, and then finally we push the tested
 changes into a bare production repo. Our sys admins then pull in
 changes from that bare production repo in our production environment.
 We also prebuild eggs in our testing environment not live on our
 production system. Given the complicated merges we need to do and the
 configuration files that need to be updated each dist update it would
 seem making those changes on a live production system would be
 problematic.

 Even if one was pulling changes directly from bitbucket into a
 production codebase, I think the dependency on bitbucket would be very
 different than on N toolsheds. If our sys admin is going to update
 Galaxy and bitbucket is down, that is no problem he or she can just
 bring Galaxy back up and update later. Now lets imagine they shutdown
 our galaxy instance, updated the code base, did a database migration,
 and went to do a toolshed migration and that failed. In this case
 instead of just bringing Galaxy back up they will now need to restore
 the database from backup and pullout of the mercurial changes.

 Anyway all of that is a digression right, I understand that we will
 need to have the deploy-time dependencies on tool sheds and make these
 tool migration script calls part of our workflow. My lingering hope is
 for a way of programmatically importing and updating new tools that
 were never part of Galaxy (Qiime, upload_local_file, etc...) using
 tool sheds. My previous e-mail was proposing or positing a mechanism
 for doing that, but I think you read it like I was trying to describe
 a way to script the migrations of the existing official Galaxy tools
 (I definitely get that you have done that).

 Thanks again for your time and detailed responses,
 -John

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/



-- 
Ross Lazarus MBBS MPH;
Associate Professor, Harvard Medical School;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Tool Shed Workflow

2012-06-09 Thread Greg Von Kuster
Hi John,

I feel this is an important topic and that others in the community are 
undoubtedly benefitting from it, so I'm glad you started this discussion.

On Jun 9, 2012, at 12:36 AM, John Chilton wrote:

 We don't pull down from bitbucket directly to our production
 environment, we pull galaxy-dist changes into our testing repository,
 merge (that can be quite complicated, sometimes a multihour process),
 auto-deploy to a testing server, and then finally we push the tested
 changes into a bare production repo. Our sys admins then pull in
 changes from that bare production repo in our production environment.
 We also prebuild eggs in our testing environment not live on our
 production system. Given the complicated merges we need to do and the
 configuration files that need to be updated each dist update it would
 seem making those changes on a live production system would be
 problematic.
 
 Even if one was pulling changes directly from bitbucket into a
 production codebase, I think the dependency on bitbucket would be very
 different than on N toolsheds.

These tool migrations will only interact with a single tool shed, the main 
Galaxy tool shed.


 If our sys admin is going to update
 Galaxy and bitbucket is down, that is no problem he or she can just
 bring Galaxy back up and update later. Now lets imagine they shutdown
 our galaxy instance, updated the code base, did a database migration,
 and went to do a toolshed migration and that failed. In this case
 instead of just bringing Galaxy back up they will now need to restore
 the database from backup and pullout of the mercurial changes.

In your scenario, if everything went well except the tool shed migration, an 
option that would be less intrusive than reverting back to the previous Galaxy 
release would be to just bring up your server without the migrated tools for a 
temporary time.  When the tool shed migration process is corrected (generally, 
the only reason it would break is if the tool shed was down), you could run it 
at that time.  So the worst case scenario is that the specific migrated tools 
will be temporarily unavailable from your production Galaxy instance.

A nice feature of these migrated tool scripts is that they are very flexible in 
when they can be run, which is any time.  They also do not have to be run in 
any specific order.  So, for example, you could run tool migration script 
0002 six months after you've run migration script 0003, 0004, etc.

These scripts do affect the Galaxy database by adding new records to certain 
tables, but if the script fails, no database corrections are necessary in order 
to prepare for running the script again.  You can just run the same script 
later, and the script will handle whatever database state exists at that time.


 
 Anyway all of that is a digression right, I understand that we will
 need to have the deploy-time dependencies on tool sheds and make these
 tool migration script calls part of our workflow. My lingering hope is
 for a way of programmatically importing and updating new tools that
 were never part of Galaxy (Qiime, upload_local_file, etc...) using
 tool sheds.

This is currently possible (and very simple to do) using the Galaxy Admin UI.  
See the following sections of the tool shed wiki for details.

http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_tools_into_a_local_Galaxy_instance
http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_data_types_into_a_local_Galaxy_instance
http://wiki.g2.bx.psu.edu/Tool%20Shed#Getting_updates_for_tool_shed_repositories_installed_in_a_local_Galaxy_instance

I'm currently writing the following new section - it should be available within 
the next week or so.

http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_3rd_party_tool_dependency_installation_and_compilation_with_installed_repositories



 My previous e-mail was proposing or positing a mechanism
 for doing that, but I think you read it like I was trying to describe
 a way to script the migrations of the existing official Galaxy tools
 (I definitely get that you have done that).
 
 Thanks again for your time and detailed responses,
 -John
 


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] RNA-seq wig file from UCSC conversion to gene expression values, using Galaxy

2012-06-09 Thread Karmella Haynes
Hello Galaxy list,

I'm helping a student with a human bioinformatics project. We would like to 
generate a single relative gene expression value for each gene in the human 
genome (for different cell lines of interest) using the RNA-seq data from the 
UCSC archive. We can see raw signal values by displaying the following file as 
a track in the Genome Browser:

Track name: BJ cell pA+ + 1
Table name: wgEncodeCshlLongRnaSeqBjCellPapMinusRawSigRep1
File name: /gbdb/hg19/bbi/wgEncodeCshlLongRnaSeqBjCellPapPlusRawSigRep1.bigWig

But where we get stuck is trying to import the wig into Galaxy to generate 
expression values for each gene. Cufflinks seems like the right tool for the 
output we want, but we're not sure what the inputs are supposed to be. We'd 
appreciate any help you can provide.

Best regards,
---Karmella
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/