Re: [galaxy-dev] Tool Shed Workflow
John, Why not separate toolshed updates from dist updates - tool xml and other code should be robust wrt dist version. One thing at a time - tools get updated less often than dist I'd wager, and you can subscribe to repository update emails. After a dist update you want all the tool functional tests green as evidence that at least the test cases are running! As always, YMMV On Sat, Jun 9, 2012 at 2:38 PM, John Chilton chil...@msi.umn.edu wrote: On Fri, Jun 8, 2012 at 3:27 PM, Greg Von Kuster g...@bx.psu.edu wrote: Hi John, On Jun 8, 2012, at 1:22 PM, John Chilton wrote: Hello Greg, Thanks for the prompt and detailed response (though it did make me sad). I think deploying tested, static components and configurations to production environments and having production environments not depending on outside services (like the tool shed) should be considered best practices. I'm not sure I understand this issue. What processes are you using to upgrade your test and production servers with new Galaxy distributions? If you are pulling new Galaxy distributions from our Galaxy dist repository in bitbucket, then pulling tools from the Galaxy tool shed is not much different - both are outside services. Updating your test environment, determining it is functionally correct, and then updating your production environment using the same approach would generally follow a best practice approach. This is the approach we are currently using for our public test and main Galaxy instances at Penn State. We don't pull down from bitbucket directly to our production environment, we pull galaxy-dist changes into our testing repository, merge (that can be quite complicated, sometimes a multihour process), auto-deploy to a testing server, and then finally we push the tested changes into a bare production repo. Our sys admins then pull in changes from that bare production repo in our production environment. We also prebuild eggs in our testing environment not live on our production system. Given the complicated merges we need to do and the configuration files that need to be updated each dist update it would seem making those changes on a live production system would be problematic. Even if one was pulling changes directly from bitbucket into a production codebase, I think the dependency on bitbucket would be very different than on N toolsheds. If our sys admin is going to update Galaxy and bitbucket is down, that is no problem he or she can just bring Galaxy back up and update later. Now lets imagine they shutdown our galaxy instance, updated the code base, did a database migration, and went to do a toolshed migration and that failed. In this case instead of just bringing Galaxy back up they will now need to restore the database from backup and pullout of the mercurial changes. Anyway all of that is a digression right, I understand that we will need to have the deploy-time dependencies on tool sheds and make these tool migration script calls part of our workflow. My lingering hope is for a way of programmatically importing and updating new tools that were never part of Galaxy (Qiime, upload_local_file, etc...) using tool sheds. My previous e-mail was proposing or positing a mechanism for doing that, but I think you read it like I was trying to describe a way to script the migrations of the existing official Galaxy tools (I definitely get that you have done that). Thanks again for your time and detailed responses, -John ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444; ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Tool Shed Workflow
Hi John, I feel this is an important topic and that others in the community are undoubtedly benefitting from it, so I'm glad you started this discussion. On Jun 9, 2012, at 12:36 AM, John Chilton wrote: We don't pull down from bitbucket directly to our production environment, we pull galaxy-dist changes into our testing repository, merge (that can be quite complicated, sometimes a multihour process), auto-deploy to a testing server, and then finally we push the tested changes into a bare production repo. Our sys admins then pull in changes from that bare production repo in our production environment. We also prebuild eggs in our testing environment not live on our production system. Given the complicated merges we need to do and the configuration files that need to be updated each dist update it would seem making those changes on a live production system would be problematic. Even if one was pulling changes directly from bitbucket into a production codebase, I think the dependency on bitbucket would be very different than on N toolsheds. These tool migrations will only interact with a single tool shed, the main Galaxy tool shed. If our sys admin is going to update Galaxy and bitbucket is down, that is no problem he or she can just bring Galaxy back up and update later. Now lets imagine they shutdown our galaxy instance, updated the code base, did a database migration, and went to do a toolshed migration and that failed. In this case instead of just bringing Galaxy back up they will now need to restore the database from backup and pullout of the mercurial changes. In your scenario, if everything went well except the tool shed migration, an option that would be less intrusive than reverting back to the previous Galaxy release would be to just bring up your server without the migrated tools for a temporary time. When the tool shed migration process is corrected (generally, the only reason it would break is if the tool shed was down), you could run it at that time. So the worst case scenario is that the specific migrated tools will be temporarily unavailable from your production Galaxy instance. A nice feature of these migrated tool scripts is that they are very flexible in when they can be run, which is any time. They also do not have to be run in any specific order. So, for example, you could run tool migration script 0002 six months after you've run migration script 0003, 0004, etc. These scripts do affect the Galaxy database by adding new records to certain tables, but if the script fails, no database corrections are necessary in order to prepare for running the script again. You can just run the same script later, and the script will handle whatever database state exists at that time. Anyway all of that is a digression right, I understand that we will need to have the deploy-time dependencies on tool sheds and make these tool migration script calls part of our workflow. My lingering hope is for a way of programmatically importing and updating new tools that were never part of Galaxy (Qiime, upload_local_file, etc...) using tool sheds. This is currently possible (and very simple to do) using the Galaxy Admin UI. See the following sections of the tool shed wiki for details. http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_tools_into_a_local_Galaxy_instance http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_data_types_into_a_local_Galaxy_instance http://wiki.g2.bx.psu.edu/Tool%20Shed#Getting_updates_for_tool_shed_repositories_installed_in_a_local_Galaxy_instance I'm currently writing the following new section - it should be available within the next week or so. http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_3rd_party_tool_dependency_installation_and_compilation_with_installed_repositories My previous e-mail was proposing or positing a mechanism for doing that, but I think you read it like I was trying to describe a way to script the migrations of the existing official Galaxy tools (I definitely get that you have done that). Thanks again for your time and detailed responses, -John ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] RNA-seq wig file from UCSC conversion to gene expression values, using Galaxy
Hello Galaxy list, I'm helping a student with a human bioinformatics project. We would like to generate a single relative gene expression value for each gene in the human genome (for different cell lines of interest) using the RNA-seq data from the UCSC archive. We can see raw signal values by displaying the following file as a track in the Genome Browser: Track name: BJ cell pA+ + 1 Table name: wgEncodeCshlLongRnaSeqBjCellPapMinusRawSigRep1 File name: /gbdb/hg19/bbi/wgEncodeCshlLongRnaSeqBjCellPapPlusRawSigRep1.bigWig But where we get stuck is trying to import the wig into Galaxy to generate expression values for each gene. Cufflinks seems like the right tool for the output we want, but we're not sure what the inputs are supposed to be. We'd appreciate any help you can provide. Best regards, ---Karmella ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/