Re: [galaxy-dev] Tool Shed Workflow
We are going in circles here :) Me: my hope is for a way of programmatically importing and updating new tools You: This is currently possible (and very simple to do) using the Galaxy Admin UI I would call using the Admin UI not doing some programmatically. You have done a brilliant job making easy to install and update tools via the Admin UI. I am not sure the experience could be made any easier. I have been instead trying to ask about how I might script some of the actions you have enabled via the Admin UI. My draconian theories about production environments aside, the second use case - fully automating the creation of preconfigured Galaxy instances for cloud images - that requires this functionality if I want to use tool sheds, its not a taste thing. So I am going to implement it - I need it, I just wanted your opinion on the best way to go about it. We should perhaps continue this conversation via pull request. Thanks again, -John On Sat, Jun 9, 2012 at 6:01 AM, Greg Von Kuster g...@bx.psu.edu wrote: Hi John, I feel this is an important topic and that others in the community are undoubtedly benefitting from it, so I'm glad you started this discussion. On Jun 9, 2012, at 12:36 AM, John Chilton wrote: We don't pull down from bitbucket directly to our production environment, we pull galaxy-dist changes into our testing repository, merge (that can be quite complicated, sometimes a multihour process), auto-deploy to a testing server, and then finally we push the tested changes into a bare production repo. Our sys admins then pull in changes from that bare production repo in our production environment. We also prebuild eggs in our testing environment not live on our production system. Given the complicated merges we need to do and the configuration files that need to be updated each dist update it would seem making those changes on a live production system would be problematic. Even if one was pulling changes directly from bitbucket into a production codebase, I think the dependency on bitbucket would be very different than on N toolsheds. These tool migrations will only interact with a single tool shed, the main Galaxy tool shed. If our sys admin is going to update Galaxy and bitbucket is down, that is no problem he or she can just bring Galaxy back up and update later. Now lets imagine they shutdown our galaxy instance, updated the code base, did a database migration, and went to do a toolshed migration and that failed. In this case instead of just bringing Galaxy back up they will now need to restore the database from backup and pullout of the mercurial changes. In your scenario, if everything went well except the tool shed migration, an option that would be less intrusive than reverting back to the previous Galaxy release would be to just bring up your server without the migrated tools for a temporary time. When the tool shed migration process is corrected (generally, the only reason it would break is if the tool shed was down), you could run it at that time. So the worst case scenario is that the specific migrated tools will be temporarily unavailable from your production Galaxy instance. A nice feature of these migrated tool scripts is that they are very flexible in when they can be run, which is any time. They also do not have to be run in any specific order. So, for example, you could run tool migration script 0002 six months after you've run migration script 0003, 0004, etc. These scripts do affect the Galaxy database by adding new records to certain tables, but if the script fails, no database corrections are necessary in order to prepare for running the script again. You can just run the same script later, and the script will handle whatever database state exists at that time. Anyway all of that is a digression right, I understand that we will need to have the deploy-time dependencies on tool sheds and make these tool migration script calls part of our workflow. My lingering hope is for a way of programmatically importing and updating new tools that were never part of Galaxy (Qiime, upload_local_file, etc...) using tool sheds. This is currently possible (and very simple to do) using the Galaxy Admin UI. See the following sections of the tool shed wiki for details. http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_tools_into_a_local_Galaxy_instance http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_data_types_into_a_local_Galaxy_instance http://wiki.g2.bx.psu.edu/Tool%20Shed#Getting_updates_for_tool_shed_repositories_installed_in_a_local_Galaxy_instance I'm currently writing the following new section - it should be available within the next week or so. http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_3rd_party_tool_dependency_installation_and_compilation_with_installed_repositories My previous e-mail was
Re: [galaxy-dev] Tool Shed Workflow
John, Why not separate toolshed updates from dist updates - tool xml and other code should be robust wrt dist version. One thing at a time - tools get updated less often than dist I'd wager, and you can subscribe to repository update emails. After a dist update you want all the tool functional tests green as evidence that at least the test cases are running! As always, YMMV On Sat, Jun 9, 2012 at 2:38 PM, John Chilton chil...@msi.umn.edu wrote: On Fri, Jun 8, 2012 at 3:27 PM, Greg Von Kuster g...@bx.psu.edu wrote: Hi John, On Jun 8, 2012, at 1:22 PM, John Chilton wrote: Hello Greg, Thanks for the prompt and detailed response (though it did make me sad). I think deploying tested, static components and configurations to production environments and having production environments not depending on outside services (like the tool shed) should be considered best practices. I'm not sure I understand this issue. What processes are you using to upgrade your test and production servers with new Galaxy distributions? If you are pulling new Galaxy distributions from our Galaxy dist repository in bitbucket, then pulling tools from the Galaxy tool shed is not much different - both are outside services. Updating your test environment, determining it is functionally correct, and then updating your production environment using the same approach would generally follow a best practice approach. This is the approach we are currently using for our public test and main Galaxy instances at Penn State. We don't pull down from bitbucket directly to our production environment, we pull galaxy-dist changes into our testing repository, merge (that can be quite complicated, sometimes a multihour process), auto-deploy to a testing server, and then finally we push the tested changes into a bare production repo. Our sys admins then pull in changes from that bare production repo in our production environment. We also prebuild eggs in our testing environment not live on our production system. Given the complicated merges we need to do and the configuration files that need to be updated each dist update it would seem making those changes on a live production system would be problematic. Even if one was pulling changes directly from bitbucket into a production codebase, I think the dependency on bitbucket would be very different than on N toolsheds. If our sys admin is going to update Galaxy and bitbucket is down, that is no problem he or she can just bring Galaxy back up and update later. Now lets imagine they shutdown our galaxy instance, updated the code base, did a database migration, and went to do a toolshed migration and that failed. In this case instead of just bringing Galaxy back up they will now need to restore the database from backup and pullout of the mercurial changes. Anyway all of that is a digression right, I understand that we will need to have the deploy-time dependencies on tool sheds and make these tool migration script calls part of our workflow. My lingering hope is for a way of programmatically importing and updating new tools that were never part of Galaxy (Qiime, upload_local_file, etc...) using tool sheds. My previous e-mail was proposing or positing a mechanism for doing that, but I think you read it like I was trying to describe a way to script the migrations of the existing official Galaxy tools (I definitely get that you have done that). Thanks again for your time and detailed responses, -John ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444; ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Tool Shed Workflow
Hi John, I feel this is an important topic and that others in the community are undoubtedly benefitting from it, so I'm glad you started this discussion. On Jun 9, 2012, at 12:36 AM, John Chilton wrote: We don't pull down from bitbucket directly to our production environment, we pull galaxy-dist changes into our testing repository, merge (that can be quite complicated, sometimes a multihour process), auto-deploy to a testing server, and then finally we push the tested changes into a bare production repo. Our sys admins then pull in changes from that bare production repo in our production environment. We also prebuild eggs in our testing environment not live on our production system. Given the complicated merges we need to do and the configuration files that need to be updated each dist update it would seem making those changes on a live production system would be problematic. Even if one was pulling changes directly from bitbucket into a production codebase, I think the dependency on bitbucket would be very different than on N toolsheds. These tool migrations will only interact with a single tool shed, the main Galaxy tool shed. If our sys admin is going to update Galaxy and bitbucket is down, that is no problem he or she can just bring Galaxy back up and update later. Now lets imagine they shutdown our galaxy instance, updated the code base, did a database migration, and went to do a toolshed migration and that failed. In this case instead of just bringing Galaxy back up they will now need to restore the database from backup and pullout of the mercurial changes. In your scenario, if everything went well except the tool shed migration, an option that would be less intrusive than reverting back to the previous Galaxy release would be to just bring up your server without the migrated tools for a temporary time. When the tool shed migration process is corrected (generally, the only reason it would break is if the tool shed was down), you could run it at that time. So the worst case scenario is that the specific migrated tools will be temporarily unavailable from your production Galaxy instance. A nice feature of these migrated tool scripts is that they are very flexible in when they can be run, which is any time. They also do not have to be run in any specific order. So, for example, you could run tool migration script 0002 six months after you've run migration script 0003, 0004, etc. These scripts do affect the Galaxy database by adding new records to certain tables, but if the script fails, no database corrections are necessary in order to prepare for running the script again. You can just run the same script later, and the script will handle whatever database state exists at that time. Anyway all of that is a digression right, I understand that we will need to have the deploy-time dependencies on tool sheds and make these tool migration script calls part of our workflow. My lingering hope is for a way of programmatically importing and updating new tools that were never part of Galaxy (Qiime, upload_local_file, etc...) using tool sheds. This is currently possible (and very simple to do) using the Galaxy Admin UI. See the following sections of the tool shed wiki for details. http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_tools_into_a_local_Galaxy_instance http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_data_types_into_a_local_Galaxy_instance http://wiki.g2.bx.psu.edu/Tool%20Shed#Getting_updates_for_tool_shed_repositories_installed_in_a_local_Galaxy_instance I'm currently writing the following new section - it should be available within the next week or so. http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_3rd_party_tool_dependency_installation_and_compilation_with_installed_repositories My previous e-mail was proposing or positing a mechanism for doing that, but I think you read it like I was trying to describe a way to script the migrations of the existing official Galaxy tools (I definitely get that you have done that). Thanks again for your time and detailed responses, -John ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Tool Shed Workflow
Hi John, On Jun 7, 2012, at 11:55 PM, John Chilton wrote: I have read through the documentation a couple times, but I still have a few questions about the recent tool shed enhancements. At MSI we have a testing environment and a production environment and I want to make sure the tool versions and configurations don't get out of sync, I would also like to test everything in our testing environment before it reaches production. Is there a recommended way to accomplish this rather than just manually repeating the same set of UI interactions twice? Can I just import tools through the testing UI and run the ./scripts/migrate_tools/ scripts on our testing repository and then move the resulting migrated_tools_conf.xml and integrated_tool_panel.xml files into production? I have follow up questions, but I will wait for a response on this point. Tools that used to be in the Galaxy distribution but have been moved to the main Galaxy tool shed are automatically installed when you start up your Galaxy server and presented with the option of running the migration script to automatically install the tools that were migrated in the specific Galaxy distribution release. If you choose to install the tools, they are installed only in that specific Galaxy instance. Installation produces mercurial repositories that include the tools on disk in your Galaxy server environment. Several other things are produced as well, including database records for the installation. Each Galaxy instance consists of it's own separate set of components, this installation process must be done for each instance. The installation is fully automatic, requiring little interaction on the part of the Galaxy admin, and doesn't require much time, so performing the process for each Galaxy instance should not be too intensive. Also, the tools that are! installed into each Galaxy instance's tool panel are only those tools that were originally defined in the tool panel configuration file (tool_conf.xml). This approach provides for the case where each Galaxy instance having different tools defined will not be altered by the migration process. Also as you are removing tools from Galaxy and placing them into our tool shed, what is the recommended course of actions for deployers that have made local minor tweaks to those tool configs and scripts and adapt them to our local environments? Along the same lines, what is the recommended course of action if we need to make minor tweaks to tools pulled into through the UI to adapt them to our institution. In both cases you should upload your proprietary tools to either a local Galaxy tool shed that you administer, or the main Galaxy tool shed if you want. You can choose to not execute any of the tool migration scripts, so the Galaxy tools that were migrated from the distribution will not be installed into your Galaxy environment. You can use the Galaxy admin UI to install your proprietary versions of the migrated tools from the tool shed in which you chose to upload and store them. New versions of the tools can be uploaded to respective tool shed repositories over time. Thanks for your time, -John John Chilton Senior Software Developer University of Minnesota Supercomputing Institute Office: 612-625-0917 Cell: 612-226-9223 ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Tool Shed Workflow
Hello Greg, Thanks for the prompt and detailed response (though it did make me sad). I think deploying tested, static components and configurations to production environments and having production environments not depending on outside services (like the tool shed) should be considered best practices. Oh well, I guess. Would there be a way to at least automate the pulling of tools in. For instance, would it make sense to tweak InstallManager to parse a new kind of migration file that is a lot like the official migration files, but with the sections defined in the file. For this new kind of migration, the InstallManager would then import everything in the file and not just the tools that are also in a tool_conf? Does that make sense? If yes, I imagine it could be modified to handle updates the same way? Rephrased, I guess the idea would be to have the sequence of official galaxy migrations that check tool_conf, and then have a sequence of migration defined by the deployer that could be used to install new tools or update existing ones. My concern isn't just with the dev to production transition, it is also the ability to sort of programmatically define Galaxy installations the way I am doing with the galaxy-vm-launcher (https://bitbucket.org/jmchilton/galaxy-vm-launcher) or the way mi-deployment works. Thanks again for your time and patience in explaining these things to me, -John On Fri, Jun 8, 2012 at 4:19 AM, Greg Von Kuster g...@bx.psu.edu wrote: Hi John, On Jun 7, 2012, at 11:55 PM, John Chilton wrote: I have read through the documentation a couple times, but I still have a few questions about the recent tool shed enhancements. At MSI we have a testing environment and a production environment and I want to make sure the tool versions and configurations don't get out of sync, I would also like to test everything in our testing environment before it reaches production. Is there a recommended way to accomplish this rather than just manually repeating the same set of UI interactions twice? Can I just import tools through the testing UI and run the ./scripts/migrate_tools/ scripts on our testing repository and then move the resulting migrated_tools_conf.xml and integrated_tool_panel.xml files into production? I have follow up questions, but I will wait for a response on this point. Tools that used to be in the Galaxy distribution but have been moved to the main Galaxy tool shed are automatically installed when you start up your Galaxy server and presented with the option of running the migration script to automatically install the tools that were migrated in the specific Galaxy distribution release. If you choose to install the tools, they are installed only in that specific Galaxy instance. Installation produces mercurial repositories that include the tools on disk in your Galaxy server environment. Several other things are produced as well, including database records for the installation. Each Galaxy instance consists of it's own separate set of components, this installation process must be done for each instance. The installation is fully automatic, requiring little interaction on the part of the Galaxy admin, and doesn't require much time, so performing the process for each Galaxy instance should not be too intensive. Also, the tools that are installed into each Galaxy instance's tool panel are only those tools that were originally defined in the tool panel configuration file (tool_conf.xml). This approach provides for the case where each Galaxy instance having different tools defined will not be altered by the migration process. Also as you are removing tools from Galaxy and placing them into our tool shed, what is the recommended course of actions for deployers that have made local minor tweaks to those tool configs and scripts and adapt them to our local environments? Along the same lines, what is the recommended course of action if we need to make minor tweaks to tools pulled into through the UI to adapt them to our institution. In both cases you should upload your proprietary tools to either a local Galaxy tool shed that you administer, or the main Galaxy tool shed if you want. You can choose to not execute any of the tool migration scripts, so the Galaxy tools that were migrated from the distribution will not be installed into your Galaxy environment. You can use the Galaxy admin UI to install your proprietary versions of the migrated tools from the tool shed in which you chose to upload and store them. New versions of the tools can be uploaded to respective tool shed repositories over time. Thanks for your time, -John John Chilton Senior Software Developer University of Minnesota Supercomputing Institute Office: 612-625-0917 Cell: 612-226-9223 ___ Please keep all replies on the
Re: [galaxy-dev] Tool Shed Workflow
Hi John, On Jun 8, 2012, at 1:22 PM, John Chilton wrote: Hello Greg, Thanks for the prompt and detailed response (though it did make me sad). I think deploying tested, static components and configurations to production environments and having production environments not depending on outside services (like the tool shed) should be considered best practices. I'm not sure I understand this issue. What processes are you using to upgrade your test and production servers with new Galaxy distributions? If you are pulling new Galaxy distributions from our Galaxy dist repository in bitbucket, then pulling tools from the Galaxy tool shed is not much different - both are outside services. Updating your test environment, determining it is functionally correct, and then updating your production environment using the same approach would generally follow a best practice approach. This is the approach we are currently using for our public test and main Galaxy instances at Penn State. Oh well, I guess. Would there be a way to at least automate the pulling of tools in. The process is completely automated - all you need to do execute a script, something like this: sh ./scripts/migrate_tools/0002_tools.sh This is the same process used when the Galaxy database schema migrates as part of a new Galaxy release, except in that case you would run a script like this: sh manage_db.sh upgrade For instance, would it make sense to tweak InstallManager to parse a new kind of migration file that is a lot like the official migration files, but with the sections defined in the file. For this new kind of migration, the InstallManager would then import everything in the file and not just the tools that are also in a tool_conf? Does that make sense? If yes, I imagine it could be modified to handle updates the same way? If I understand this correctly, this is how the InstallManage works. The entire tool shed repository is installed into your local Galaxy environment, but only the tools that are currently defined in your tool_conf.xml file are loaded into your tool panel. Rephrased, I guess the idea would be to have the sequence of official galaxy migrations that check tool_conf, and then have a sequence of migration defined by the deployer that could be used to install new tools or update existing ones. My concern isn't just with the dev to production transition, it is also the ability to sort of programmatically define Galaxy installations the way I am doing with the galaxy-vm-launcher (https://bitbucket.org/jmchilton/galaxy-vm-launcher) or the way mi-deployment works. You have complete control with these migrations. You can chose to not install any tools shed repositories, and just start your Galaxy server. If you choose to install the defined repositories, you have control over what specific tools included in the repository are loaded into your tool panel by having them defined in your tool_conf.xml prior to the installation. This whole process is associated only with tools that have moved from the Galaxy distribution to the tool shed. Thanks again for your time and patience in explaining these things to me, -John On Fri, Jun 8, 2012 at 4:19 AM, Greg Von Kuster g...@bx.psu.edu wrote: Hi John, On Jun 7, 2012, at 11:55 PM, John Chilton wrote: I have read through the documentation a couple times, but I still have a few questions about the recent tool shed enhancements. At MSI we have a testing environment and a production environment and I want to make sure the tool versions and configurations don't get out of sync, I would also like to test everything in our testing environment before it reaches production. Is there a recommended way to accomplish this rather than just manually repeating the same set of UI interactions twice? Can I just import tools through the testing UI and run the ./scripts/migrate_tools/ scripts on our testing repository and then move the resulting migrated_tools_conf.xml and integrated_tool_panel.xml files into production? I have follow up questions, but I will wait for a response on this point. Tools that used to be in the Galaxy distribution but have been moved to the main Galaxy tool shed are automatically installed when you start up your Galaxy server and presented with the option of running the migration script to automatically install the tools that were migrated in the specific Galaxy distribution release. If you choose to install the tools, they are installed only in that specific Galaxy instance. Installation produces mercurial repositories that include the tools on disk in your Galaxy server environment. Several other things are produced as well, including database records for the installation. Each Galaxy instance consists of it's own separate set of components, this installation process must be done for each instance. The installation is fully automatic, requiring little
Re: [galaxy-dev] Tool Shed Workflow
On Fri, Jun 8, 2012 at 3:27 PM, Greg Von Kuster g...@bx.psu.edu wrote: Hi John, On Jun 8, 2012, at 1:22 PM, John Chilton wrote: Hello Greg, Thanks for the prompt and detailed response (though it did make me sad). I think deploying tested, static components and configurations to production environments and having production environments not depending on outside services (like the tool shed) should be considered best practices. I'm not sure I understand this issue. What processes are you using to upgrade your test and production servers with new Galaxy distributions? If you are pulling new Galaxy distributions from our Galaxy dist repository in bitbucket, then pulling tools from the Galaxy tool shed is not much different - both are outside services. Updating your test environment, determining it is functionally correct, and then updating your production environment using the same approach would generally follow a best practice approach. This is the approach we are currently using for our public test and main Galaxy instances at Penn State. We don't pull down from bitbucket directly to our production environment, we pull galaxy-dist changes into our testing repository, merge (that can be quite complicated, sometimes a multihour process), auto-deploy to a testing server, and then finally we push the tested changes into a bare production repo. Our sys admins then pull in changes from that bare production repo in our production environment. We also prebuild eggs in our testing environment not live on our production system. Given the complicated merges we need to do and the configuration files that need to be updated each dist update it would seem making those changes on a live production system would be problematic. Even if one was pulling changes directly from bitbucket into a production codebase, I think the dependency on bitbucket would be very different than on N toolsheds. If our sys admin is going to update Galaxy and bitbucket is down, that is no problem he or she can just bring Galaxy back up and update later. Now lets imagine they shutdown our galaxy instance, updated the code base, did a database migration, and went to do a toolshed migration and that failed. In this case instead of just bringing Galaxy back up they will now need to restore the database from backup and pullout of the mercurial changes. Anyway all of that is a digression right, I understand that we will need to have the deploy-time dependencies on tool sheds and make these tool migration script calls part of our workflow. My lingering hope is for a way of programmatically importing and updating new tools that were never part of Galaxy (Qiime, upload_local_file, etc...) using tool sheds. My previous e-mail was proposing or positing a mechanism for doing that, but I think you read it like I was trying to describe a way to script the migrations of the existing official Galaxy tools (I definitely get that you have done that). Thanks again for your time and detailed responses, -John ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/