Re: [galaxy-dev] [galaxy-user] Using Galaxy Cloudman for a workshop
Hi Clare, The share string is generated when you share a cluster. The string is accessible on the shared cluster, when you click the green 'Share a cluster' icon next to the cluster name and then the top link Shared instances. You will get a list of the point in time shares of the cluster you have created. The share string will look something like this cm-cd53Bfg6f1223f966914df347687f6uf32/shared/2011-10-19--03-14 You simply paste that string into new cluster box you mentioned. Enis On Thu, Dec 1, 2011 at 6:31 AM, Clare Sloggett s...@unimelb.edu.au wrote: Hi Enis, Jeremy, and all, Thanks so much for all your help. I have another question which I suspect is just me missing something obvious. I'm guessing that when you cloned the cluster for your workshop, you used CloudMan's 'share-an-instance' functionality? When I launch a new cluster which I want to be a copy of an existing cluster, and select the share-an-instance option, it asks for the cluster share-string. How can I find this string for my existing cluster? Or have I got completely the wrong idea - did you actually clone the instance using AWS functionality? Thanks, Clare On Mon, Nov 21, 2011 at 5:37 PM, Enis Afgan eaf...@emory.edu wrote: Hi Clare, I don't recall what instance type we used earlier, but I think an Extra Large Instance is going to be fine. Do note that the master node is also being used to run jobs. However, if it's loaded by just the web server, SGE will typically just not schedule jobs to it. As far as the core/thread/slot concerns goes, SGE sees each core as a slot. Each job in Galaxy simply requires 1 slot, even if it uses multiple threads (i.e., cores). What this means is that nodes will probably get overloaded if only the same type of job is being run (BWA), but if analyses are being run that use multiple tools, jobs will get spread over the cluster to balance the overal load a bit better than by simply looking at the number of slots. Enis On Mon, Nov 21, 2011 at 4:34 AM, Clare Sloggett s...@unimelb.edu.au wrote: Hi Jeremy, Also if you do remember what kind of Amazon node you used, particularly for the cluster's master node (e.g. an 'xlarge' 4-core 15GB or perhaps one of the 'high-memory' nodes?), that would be a reassuring sanity chech for me! Cheers, Clare On Mon, Nov 21, 2011 at 10:37 AM, Clare Sloggett s...@unimelb.edu.au wrote: Hi Jeremy, Enis, That makes sense. I know I can configure how many threads BWA uses in its wrapper, with bwa -t. But, is there somewhere that I need to tell Galaxy the corresponding information, ie that this command-line task will make use of up to 4 cores? Or, does this imply that there is always exactly one job per node? So if I have (for instance) a cluster made of 4-core nodes, and a single-threaded task (e.g. samtools), are the other 3 cores just going to waste or will the scheduler allocate multiple single-threaded jobs to one node? I've cc'd galaxy-dev instead of galaxy-user as I think the conversation has gone that way! Thanks again, Clare On Fri, Nov 18, 2011 at 2:36 PM, Jeremy Goecks jeremy.goe...@emory.edu wrote: On Fri, Nov 18, 2011 at 12:56 AM, Jeremy Goecks jeremy.goe...@emory.edu wrote: Scalability issues are more likely to arise on the back end than the front end, so you'll want to ensure that you have enough compute nodes. BWA uses four nodes by default--Enis, does the cloud config change this parameter?--so you'll want 4x50 or 200 total nodes if you want everyone to be able to run a BWA job simultaneously. Actually, one other question - this paragraph makes me realise that I don't really understand how Galaxy is distributing jobs. I had thought that each job would only use one node, and in some cases take advantage of multiple cores within that node. I'm taking a node to be a set of cores with their own shared memory, so in this case a VM instance, is this right? If some types of jobs can be distributed over multiple nodes, can I configure, in Galaxy, how many nodes they should use? You're right -- my word choices were poor. Replace 'node' with 'core' in my paragraph to get an accurate suggestion for resources. Galaxy uses a job scheduler--SGE on the cloud--to distribute jobs to different cluster nodes. Jobs that require multiple cores typically run on a single node. Enis can chime in on whether CloudMan supports job submission over multiple nodes; this would require setup of an appropriate parallel environment and a tool that can make use of this environment. Good luck, J. -- E: s...@unimelb.edu.au P: 03 903 53357 M: 0414 854 759 -- E: s...@unimelb.edu.au P: 03 903 53357 M: 0414 854 759 -- E: s...@unimelb.edu.au P: 03 903 53357 M: 0414 854 759
Re: [galaxy-dev] Removing nodes from a CloudMan instance
Unfortunately not. When removing nodes, CloudMan chooses from the nodes that are currently not being used but among those the choice is random. You can manually terminate a particular worker instance from the AWS console and thus remove it from your cluster by force. CloudMan will then reconfigure the cluster. Although this has been implemented and tested, it is not really the recommended behavior, especially not repeatedly. I'll look into how to implement this option. Enis On Thu, Dec 1, 2011 at 6:20 AM, Clare Sloggett s...@unimelb.edu.au wrote: Hi galaxy-devs, Quick question: when using the cloud console on CloudMan, it's possible to add different types of nodes (large, micro, etc) to the virtual cluster using the 'Add Nodes' option at the top. I can also remove a given number of nodes using the 'Remove Nodes' option at the top. However, is there any way to control exactly which node (or more importantly just which type of node) gets removed? Thanks for any help! Clare -- E: s...@unimelb.edu.au P: 03 903 53357 M: 0414 854 759 ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] GalaxyCloudman + CADDSuite
Hi Marcel, However, when I create an AMI, terminate the cluster and create a new cluster using the new AMI, both /mnt/galaxyData and /mnt/galaxyTools do not exist anymore, i.e. /dev/sdg3 and /dev/sdg4 are not mounted automatically. If I mount those two devices manually, everything runs smoothly again. So, is there anything that I might have forgotten to do while creating the AMI? Is there a way to make sure that those devices will be mounted automatically? It is not necessary to create a new AMI when wanting to customize your cluster. Instead, on the admin interface - after you have modified the file systems, there is an option to persist static file systems (galaxyTools galaxyIndices). Once the process is completed and you restart the cluster, just continue to use the same AMI. CloudMan will use the new, customized, data snapshots at runtime. Let us know how it goes, Enis Regards, Marcel On 11/30/11 3:57 PM, Enis Afgan wrote: Hi Marcel, It would be best to use 'galaxy' user to add any tools. To do so, after you've logged in as ubuntu user, simply execute: sudo su galaxy and you will become galaxy user. You can then make the desired modifications. Good luck, Enis On Wed, Nov 30, 2011 at 3:42 PM, Greg Von Kusterg...@bx.psu.edu wrote: Hello Marcel, In the future, please send all questions like this to the galaxy-dev mailing list, as doing so will streamline the process of getting a timely answer. I believe Enis is best able to answer your questions. Thanks! On Nov 30, 2011, at 9:29 AM, Marcel Schumann wrote: Hi Greg, I'm currently trying to create a GalaxyCloudman version that includes CADDSuite. Thus, I launched GalaxyCloudman as described in your wiki and tried to modify it afterwards. Well, starting cloudman worked without any problems... so far, so good :-) As described on http://wiki.g2.bx.psu.edu/**Admin/Cloud/Customize%** 20Galaxy%20Cloudhttp://wiki.g2.bx.psu.edu/Admin/Cloud/Customize%20Galaxy%20Cloud I could then log-in via ssh as user 'ubuntu' (not as user 'galaxy'). However, all files of the galaxy installation belong to user and group 'galaxy'. Thus my question: How should users be able to customize cloudman? Is there some trick by which I can log-in as 'galaxy' or do you have any other idea how to make this work ? ;-) Sorry Greg, if you are not the correct contact in this case, but I found not specific contact or mailing list for cloudman. Perhaps, you could just forward this mail in that case ... Cheers, Marcel -- Marcel Schumann University of Tuebingen Wilhelm Schickard Institute for Computer Science Division for Applied Bioinformatics Room C313, Sand 14, D-72076 Tuebingen phone: +49 (0)7071-29 70437 fax: +49 (0)7071-29 5152 email: schum...@informatik.uni-**tuebingen.deschum...@informatik.uni-tuebingen.de Greg Von Kuster Galaxy Development Team g...@bx.psu.edu -- Marcel Schumann University of Tuebingen Wilhelm Schickard Institute for Computer Science Division for Applied Bioinformatics Room C313, Sand 14, D-72076 Tuebingen phone: +49 (0)7071-29 70437 fax: +49 (0)7071-29 5152 email: schum...@informatik.uni-**tuebingen.deschum...@informatik.uni-tuebingen.de ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] synchronous data depositing
Greetings, I've been attempting to return data to Galaxy via the synchronous data depositing protocol. Using the Biomart, UCSC Table Browser, etc as examples in the data_source tools directory, I've been able to get the initial GET request to my site just fine. However, when I POST back to galaxy I immediately get a redirect to the welcome page and Galaxy never resubmits back to my site. I was wondering if there is more to the protocol than is covered here: http://wiki.g2.bx.psu.edu/Admin/Internals/Data%20Sources or perhaps configuration I need to perform on my local Galaxy installation to correctly handle the POSTs back to tool_runner? Also, are there any code examples I should be looking at? Thanks for your help! -James -- J Ireland www.5amsolutions.com | Software for Life(TM) m: 415 484-DATA (3282) ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] [galaxy-user] Using Galaxy Cloudman for a workshop
Right! I did think to look for a 'share this cluster' command, I just failed to find it. It all makes sense now, thanks. On Thu, Dec 1, 2011 at 7:34 PM, Enis Afgan eaf...@emory.edu wrote: Hi Clare, The share string is generated when you share a cluster. The string is accessible on the shared cluster, when you click the green 'Share a cluster' icon next to the cluster name and then the top link Shared instances. You will get a list of the point in time shares of the cluster you have created. The share string will look something like this cm-cd53Bfg6f1223f966914df347687f6uf32/shared/2011-10-19--03-14 You simply paste that string into new cluster box you mentioned. Enis On Thu, Dec 1, 2011 at 6:31 AM, Clare Sloggett s...@unimelb.edu.au wrote: Hi Enis, Jeremy, and all, Thanks so much for all your help. I have another question which I suspect is just me missing something obvious. I'm guessing that when you cloned the cluster for your workshop, you used CloudMan's 'share-an-instance' functionality? When I launch a new cluster which I want to be a copy of an existing cluster, and select the share-an-instance option, it asks for the cluster share-string. How can I find this string for my existing cluster? Or have I got completely the wrong idea - did you actually clone the instance using AWS functionality? Thanks, Clare On Mon, Nov 21, 2011 at 5:37 PM, Enis Afgan eaf...@emory.edu wrote: Hi Clare, I don't recall what instance type we used earlier, but I think an Extra Large Instance is going to be fine. Do note that the master node is also being used to run jobs. However, if it's loaded by just the web server, SGE will typically just not schedule jobs to it. As far as the core/thread/slot concerns goes, SGE sees each core as a slot. Each job in Galaxy simply requires 1 slot, even if it uses multiple threads (i.e., cores). What this means is that nodes will probably get overloaded if only the same type of job is being run (BWA), but if analyses are being run that use multiple tools, jobs will get spread over the cluster to balance the overal load a bit better than by simply looking at the number of slots. Enis On Mon, Nov 21, 2011 at 4:34 AM, Clare Sloggett s...@unimelb.edu.au wrote: Hi Jeremy, Also if you do remember what kind of Amazon node you used, particularly for the cluster's master node (e.g. an 'xlarge' 4-core 15GB or perhaps one of the 'high-memory' nodes?), that would be a reassuring sanity chech for me! Cheers, Clare On Mon, Nov 21, 2011 at 10:37 AM, Clare Sloggett s...@unimelb.edu.au wrote: Hi Jeremy, Enis, That makes sense. I know I can configure how many threads BWA uses in its wrapper, with bwa -t. But, is there somewhere that I need to tell Galaxy the corresponding information, ie that this command-line task will make use of up to 4 cores? Or, does this imply that there is always exactly one job per node? So if I have (for instance) a cluster made of 4-core nodes, and a single-threaded task (e.g. samtools), are the other 3 cores just going to waste or will the scheduler allocate multiple single-threaded jobs to one node? I've cc'd galaxy-dev instead of galaxy-user as I think the conversation has gone that way! Thanks again, Clare On Fri, Nov 18, 2011 at 2:36 PM, Jeremy Goecks jeremy.goe...@emory.edu wrote: On Fri, Nov 18, 2011 at 12:56 AM, Jeremy Goecks jeremy.goe...@emory.edu wrote: Scalability issues are more likely to arise on the back end than the front end, so you'll want to ensure that you have enough compute nodes. BWA uses four nodes by default--Enis, does the cloud config change this parameter?--so you'll want 4x50 or 200 total nodes if you want everyone to be able to run a BWA job simultaneously. Actually, one other question - this paragraph makes me realise that I don't really understand how Galaxy is distributing jobs. I had thought that each job would only use one node, and in some cases take advantage of multiple cores within that node. I'm taking a node to be a set of cores with their own shared memory, so in this case a VM instance, is this right? If some types of jobs can be distributed over multiple nodes, can I configure, in Galaxy, how many nodes they should use? You're right -- my word choices were poor. Replace 'node' with 'core' in my paragraph to get an accurate suggestion for resources. Galaxy uses a job scheduler--SGE on the cloud--to distribute jobs to different cluster nodes. Jobs that require multiple cores typically run on a single node. Enis can chime in on whether CloudMan supports job submission over multiple nodes; this would require setup of an appropriate parallel environment and a tool that can make use of this environment. Good luck, J.
Re: [galaxy-dev] GalaxyCloudman + CADDSuite
Hi Marcel, On Thu, Dec 1, 2011 at 11:28 AM, Marcel Schumann schum...@informatik.uni-tuebingen.de wrote: Hi Enis, well, I know I do not have to create a new AMI if I want to reuse an instance myself. However, I would like to share the modified GalaxyCloudman version with other people and therefore I do have to create an AMI. Unless you modify the system packages (i.e., your customizations are not self contained), you still don't have to create a new AMI to share a cluster. There is the share-a-cluster option (icon next to the cluster name). Just wanted to make sure you were aware of the functionality... Ok, I will try to make this work somehow ... but I guess there are no immediate clues as to what could have gone wrong? Or do you have any ideas what I should try? CloudMan sets up the system at runtime so it performs changes that then get persisted when you create the AMI. So, it is necessary to reverse those changes before creating the AMI so that next time a cluster is started, the startup procedure proceeds as before. Did you see what's in the cloudman log (/mnt/cm/paster.log) on your customized AMI? That's probably the easiest place to start and we can work from there. Enis Cheers, Marcel On 12/1/11 10:45 AM, Enis Afgan wrote: Hi Marcel, However, when I create an AMI, terminate the cluster and create a new cluster using the new AMI, both /mnt/galaxyData and /mnt/galaxyTools do not exist anymore, i.e. /dev/sdg3 and /dev/sdg4 are not mounted automatically. If I mount those two devices manually, everything runs smoothly again. So, is there anything that I might have forgotten to do while creating the AMI? Is there a way to make sure that those devices will be mounted automatically? It is not necessary to create a new AMI when wanting to customize your cluster. Instead, on the admin interface - after you have modified the file systems, there is an option to persist static file systems (galaxyTools galaxyIndices). Once the process is completed and you restart the cluster, just continue to use the same AMI. CloudMan will use the new, customized, data snapshots at runtime. Let us know how it goes, Enis Regards, Marcel On 11/30/11 3:57 PM, Enis Afgan wrote: Hi Marcel, It would be best to use 'galaxy' user to add any tools. To do so, after you've logged in as ubuntu user, simply execute: sudo su galaxy and you will become galaxy user. You can then make the desired modifications. Good luck, Enis On Wed, Nov 30, 2011 at 3:42 PM, Greg Von Kusterg...@bx.psu.edu wrote: Hello Marcel, -- Marcel Schumann University of Tuebingen Wilhelm Schickard Institute for Computer Science Division for Applied Bioinformatics Room C313, Sand 14, D-72076 Tuebingen phone: +49 (0)7071-29 70437 fax: +49 (0)7071-29 5152 email: schum...@informatik.uni-**tueb**ingen.de http://tuebingen.de schumann@informatik.**uni-tuebingen.deschum...@informatik.uni-tuebingen.de -- Marcel Schumann University of Tuebingen Wilhelm Schickard Institute for Computer Science Division for Applied Bioinformatics Room C313, Sand 14, D-72076 Tuebingen phone: +49 (0)7071-29 70437 fax: +49 (0)7071-29 5152 email: schum...@informatik.uni-**tuebingen.deschum...@informatik.uni-tuebingen.de ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Quota will not decrease with permanent delete
You aren't the only one having problem in the disk-usage reported in the upper right. I've also had problems with this total disk usage in our local instance of Galaxy I've deleted all but a handful of files, and we have a cron job that purges the files from disk after a certain number of days. I've deleted everything but one history, and for the last few weeks the size of that history has been reported as about 200GB. I don't have any other histories, or any deleted but not purged datasets. However the total disk usage shown by Galaxy is about 1.5TB On Nov 30, 2011, at 2:24 PM, Mary Anne Alliegro wrote: Hi Galaxy Users, I have permanently deleted numerous files. My usage % has decreased, but this is NOT reflected in my Gb report (upper right)- it remains the same. Am I missing some phantom trash bin? If not, will Galaxy recalculate my Gb usage so that I may proceed with my project? Thank you, Mary Anne pastedGraphic.pdf ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Glen L. Beane Senior Software Engineer The Jackson Laboratory (207) 288-6153 ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] synchronous data depositing
Hi James, Can you let us know which revision of Galaxy that you are using (hg head) and any log output that appears when accessing or running the tool? Also the list of parameters and values that are being POSTed to Galaxy and a copy of your tool.xml file would be useful. Thanks for using Galaxy, Dan On Dec 1, 2011, at 5:08 AM, James Ireland wrote: Greetings, I've been attempting to return data to Galaxy via the synchronous data depositing protocol. Using the Biomart, UCSC Table Browser, etc as examples in the data_source tools directory, I've been able to get the initial GET request to my site just fine. However, when I POST back to galaxy I immediately get a redirect to the welcome page and Galaxy never resubmits back to my site. I was wondering if there is more to the protocol than is covered here: http://wiki.g2.bx.psu.edu/Admin/Internals/Data%20Sources or perhaps configuration I need to perform on my local Galaxy installation to correctly handle the POSTs back to tool_runner? Also, are there any code examples I should be looking at? Thanks for your help! -James -- J Ireland www.5amsolutions.com | Software for Life(TM) m: 415 484-DATA (3282) ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] GalaxyCloudman + CADDSuite
Hi Enis, well, I know I do not have to create a new AMI if I want to reuse an instance myself. However, I would like to share the modified GalaxyCloudman version with other people and therefore I do have to create an AMI. Ok, I will try to make this work somehow ... but I guess there are no immediate clues as to what could have gone wrong? Or do you have any ideas what I should try? Cheers, Marcel On 12/1/11 10:45 AM, Enis Afgan wrote: Hi Marcel, However, when I create an AMI, terminate the cluster and create a new cluster using the new AMI, both /mnt/galaxyData and /mnt/galaxyTools do not exist anymore, i.e. /dev/sdg3 and /dev/sdg4 are not mounted automatically. If I mount those two devices manually, everything runs smoothly again. So, is there anything that I might have forgotten to do while creating the AMI? Is there a way to make sure that those devices will be mounted automatically? It is not necessary to create a new AMI when wanting to customize your cluster. Instead, on the admin interface - after you have modified the file systems, there is an option to persist static file systems (galaxyTools galaxyIndices). Once the process is completed and you restart the cluster, just continue to use the same AMI. CloudMan will use the new, customized, data snapshots at runtime. Let us know how it goes, Enis Regards, Marcel On 11/30/11 3:57 PM, Enis Afgan wrote: Hi Marcel, It would be best to use 'galaxy' user to add any tools. To do so, after you've logged in as ubuntu user, simply execute: sudo su galaxy and you will become galaxy user. You can then make the desired modifications. Good luck, Enis On Wed, Nov 30, 2011 at 3:42 PM, Greg Von Kusterg...@bx.psu.edu wrote: Hello Marcel, -- Marcel Schumann University of Tuebingen Wilhelm Schickard Institute for Computer Science Division for Applied Bioinformatics Room C313, Sand 14, D-72076 Tuebingen phone: +49 (0)7071-29 70437 fax: +49 (0)7071-29 5152 email: schum...@informatik.uni-**tuebingen.deschum...@informatik.uni-tuebingen.de -- Marcel Schumann University of Tuebingen Wilhelm Schickard Institute for Computer Science Division for Applied Bioinformatics Room C313, Sand 14, D-72076 Tuebingen phone: +49 (0)7071-29 70437 fax: +49 (0)7071-29 5152 email: schum...@informatik.uni-tuebingen.de ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] GATK / R local install configuration
Hi, Any chances someone could give me hint on this issue? Thanks in advance, Carlos On Tue, Nov 29, 2011 at 2:02 PM, Carlos Borroto carlos.borr...@gmail.com wrote: Hi, I'm testing the GATK pipeline and I ran into a problem with Variant Recalibrator tool. It seems I don't have correctly configure R and GATK on my instance, as this tool is failing with this error: mv: /Volumes/Data/Users/cjavier/galaxy_central/database/files/000/dataset_393.dat.pdf: No such file or directory I see this PDF is builded with R and I also see this in the log file: INFO 17:11:44,260 VariantRecalibrator - Executing: Rscript /Volumes/Data/Users/cjavier/galaxy_central/database/files/000/dataset_393.dat WARN 17:11:48,407 RScriptExecutor - RScript exited with 1. Run with -l DEBUG for more info. In the testing server this tool does work: http://test.g2.bx.psu.edu/u/cjav/h/variant-recalibrator---tutorial At least after making sure the input VCF file has matching annotations string to the one you want to select in Galaxy. Any help on what need to be done to get this configuration right. BTW I already have R and Rpy correctly configured and I can run tools like Statistics/Summary Statistics. Thanks! Carlos ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Local installation of the tool shed
Hello Louise-Amelie, On Dec 1, 2011, at 4:08 AM, Louise-Amélie Schmitt wrote: Yes this fixes the problem, it should work fine now :) Thanks a lot! Hehe, next issues: 1) When I have my repo created and I click on the Upload files to repository, I get a Not found error in the browser: Not Found The requested URL /toolshed/upload/upload was not found on this server. This is probably due to your apache rewrite rule: RewriteRule ^/toolshed/upload/(.*) /home/galaxy/galaxy-dev/static/automated_upload/$1 [L] Is this something proprietary you have set up on your local galaxy instance? Try removing it from your rewrite rules for your local tool shed. 2) When we try to access our local toolshed from our Galaxy instance, it appears in the Accessible tool sheds list along with your two public repos, but when we click on it, the Valid repositories list is empty. Is this a bug or does a repo have to actually contain files so it can appear in this list? The list of valid repositories will only include repositories that have content that is valid for a Galaxy instance. These repositories are defined as either valid or in some cases downloadable (I'm working to replace the latter with the former). See the following sections of the tool shed wiki for details: http://wiki.g2.bx.psu.edu/Tool%20Shed#Repository_revisions:_downloadable_tool_versions http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_tools_into_a_local_Galaxy_instance 3) Where should the hgweb.config be? This file should be left in the Galaxy root install directory. Actually, we have absolute paths since in the community_wsgi.ini we set the file_path option to an absolute path. I advise against this - I'm fairly certain it will pose problems at some point. Tool shed paths should be relative. Here are some example entries in only of my local tool sheds: [paths] repos/test/filter = database/community_files/000/repo_1 repos/test/workflow_with_tools = database/community_files/000/repo_2 repos/test/heteroplasmy_workflow = database/community_files/000/repo_3 Thanks for your patience! L-A Greg Von Kuster Galaxy Development Team g...@bx.psu.edu ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] synchronous data depositing
Hi Dan, Thanks for the quick response! I have a feeling this is something silly that I'm missing. As a reality check I simply created a copy of the biomart.xml tool.xml file and called it test_me.xml. I changed the tool_id and GALAXY_URL in the xml file as you'll see and added a link in my tool_config. Selecting this tool from the Get Data folder brings me to Biomart as expected. When I hit go, I have the same issue of just being redirected to the welcome page. The server output is shown below. The unchanged biomart tool continues to work fine. Galaxy server output: 127.0.0.1 - - [01/Dec/2011:08:08:27 -0700] POST /tool_runner/test_me?type=textname=Homo%20sapiens%20genes%20(GRCh37.p5)URL= http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9?do_export=1resultsButton=1HTTP/1.1; 302 - http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9; Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0.1) Gecko/20100101 Firefox/8.0.1 Here's my Galaxy revision info: changeset: 6056:338ead4737ba tag: tip user:Nate Coraor n...@bx.psu.edu date:Thu Sep 29 16:45:19 2011 -0400 Thanks again, -James P.S. It's my pleasure to use Galaxy! ;) On Thu, Dec 1, 2011 at 7:06 AM, Daniel Blankenberg d...@bx.psu.edu wrote: Hi James, Can you let us know which revision of Galaxy that you are using (hg head) and any log output that appears when accessing or running the tool? Also the list of parameters and values that are being POSTed to Galaxy and a copy of your tool.xml file would be useful. Thanks for using Galaxy, Dan On Dec 1, 2011, at 5:08 AM, James Ireland wrote: Greetings, I've been attempting to return data to Galaxy via the synchronous data depositing protocol. Using the Biomart, UCSC Table Browser, etc as examples in the data_source tools directory, I've been able to get the initial GET request to my site just fine. However, when I POST back to galaxy I immediately get a redirect to the welcome page and Galaxy never resubmits back to my site. I was wondering if there is more to the protocol than is covered here: http://wiki.g2.bx.psu.edu/Admin/Internals/Data%20Sources or perhaps configuration I need to perform on my local Galaxy installation to correctly handle the POSTs back to tool_runner? Also, are there any code examples I should be looking at? Thanks for your help! -James -- J Ireland www.5amsolutions.com | Software for Life(TM) m: 415 484-DATA (3282) ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- J Ireland www.5amsolutions.com | Software for Life(TM) m: 415 484-DATA (3282) ?xml version=1.0? !-- If the value of 'URL_method' is 'get', the request will consist of the value of 'URL' coming back in the initial response. If value of 'URL_method' is 'post', any additional params coming back in the initial response ( in addition to 'URL' ) will be encoded and appended to URL and a post will be performed. TODO: Hack to get biomart to work - the 'add_to_URL' param can be eliminated when the Biomart team encodes URL prior to sending, meanwhile everything including and beyond the first '' is truncated from URL. They said they'll let us know when this is fixed at their end. -- tool name=BioMart id=test_me tool_type=data_source version=1.0.1 descriptionCentral server/description command interpreter=pythondata_source.py $output $__app__.config.output_size_limit/command inputs action=http://www.biomart.org/biomart/martview; check_values=false method=get target=_top displaygo to BioMart Central $GALAXY_URL/display param name=GALAXY_URL type=baseurl value=/tool_runner/test_me / /inputs request_param_translation request_param galaxy_name=URL remote_name=URL missing= append_param separator=amp; first_separator=? join== value name=_export missing=1 / value name=GALAXY_URL missing=0 / /append_param /request_param request_param galaxy_name=data_type remote_name=exportView_outputformat missing=tabular value_translation value galaxy_value=tabular remote_value=TSV / /value_translation /request_param request_param galaxy_name=URL_method remote_name=URL_method missing=get / request_param galaxy_name=dbkey remote_name=dbkey missing=? / request_param galaxy_name=organism remote_name=organism missing= / request_param galaxy_name=table remote_name=table missing= / request_param galaxy_name=description remote_name=description missing= / request_param galaxy_name=name remote_name=name missing=Biomart query / request_param galaxy_name=info
Re: [galaxy-dev] synchronous data depositing
Hi James, For legacy reasons, Biomart uses a special-cased GALAXY_URL parameter, which is likely the source of the problem. If you remove the 'param name=GALAXY_URL ... /' input parameter, restart Galaxy and reload the Galaxy interface, does it then work correctly? Thanks for using Galaxy, Dan On Dec 1, 2011, at 11:18 AM, James Ireland wrote: Hi Dan, Thanks for the quick response! I have a feeling this is something silly that I'm missing. As a reality check I simply created a copy of the biomart.xml tool.xml file and called it test_me.xml. I changed the tool_id and GALAXY_URL in the xml file as you'll see and added a link in my tool_config. Selecting this tool from the Get Data folder brings me to Biomart as expected. When I hit go, I have the same issue of just being redirected to the welcome page. The server output is shown below. The unchanged biomart tool continues to work fine. Galaxy server output: 127.0.0.1 - - [01/Dec/2011:08:08:27 -0700] POST /tool_runner/test_me?type=textname=Homo%20sapiens%20genes%20(GRCh37.p5)URL=http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9?do_export=1resultsButton=1 HTTP/1.1 302 - http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9; Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0.1) Gecko/20100101 Firefox/8.0.1 Here's my Galaxy revision info: changeset: 6056:338ead4737ba tag: tip user:Nate Coraor n...@bx.psu.edu date:Thu Sep 29 16:45:19 2011 -0400 Thanks again, -James P.S. It's my pleasure to use Galaxy! ;) On Thu, Dec 1, 2011 at 7:06 AM, Daniel Blankenberg d...@bx.psu.edu wrote: Hi James, Can you let us know which revision of Galaxy that you are using (hg head) and any log output that appears when accessing or running the tool? Also the list of parameters and values that are being POSTed to Galaxy and a copy of your tool.xml file would be useful. Thanks for using Galaxy, Dan On Dec 1, 2011, at 5:08 AM, James Ireland wrote: Greetings, I've been attempting to return data to Galaxy via the synchronous data depositing protocol. Using the Biomart, UCSC Table Browser, etc as examples in the data_source tools directory, I've been able to get the initial GET request to my site just fine. However, when I POST back to galaxy I immediately get a redirect to the welcome page and Galaxy never resubmits back to my site. I was wondering if there is more to the protocol than is covered here: http://wiki.g2.bx.psu.edu/Admin/Internals/Data%20Sources or perhaps configuration I need to perform on my local Galaxy installation to correctly handle the POSTs back to tool_runner? Also, are there any code examples I should be looking at? Thanks for your help! -James -- J Ireland www.5amsolutions.com | Software for Life(TM) m: 415 484-DATA (3282) ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- J Ireland www.5amsolutions.com | Software for Life(TM) m: 415 484-DATA (3282) test_me.xml ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] synchronous data depositing
Hi Dan, BTW - is there a particular tool/tool.xml I should focus on for a demonstration of current best practices? Thanks, -James On Thu, Dec 1, 2011 at 8:40 AM, James Ireland jirel...@5amsolutions.comwrote: Hi Dan, Ahhh... ok. I had seen two forms of the url ad was wondering what was up. So, simply removing the param creates a poorly formed url: http://127.0.0.1:8080/tool_runner?tool_id=test_me?type=textname=Homo%20sapiens%20genes%20%28GRCh37.p5%29URL=http://www.biomart.org/biomart/martview/1a5fc0bddc9f81dba837a1b1a5063691?do_export=1resultsButton=1 Which causes much wailing and nashing of teeth... Tool 'test_me?type=text' does not exist, kwd={'hsapiens_gene_ensembl__feature_page__attribute.ensembl_gene_id': u'on', 'hsapiens_gene_ensembl__filter.go_parent_name': u'', 'hsapiens_gene_ensembl__filter.encode_region': u'5:131256415:132256414', 'hsapiens_gene_ensembl__filtergroup.gene__visibility': u'show', 'hsapiens_gene_ensembl__filter.start': u'1', 'hsapiens_gene_ensembl__filter.marker_end': u'', 'hsapiens_gene_ensembl__filter.id_list_limit_filters': u'ensembl_gene_id', 'hsapiens_gene_ensembl__filter.type': u'manual_picks', 'export_subset': u'10', 'hsapiens_gene_ensembl__filter.somatic_variation_source': u'COSMIC', 'defaulthsapiens_gene_ensembl__homologs__attributelist': [u'hsapiens_gene_ensembl__homologs__attribute.ensembl_gene_id', u'hsapiens_gene_ensembl__homologs__attribute.ensembl_transcript_id'], 'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_gene_id': u'on', 'hsapiens_gene_ensembl__snp__attribute.ensembl_gene_id': u'on', 'hsapiens_gene_ensembl__filtergroup.protein__visibility': u'hide', 'exportView_outputformat': u'TSV', 'export_saveto': u'text', 'export_dataset': u'0', 'URL': u' http://www.biomart.org/biomart/martview/1a5fc0bddc9f81dba837a1b1a5063691?do_export=1', 'defaulthsapiens_gene_ensembl__transcript_event__attributelist': [u'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_gene_id', u'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_transcript_id'], 'hsapiens_gene_ensembl__filter.transcript_status': u'KNOWN', 'hsapiens_gene_ensembl__filter.with_transmembrane_domain': u'only', 'menuNumber': u'0', 'hsapiens_gene_ensembl__filter.protein_fam_id_b... etc, etc If I correct the URL by hand (change the second ? to ) and resubmit, that seems to work! I'll try using this url form on my own page now. I'm surprised that the legacy url works until I change the tool name! If you hear no more from me, you can assume it worked. Thanks! -James On Thu, Dec 1, 2011 at 8:25 AM, Daniel Blankenberg d...@bx.psu.edu wrote: Hi James, For legacy reasons, Biomart uses a special-cased GALAXY_URL parameter, which is likely the source of the problem. If you remove the 'param name=GALAXY_URL ... /' input parameter, restart Galaxy and reload the Galaxy interface, does it then work correctly? Thanks for using Galaxy, Dan On Dec 1, 2011, at 11:18 AM, James Ireland wrote: Hi Dan, Thanks for the quick response! I have a feeling this is something silly that I'm missing. As a reality check I simply created a copy of the biomart.xml tool.xml file and called it test_me.xml. I changed the tool_id and GALAXY_URL in the xml file as you'll see and added a link in my tool_config. Selecting this tool from the Get Data folder brings me to Biomart as expected. When I hit go, I have the same issue of just being redirected to the welcome page. The server output is shown below. The unchanged biomart tool continues to work fine. Galaxy server output: 127.0.0.1 - - [01/Dec/2011:08:08:27 -0700] POST /tool_runner/test_me?type=textname=Homo%20sapiens%20genes%20(GRCh37.p5)URL= http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9?do_export=1resultsButton=1HTTP/1.1; 302 - http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9; Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0.1) Gecko/20100101 Firefox/8.0.1 Here's my Galaxy revision info: changeset: 6056:338ead4737ba tag: tip user:Nate Coraor n...@bx.psu.edu date:Thu Sep 29 16:45:19 2011 -0400 Thanks again, -James P.S. It's my pleasure to use Galaxy! ;) On Thu, Dec 1, 2011 at 7:06 AM, Daniel Blankenberg d...@bx.psu.eduwrote: Hi James, Can you let us know which revision of Galaxy that you are using (hg head) and any log output that appears when accessing or running the tool? Also the list of parameters and values that are being POSTed to Galaxy and a copy of your tool.xml file would be useful. Thanks for using Galaxy, Dan On Dec 1, 2011, at 5:08 AM, James Ireland wrote: Greetings, I've been attempting to return data to Galaxy via the synchronous data depositing protocol. Using the Biomart, UCSC Table Browser, etc as examples in the data_source tools directory, I've been able to get the initial GET request to my site just fine. However, when I POST back
Re: [galaxy-dev] synchronous data depositing
Hi James, tools/data_source/yeastmine.xml is a good example of a relatively simple configuration. Ideally, the request_param_translation/ tagset would not be required, if a more specific data_type parameter value was being provided by the external site. Thanks for using Galaxy, Dan On Dec 1, 2011, at 12:05 PM, James Ireland wrote: Hi Dan, BTW - is there a particular tool/tool.xml I should focus on for a demonstration of current best practices? Thanks, -James On Thu, Dec 1, 2011 at 8:40 AM, James Ireland jirel...@5amsolutions.com wrote: Hi Dan, Ahhh... ok. I had seen two forms of the url ad was wondering what was up. So, simply removing the param creates a poorly formed url: http://127.0.0.1:8080/tool_runner?tool_id=test_me?type=textname=Homo%20sapiens%20genes%20%28GRCh37.p5%29URL=http://www.biomart.org/biomart/martview/1a5fc0bddc9f81dba837a1b1a5063691?do_export=1resultsButton=1 Which causes much wailing and nashing of teeth... Tool 'test_me?type=text' does not exist, kwd={'hsapiens_gene_ensembl__feature_page__attribute.ensembl_gene_id': u'on', 'hsapiens_gene_ensembl__filter.go_parent_name': u'', 'hsapiens_gene_ensembl__filter.encode_region': u'5:131256415:132256414', 'hsapiens_gene_ensembl__filtergroup.gene__visibility': u'show', 'hsapiens_gene_ensembl__filter.start': u'1', 'hsapiens_gene_ensembl__filter.marker_end': u'', 'hsapiens_gene_ensembl__filter.id_list_limit_filters': u'ensembl_gene_id', 'hsapiens_gene_ensembl__filter.type': u'manual_picks', 'export_subset': u'10', 'hsapiens_gene_ensembl__filter.somatic_variation_source': u'COSMIC', 'defaulthsapiens_gene_ensembl__homologs__attributelist': [u'hsapiens_gene_ensembl__homologs__attribute.ensembl_gene_id', u'hsapiens_gene_ensembl__homologs__attribute.ensembl_transcript_id'], 'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_gene_id': u'on', 'hsapiens_gene_ensembl__snp__attribute.ensembl_gene_id': u'on', 'hsapiens_gene_ensembl__filtergroup.protein__visibility': u'hide', 'exportView_outputformat': u'TSV', 'export_saveto': u'text', 'export_dataset': u'0', 'URL': u'http://www.biomart.org/biomart/martview/1a5fc0bddc9f81dba837a1b1a5063691?do_export=1', 'defaulthsapiens_gene_ensembl__transcript_event__attributelist': [u'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_gene_id', u'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_transcript_id'], 'hsapiens_gene_ensembl__filter.transcript_status': u'KNOWN', 'hsapiens_gene_ensembl__filter.with_transmembrane_domain': u'only', 'menuNumber': u'0', 'hsapiens_gene_ensembl__filter.protein_fam_id_b... etc, etc If I correct the URL by hand (change the second ? to ) and resubmit, that seems to work! I'll try using this url form on my own page now. I'm surprised that the legacy url works until I change the tool name! If you hear no more from me, you can assume it worked. Thanks! -James On Thu, Dec 1, 2011 at 8:25 AM, Daniel Blankenberg d...@bx.psu.edu wrote: Hi James, For legacy reasons, Biomart uses a special-cased GALAXY_URL parameter, which is likely the source of the problem. If you remove the 'param name=GALAXY_URL ... /' input parameter, restart Galaxy and reload the Galaxy interface, does it then work correctly? Thanks for using Galaxy, Dan On Dec 1, 2011, at 11:18 AM, James Ireland wrote: Hi Dan, Thanks for the quick response! I have a feeling this is something silly that I'm missing. As a reality check I simply created a copy of the biomart.xml tool.xml file and called it test_me.xml. I changed the tool_id and GALAXY_URL in the xml file as you'll see and added a link in my tool_config. Selecting this tool from the Get Data folder brings me to Biomart as expected. When I hit go, I have the same issue of just being redirected to the welcome page. The server output is shown below. The unchanged biomart tool continues to work fine. Galaxy server output: 127.0.0.1 - - [01/Dec/2011:08:08:27 -0700] POST /tool_runner/test_me?type=textname=Homo%20sapiens%20genes%20(GRCh37.p5)URL=http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9?do_export=1resultsButton=1 HTTP/1.1 302 - http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9; Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0.1) Gecko/20100101 Firefox/8.0.1 Here's my Galaxy revision info: changeset: 6056:338ead4737ba tag: tip user:Nate Coraor n...@bx.psu.edu date:Thu Sep 29 16:45:19 2011 -0400 Thanks again, -James P.S. It's my pleasure to use Galaxy! ;) On Thu, Dec 1, 2011 at 7:06 AM, Daniel Blankenberg d...@bx.psu.edu wrote: Hi James, Can you let us know which revision of Galaxy that you are using (hg head) and any log output that appears when accessing or running the tool? Also the list of parameters and values that are being POSTed to Galaxy and a copy of
Re: [galaxy-dev] Job output not returned from cluster
On Nov 29, 2011, at 9:22 PM, Fields, Christopher J wrote: On Nov 29, 2011, at 3:13 AM, Peter Cock wrote: On Monday, November 28, 2011, Joseph Hargitai joseph.hargi...@einstein.yu.edu wrote: Ed, we had the classic goof on our cluster with this. 4 nodes could not see the /home/galaxy folder due to a missing entry in /etc/fstab. When the jobs hit those nodes (which explains the randomness) we got the error message. Bothersome was the lack of good logs to go on. The error message was too generic - however I discovered that Galaxy was depositing the error and our messages in the /pbs folder and you could briefly read them before they got deleted. There the message was the classic SGE input/output message - /home/galaxy file not found. Hence my follow up question - how can I have galaxy NOT to delete these SGE error and out files? best, joe Better yet, Galaxy should read the SGE o and e files and record their contents as it would for a directly executed tools stdout and stderr. Peter ...or at least have the option to do so, maybe a level of verbosity. I have been bitten by lack of stderr output myself, where having it might have saved some manual debugging. Unless I'm misunderstanding, this is what Galaxy already does. stdout/stderr up to 32K are read from .o and .e and stored in job.stdout/job.stderr. We do need to just store them as files and make them accessible for each tool run, this will hopefully happen sometime soonish. --nate chris ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] proxy settings?
On Nov 29, 2011, at 9:35 PM, Smithies, Russell wrote: Found the cure – just required adding urllib2.ProxyHandler in the data_source tools. Why doesn’t Galaxy pick up the system http_proxy variables? Hi Russell, Thanks for tracking down the problem. Could you send a patch for this? --nate --Russell Smithies From: galaxy-dev-boun...@lists.bx.psu.edu [mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Smithies, Russell Sent: Wednesday, 30 November 2011 9:09 a.m. To: galaxy-dev@lists.bx.psu.edu Subject: [galaxy-dev] proxy settings? I’m new to Galaxy so I’m not sure if this a Galaxy or linux/apache question . When I try to “Get Data” from UCSC or any other external site, I get a 407 error from our proxy as I need to authenticate. Is the request going out as the ‘galaxy’ user or ‘apache’ or the user that’s logged in? I already have http_proxy and ftp_proxy configured in /etc/profile (we’re running Centos 6) but I assume there a correct place to configure this for Galaxy? The error message I’m seeing is: An error occurred running this job: The remote data source application may be off line, please try again later. Error: ('http error', 407, 'Proxy Access Denied', httplib.HTTPMessage instance at 0x35d2998) Any ideas? Thanx, Russell Smithies Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] dadaset (file) visualization
Hi, I'm new in galaxy, but i've made several installations in different distributions. At his moment I'm using RHEL 6.1 and I'm not able to visualize certain files. After uploading a SAM file I click on the file title and a preview of few lines appears whith the rest of the available options. When i click on the eye button nothing happen. If i try to download it, i do receive an empty file. This behaviour occurs with other file extension (BAM, HTML), but not whith others (FASTA, FASTAq) Using wget: wget http://10.0.0.2:8090/galaxy/datasets/5969b1f7201f12ae/display/?preview=True; --2011-12-01 18:18:56-- http://10.0.0.2:8090/galaxy/datasets/5969b1f7201f12ae/display/?preview=True Connecting to 10.0.0.2:8090... connected. HTTP request sent, awaiting response... 10.0.0.2 - - [01/Dec/2011:18:18:56 +] GET /galaxy/datasets/5969b1f7201f12ae/display/?preview=True HTTP/1.0 200 - - Wget/1.12 (linux-gnu) 200 OK Length: unspecified [text/plain] Saving to: `index.html?preview=True.3' finaly index.html?preview=True.3 is an empty file, and obviously, no errors is displayed with any of the debugging options. I've done the same tests in other distributions and apparently every thing seems right Can any one of you get some light to this unexplainable black hole? -- = Alfonso Núñez Salgado Unidad de Bioinformática Centro de Biologia Molecular Severo Ochoa C/Nicolás Cabrera 1 Universidad Autónoma de Madrid Cantoblanco, 28049 Madrid (Spain) Phone: (34) 91-196-4633 Fax: (34) 91-196-4420 web: http://ub.cbm.uam.es/ = ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] synchronous data depositing
Sorry - this truly is the last question. Besides my question above on the expected response type from my page, how do I get Galaxy to skip the intermediate Execute button before the resubmission? Sorry for all the questions and thanks for your help. -James On Thu, Dec 1, 2011 at 10:11 AM, James Ireland jirel...@5amsolutions.comwrote: Excellent! Thanks, Dan. I'm now getting the post resubmits from Galaxy. Works great! Last question - can you provide any info on the expected http response type from my page back to Galaxy after the resubmit? Returning a standard html response or text/csv doesn't seem to be cutting it. On Thu, Dec 1, 2011 at 9:13 AM, Daniel Blankenberg d...@bx.psu.edu wrote: Hi James, tools/data_source/yeastmine.xml is a good example of a relatively simple configuration. Ideally, the request_param_translation/ tagset would not be required, if a more specific data_type parameter value was being provided by the external site. Thanks for using Galaxy, Dan On Dec 1, 2011, at 12:05 PM, James Ireland wrote: Hi Dan, BTW - is there a particular tool/tool.xml I should focus on for a demonstration of current best practices? Thanks, -James On Thu, Dec 1, 2011 at 8:40 AM, James Ireland jirel...@5amsolutions.comwrote: Hi Dan, Ahhh... ok. I had seen two forms of the url ad was wondering what was up. So, simply removing the param creates a poorly formed url: http://127.0.0.1:8080/tool_runner?tool_id=test_me?type=textname=Homo%20sapiens%20genes%20%28GRCh37.p5%29URL=http://www.biomart.org/biomart/martview/1a5fc0bddc9f81dba837a1b1a5063691?do_export=1resultsButton=1 Which causes much wailing and nashing of teeth... Tool 'test_me?type=text' does not exist, kwd={'hsapiens_gene_ensembl__feature_page__attribute.ensembl_gene_id': u'on', 'hsapiens_gene_ensembl__filter.go_parent_name': u'', 'hsapiens_gene_ensembl__filter.encode_region': u'5:131256415:132256414', 'hsapiens_gene_ensembl__filtergroup.gene__visibility': u'show', 'hsapiens_gene_ensembl__filter.start': u'1', 'hsapiens_gene_ensembl__filter.marker_end': u'', 'hsapiens_gene_ensembl__filter.id_list_limit_filters': u'ensembl_gene_id', 'hsapiens_gene_ensembl__filter.type': u'manual_picks', 'export_subset': u'10', 'hsapiens_gene_ensembl__filter.somatic_variation_source': u'COSMIC', 'defaulthsapiens_gene_ensembl__homologs__attributelist': [u'hsapiens_gene_ensembl__homologs__attribute.ensembl_gene_id', u'hsapiens_gene_ensembl__homologs__attribute.ensembl_transcript_id'], 'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_gene_id': u'on', 'hsapiens_gene_ensembl__snp__attribute.ensembl_gene_id': u'on', 'hsapiens_gene_ensembl__filtergroup.protein__visibility': u'hide', 'exportView_outputformat': u'TSV', 'export_saveto': u'text', 'export_dataset': u'0', 'URL': u' http://www.biomart.org/biomart/martview/1a5fc0bddc9f81dba837a1b1a5063691?do_export=1', 'defaulthsapiens_gene_ensembl__transcript_event__attributelist': [u'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_gene_id', u'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_transcript_id'], 'hsapiens_gene_ensembl__filter.transcript_status': u'KNOWN', 'hsapiens_gene_ensembl__filter.with_transmembrane_domain': u'only', 'menuNumber': u'0', 'hsapiens_gene_ensembl__filter.protein_fam_id_b... etc, etc If I correct the URL by hand (change the second ? to ) and resubmit, that seems to work! I'll try using this url form on my own page now. I'm surprised that the legacy url works until I change the tool name! If you hear no more from me, you can assume it worked. Thanks! -James On Thu, Dec 1, 2011 at 8:25 AM, Daniel Blankenberg d...@bx.psu.eduwrote: Hi James, For legacy reasons, Biomart uses a special-cased GALAXY_URL parameter, which is likely the source of the problem. If you remove the 'param name=GALAXY_URL ... /' input parameter, restart Galaxy and reload the Galaxy interface, does it then work correctly? Thanks for using Galaxy, Dan On Dec 1, 2011, at 11:18 AM, James Ireland wrote: Hi Dan, Thanks for the quick response! I have a feeling this is something silly that I'm missing. As a reality check I simply created a copy of the biomart.xml tool.xml file and called it test_me.xml. I changed the tool_id and GALAXY_URL in the xml file as you'll see and added a link in my tool_config. Selecting this tool from the Get Data folder brings me to Biomart as expected. When I hit go, I have the same issue of just being redirected to the welcome page. The server output is shown below. The unchanged biomart tool continues to work fine. Galaxy server output: 127.0.0.1 - - [01/Dec/2011:08:08:27 -0700] POST /tool_runner/test_me?type=textname=Homo%20sapiens%20genes%20(GRCh37.p5)URL= http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9?do_export=1resultsButton=1HTTP/1.1; 302 -
Re: [galaxy-dev] GalaxyCloudman + CADDSuite
Marcel; well, I know I do not have to create a new AMI if I want to reuse an instance myself. However, I would like to share the modified GalaxyCloudman version with other people and therefore I do have to create an AMI. What Enis was suggesting is using the share-a-cluster functionality built into CloudMan. This bundles your data volumes as snapshots and prepares a sharable cluster than anyone can initiate. In the CloudMan interface, there is a little green button next to the cluster name that enables this. That is definitely the easiest way to share and distribute modified CloudMan versions. Ok, I will try to make this work somehow ... but I guess there are no immediate clues as to what could have gone wrong? Or do you have any ideas what I should try? After CloudMan boots once, you need to clean up some files before preparing an AMI. This is the automated code we use to clean up for prepping CloudMan compatible AMIs: https://github.com/chapmanb/cloudbiolinux/blob/master/cloudbio/cloudman.py#L87 Be careful if you run that directly. It runs immediately before bundling and removes the ssh keys (so you don't have a backdoor to the AMI you are distributing) so you want to do it as the last thing. It also assumes you have unmounted all of the associated Galaxy data libraries. Hope this helps, Brad ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] lims integration
I am newish to Galaxy and trying to learn how I might integrate it with our workflows and LIMS for automated data handling. I am aware of the API and have looked up all the documentation that I could find. However, there are many things I cannot make sense of, and have not been able to find information to help me out. I think a good place to start asking questions is with how to run workflow_execute.py and ask what each of the parameters are and where to get the information from them Arguments *API key - got this and understand *url - got this and understand *workflow_id - I have created workflows and have been able to find what looks to be a workflow_id by clicking on the workflow name and selecting Download or Export. It seems this may be correct, is it? *history - a named history to use? Should this already exist? I have no idea here. *step=src=dataset_id - ??? I have no idea ??? I have seen how to create data libraries manually at the command line; does this factor in? If anyone has information they can help me out with, it would be much appreciated. Thanks Craig Blackhart Computer Scientist Applied Engineering Technologies Los Alamos National Laboratory 505-665-6588 This message contains no information that requires ADC review ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] lims integration
Hi Chris, unfortunately none of us here have played around with the API yet. I would recommend inquiring on the galaxy-central's mailing list ( mailto:galaxy-dev@lists.bx.psu.edu galaxy-dev@lists.bx.psu.edu). - workflows, histories, libraries, and datasets have IDs in the database but they may be obscured in the URLs used in galaxy; in the db they're just integer primary keys - histories must exist to do any work, but you can create a new history On Thu, Dec 1, 2011 at 4:32 PM, Craig Blackhart blackh...@lanl.gov wrote: I am newish to Galaxy and trying to learn how I might integrate it with our workflows and LIMS for automated data handling. I am aware of the API and have looked up all the documentation that I could find. However, there are many things I cannot make sense of, and have not been able to find information to help me out. I think a good place to start asking questions is with how to run workflow_execute.py and ask what each of the parameters are and where to get the information from them ** ** Arguments *API key – got this and understand *url – got this and understand *workflow_id – I have created workflows and have been able to find what looks to be a workflow_id by clicking on the workflow name and selecting “Download or Export”. It seems this may be correct, is it? *history – a named history to use? Should this already exist? I have no idea here. *step=src=dataset_id - ??? I have no idea ??? I have seen how to create data libraries manually at the command line; does this factor in? ** ** If anyone has information they can help me out with, it would be much appreciated. ** ** Thanks ** ** Craig Blackhart Computer Scientist Applied Engineering Technologies Los Alamos National Laboratory 505-665-6588 *This message contains no information that requires ADC review* ** ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] lims integration
Hi Craig, Thanks for your interest in the galaxy API. For the parameters you're uncertain about: The 'workflow_id' is indeed the encoded workflow id. You can get this by encoding it yourself, or doing a GET on /api/workflows for a list of *all* workflows and their encoded id's (see example below) The 'history' is either a name for a new history to be created, or a string of the format hist_id=encoded history_id to use, if you have an existing history that you'd like the results to appear in. And the last parameter is a three part string. The first part is the step that the input should be mapped to, the second part is the *type* of input it is, and the third step is the actual encoded id. The type is going to be either ldda for a library dataset, or hda for a history dataset. All of these encoded id's are discoverable through the API itself. Try using scripts/api/display.py to view /api/workflows, /api/workflows/encoded_workflow_id to get a feel for what's available to you. Lastly, an example of how the run_workflow component of the api could be used as a part of an external pipeline can be found in scripts/api/example_watch_folder.py. This script monitors a particular folder for files, uploads them to galaxy, and executes a workflow on them. Hope this helps, and definitely let me know if I can answer any more questions or if you have feedback about the API. -Dannon On Dec 1, 2011, at 7:32 PM, Craig Blackhart wrote: I am newish to Galaxy and trying to learn how I might integrate it with our workflows and LIMS for automated data handling. I am aware of the API and have looked up all the documentation that I could find. However, there are many things I cannot make sense of, and have not been able to find information to help me out. I think a good place to start asking questions is with how to run workflow_execute.py and ask what each of the parameters are and where to get the information from them Arguments *API key – got this and understand *url – got this and understand *workflow_id – I have created workflows and have been able to find what looks to be a workflow_id by clicking on the workflow name and selecting “Download or Export”. It seems this may be correct, is it? *history – a named history to use? Should this already exist? I have no idea here. *step=src=dataset_id - ??? I have no idea ??? I have seen how to create data libraries manually at the command line; does this factor in? If anyone has information they can help me out with, it would be much appreciated. Thanks Craig Blackhart Computer Scientist Applied Engineering Technologies Los Alamos National Laboratory 505-665-6588 This message contains no information that requires ADC review ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] GalaxyCloudman + CADDSuite
Hi Marcel, On Thu, Dec 1, 2011 at 10:35 PM, Marcel Schumann schum...@informatik.uni-tuebingen.de wrote: Hi Brad, hi Enis, ok, so one possible way should thus be to use the _cleanup_ec2() function before creating an AMI via the amazon console and the second option perhaps is this share-a-cluster functionality. About the latter: sorry, I tried to find out what it really does, but did not find any documentation of it. So, how does this share-a-cluster functionality work in principle? If it does not create AMI, does it just create snapshots of the individual disks that users have to mount (manually) or ... ? My point here is: I need to make a modified version of GalaxyCloudman available (i.e. one that contains the CADDSuite tools). I do not just want to share an _instance_ (a running cluster) but a cloudman image (as an AMI or by some other means) that includes all my tools, so that other users can easily start their own server. The share-an-instance functionality creates snapshot the data volume (/mnt/galaxyData) so any tool that is installed on that file system will get bundled into the shared instance. Also, any modifications that you make to you galaxy configuration (e.g., galaxy_tools.xml) become part of the shared instance. When a user then starts a new cluster and provides the cluster share string (this is available on the shared cluster after you share it), they will get access to an exact replica of your cluster. Like, I mentioned earlier, unless you are modifing system libraries there is absolutely no reason to create a new AMI. Simply contain your customizations to the file system and use cluster sharing functionality. Try it and see if it does what you need it to do. If you do indeed have a description of this somewhere that I just did not find, then I am sorry but would be grateful for a link ;-) Sorry but there's no functionality about this feature on the wiki. However, there is a pretty detailed description of the functionality and the process on the cloudman console after you click the green share a cluster button. I'll work on preparing some additional documentation on this topic. Enis Cheers, Marcel On 12/1/11 10:04 PM, Brad Chapman wrote: Marcel; well, I know I do not have to create a new AMI if I want to reuse an instance myself. However, I would like to share the modified GalaxyCloudman version with other people and therefore I do have to create an AMI. What Enis was suggesting is using the share-a-cluster functionality built into CloudMan. This bundles your data volumes as snapshots and prepares a sharable cluster than anyone can initiate. In the CloudMan interface, there is a little green button next to the cluster name that enables this. That is definitely the easiest way to share and distribute modified CloudMan versions. Ok, I will try to make this work somehow ... but I guess there are no immediate clues as to what could have gone wrong? Or do you have any ideas what I should try? After CloudMan boots once, you need to clean up some files before preparing an AMI. This is the automated code we use to clean up for prepping CloudMan compatible AMIs: https://github.com/chapmanb/**cloudbiolinux/blob/master/** cloudbio/cloudman.py#L87https://github.com/chapmanb/cloudbiolinux/blob/master/cloudbio/cloudman.py#L87 Be careful if you run that directly. It runs immediately before bundling and removes the ssh keys (so you don't have a backdoor to the AMI you are distributing) so you want to do it as the last thing. It also assumes you have unmounted all of the associated Galaxy data libraries. Hope this helps, Brad -- Marcel Schumann University of Tuebingen Wilhelm Schickard Institute for Computer Science Division for Applied Bioinformatics Room C313, Sand 14, D-72076 Tuebingen phone: +49 (0)7071-29 70437 fax: +49 (0)7071-29 5152 email: schum...@informatik.uni-**tuebingen.deschum...@informatik.uni-tuebingen.de ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/