Re: [galaxy-dev] upload from local filesysem
You can do so / allowing them to, by using the galaxy ftp solution. That way users can link or provide files in their email subdirs. You have to sort out how to set the users/rights properly on your linux box though. But see the galaxy ftp solution pages, it might be of help. Alex Van: galaxy-dev-boun...@lists.bx.psu.edu [galaxy-dev-boun...@lists.bx.psu.edu] namens J. Greenbaum [jgb...@liai.org] Verzonden: woensdag 23 mei 2012 3:01 To: galaxy-dev@lists.bx.psu.edu Onderwerp: [galaxy-dev] upload from local filesysem It seems that this topic has been covered at least several times and I've found this thread to be helpful: http://user.list.galaxyproject.org/Speed-up-uploading-into-local-Galaxy-terribly-slow-tp4429327p4436053.html However, as far as I understand, this will only allow the Admin to do the upload from a local disk. Has anyone had success allowing a regular user to upload from a local disk? This seems like a common issue, particularly with large sequencing files. It would be great to be able to add this as the first step in a workflow. Any info would be appreciated. Thanks, Jason -- Jason Greenbaum, Ph.D. Manager, Bioinformatics Core | jgb...@liai.orgmailto:jgb...@liai.org La Jolla Institute for Allergy and Immunology ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] LateValidationError in Extract features
Hi there I've got a workflow that uses BioPerl's bp_genbank2gff3 to convert Genbank to GFF3, then hands the GFF3 to Extract features to filter to only genes, before moving on. The workflow JSON is at http://pastebin.com/zHWsC6YT. Step 4 - the Genbank2GFF runs fine, and if I view the output in the history I can see the resulting GFF, but step 8 - the Extract features, fails with a LateValidationError, as follows: Traceback (most recent call last): File /net/datasrv3.sanbi.ac.za/datastore/cip0/software/galaxy-dev/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 144, in queue_job job_wrapper.prepare() File /net/datasrv3.sanbi.ac.za/datastore/cip0/software/galaxy-dev/galaxy-dist/lib/galaxy/jobs/__init__.py, line 130, in prepare self.tool.handle_unvalidated_param_values( incoming, self.app ) File /net/datasrv3.sanbi.ac.za/datastore/cip0/software/galaxy-dev/galaxy-dist/lib/galaxy/tools/__init__.py, line 1872, in handle_unvalidated_param_values self.handle_unvalidated_param_values_helper( self.inputs, input_values, app ) File /net/datasrv3.sanbi.ac.za/datastore/cip0/software/galaxy-dev/galaxy-dist/lib/galaxy/tools/__init__.py, line 1890, in handle_unvalidated_param_values_helper self.handle_unvalidated_param_values_helper( input.cases[current].inputs, values, app, context, prefix ) File /net/datasrv3.sanbi.ac.za/datastore/cip0/software/galaxy-dev/galaxy-dist/lib/galaxy/tools/__init__.py, line 1912, in handle_unvalidated_param_values_helper raise LateValidationError( message ) LateValidationError This Galaxy is Postgres and SGE, with galaxy-dist as of last night. Another strange thing is that while I select that the output of step 4 should be visible, it is, in fact, hidden in the new history. How to debug this? How can I see which input is not validating? Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Avoid to download output files
Hi, I don't want dowload the output files which Galaxy generates, because I don't want to have them in double in my disk. I saw, that the output files are in file_path an I found them. But they have a .dat extension ! In your mailing-list archive I haven't found a solution to transform these extensions. Is it possible to change this with a simple change in a configuration file or in a script file ? Or, must I write my owner script, and if yes, how can I configure Galaxy so that it launches it automatically after one application like MACS for exemple which have a file configuration : macs.xml in the tools directory? Because I've tried to add a line like this in this file : command interpreter =perl mon scrip.pl/command I've placed it after the line command... which launches the MACS script, but my script seems not work. Thank you. julie ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Pushing changes upstream to Galaxy
On May 19, 2012, at 10:09 AM, Peter van Heusden wrote: Hi there As I adapt Galaxy to the needs of the particular workflows I'm implementing, I invariably end up tweaking the existing code, especially the tool definitions and the datatypes config file. These tweaks are often minor - e.g. adding an extra parameter to Ross Lazarus' ClustalW's code, changing the way the binary is called (e.g. clustalw2 on ubuntu is installed as clustalw, BLAST+ is installed single threaded so the -num_threads option in the ncbi_blastn_wrapper.xml doesn't work), but a) Sometimes they make sense to share with the rest of the Galaxy community b) local changes make it more tricky to keep in sync with upstream changes to Galaxy What's the best procedure for proposing changes up upstream galaxy? Posting patches to galaxy-dev? Is there some mercurial magic that can help? Thanks, Peter Hi Peter, The best way is to fork Galaxy and use a pull request in bitbucket. Thanks! --nate ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] cannot access conditional parameter in command
Hi, I think I found the solution to my problem by googling it instead of just searching through the dev mailing list. I found this post very useful for debugging tips on cheetah template: http://lists.bx.psu.edu/pipermail/galaxy-dev/2011-March/004817.html Then I read on the galaxy wiki: http://wiki.g2.bx.psu.edu/Admin/Tools/Data%20Tables How to use Data Tables and found this: The script or binary will need the actual path in the command, and since the ID is the value, some work will need to be done to extract the path. The following line would do the trick: --ref=${ filter( lambda x: str( x[0] ) == str( $index ), $__app__.tool_data_tables[ 'bowtie_indexes' ].get_fields() )[0][-1] } In this line, str( x[0] ) refers to the unique ID in column 0 and the -1 in [0][-1] gets the path in the last column. I've changed the command to print all available parameters: command interpreter=sh #silent sys.stderr.write( Cheetah Template Variables \n) #for k,v in $searchList[2].items() #silent sys.stderr.write( %s = %s\n % (str(k), str(v) )) #end for #silent sys.stderr.write(CRI datasets: %s\n % str($__app__.tool_data_tables[ 'cri_datasets' ].get_fields())) #silent sys.stderr.write(Dataset selected: %s\n % str($dataset.name)) #silent dataset_path = filter( lambda x: str( x[0] ) == str( $dataset.name ), $__app__.tool_data_tables[ 'cri_datasets' ].get_fields() )[0][-1] #silent sys.stderr.write(Dataset path: %s\n % str(dataset_path) ) #silent sys.stderr.write( end-of-list \n) ... and finally got it to works! To summary, to get the path from the cri_datasets table: dataset_path = filter( lambda x: str( x[0] ) == str( $dataset.name ), $__app__.tool_data_tables[ 'cri_datasets' ].get_fields() )[0][-1] Hope it could be useful to someone else. Anne. On 22 May 2012, at 19:52, Anne Pajon wrote: Hi, I'm writing a galaxy tool xml file for an in house existing R script. I have a conditional parameter set for selecting the dataset to work with but I would like to access the path of the file from the data table I've created: table name=cri_datasets comment_char=# columnsvalue, name, path/columns file path=tool-data/cri/datasets.loc / /table Here is the beginning of the xml file: command interpreter=R --vanilla --slave -f rp.R --args $dataset.name.fields.name $dataset.name.fields.path $input $column $output_txt $output_html $output_html.files_path $dataset.outcome $dataset.receptor $dataset.treatment $dataset.er $dataset.age $dataset.grade $dataset.signature $dataset.margins $dataset.groups $dataset.gleason $dataset.tstage 2gt;stderr || cat stderr 1gt;amp;2 /command inputs param name=input type=data format=tabular,txt label=Source file help=File containing a list of gene symbols/ param name=column type=data_column data_ref=input numerical=False label=Column help=Column containing gene symbols (only available for tabular input file) / conditional name=dataset param name=name type=select label=Dataset options from_data_table=cri_datasets / /param ... but when running the tool, here is the error message I get: Traceback (most recent call last): File /opt/local/home/webapp/galaxy/lib/galaxy/jobs/runners/local.py, line 59, in run_job job_wrapper.prepare() File /opt/local/home/webapp/galaxy/lib/galaxy/jobs/__init__.py, line 429, in prepare self.command_line = self.tool.build_command_line( param_dict ) File /opt/local/home/webapp/galaxy/lib/galaxy/tools/__init__.py, line 1971, in build_command_line command_line = fill_template( self.command, context=param_dict ) File /opt/local/home/webapp/galaxy/lib/galaxy/util/template.py, line 9, in fill_template return str( Template( source=template_text, searchList=[context] ) ) File /opt/local/home/webapp/galaxy/eggs/Cheetah-2.2.2-py2.7-linux-x86_64-ucs2.egg/Cheetah/Template.py, line 1004, in __str__ return getattr(self, mainMethName)() File DynamicallyCompiledCheetahTemplate.py, line 83, in respond NotFound: cannot find 'fields' while searching for 'dataset.name.fields.name' Is it possible to access the other columns of a loc file? It seems possible based of what I read on the wiki http://wiki.g2.bx.psu.edu/Admin/Tools/Tool%20Config%20Syntax#A.3Coptions.3E_tag_set using from_data_table and syntax of ${param.fields.path} but I could not manage to make it work. Any help would be very much appreciated. Thanks! Kind regards, Anne. -- Anne Pajon, Ph.D. Cancer Research UK - Cambridge Research Institute Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE anne.pa...@cancer.org.uk | +44 (0)1223 404 334 -- Anne Pajon, Ph.D. Cancer Research UK - Cambridge Research Institute Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE anne.pa...@cancer.org.uk | +44 (0)1223 404 334 NOTICE AND DISCLAIMER This e-mail (including any attachments) is intended for the above-named
[galaxy-dev] Create Trackster visualizations via API
Is it possible to create visualizations via the API yet? Can't seem to find an example. Ayton ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] specifying nodes/processors with PBS
Hi, I've successfully configured a local galaxy instance to submit jobs to our local cluster, but have a question regarding setting PBS options such as the number of nodes and processors to use. I've followed the guide here: http://wiki.g2.bx.psu.edu/Admin/Config/Performance/Cluster?highlight=%28pbs%29 Is there any way to do this in a tool-specific manner. For example, for running tophat to do a mapping run, I would want it to use 24 CPUs, but for simply converting a file type I would want only 1 CPU requested. Is there a way to specify this somewhere? Thanks, Jason -- Jason Greenbaum, Ph.D. Manager, Bioinformatics Core | jgb...@liai.org La Jolla Institute for Allergy and Immunology ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Visualise Galaxy workflows using graphviz
Hi there Because my Galaxy workflows tend to sprawl outside the viewable space in my browser, I've written a script that, when given a workflow JSON file as input, writes out a graphviz dot format graph of the workflow. By default the graph treats datasets as nodes and analyses as edges (unlike the Galaxy workflow editor, but quite like the way the history is presented), but you can choose to have analyses as nodes (and thus datasets as edges) too. In case anyone else finds this useful, the code can be downloaded from my bitbucket: https://bitbucket.org/pvanheus/galaxy/src/f29453f3d9d8/contrib/workflow_to_dot.py Usage is something like: ./workflow_to_dot.py Galaxy-Workflow.ga |dot -Tsvg workflow.svg and then you can view your workflow.svg with eog or another viewer. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] upload from local filesysem
Thanks for the reply. Ideally, I would have a cron job run that would monitor a directory and automatically add the new files to a particular library. I do not want the end users to have to deal with this. Does anyone have a working solution or ideas for implementation? This is the main sticking point that is preventing us from using galaxy for our in-house next-gen sequencing pipeline. Thanks, Jason -- Jason Greenbaum, Ph.D. Manager, Bioinformatics Core | jgb...@liai.org La Jolla Institute for Allergy and Immunology - Original Message - From: Alex Bossers alex.boss...@wur.nl To: J. Greenbaum jgb...@liai.org, galaxy-dev@lists.bx.psu.edu Sent: Wednesday, May 23, 2012 1:24:13 AM Subject: RE: [galaxy-dev] upload from local filesysem You can do so / allowing them to, by using the galaxy ftp solution. That way users can link or provide files in their email subdirs. You have to sort out how to set the users/rights properly on your linux box though. But see the galaxy ftp solution pages, it might be of help. Alex Van: galaxy-dev-boun...@lists.bx.psu.edu [galaxy-dev-boun...@lists.bx.psu.edu] namens J. Greenbaum [jgb...@liai.org] Verzonden: woensdag 23 mei 2012 3:01 To: galaxy-dev@lists.bx.psu.edu Onderwerp: [galaxy-dev] upload from local filesysem It seems that this topic has been covered at least several times and I've found this thread to be helpful: http://user.list.galaxyproject.org/Speed-up-uploading-into-local-Galaxy-terribly-slow-tp4429327p4436053.html However, as far as I understand, this will only allow the Admin to do the upload from a local disk. Has anyone had success allowing a regular user to upload from a local disk? This seems like a common issue, particularly with large sequencing files. It would be great to be able to add this as the first step in a workflow. Any info would be appreciated. Thanks, Jason -- Jason Greenbaum, Ph.D. Manager, Bioinformatics Core | jgb...@liai.orgmailto:jgb...@liai.org La Jolla Institute for Allergy and Immunology ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] specifying nodes/processors with PBS
Yes, you can set these things for each tool independently.Check the file"universe_wsgi.ini"and check the section:[galaxy:tool_runners]There you can set the -pe options for GSUB in the same way you can do on the command line...Here's an example of my setup:gatk_variant_select = drmaa://-V -pe smp 1gatk_validate_variants = drmaa://-V -pe smp 1far = drmaa://-V -pe smp 1velveth = drmaa://-V -pe mpi 4velvetg = drmaa://-V -pe mpi 8I set the -pe mpi option to 8 for VELVETH/G and to 1 for GATK for instanceThonOn May 23, 2012, at 12:47 PM, "J. Greenbaum" jgb...@liai.org wrote:Hi,I've successfully configured a local galaxy instance to submit jobs to our local cluster, but have a question regarding setting PBS options such as the number of nodes and processors to use. I've followed the guide here:http://wiki.g2.bx.psu.edu/Admin/Config/Performance/Cluster?highlight=%28pbs%29Is there any way to do this in a tool-specific manner. For example, for running tophat to do a mapping run, I would want it to use 24 CPUs, but for simply converting a file type I would want only 1 CPU requested. Is there a way to specify this somewhere?Thanks,Jason--Jason Greenbaum, Ph.D.Manager, Bioinformatics Core | jgb...@liai.orgLa Jolla Institute for Allergy and Immunology___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] specifying nodes/processors with PBS
Excellent! Thanks for the reply. -J -- Jason Greenbaum, Ph.D. Manager, Bioinformatics Core | jgb...@liai.org La Jolla Institute for Allergy and Immunology - Original Message - From: Anthonius deBoer thondeb...@me.com To: J. Greenbaum jgb...@liai.org Cc: galaxy-dev@lists.bx.psu.edu Sent: Wednesday, May 23, 2012 2:03:18 PM Subject: Re: [galaxy-dev] specifying nodes/processors with PBS Yes, you can set these things for each tool independently. Check the file universe_wsgi.ini and check the section: [galaxy:tool_runners] There you can set the -pe options for GSUB in the same way you can do on the command line... Here's an example of my setup: gatk_variant_select = drmaa://-V -pe smp 1 gatk_validate_variants = drmaa://-V -pe smp 1 far = drmaa://-V -pe smp 1 velveth = drmaa://-V -pe mpi 4 velvetg = drmaa://-V -pe mpi 8 I set the -pe mpi option to 8 for VELVETH/G and to 1 for GATK for instance Thon On May 23, 2012, at 12:47 PM, J. Greenbaum jgb...@liai.org wrote: Hi, I've successfully configured a local galaxy instance to submit jobs to our local cluster, but have a question regarding setting PBS options such as the number of nodes and processors to use. I've followed the guide here: http://wiki.g2.bx.psu.edu/Admin/Config/Performance/Cluster?highlight=%28pbs%29 Is there any way to do this in a tool-specific manner. For example, for running tophat to do a mapping run, I would want it to use 24 CPUs, but for simply converting a file type I would want only 1 CPU requested. Is there a way to specify this somewhere? Thanks, Jason -- Jason Greenbaum, Ph.D. Manager, Bioinformatics Core | jgb...@liai.org La Jolla Institute for Allergy and Immunology ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Visualise Galaxy workflows using graphviz
Thanks for sharing Peter. FWIW, there is already code in Galaxy to generate SVG from a workflow. See the gen_image function in web/controllers/workflow.py I think this code is used when workflows are included in the tool shed as well. If your code is better in some way(s), please consider a pull request. It should be fairly straightforward to add an option to a workflow's menu to generate SVG for it. Best, J. On May 23, 2012, at 4:44 PM, Peter van Heusden wrote: Hi there Because my Galaxy workflows tend to sprawl outside the viewable space in my browser, I've written a script that, when given a workflow JSON file as input, writes out a graphviz dot format graph of the workflow. By default the graph treats datasets as nodes and analyses as edges (unlike the Galaxy workflow editor, but quite like the way the history is presented), but you can choose to have analyses as nodes (and thus datasets as edges) too. In case anyone else finds this useful, the code can be downloaded from my bitbucket: https://bitbucket.org/pvanheus/galaxy/src/f29453f3d9d8/contrib/workflow_to_dot.py Usage is something like: ./workflow_to_dot.py Galaxy-Workflow.ga |dot -Tsvg workflow.svg and then you can view your workflow.svg with eog or another viewer. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Using workflows as components in workflows?
Peter, I'm not ignoring you. However, there are others on the Galaxy team that are more familiar with the API and can provider better answers. I expect they'll chime in soon to address your questions. Best, J. On May 20, 2012, at 5:39 PM, Peter van Heusden wrote: Hi Jeremy I'm need this for something I'm implementing at the moment, and how I'm thinking about it is making a tool that uses the API to call a workflow. There are a few problems though, correct me if I'm wrong: 1) In order to make an input history item available to the called workflow, the tool needs to somehow know about history items, but the tool xml passes in parameters as data files. This could probably be remedied by providing a type=history_item parameter to param that would provide the id associated with the history item. In the interem, just to test things, I'm passing in parameters as a history:history_item string (yeah I know, ugly!). 2) My particular tool needs to take a history item, splits it into partitions, and call a workflow with each of those partitions. For this to work, the partition needs to be uploaded as a new history item, but that is currently not possible. The other possibility is to create a tool that does the split, have it in a single-tool workflow (because workflows can be called from the API in such a way that their output goes to a new history, whereas I don't see that in the tool interface) and then iterate through the history that contains the split data, calling the analysis workflow on each item. Peter P.S. for my particular problem - call a bunch of tools, once for each row in a file of tabular data - it would be WAY easier to just write everything in a Python script, but I'm trying to see what is do-able within Galaxy. On 20/05/2012 16:55, Jeremy Goecks wrote: Is there any way we can speed up the implementation of this issue? Community contributions and always encouraged and welcomed. Partial solutions are fine, and self-contained contributions are likely to be included more quickly because they are easier to review. Thanks, J. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] About Galaxy Configuration
Hi, Ciara, AFAIK new tools always require a server restart at present, but an already loaded tool can have updates to the tool XML re-read using the 'Reload a tool's configuration' option from the administrative menu if your login is listed as an administrator in the config file. Restarting Galaxy seems relatively benign if jobs are running on a cluster. SGE jobs (eg) will be correctly handled and managed. Busy users may see a service unavailable message from Apache for the time it takes to restart but in my experience they can generally cope with the occasional unscheduled server bounce to import a new tool - YMMV On Thu, May 24, 2012 at 1:10 PM, Ciara Ledero lede...@gmail.com wrote: Hi there, I understand that everytime I create a new tool, I would have to restart Galaxy to see the changes that I've made. But if somebody else has already started Galaxy on our server, I won't be able to see my newly developed tool. I would be able to access the site, yes, but I won't see my tool. Is there a way around this? Cheers, CL ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] About Galaxy Configuration
I didn't know that. Thanks for the reply! Now I have a new set of questions. First, how do I list myself as an admin? Should I just literally add my login detail in the config file? Second, my colleague would like to be able to edit and view the changes that he has made in his tools, even if I am doing some editing myself. Yesterday, he was running Galaxy and I was editing some tools. Since he has already started the instance, I got a socket error telling me that the address was already in use. Because of that, I was not able to see my revisions. What can we do about this? We're currently searching for ways on how Galaxy can handle 'multiusers'. Thanks again! On Thu, May 24, 2012 at 11:19 AM, Ross ross.laza...@gmail.com wrote: Hi, Ciara, AFAIK new tools always require a server restart at present, but an already loaded tool can have updates to the tool XML re-read using the 'Reload a tool's configuration' option from the administrative menu if your login is listed as an administrator in the config file. Restarting Galaxy seems relatively benign if jobs are running on a cluster. SGE jobs (eg) will be correctly handled and managed. Busy users may see a service unavailable message from Apache for the time it takes to restart but in my experience they can generally cope with the occasional unscheduled server bounce to import a new tool - YMMV On Thu, May 24, 2012 at 1:10 PM, Ciara Ledero lede...@gmail.com wrote: Hi there, I understand that everytime I create a new tool, I would have to restart Galaxy to see the changes that I've made. But if somebody else has already started Galaxy on our server, I won't be able to see my newly developed tool. I would be able to access the site, yes, but I won't see my tool. Is there a way around this? Cheers, CL ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/