Re: [galaxy-dev] Getting the input name of the dataset

2012-03-27 Thread thondeboer
You can use $input.name to get the name you see in the history.Unfortunately, that usually is something silly like "Samtools on data 3"I have changed all the labels for all the major file that are produced to provide more sensible names, usually based on the input...An example below is how I use the original name of the BAM file when I convert it into a SAM file but just taking the prefix of the input1 and adding ".sam" at the end.I am using the special parsing language that Galaxy is using to split the extension from the file name using the os.path.splitext function...Untill Galaxy comes up with a better way to maintain the original file names somehow, this is your best shot...Thon inputs param name="input1" type="data" format="bam" label="BAM File to Convert" / param name="header" type="boolean" truevalue="--header" falsevalue="" checked="False" label="Include header in output" / /inputs outputs data format="sam" name="output1" label="#echo os.path.splitext ( str ( $input1.name ) ) [ 0 ] #.sam" / /outputsOn Mar 19, 2012, at 08:00 PM, diana michelle magbanua dmdrmagba...@gmail.com wrote:Hi there,I am new to Galaxy and I've just recently learned how to integrate a Perl script to it. Now, my code uses the input file's nameas a header for a column in the output. When I ran it in Galaxy, I did get the filename, but it's the one ending in .dat (actually, I got the entire path of the file). I was wondering if it's possible to retain the original name of the file (or retrieve the name of the input dataset) and use it in the output file. I can't think of a Perl script for this yet, for my scripting's a bit rusty (I just started learning Perl last month). I've already checked the FAQs page, the wiki and the mailing list, but I did not get any useful hints.I hope my writing made sense. Thank you for your time!- Diana___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:  http://lists.bx.psu.edu/___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Possible bug: Tags in a workflow gets duplicated hundreds of times

2012-03-27 Thread thondeboer
Thanks for addressing this...The workflows load MUCH faster now...ThonOn Mar 20, 2012, at 07:54 AM, Dannon Baker dannonba...@me.com wrote:Hi Thon,  Thanks for reporting this. I see what the problem is here at least for the clone duplication, and I've committed a fix in 6833:e8e361707865 that will affect all workflows going forward.  Unfortunately, there isn't a complete solution for fixing the extra tags. The problem was that any time you cloned a workflow, the mapping that retrieved the set of tags that should have been copied was actually returning *all* of the user's tags. That means that even if the source workflow had no tags, all workflow tags owned by the user would get copied to the newly created workflow. Because of this, just removing duplicate tags (while an obvious step in the right direction) won't be enough and you'll still have tags where they shouldn't be. I'll add the duplicate removal code as a migration, but in the short term if there are a few individual workflows that you need to fix (by simply removing all tags) the easiest thing to do on your end would be to manually remove the stored_workflow_tag_associations for those particular workflows, using something like this sql statement:  delete from stored_workflow_tag_associations where stored_workflow_id = 'enter stored_workflow id to fix'  -Dannon   On Mar 19, 2012, at 1:47 PM, Anthonius deBoer wrote: Hi,I have run into an issue with a tag being duplicated hundreds of times each time I open, edit or clone a workflow.  I have tried to manually remove them, but each time the tag gets inserted somehow and duplicated each time I edit the workflow.  By now there are hundreds of copies of the tags at it makes loading and running the workflow very slow.Is there a way to remove a tag from the system somehow? I am willing to dig into the database with some SQL statement but not sure where to start...  Thanks  Thon  ___  Please keep all replies on the list by using "reply all"  in your mail client. To manage your subscriptions to this  and other Galaxy lists, please use the interface at:http://lists.bx.psu.edu/ ___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Using complete workflow in another workflow

2012-03-01 Thread thondeboer
Hi,How far along are we about thinking about being able to re-use a complete workflow as a workflow step in another workflow?This would really allow us to modularize certain aspects of the analyses and would allow us to re-use a workflow in another.Barring that, a simple copy/paste from one workflow into another would also really help...Any plans in that direction?ThanksThon
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Weird VDF format header

2012-02-27 Thread thondeboer
Hi,While technically not a Galaxy (Dev) question, but I am running into a non-standard VCF header in the GATK UnifiedGenotyper output.I see##FORMAT=ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification"
##INFO=ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes, for each ALT allele, in the same order as listed"
##INFO=ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order as listed"
Where Number=G and Number=A which does not pass the VCF Validator filter...Anyone know what is going on here? I had to change it to actual numbers (as the format describes..)Where does this header come from in the first place? Is it some standard header?ThanksThon
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Extracting the metrics files from a composite file such as Picard

2012-02-21 Thread thondeboer
Yeah...I was thinking something like that...I think it is possible to produce a varied amount of datafiles If I recall correctly...http://wiki.g2.bx.psu.edu/Admin/Tools/Multiple%20Output%20FilesThonOn Feb 20, 2012, at 07:11 PM, Ross ross.laza...@gmail.com wrote:Thon, I just had an idea - write a new tool that takes an Html dataset from the user's history and a file specification (eg "foo.xls") and 'promotes' the file(s) in the extra_files_path that match the file specification into new history items?That way you can automate the process - of course, including the outputs in workflows when their number is not known at execution may be tricky, but at least this is a generic approach and won't require any changes to any of the tools that generate Html outputs?On Tue, Feb 21, 2012 at 8:45 AM, Anthonius deBoer thondeb...@me.com wrote:Hi Ross,Thanks for the reply...It is indeed as I feared...I'll probably refactor the code so the user is able to get the actual output files.ThonOn Feb 20, 2012, at 01:29 PM, Ross ross.laza...@gmail.com wrote:Hi, Thon,On Tue, Feb 21, 2012 at 6:47 AM, Thon Deboer thondeb...@me.com wrote:Hi,Is there a way to extract the individual files from a composite file, such as the HTML files created by the picard tools?The decision to hide multiple outputs in a single history html object has this as a downside to the benefits of less cluttered histories.The results you want can be manually extracted in the usual ways - eg pasting the relevant html page url into an upload box.. but that doesn't solve your challenge of automating the process for a very large number of datasets.One possible solution is to add some complexity to each of the relevant tool forms to allow the user to specifically nominate outputs to be returned as individual new history datasets.It is not a huge task but it's not as far as I'm aware, high on the list of priorities for the team.If you are motivated sufficiently to fix this so it can do what you need, contributions of code to improve Galaxy tools are always very welcome ?I would like to take the metrics files and use them further down in some workflow, but I only get an HTML file...While these HTML files are nice for quickly looking at the results of one or two files, it becomes a problem if you have 88 samples like I do...I hate to have to re-write all the picard tools to produce actual files, but maybe there is something I am missing about composite datatype files?The wiki only explains how to CREATE them, not how to use those files downstreamRegards,Thon de Boer, Ph.D.Bioinformatics Guru+1-650-799-6839thondeb...@me.comLinkedIn Profile ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:  http://lists.bx.psu.edu/-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;-- Ross Lazarus MBBS MPH;Associate Professor, Harvard Medical School;Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Error 500 when trying to execute a workflow with the API

2012-02-21 Thread thondeboer
Hi,I tried to run a workflow with the API, but get an Error 500 when I try to run the WF...The paster.log shows the following error...$ workflow_execute.py 92cc01ed93dc0f0fc91e3ded35497c0a http://srp106:8080/api/workflows ebfb8f50c6abde6d 'TEST the API' '1=ldda=7c5ebce002fc9d5c'Paster.loggalaxy.web.framework ERROR 2012-02-21 14:36:33,067 Uncaught exception in exposed API method:Traceback (most recent call last): File "/home/tdeboer/code/galaxy-central/lib/galaxy/web/framework/__init__.py", line 145, in decorator return simplejson.dumps( func( self, trans, *args, **kwargs ), indent=4, sort_keys=True ) File "/home/tdeboer/code/galaxy-central/lib/galaxy/web/api/workflows.py", line 123, in create hda = ldda.to_history_dataset_association(history, add_to_history=add_to_history)AttributeError: 'NoneType' object has no attribute 'to_history_dataset_association'172.16.108.6 - - [21/Feb/2012:14:36:32 -0700] "POST /api/workflows?key=92cc01ed93dc0f0fc91e3ded35497c0a HTTP/1.1" 500 - "-" "Python-urllib/2.6"Any ideas?Also...I had a hard time finding out what I should use for the dataset source parameter "src" in "step=src="" and just tried ldda, but would hope there is a little info on what this hda and ldda is?Thanks,Thon___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Error 500 when trying to execute a workflow with the API

2012-02-21 Thread thondeboer
Hi Danon,thanks for the info...Indeed I had tried hda (I sorta figured H was for History, LD was for Library Dataset) but I got an error complaining about step 91 (I only had one step) but I figured out that the API probably uses a different number of the steps, so this worked./display.py 92cc01ed93dc0f0fc91e3ded35497c0a http://srp106:8080/api/workflows/ebfb8f50c6abde6dMember Information--url: /api/workflows/ebfb8f50c6abde6dinputs: {'91': {'value': '', 'label': 'Input Dataset'}}id: ebfb8f50c6abde6dname: Trivially fast workflowI was now able to run the workflow...Thanks!ThonOn Feb 21, 2012, at 03:46 PM, Dannon Baker dannonba...@me.com wrote:Hi Thon,  You have the right idea about what's going wrong here. Galaxy is trying to pull up a library dataset with the ID you specify, but it doesn't exist.  In this context, src refers to the 'type' of dataset input id, more specifically whether it's from a history or library. 'hda' indicates that the dataset is from a history, and 'ldda' is what you'll most likely use in the case of a library dataset. If you browse the history and library API functionality, you'll see other methods for explicitly grabbing these ids.  My hunch is that you have an id from a history and that you should swap ldda to hda and give it another shot. Definitely let me know if you run into more issues, though, and I'll help figure out what's going on.  -Dannon  On Feb 21, 2012, at 6:22 PM, thondeb...@me.com wrote:   Hi,I tried to run a workflow with the API, but get an Error 500 when I try to run the WF...The paster.log shows the following error...$ workflow_execute.py 92cc01ed93dc0f0fc91e3ded35497c0a http://srp106:8080/api/workflows ebfb8f50c6abde6d 'TEST the API' '1=ldda=7c5ebce002fc9d5c'  Paster.log  galaxy.web.framework ERROR 2012-02-21 14:36:33,067 Uncaught exception in exposed API method:  Traceback (most recent call last):  File "/home/tdeboer/code/galaxy-central/lib/galaxy/web/framework/__init__.py", line 145, in decorator  return simplejson.dumps( func( self, trans, *args, **kwargs ), indent=4, sort_keys=True )  File "/home/tdeboer/code/galaxy-central/lib/galaxy/web/api/workflows.py", line 123, in create  hda = ldda.to_history_dataset_association(history, add_to_history=add_to_history)  AttributeError: 'NoneType' object has no attribute 'to_history_dataset_association'  172.16.108.6 - - [21/Feb/2012:14:36:32 -0700] "POST /api/workflows?key=92cc01ed93dc0f0fc91e3ded35497c0a HTTP/1.1" 500 - "-" "Python-urllib/2.6"  Any ideas?Also...I had a hard time finding out what I should use for the dataset source parameter "src" in "step=src="" and just tried ldda, but would hope there is a little info on what this hda and ldda is?Thanks,Thon  ___  Please keep all replies on the list by using "reply all"  in your mail client. To manage your subscriptions to this  and other Galaxy lists, please use the interface at:http://lists.bx.psu.edu/ ___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/