Re: [galaxy-user] question about Filtering Cufflink files
Jagat, First, a couple housekeeping issues: (a) the questions you're asking are better suited to the galaxy-user list (questions about using Galaxy and performing analyses) rather than galaxy-dev (questions about installing Galaxy locally and tool development), so I've moved this thread to galaxy-user; (b) please start new threads when appropriate rather than replying to older threads as this makes threads shorter and more focused. Onto your questions: > I have another question when I filter gene list In the filtered list there > are multiple rows per gene. I should have one gene per row? I have attached > the snap shot of out put, but not sure if galaxy server will take it or not. > I did se the discussion on other forum: > http://seqanswers.com/forums/showthread.php?t=8830 GTF files have multiple lines per feature, so your output is reasonable. > which suggest that possible complications in getting one gene per row. My > next question is in that scenario what should be the best way of representing > one gene per FPKM value? should we take average of FPKM per gene? I think in > the gene it is till giving the transcript FPKM value but these values are > different from previous file filtered with transcript id. As Vasu noted, this is an ongoing area of research. For some experiments, it may be reasonable to group alternatively-spliced isoforms of the same gene and jointly estimate FPKM, and for others it may not. Fortunately, if you do want to group transcripts to get gene FPKM values, Cuffdiff does this for you: see its gene FPKM expression file. Best, J.___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Composite Datatypes Q.
I have a program I'm trying to "galaxify" that emits a variable number of result files. I would like the output of my Galaxy tool to show up in Galaxy as an html file with links to the result files. So when you click on the eye, the html file should up in the middle pane ... sorry if I'm not describing this in an elegant way. Creating a composite datatype is the way to go in this situation, correct? I'm creating a class that inherits from Html. How do I get the result returned from my custom "generate_primary_file" function to show up as a tool's output ... if that's the right way to go about this? Thanks! ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Filter Tool
(Starting new thread on galaxy-user.) Jagat, It depends what filter tool you're using and what dataset you're filtering. There is a generic filter tool that can be used to filter Cuffdiff tabular files for either FPKM values and differential expression tests. There is also a tool for filtering GTF files based on a Cuffdiff expr dataset. It sounds like you may be confusing either the tools or the inputs. If after double-checking you're still having problems with filtering, please put together a short list of your analysis steps and share your history with me, and I can take a look. Thanks, J. > Further to my question, It appear that there is some problem with the filter > option: > When I use the isoform/gene exp file as such it work fine but when I filter > these files with either parameter such as status if test was successful or on > p value it return me empty file. The way am saving the file is - expr file > filter save as txt file and upload back in Galaxy. > Any suggestion? > > Jagat > > > On Tue, May 3, 2011 at 3:08 AM, shamsher jagat wrote: > Jeremy, > > I have been trying to follow the steps in filtering Cufflink out put files > you have described in one of the previous messages > (http://gmod.827538.n3.nabble.com/Re-downstream-analysis-of-cuffdiff-out-put-td2836457.html): > > I have shared histroy with you, but in summary: > > File 35: when Filter GTF data by attributes value list on data 11 (combined > GTF) and data 33 (which is gene expr file) . Will not this should have one > gene per row. But it is not? > > File 39: Filter GTF file by attribute value list on data 11 and data 38 > (Cuffdiff splicing expr) it failed. I would assume that it should filter on > the basis of TSSid . The error message is > > Traceback (most recent call last): > File > "/var/opt/galaxy/g2test/galaxy_test/tools/filters/gff/gtf_filter_by_attribute_values_list.py", > line 67, in > filter( gff_file, attribute_name, ids_file, output_file ) > File > "/var/opt/galaxy/g2test/galaxy_test/tools/filters/gff/gtf_filter_by_attribute_values_list.py", > line 57, in filter > if attributes[ attribute_name ] in ids_dict: > KeyError: 'tss_id' > > 40 : Filter GTF data by attribute list on data 11 and 34 (tss group exp) > failed and error message is: > > Traceback (most recent call last): > File > "/var/opt/galaxy/g2test/galaxy_test/tools/filters/gff/gtf_filter_by_attribute_values_list.py", > line 67, in > filter( gff_file, attribute_name, ids_file, output_file ) > File > "/var/opt/galaxy/g2test/galaxy_test/tools/filters/gff/gtf_filter_by_attribute_values_list.py", > line 57, in filter > if attributes[ attribute_name ] in ids_dict: > KeyError: 'tss_id' > > I would consider that if one gene has different Id than there is splicing . > > However in contrast isoform file with transcript Id is working fine (File 20) > > On a different note can I convert GTF file to txt tab delaminated file I > tried to convert file 11 in txt (following Edit attributes) but the file is > not properly formatted especially col-pid and TSS id. Am I doing something > wrong. > > Thanks. > > > > ___ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > > http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] SNP annotation
Hi, I am new to Galaxy and am wondering what tools are available for annotation of SNPs? I know that snpEff is implemented in Galaxy and this enables annotation such as location and predicted effect, however I am wondering if there are any automated means of annotating a polymorphism as known/novel or determining its frequency in the population. Does Galaxy offer anything like this or would it be necessary to create scripts that would for exampole compare to dbSNP for uniqueness and the 1000 genomes project for frequency. Best, Gavin The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. Almac Group (UK) Limited, registered no. NI061368. Almac Sciences Limited, registered no. NI041550. Almac Discovery Limited, registered no. NI046249. Almac Pharma Services Limited, registered no. NI045055. Almac Clinical Services Limited, registered no. NI041905. Almac Clinical Technologies Limited, registered no. NI061202. Almac Diagnostics Limited, registered no. NI043067. All preceding companies are registered in Northern Ireland with a registered office address of Almac House, 20 Seagoe Industrial Estate, Craigavon, BT63 5QD, UK. Almac Sciences (Scotland) Limited, registered in Scotland no. SC154034. Almac Clinical Services LLC, Almac Clinical Technologies LLC, Almac Diagnostics LLC, Almac Pharma Services LLC and Almac Sciences LLC are Delaware limited liability companies and Almac Group Incorporated is a Delaware Corporation. More information on the Almac Group can be found on the Almac website: www.almacgroup.com ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] galaxy on the cloud - cannot select fastq files when trying to run workflow
Hi Galaxy, I am a newbie to Amazon EC2 and have been carefully following the steps in the screencast. I am able to upload two fastq files from the s3 bucket: http://s3.amazonaws.com/heteroplasmy/F4-bM4-1.fastq http://s3.amazonaws.com/heteroplasmy/F4-bM4-2.fastq I am also able import the published workflow http://s3.amazonaws.com/heteroplasmy/Galaxy-Workflow-mt_analysis_0.01_strand-specific_(fastq_double).ga But when it comes to running it, I cannot select the fastq file from the drop down menu. I am able to view them on my GC instance, since I imported them successfully, but am at a loss as to why I can't select them from the drop down menu in the workflow to begin their alignment. Is it something to do with my security group settings in setting up the EC2 instance? Any assistance you can provide would be great! Thanks, Amit -- Amit Indap ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] galaxy on the cloud - cannot select fastq files when trying to run workflow
Hi Amit, The workflow requires the input data to be in 'fastqsanger' format before being able to run. The files you uploaded from S3 are already in the correct format but this is most likely not set correctly in the metadata. So, click on the pencil icon for each of the datasets in your history and edit the data type by setting it to 'fastqsanger'. Save the changes and try running the workflow. It should work fine then. For future reference, if you decide to upload the rest of the files from the screencast/heteroplasmy study, you can choose the fastqsanger type right on the data upload form and can thus avoid this subsequent step. Enis On Mon, May 9, 2011 at 5:23 PM, Amit Indap wrote: > Hi Galaxy, > > I am a newbie to Amazon EC2 and have been carefully following the > steps in the screencast. I am able to upload two fastq files from the > s3 bucket: > http://s3.amazonaws.com/heteroplasmy/F4-bM4-1.fastq > http://s3.amazonaws.com/heteroplasmy/F4-bM4-2.fastq > > I am also able import the published workflow > > http://s3.amazonaws.com/heteroplasmy/Galaxy-Workflow-mt_analysis_0.01_strand-specific_(fastq_double).ga > > But when it comes to running it, I cannot select the fastq file from > the drop down menu. I am able to view them on my GC instance, since I > imported them successfully, but am at a loss as to why > I can't select them from the drop down menu in the workflow to begin > their alignment. Is it something to do with my security group settings > in setting up the EC2 instance? Any assistance you can provide would > be great! > > Thanks, > Amit > > -- > Amit Indap > ___ > The Galaxy User list should be used for the discussion of > Galaxy analysis and other features on the public server > at usegalaxy.org. Please keep all replies on the list by > using "reply all" in your mail client. For discussion of > local Galaxy instances and the Galaxy source code, please > use the Galaxy Development list: > > http://lists.bx.psu.edu/listinfo/galaxy-dev > > To manage your subscriptions to this and other Galaxy lists, > please use the interface at: > > http://lists.bx.psu.edu/ > ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Stastistics _q valu
I am trying to use stat function of Galaxy, and compute q value from p values. it is giving following errors: An error occurred running this job: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got 'pvals' Execution halted Any suggestion___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Galaxy Community Conference Registration Closes May 17
Hello all, We are closing registrations for the Galaxy Community Conference at the end of May 17. However, the conference is likely to fill up before that date. So, if you are thinking about attending, *now* is the time to register for the meeting: http://galaxy.psu.edu/gcc2011/Register.html We have also secured some additional rooms on the night of May 24, in a hotel in Lunteren: http://galaxy.psu.edu/gcc2011/Logistics.html Again, it is not necessary to stay in Lunteren the night of May 24, as it is easy to get there from Amsterdam on May 25, before the meeting starts. Finally, a draft schedule is now available at: http://galaxy.psu.edu/gcc2011/Programme.html This is still subject to change, but we are hoping it won't change much. Just 15 days to go! Dave C. -- http://galaxy.psu.edu/gcc2011/ http://getgalaxy.org http://usegalaxy.org/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/