Re: [galaxy-user] Cuffdiff output
I am quite new to RNA-seq analysis, but what I have learned so far is that replicates are important. If you have this result with no replicates then P-values are more or less meaningless. You can also gauge what is happening by looking at the modelled read count output. If the counts are both less than 50ish you are unlikely to have a robust result for that gene/transcript. Ian From: galaxy-user-boun...@lists.bx.psu.edu [galaxy-user-boun...@lists.bx.psu.edu] on behalf of Malik, Shivani [shivani.ma...@ucsf.edu] Sent: 11 March 2014 21:43 To: galaxy-user@lists.bx.psu.edu Subject: [galaxy-user] Cuffdiff output Hi, I have a question about interpreting the cuffdiff data and how to pick up significant genes. I have genes which show ~8 fold change between 2 conditions: eg from FPKM of 0.08 to 28 and yet they are not significant. Is there is threshold of FPKM below which Cuffdiff does not consider it an FPKM to be valid and hence significance in no? What downstream analysis should I use to extract a meaningful list of genes from the Cuffdiff data? Also, I filtered out FPKMs which were below 5 in both conditions? Is that reasonable? Thanks Shivani ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-user] Cuffdiff output
Hi, I have a question about interpreting the cuffdiff data and how to pick up significant genes. I have genes which show ~8 fold change between 2 conditions: eg from FPKM of 0.08 to 28 and yet they are not significant. Is there is threshold of FPKM below which Cuffdiff does not consider it an FPKM to be valid and hence significance in no? What downstream analysis should I use to extract a meaningful list of genes from the Cuffdiff data? Also, I filtered out FPKMs which were below 5 in both conditions? Is that reasonable? Thanks Shivani ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] Cuffdiff question
Dont use the - b parameter Sent from my iPhone; please excuse any brevity or typos! On Nov 15, 2013, at 2:51 PM, clare Hardman chard...@mrc-lmb.cam.ac.uk wrote: Hi Noa, Yes I did use Cufflinks so this sounds just like my problem. So how have you dealt with the problem? Best wishes, Clare On 14 Nov 2013, at 18:17, Noa Sher wrote: Hi Clare We just ran into a similar issue about a week ago and were debugging with the authors of cuffdiff Apparently there are issues with the -b parameter - were you using this in cufflinks? If yes - this may be the cause - we switched the order of the replicates and the values changed; as did PCA's of the samples, etc They are working on this issue for an upcoming version of cufflinks. I am interested in knowing whether this was indeed your problem or were you using a pipeline that does not include cufflinks? Good luck, Noa On 14/11/2013 13:13, clare Hardman wrote: Hello, Could you please advise me on this probably naive question. When I compare sample A and sample B by Ciffdiff and then separately compare Sample A to Sample C by Cuffdiff too, should the FMPK value be the same for A in both tests? At the moment mine does not seem to be! Best wishes Clare ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] Cuffdiff question
Hi Noa, Yes I did use Cufflinks so this sounds just like my problem. So how have you dealt with the problem? Best wishes, Clare On 14 Nov 2013, at 18:17, Noa Sher wrote: Hi Clare We just ran into a similar issue about a week ago and were debugging with the authors of cuffdiff Apparently there are issues with the -b parameter - were you using this in cufflinks? If yes - this may be the cause - we switched the order of the replicates and the values changed; as did PCA's of the samples, etc They are working on this issue for an upcoming version of cufflinks. I am interested in knowing whether this was indeed your problem or were you using a pipeline that does not include cufflinks? Good luck, Noa On 14/11/2013 13:13, clare Hardman wrote: Hello, Could you please advise me on this probably naive question. When I compare sample A and sample B by Ciffdiff and then separately compare Sample A to Sample C by Cuffdiff too, should the FMPK value be the same for A in both tests? At the moment mine does not seem to be! Best wishes Clare ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-user] Cuffdiff question
Hello, Could you please advise me on this probably naive question. When I compare sample A and sample B by Ciffdiff and then separately compare Sample A to Sample C by Cuffdiff too, should the FMPK value be the same for A in both tests? At the moment mine does not seem to be! Best wishes Clare ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] Cuffdiff question
Hi Clare We just ran into a similar issue about a week ago and were debugging with the authors of cuffdiff Apparently there are issues with the -b parameter - were you using this in cufflinks? If yes - this may be the cause - we switched the order of the replicates and the values changed; as did PCA's of the samples, etc They are working on this issue for an upcoming version of cufflinks. I am interested in knowing whether this was indeed your problem or were you using a pipeline that does not include cufflinks? Good luck, Noa On 14/11/2013 13:13, clare Hardman wrote: Hello, Could you please advise me on this probably naive question. When I compare sample A and sample B by Ciffdiff and then separately compare Sample A to Sample C by Cuffdiff too, should the FMPK value be the same for A in both tests? At the moment mine does not seem to be! Best wishes Clare ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-user] Cuffdiff question
Hello, The transcript name when using RefSeq as a reference annotation is the NM_ type of identifier. If you want to include a gene symbol, then the reference annotation should include the attribute gene_name. The iGenomes GTF files are an example of datasets that include this attribute. http://wiki.galaxyproject.org/Support#Interpreting_scientific_results See: /Example/?/*RNA-seq analysis*/*tools* http://cufflinks.cbcb.umd.edu/faq.html#gtfs http://cufflinks.cbcb.umd.edu/igenomes.html http://cufflinks.cbcb.umd.edu/manual.html (search on page for gene_name to see where can be used/output) Best, Jen Galaxy team On 11/12/13 10:40 AM, Irene Bassano wrote: Dear Jennifer, I am about to start a new analysis and learning from my old mistakes would like to ask just one question: I would like my Cufflinks and Cuffdiff results to give me the actual gene name or transcript name rather than NM_.. like I had last time i run the job. At which stage and which option do I have ti use to get the proper names? Thanks a lot! Irene -- Jennifer Hillman-Jackson http://galaxyproject.org -- Jennifer Hillman-Jackson http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] Cuffdiff version not apparent
Hi Cory, A list of Galaxy dependancies can be found on the wiki at: http://wiki.galaxyproject.org/Admin/Tools/Tool%20Dependencies ...although many tools allow a range of tool versions. You can also identify the information about the specific tool versions by clicking on the View Details Œi¹ icon of a history item created by that tool and looking at the Tool Version field. If you¹re using the Galaxy public server (https://usegalaxy.org/) then clicking on the Œi¹ icon of a cuffdiff output file will show: Tool Version:cuffdiff v2.1.1 (4046M) Hope this helps. Cheers, Graham Dr. Graham Etherington Bioinformatics Support Officer, The Sainsbury Laboratory, Norwich Research Park, Norwich NR4 7UH. UK Tel: +44 (0)1603 450601 On 04/11/2013 20:57, Cory Dunn cd...@ku.edu.tr wrote: Dear Galaxy Staff: I was wondering which version of Cuffdiff is currently running on Galaxy. The wrapper version is 0.0.6, but I did not see the actual version of the underlying software under the Tool Version field (please see attached screen grab). Thanks for your help, Cory Dunn ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-user] Cuffdiff version not apparent
Dear Galaxy Staff: I was wondering which version of Cuffdiff is currently running on Galaxy. The wrapper version is 0.0.6, but I did not see the actual version of the underlying software under the Tool Version field (please see attached screen grab). Thanks for your help, Cory Dunn [image: Inline image 1] image.png___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] Cuffdiff changes
Where can I see which version are being used? You can see both the Galaxy tool version and the Cuffdiff tool version (when available) by clicking on the 'view details' icon (the 'i' at the bottom of an expanded dataset). Right now the Cuffdiff version is not displayed, but that will change when our server is updated. What does Cuffdiff(version 0.0.5) mean then? That is the version of the Galaxy wrapper; the wrapper provides the interface between Cuffdiff and Galaxy. What version was it before? I think Cuffdiff version was 1.3.1 previously. I look forward to the update, will that mean another version of Cuffdiff again? The wrapper will be updated but not Cuffdiff itself. J. ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] Cuffdiff changes
Thanks for quick reply. Where can I see which version are being used? It does not say in the attached (in my first e-mail) info-view after the run.. What does Cuffdiff (version 0.0.5) mean then? What version was it before? It is a great difference in number of DE genes.. I mostly meant it was the same now in August with more DE genes for all different type of analysis that I have previously done (consistent new results with no settings changed), I really do not see more DE genes when 5 vs 6 samples compared to 1 vs 2 samples. But that might be due to more biological reasons. I look forward to the update, will that mean another version of Cuffdiff again? Kind regards, Johanna From: Jeremy Goecks [mailto:jeremy.goe...@emory.edu] Sent: Thursday, August 22, 2013 8:04 PM To: Johanna Sandgren Cc: galaxy-user@lists.bx.psu.edu Subject: Re: [galaxy-user] Cuffdiff changes I am wondering why Cuffdiff suddenly gives many more significant DE genes? Cuffdiff was recently updated to version 2.1.0; this update likely explains the different results that you see. I have rerun several analysis, also with more samples in each group, all give much more significant genes. Why? More samples = more power to accurately estimate expression levels = more DE genes. Also, will replicate information soon be included in output files? Replicate data will be available when we next update our server. This should occur in about 3 weeks. Best, J. ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-user] Cuffdiff changes
Hi, I am wondering why Cuffdiff suddenly gives many more significant DE genes? I have used same input data and now get approx 5x more significant genes, settings is same with the exception that you now included library normalization and dispersion estimation. See below for parameters. I have rerun several analysis, also with more samples in each group, all give much more significant genes. Why? Also, will replicate information soon be included in output files? Kind regards, Johanna Tool: Cuffdiff Tool: Cuffdiff Name: Cuffdiff on data 225, data 236, and others: splicing differential expression testing Name: Cuffdiff on data 225, data 236, and others: splicing differential expression testing Created: 4-Apr-13 Created: 21-Aug-13 Filesize: 10.3 MB Filesize: 10.2 MB Dbkey: hg19 Dbkey: hg19 Format: tabular Format: tabular Galaxy Tool Version: 0.0.5 Galaxy Tool Version: 0.0.5 Tool Version: Tool Version: Tool Standard Output: stdouthttps://main.g2.bx.psu.edu/datasets/bbd44e69cb8906b5355e4a035e92cd84/stdout Tool Standard Output: stdouthttps://main.g2.bx.psu.edu/datasets/bbd44e69cb8906b524bcf7e585f495b1/stdout Tool Standard Error: stderrhttps://main.g2.bx.psu.edu/datasets/bbd44e69cb8906b5355e4a035e92cd84/stderr Tool Standard Error: stderrhttps://main.g2.bx.psu.edu/datasets/bbd44e69cb8906b524bcf7e585f495b1/stderr Tool Exit Code: 0 Tool Exit Code: 0 API ID: bbd44e69cb8906b5355e4a035e92cd84 API ID: bbd44e69cb8906b524bcf7e585f495b1 Input Parameter Value Note for rerun Input Parameter Value Transcripts 261: Cuffmerge on data 258, data 135, and others: merged transcripts Transcripts 261: Cuffmerge on data 258, data 135, and others: merged transcripts Perform replicate analysis Yes Perform replicate analysis Yes Group name s202 Group name s202 Add file 194: Galaxy883-[MarkDups_Dupes_Marked_882_202.bam].bam Add file 194: Galaxy883-[MarkDups_Dupes_Marked_882_202.bam].bam Group name Ctrls Group name Ctrls Add file 236: MarkDups_Dupes Marked216.bam Add file 236: MarkDups_Dupes Marked216.bam Add file 225: MarkDups_Dupes Marked206.bam Add file 225: MarkDups_Dupes Marked206.bam Library normalization method not used (parameter was added after this job was run) Library normalization method geometric Dispersion estimation method not used (parameter was added after this job was run) Dispersion estimation method pooled False Discovery Rate 0.05 False Discovery Rate 0.05 Min Alignment Count 2 Min Alignment Count 2 Perform quartile normalization No Perform quartile normalization No Use multi-read correct Yes Use multi-read correct Yes Perform Bias Correction Yes Perform Bias Correction Yes Reference sequence data cached Reference sequence data cached Set Additional Parameters? (not recommended) Yes Set Additional Parameters? (not recommended) Yes Average Fragment Length 200 Average Fragment Length 200 Fragment Length Standard Deviation 80 Fragment Length Standard Deviation 80 .. Johanna Sandgren, PhD Department of Oncology-Pathology CCK, Karolinska Institutet SE-171 76 Stockholm, Sweden +46-8-517 721 35 (office), +46-8- 321047(fax), +46-708 388476 (mobile) ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] Cuffdiff-cummerbund with biological replicates problem
In the past, others have had success using Cummerbund with Galaxy, and there's even a Cummerbund wrapper in the tool shed: http://toolshed.g2.bx.psu.edu/view/jjohnson/cummerbund That said, it appears that replicate information is largely contained in the read group tracking files, which are not currently included in Galaxy's Cuffdiff outputs. I don't know if these files are required by Cummerbund to do replicate analysis. This would be a good question for the Cummerbund developers, as well as what the p and q values mean when doing replicate analysis. If you find that Galaxy's lacking something for Cummerbund to function correctly, that would be very useful information to share with the list. Best, J. On Jul 26, 2013, at 8:50 PM, Mike Shamblott wrote: I'm trying to run Cuffdiff on a set of 10 human samples with biological replication then download the results for further analyses in Cummerbund(v2.1.1). It seems like a standard workflow but I cannot get cummerbund to acknowledge replicates. I download and rename the 11 cuffdiff output files to the names expected by cummerbund. Cummerbund builds a CuffSet with no warnings and most analyses work as expected. The problem comes any time I try to see the results of replication. For example, in cummerbund, replicates() returns an empty set and any type of plot returns an error when replicates=T is included as an argument. There is no evidence of replication data in any of the 11 cuffdiff output files. The data is presented with the group name only. From this, I conclude that the problem is with cuffdiff, since there is no replicate data for cummerbund to build into its db. I see that there are several read group files that are produced by cuffdiff but cannot be downloaded in Galaxy. Is this the problem, and if so, how can Galaxy be used to generate data with (essential) replication? Are the p and q significance values reported in the output files a result of replicate analysis? I have tried to ask this question in several different forums without success. The responses I've gotten suggest its a Galaxy issue rather than either cuffdiff or cummerbund. I'm hoping someone here can help answer my questions. Hopeful, Mike ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-user] Cuffdiff-cummerbund with biological replicates problem
I'm trying to run Cuffdiff on a set of 10 human samples with biological replication then download the results for further analyses in Cummerbund(v2.1.1). It seems like a standard workflow but I cannot get cummerbund to acknowledge replicates. I download and rename the 11 cuffdiff output files to the names expected by cummerbund. Cummerbund builds a CuffSet with no warnings and most analyses work as expected. The problem comes any time I try to see the results of replication. For example, in cummerbund, replicates() returns an empty set and any type of plot returns an error when replicates=T is included as an argument. There is no evidence of replication data in any of the 11 cuffdiff output files. The data is presented with the group name only. From this, I conclude that the problem is with cuffdiff, since there is no replicate data for cummerbund to build into its db. I see that there are several read group files that are produced by cuffdiff but cannot be downloaded in Galaxy. Is this the problem, and if so, how can Galaxy be used to generate data with (essential) replication? Are the p and q significance values reported in the output files a result of replicate analysis? I have tried to ask this question in several different forums without success. The responses I've gotten suggest its a Galaxy issue rather than either cuffdiff or cummerbund. I'm hoping someone here can help answer my questions. Hopeful, Mike___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] Cuffdiff statistical calculations are inconsistent?
The header of the Cuffdiff tool page says it is version 0.0.5 This version is the Galaxy tool wrapper version, not the tool version. (Yes, this is a usability issue.) You can find the tool version in the dataset's information panel by clicking on the 'i' icon. Is there a way, or setting, on Cuffdiff 2.0 to revert the parameters to be more similar to Cuffdiff 1.3? This isn't a parameter issue. The Cuffdiff algorithm has changed substantially, and it's not clear to me if/how (or whether it's a good idea at all) to modify parameters to obtain 1.3-esque results. Best, J. ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Cuffdiff statistical calculations are inconsistent?
Hi, I'll preface my concern by saying that I'm a novice to Cufflinks. Back in September, I performed a Cuffdiff analysis comparing a wild-type and mutant condition. The analysis returned ~800 transcripts differentially regulated between the two with statistical significance. Recently, I've rerun the Cuffdiff analysis - using exactly the same files stored in Galaxy for all inputs, and with all the same parameters - and only get a few dozen statistically significant hits. However, all of the data besides the p and q values are essentially identical between these two runs, so I am really unclear as to what is causing the difference. Here is just one clear example: From run 1: YFR026C FPKM 1 = 17.2434 FPKM 2 = 196.735 log2(fold change) = 3.51214 p = 1.64E-8 q = 7.33E-6 significant = yes From run 2: YFR026C FPKM 1 = 14.4489 FPKM 2 = 144.939 log2(fold change) = 3.32641 p = 0.000170034 q = 0.0719964 significant = no The second Cuffdiff analysis shows there is still a ~10-fold difference between conditions, but this is not statistically significant. Has the version of Cuffdiff on Galaxy been updated such that some parameters have changed, that could explain this difference? Or, is there some setting I am missing that would cause very large changes to fail statistical significance testing? Any help or input would be appreciated, I am really at a loss for why executing what should be exactly the same task is giving vastly different results. I could just be overlooking something very fundamental that is obvious to someone with more experience with this program. Thanks. -Jenna Smith ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Cuffdiff statistical calculations are inconsistent?
We are having the exact same issue, on the main server and our (recent) cloud instances. Were some of the hidden Cuffdiff parameters modified since fall 2012? Cheers, Mo Heydarian On Mar 13, 2013 11:02 AM, Jenna Smith jes...@case.edu wrote: Hi, I'll preface my concern by saying that I'm a novice to Cufflinks. Back in September, I performed a Cuffdiff analysis comparing a wild-type and mutant condition. The analysis returned ~800 transcripts differentially regulated between the two with statistical significance. Recently, I've rerun the Cuffdiff analysis - using exactly the same files stored in Galaxy for all inputs, and with all the same parameters - and only get a few dozen statistically significant hits. However, all of the data besides the p and q values are essentially identical between these two runs, so I am really unclear as to what is causing the difference. Here is just one clear example: From run 1: YFR026C FPKM 1 = 17.2434 FPKM 2 = 196.735 log2(fold change) = 3.51214 p = 1.64E-8 q = 7.33E-6 significant = yes From run 2: YFR026C FPKM 1 = 14.4489 FPKM 2 = 144.939 log2(fold change) = 3.32641 p = 0.000170034 q = 0.0719964 significant = no The second Cuffdiff analysis shows there is still a ~10-fold difference between conditions, but this is not statistically significant. Has the version of Cuffdiff on Galaxy been updated such that some parameters have changed, that could explain this difference? Or, is there some setting I am missing that would cause very large changes to fail statistical significance testing? Any help or input would be appreciated, I am really at a loss for why executing what should be exactly the same task is giving vastly different results. I could just be overlooking something very fundamental that is obvious to someone with more experience with this program. Thanks. -Jenna Smith ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Cuffdiff statistical calculations are inconsistent?
This is likely due to the upgrade from Cufflinks 1.3.x to Cufflinks 2.0.x; Cufflinks 2.0 introduced a new algorithm for Cuffdiff in particular. You can read about these changes on the website: http://cufflinks.cbcb.umd.edu/ (and there's a manuscript describing the changes as well). You might consider writer to to the tool authors directly for more details: tophat.cuffli...@gmail.com Of course, please consider sharing anything you learn with members of this list as well. Best, J. On Mar 13, 2013, at 12:06 PM, Mohammad Heydarian wrote: We are having the exact same issue, on the main server and our (recent) cloud instances. Were some of the hidden Cuffdiff parameters modified since fall 2012? Cheers, Mo Heydarian On Mar 13, 2013 11:02 AM, Jenna Smith jes...@case.edu wrote: Hi, I'll preface my concern by saying that I'm a novice to Cufflinks. Back in September, I performed a Cuffdiff analysis comparing a wild-type and mutant condition. The analysis returned ~800 transcripts differentially regulated between the two with statistical significance. Recently, I've rerun the Cuffdiff analysis - using exactly the same files stored in Galaxy for all inputs, and with all the same parameters - and only get a few dozen statistically significant hits. However, all of the data besides the p and q values are essentially identical between these two runs, so I am really unclear as to what is causing the difference. Here is just one clear example: From run 1: YFR026C FPKM 1 = 17.2434 FPKM 2 = 196.735 log2(fold change) = 3.51214 p = 1.64E-8 q = 7.33E-6 significant = yes From run 2: YFR026C FPKM 1 = 14.4489 FPKM 2 = 144.939 log2(fold change) = 3.32641 p = 0.000170034 q = 0.0719964 significant = no The second Cuffdiff analysis shows there is still a ~10-fold difference between conditions, but this is not statistically significant. Has the version of Cuffdiff on Galaxy been updated such that some parameters have changed, that could explain this difference? Or, is there some setting I am missing that would cause very large changes to fail statistical significance testing? Any help or input would be appreciated, I am really at a loss for why executing what should be exactly the same task is giving vastly different results. I could just be overlooking something very fundamental that is obvious to someone with more experience with this program. Thanks. -Jenna Smith ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Cuffdiff tracking file does not report all genes and trancrips from reference annotation?
Hello Wei, The results do sound strange. The best advice to start with is to make sure that you are up-to-date with both Galaxy and the RNA-seq tools and using the best possible inputs. 1 . Make sure that you are running the latest distribution http://wiki.galaxyproject.org/DevNewsBriefs 2. Update to use the current version of CuffDiff http://wiki.galaxyproject.org/Admin/Tools/Tool%20Dependencies 3. Use the iGenomes GTF annotation to make full use of the functionality in Cuffdiff http://cufflinks.cbcb.umd.edu/igenomes.html A good test to see if your set-up is correct would be to run the RNA-seq tutorial locally as a test case. http://main.g2.bx.psu.edu/u/jeremy/p/galaxy-rna-seq-analysis-exercise The reverse can also be done, if you have a problem locally, try running it as a small test (that still demonstrates the issue) on the Public Main server and see if the results can be duplicated. This can help determine if problem is with data inputs/settings or a problem with tool set-up/installation. It can also be a way to share your data with us if you need feedback. But, hopefully after updating the issue clears up! Jen Galaxy team On 11/29/12 11:51 AM, Wei Liao wrote: Hi, Galaxy user. I ran into a problem when using Cuffdiff 1.2.1 in Galaxy local instance to check differential expressed genes in my samples. I have 3 normals and 6 cancers samples, I did the following: - After tophat for each samples, run cufflink with refseq annotation which has 25266 genes and 43091 transcripts - cuffmerge all cufflink outputs contains 58112 lines - run cuffdiff with 3 normals as triplicate and compare to each cancer sample. Suprisingly, I fould out that the tanscripts tracking file, gene tracking, CDS tracking only has 2000 genes and 4000 transcripts. So the cufflink only compare 2000 genes and 4000 transcripts between samples. The question I want to ask here is that *why are the rest of the genes and transcripts not being tested and included in the tracking files?* Do you know what cause this kind of problem? Thanks, Wei -- Wei Liao Research Scientist, Brentwood Biomedical Research Institute 16111 Plummer St. Bldg 7, Rm D-122 North Hills, CA 91343 818-891-7711 ext 7645 ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Cuffdiff tracking file does not report all genes and trancrips from reference annotation?
Hi, Galaxy user. I ran into a problem when using Cuffdiff 1.2.1 in Galaxy local instance to check differential expressed genes in my samples. I have 3 normals and 6 cancers samples, I did the following: - After tophat for each samples, run cufflink with refseq annotation which has 25266 genes and 43091 transcripts - cuffmerge all cufflink outputs contains 58112 lines - run cuffdiff with 3 normals as triplicate and compare to each cancer sample. Suprisingly, I fould out that the tanscripts tracking file, gene tracking, CDS tracking only has 2000 genes and 4000 transcripts. So the cufflink only compare 2000 genes and 4000 transcripts between samples. The question I want to ask here is that *why are the rest of the genes and transcripts not being tested and included in the tracking files?* Do you know what cause this kind of problem? Thanks, Wei -- Wei Liao Research Scientist, Brentwood Biomedical Research Institute 16111 Plummer St. Bldg 7, Rm D-122 North Hills, CA 91343 818-891-7711 ext 7645 ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Cuffdiff without gene annotation
Hello guys, I went through the RNAseq workflow (I didn't do Cuffmerge) and from the Cuffdiff output gene and transcript differential expression testing I filtered some data. For example, for two samples I got about 400 gene and 900 transcript differential expressed with fold change 2. Since I am working with a fungus whose genome annotation is in a format (gff) not accepted by Cuffmerge or Cuffcompare in Galaxy (the accepted one is GTF2), the Cuffdiff output tells me only the position of relevant genes on the scaffolds. Going to genome browser and see which gene is in that position is fine for few genes, but doing that for all 400 or 900 is something probably impossible. Does anybody have a helpful suggestion on what I can do? It would be great if there was a program where based on the position of the genes on the scaffold (Cuffdiff output) I can get their information using the annotation file. I have also the gene annotation file in gene bank format (gbk) but I don't see a way to use it for what I need. Thanks Giuseppe Ianiri, Ph.D. Division of Cell Biology and Biophysics School of Biological Sciences 5100 Rockhill Road University of Missouri-Kansas City Kansas City, MO 64110 Email: iani...@umkc.edu ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] cuffdiff
Hi, I got confused while trying to perform Cuffdiff for my RNA sequencing analysis. So I have five different samples which were sequenced. I used tophat to create the bam files and cufflink to create the assembled trancripts. Then I uded Cuffmerge to merge them in one file and then I wanted to do Cuffdiff with that merged file. What shall I choose for the ''SAM or BAM file of aligned RNA-Seq'' option? I have the 5 options from the 5 tophat actions on my 5 samples. All I want in the end is an excel table showing the number of hits from each sample (and not necessary a comparison of them). Regards Kristis Vevis, PhD Student Cell Biology UCL Institute of Ophthalmology 11-43 Bath Street London EC1V 9EL, UK 020 7608 4067 ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] cuffdiff
Use the replicates option (yes, a bit of a misnomer) and put each Tophat run in its own group. This will produce a tabular file with FPKM for each group/run. Best, J. On Nov 12, 2012, at 10:05 AM, Vevis, Christis wrote: Hi, I got confused while trying to perform Cuffdiff for my RNA sequencing analysis. So I have five different samples which were sequenced. I used tophat to create the bam files and cufflink to create the assembled trancripts. Then I uded Cuffmerge to merge them in one file and then I wanted to do Cuffdiff with that merged file. What shall I choose for the ‘’SAM or BAM file of aligned RNA-Seq’’ option? I have the 5 options from the 5 tophat actions on my 5 samples. All I want in the end is an excel table showing the number of hits from each sample (and not necessary a comparison of them). Regards Kristis Vevis, PhD Student Cell Biology UCL Institute of Ophthalmology 11-43 Bath Street London EC1V 9EL, UK 020 7608 4067 ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] cuffdiff values different for same sample
Hello, Thank you for sharing your history. The difference in FPKM values can be explained by the use of the -N option (Perform quartile normalization: Yes). Set this to No to avoid the variable per-run normalization. This has also been discussed at seqanswers.com: http://seqanswers.com/forums/showthread.php?t=4606 Hopefully this helps! Jen Galaxy team On 10/12/12 9:43 AM, i b wrote: Hello forum, I have ran cuffdiff with two samples, treated vs untreated, and had certain values as expressed in the output (isoform differential expression). When I ran it again, using a different treated sample, but SAME UNTREATED SAMPLE, the values assigned to the untreated are different. Why is this if values are calculated from a unique FPKM? This happend for other jobs where I have ran the same untreated vs other treated samples... Thanks a lot, ib ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] cuffdiff fpkm=0 not significant
ib, Look at the status column. I suspect that for this example you given the status is HiData. Cuffdiff considers the expression are very high and no statistic testing would have been done for the gene. fpkm 2=0 could be misleading, as it may not be actually 0. I have encountered in a data set where several genes are expected to be highly expressed in all samples, but one of the fpkm values were all given as 0. Hope this helps. Haiping -- JHMI Deep Sequencing and Microarray Core Facility at BRB - Your source for quality service on Microarray studies, NextGen Sequencing, and Data Analysis. http://www.microarray.jhmi.edu/ -- Haiping Hao Ph.D. Associate Director Johns Hopkins University Deep Sequencing and Microarray Core Edward D. Miller Research Building 733 N. Broadway, Rm 359 Baltimore, MD 21205 h...@jhmi.edu Phone: 443-287-9056 Fax: 410-502- On 10/15/2012 5:36 PM, i b wrote: Dear forum, whoever knows this please tell me! Given two fpkm . e.g. fpkm 1=2922828 and fpkm 2=0, why cuffdiff does not calculate differential expression. those genes are listed in cuffdiff as not significant and the log2 fold change is infinite , either negative or positve...how do i consider them when i want to pull out the genes that are differentially expressed between treated and untreated samples? how do i interpret these data? thanks, ib ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] cuffdiff values different for same sample
Hello, Would you be able to share a history containing these data? Use Options (gear icon) - Share or Publish, generate the share link, then copy and paste that into a reply email sent to me directly. Please note the dataset #'s for the Cuffdiff runs that you are comparing and make sure that all inputs are undeleted. Best, Jen Galaxy team On 10/12/12 9:43 AM, i b wrote: Hello forum, I have ran cuffdiff with two samples, treated vs untreated, and had certain values as expressed in the output (isoform differential expression). When I ran it again, using a different treated sample, but SAME UNTREATED SAMPLE, the values assigned to the untreated are different. Why is this if values are calculated from a unique FPKM? This happend for other jobs where I have ran the same untreated vs other treated samples... Thanks a lot, ib ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] cuffdiff values different for same sample
Hello forum, I have ran cuffdiff with two samples, treated vs untreated, and had certain values as expressed in the output (isoform differential expression). When I ran it again, using a different treated sample, but SAME UNTREATED SAMPLE, the values assigned to the untreated are different. Why is this if values are calculated from a unique FPKM? This happend for other jobs where I have ran the same untreated vs other treated samples... Thanks a lot, ib ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Cuffdiff no without replicates
On Wed, Oct 3, 2012 at 9:11 PM, i b ibse...@gmail.com wrote: Dear all, how reliable is running Cuffdiff without replicates? e.g.one samples agains another one? Is it statistically makign any difference when using replicates? Seqanswers might be a better place to ask this very interesting technical question that goes way beyond Galaxy... My 2c: Statistically speaking, sequencing and biology are both noisy. Replicates provide information about non-experimental (technical and biological) variation. That variation is usually not the variation you are looking for, but if you want to remove it, you have to model it and that requires information from replicates (or really good guesswork). In some situations (eg extreme experimental conditions), I'm sure you'll find biologically meaningful signal without them but in my experience, they can really help to decrease non-experimental noise, particularly where the experimental condition induces only subtle changes in transcript abundance. You could always analyse a data set with replicates and compare the results with and without those replicates yourself to see what happens - it would be a nice paper I'm sure. Thanks, ib ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Cuffdiff no without replicates
On Wed, Oct 3, 2012 at 7:35 AM, Ross ross.laza...@gmail.com wrote: On Wed, Oct 3, 2012 at 9:11 PM, i b ibse...@gmail.com wrote: Dear all, how reliable is running Cuffdiff without replicates? e.g.one samples agains another one? Is it statistically makign any difference when using replicates? Seqanswers might be a better place to ask this very interesting technical question that goes way beyond Galaxy... My 2c: Statistically speaking, sequencing and biology are both noisy. Replicates provide information about non-experimental (technical and biological) variation. That variation is usually not the variation you are looking for, but if you want to remove it, you have to model it and that requires information from replicates (or really good guesswork). In some situations (eg extreme experimental conditions), I'm sure you'll find biologically meaningful signal without them but in my experience, they can really help to decrease non-experimental noise, particularly where the experimental condition induces only subtle changes in transcript abundance. You could always analyse a data set with replicates and compare the results with and without those replicates yourself to see what happens - it would be a nice paper I'm sure. A bit off-topic, but you might take a look here: http://www.ncbi.nlm.nih.gov/pubmed/21747377 In short, one needs replication in biology, regardless of the technology used. In particular, one would never suggest running a microarray experiment without replicates; one should follow approximately the same rules for sequencing (and sequence data analysis). Sean Thanks, ib ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] cuffdiff sample values assigned
On 8/21/12 4:33 AM, i b wrote: Thanks Jen, useful link. But I did not understand one thing. I have the following FPKM in cufflinks for two samples: s1 (untreated): 1234106 s2 (treated): 159713 cuffdiff of the two samples gives me the following values: value_1:5.4 value_2:20.9 and it is not significant (!). My two question: 1. how is this not significant? 2. what is the realtion between the high fpkm in cufflinks and the low values in cuffdiff?I read the manual: is this part of the statistical method adopted?e.g are these numbers (cuffdiff values) derived from the formula adopted? thanks a lot, ib On Thu, Aug 16, 2012 at 11:26 PM, Jennifer Jackson j...@bx.psu.edu wrote: Hello, A very similar question came up a few days ago and Jeremy had some good advice for how to approach learning to interpret this data: http://lists.bx.psu.edu/pipermail/galaxy-user/2012-August/004985.html Best, Jen Galaxy team On 8/15/12 8:49 AM, i b wrote: Dear all, in cuffdiff outputs e.g. transcript differential expression, I find for example: value_1 value_2 log2(fold_change) 7.77183 0 -1.79769e+308 or value_1 value_2 log2(fold_change) 0 14.5972 1.79769e+308 for many many rows. if I sort in excel my data by fold change column (big to small ), all the rows with -1.79769e+308 or +1.79769e+308 are on the top. How can be sure that these on the top are really the most up-regulated or down regulated transcripts if I don't know the real value of one of the two samples (is 0 really zero?)? I was told that the zero in one if the two samples is very small number and Cuffdiff simply writes 0, but it is not absolutely zero, otherwise it would not be possible ot have -1.79769e+308 or 1.79769e+308 Could you please tell me then how can I extrapolate the highest fold change? (up and down regualted)?or of what is done by sorting by log fold chnage is correct? Thanks, ib ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://galaxyproject.org -- Jennifer Jackson http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Cuffdiff errors
Hello, I am having a problem running Cuffdiff on some RNA-seq data. I want to compare 2 samples (A and B). I did Cufflinks and Cuffmerge before running Cuffdiff. I ran Cuffdiff with the following options: Cuffmerge + Bowtie A, B (sorted required by Cufflinks after mapped with Bowtie). But I got the following error message: An error occurred running this job: cuffdiff v1.3.0 (3022) cuffdiff --no-update-check -q -p 8 -c 10 --FDR 0.05 /galaxy/main_pool/pool4/files/004/800/dataset_4800173.dat /galaxy/main_pool/pool3/files/004/799/dataset_4799827.dat /galaxy/main_pool/pool4/files/004/799/dataset_4799831.dat Where did I do wrong? Thanks very much for your help! Yan ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Cuffdiff errors
Hi Yan, Would you please submit this as a bug report? It helps if you leave all inputs undeleted in your history. Instructions: http://wiki.g2.bx.psu.edu/Support#Reporting_tool_errors Thanks! Jen Galaxy team On 8/16/12 6:18 AM, Yan He wrote: Hello, I am having a problem running Cuffdiff on some RNA-seq data. I want to compare 2 samples (A and B). I did Cufflinks and Cuffmerge before running Cuffdiff. I ran Cuffdiff with the following options: Cuffmerge + Bowtie A, B (sorted required by Cufflinks after mapped with Bowtie). But I got the following error message: *An error occurred running this job: /cuffdiff v1.3.0 (3022) cuffdiff --no-update-check -q -p 8 -c 10 --FDR 0.05 /galaxy/main_pool/pool4/files/004/800/dataset_4800173.dat /galaxy/main_pool/pool3/files/004/799/dataset_4799827.dat /galaxy/main_pool/pool4/files/004/799/dataset_4799831.dat/* *//* Where did I do wrong? Thanks very much for your help! Yan ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] cuffdiff with three groups
Dear all, I ran Cuffdiff with 3 groups: A, B, C each with 2, 5 and 1 replicates respectively. When looking at the transcripts dif.exp.testing, I have only sample A and B and redpective values. What happened to sample C? Thanks for any help. ib ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] cuffdiff: same gene listed with different FPKM values
Dear all, has anything like this happened to you? I compared two samples with cuffdiff and when I look at the differentially expressed genes values I have the same gene listed for 5 times with different values. E.g. sample1sample2gene 71.6837 9.76435 NM_005514 87.6456 27.3965 NM_005514 115.333 4.81687 NM_005514 38.1879 5.2753 NM_005514 69.4197 5.84387 NM_005514 112.964 3.89226 NM_005514 What does this mean? And how do I know which one represents the real expression level for that gene for the two samples? Thanks, ib ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] cuffdiff results missing
Hello Irene, This issue is similar to the original. The input GTF for this run (dataset #15) has tss_id populated, but not p_id. The p_id attribute is required for the CDS calculations. http://cufflinks.cbcb.umd.edu/manual.html#cuffdiff_input (quote) Cuffdiff Input: AttributeDescription p_id The ID of the coding sequence this transcript contains. This attribute is attached by Cuffcompare to the .combined.gtf records only when it is run with a reference annotation that include CDS records. Further, differential CDS analysis is only performed when all isoforms of a gene have p_id attributes, because neither Cufflinks nor Cuffcompare attempt to assign an open reading frame to transcripts. Dataset #15 was created from a CuffMerge run (which runs Cuffcompare as a component). Examining the selections used (clicking on the blue arrow rerun icon), shows that the option Use Sequence Data: was set to No. Changing this to Yes and using Choose the source for the reference list: as Locally cached (and double checking that all inputs are assigned to hg19) will assign p_id. Note that this will be true only for those transcripts that are associated with reference annotation transcripts containing coding regions (in your data: 'nearest_ref NM_X', not NR_X. NR_ human RefSeq transcripts are non-coding). Galaxy's CuffMerge tool form has this option labeled: (quote) Use Sequence Data: Use sequence data for some optional classification functions, including the addition of the p_id attribute required by Cuffdiff. Thanks! Jen Galaxy team On 7/21/12 11:04 PM, i b wrote: Hi, I ran again cuffdiff using the cuffmerge as gtf. The following dataset were empty: 128: Cuffdiff on data 12, data 14, and others: CDS FPKM tracking 127: Cuffdiff on data 12, data 14, and others: CDS FPKM differential expression testing 126: Cuffdiff on data 12, data 14, and others: CDS overloading diffential expression testing The others have data downloadable as excel. any explanation??? Thanks, ib On Fri, Jul 20, 2012 at 12:10 AM, Jennifer Jackson j...@bx.psu.edu wrote: Hello Irene, Yes, this is can be the result if your source GTF data did not have the full compliment of attributes needed by Cuffdiff to perform these calculations. The primary tool documentation covers this information here: http://cufflinks.cbcb.umd.edu/manual.html#fpkm_track The iGenomes datasets are a popular choice for this reason. A version of UCSC RefGenes is available for certain genomes. Please see: http://cufflinks.cbcb.umd.edu/igenomes.html (scroll down on page in some browsers to find table) Galaxy has one of these already loaded, mm9 genes.gtf, in the Shared Data - Shared Libraries section on the public Main server. More iGenomes .gtf files will likely be added here, sometime after the GCC2012 conference. For now, locally download to your own system/desktop, uncompress, and just load the GTF file to Galaxy. Consider FTP for larger datasets: http://wiki.g2.bx.psu.edu/FTPUpload) More resources include the author supported help email at tophat.cuffli...@gmail.com and seqanswers.com (where the authors often post). Hopefully this helps, Jen Galaxy team On 7/19/12 1:38 PM, i b wrote: Hi, I ran cuffdiff using Refseq genes as GTF and two groups of BAM. Group one has two replicates (treated) and group two only one replicate (untreated). When looking at the outputs the following are empty (1 line): TSS group FPKM tracking TSS groups differential expression testing CDS FPKM tracking CDS FPKM differential expression testing CDS overloading diffential expression testing promoters differential expression testing splicing differential expression testing the other four outputs have data downloadable as excel. Is this normal? thanks, ib ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://galaxyproject.org -- Jennifer Jackson http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Re: [galaxy-user] cuffdiff failed
Hello Irene, There appears to be a problem with the information entered into the tool form for the labels (e.g. Group name). The command string only shows one group label value when there are two group data sets. You submitted a bug report for this same issue, so I will take a look there at the exact error (usually is very specific about problem) and at the exact settings entered into tool form and provide feedback there. Thanks, Jen Galaxy team On 7/20/12 9:19 AM, i b wrote: Hi, I had 3 samples (2replicates treated (A-B) and one untreated (C) ). I did cufflnks and cuffmerge (all 3 cufflinks) I run cuffdiff with the following options: cuffmerge + tophats from A, B, C (2 groups: gr.one with A, B), gropu2: C. I had the following message: 0 bytes An error occurred running this job: cuffdiff v1.3.0 (3022) cuffdiff --no-update-check -q -p 8 -c 10 --FDR 0.05 -N -b /galaxy/data/hg19/sam_index/hg19.fa --labels + /galaxy/main_pool/pool4/files/004/645/dataset_4645857.dat /galaxy/main_pool/pool3/files/004/623/dataset_4623286.dat,/galaxy Where did I do wrong? What does it mean? Cheers, ib ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] cuffdiff results missing
Hi, I ran cuffdiff using Refseq genes as GTF and two groups of BAM. Group one has two replicates (treated) and group two only one replicate (untreated). When looking at the outputs the following are empty (1 line): TSS group FPKM tracking TSS groups differential expression testing CDS FPKM tracking CDS FPKM differential expression testing CDS overloading diffential expression testing promoters differential expression testing splicing differential expression testing the other four outputs have data downloadable as excel. Is this normal? thanks, ib ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] cuffdiff results missing
Hello Irene, Yes, this is can be the result if your source GTF data did not have the full compliment of attributes needed by Cuffdiff to perform these calculations. The primary tool documentation covers this information here: http://cufflinks.cbcb.umd.edu/manual.html#fpkm_track The iGenomes datasets are a popular choice for this reason. A version of UCSC RefGenes is available for certain genomes. Please see: http://cufflinks.cbcb.umd.edu/igenomes.html (scroll down on page in some browsers to find table) Galaxy has one of these already loaded, mm9 genes.gtf, in the Shared Data - Shared Libraries section on the public Main server. More iGenomes .gtf files will likely be added here, sometime after the GCC2012 conference. For now, locally download to your own system/desktop, uncompress, and just load the GTF file to Galaxy. Consider FTP for larger datasets: http://wiki.g2.bx.psu.edu/FTPUpload) More resources include the author supported help email at tophat.cuffli...@gmail.com and seqanswers.com (where the authors often post). Hopefully this helps, Jen Galaxy team On 7/19/12 1:38 PM, i b wrote: Hi, I ran cuffdiff using Refseq genes as GTF and two groups of BAM. Group one has two replicates (treated) and group two only one replicate (untreated). When looking at the outputs the following are empty (1 line): TSS group FPKM tracking TSS groups differential expression testing CDS FPKM tracking CDS FPKM differential expression testing CDS overloading diffential expression testing promoters differential expression testing splicing differential expression testing the other four outputs have data downloadable as excel. Is this normal? thanks, ib ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] cuffdiff
Hi, just few questions about cuffdiff if anyone can answer: 1.how can I load more than two sam/bam files?Galaxy gives spaceonly for two files 2.what to use as input: cufflinks, cuffcompare or cuffmerge? Thanks a lot! ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Cuffdiff
Hi Ateequr, This post from today has information another member found at seqanswers.com, directly from the CuffLinks/Merge/Diff tool author: http://user.list.galaxyproject.org/Re-1-cuffcompare-or-cuffmerge-td4581029.html Best, Jen Galaxy team On 4/17/12 8:00 AM, Ateequr Rehman wrote: Dear All I have simple and question for cuffdiff should we run cuffdif on merge transcript file (produced by cuffmerge) and concatenate data sets or directly on cufflink produced files, in the later case, i have two transcript files resulting from cufflink on sample 1 and 2 respectively, result using sample 1 as transcripts are not the same when i am suing sample 2 as transcript i am bit confused what should be the correct way any help is very much welcomed Best ateeq Ateequr Rehman House No. 2 ground floor Blauenstr. 10 79115 Freiburg im Breisgau ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Cuffdiff
Dear All I have simple and question for cuffdiff should we run cuffdif on merge transcript file (produced by cuffmerge) and concatenate data sets or directly on cufflink produced files, in the later case, i have two transcript files resulting from cufflink on sample 1 and 2 respectively, result using sample 1 as transcripts are not the same when i am suing sample 2 as transcript i am bit confused what should be the correct way any help is very much welcomed Best ateeq Ateequr Rehman House No. 2 ground floor Blauenstr. 10 79115 Freiburg im Breisgau___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Cuffdiff result P and q values
Hello Ateeq, It looks like you are working with a bacterial genome. There has been some limited discussion on the Galaxy mailing list about using RNA-seq tools with circular genomes, but the best resources are probably the tool documentation itself (e.g. http://cufflinks.cbcb.umd.edu/manual.html#gene_exp_diff http://cufflinks.cbcb.umd.edu/howitworks.html#hdif), the tool author's Q/A email tophat.cuffli...@gmail.com, and seqanswers.com. From a quick check, it seems that the 'not significant' result is due to the value of Test status being NOTEST. Definition in documentation link above: NOTEST (not enough alignments for testing) Hopefully this helps, Best, Jen Galaxy team On 3/19/12 11:03 AM, Ateequr Rehman wrote: Dear galaxy user After running cuffdiff on my two samples (SAM files from bowtie) i got a list with p and q values, and löast colum is saying abou significance with P value, it seems like the comparison should be significant, but in Q value is 1, and last coumn is saying not significant any one have an idea, how to interpret it , should we take any comparsion with less than 0.05 p value as significant or not tables in excel looks like it Nay help is welcome best Ateeq test_id gene_id genelocus sample_1sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant CUFF.428.1 CUFF.428- gi|190572091|ref|NC_010943.1|:1575813-1577629 q1 q2 NOTEST 171.773 605.136 -150.518588.996 3,86E-051 no CUFF.462.1 CUFF.462- gi|190572091|ref|NC_010943.1|:1680283-1681214 q1 q2 NOTEST 696.628 322.149 -111.266538.062 7,42E-031 no CUFF.635.1 CUFF.635- gi|190572091|ref|NC_010943.1|:2343969-2346219 q1 q2 NOTEST 396.469 223.951 -0.824027 476.902 1,85E-011 no CUFF.512.1 CUFF.512- gi|190572091|ref|NC_010943.1|:1840464-1843486 q1 q2 NOTEST 136.314 70.604 -0.949109 422.322 2,41E-011 no CUFF.632.1 CUFF.632- gi|190572091|ref|NC_010943.1|:2346561-2347408 q1 q2 NOTEST 351.508 167.567 -106.882415.844 3,20E-011 no CUFF.941.1 CUFF.941- gi|190572091|ref|NC_010943.1|:3664426-3665364 q1 q2 NOTEST 282.247 133.798 -10.769 412.254 3,75E-011 no CUFF.616.1 CUFF.616- gi|190572091|ref|NC_010943.1|:2301552-2303180 q1 q2 NOTEST 169.682 744.885 -118.774462.107 3,82E-011 no CUFF.617.1 CUFF.617- gi|190572091|ref|NC_010943.1|:2295763-2297758 q1 q2 NOTEST 225.933 112.178 -101.011454.517 5,49E-011 no CUFF.9.1CUFF.9 - gi|190572091|ref|NC_010943.1|:41597-42402 q1 q2 OK 1729.08 2797.07 0.693913-4.461 8,16E-010.000179474 yes CUFF.956.1 CUFF.956- gi|190572091|ref|NC_010943.1|:3665445-3669232 q1 q2 NOTEST 518.525 323.653 -0.679966 444.565 8,76E-011 no CUFF.549.1 CUFF.549- gi|190572091|ref|NC_010943.1|:2043111-2043664 q1 q2 OK 7148.23 11816.4 0.725138-421.4432,50E+000.000275446 yes CUFF.872.1 CUFF.872- gi|190572091|ref|NC_010943.1|:3489557-3490326 q1 q2 NOTEST 220.274 840.662 -13.897 416.179 3,16E+001 no CUFF.636.1 CUFF.636- gi|190572091|ref|NC_010943.1|:2348784-2352394 q1 q2 NOTEST 114.384 601.415 -0.927447 414.807 3,35E+001 no CUFF.605.1 CUFF.605- gi|190572091|ref|NC_010943.1|:2271979-2275960 q1 q2 NOTEST 217.007 133.837 -0.697264 409.373 4,24E+001 no CUFF.568.1 CUFF.568- gi|190572091|ref|NC_010943.1|:2160538-2164415 q1 q2 NOTEST 74.365 377.013 -0.980011 395.097 7,78E+001 no CUFF.597.1 CUFF.597- gi|190572091|ref|NC_010943.1|:2250029-2250918 q1 q2 NOTEST 229.389 105.246 -112.403386.937 0.000109116 1 no Ateequr Rehman House No. 2 ground floor Blauenstr. 10 79115 Freiburg im Breisgau ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this
[galaxy-user] Cuffdiff result P and q values
Dear galaxy user After running cuffdiff on my two samples (SAM files from bowtie) i got a list with p and q values, and löast colum is saying abou significance with P value, it seems like the comparison should be significant, but in Q value is 1, and last coumn is saying not significant any one have an idea, how to interpret it , should we take any comparsion with less than 0.05 p value as significant or not tables in excel looks like it Nay help is welcome best Ateeq test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant CUFF.428.1 CUFF.428 - gi|190572091|ref|NC_010943.1|:1575813-1577629 q1 q2 NOTEST 171.773 605.136 -150.518 588.996 3,86E-05 1 no CUFF.462.1 CUFF.462 - gi|190572091|ref|NC_010943.1|:1680283-1681214 q1 q2 NOTEST 696.628 322.149 -111.266 538.062 7,42E-03 1 no CUFF.635.1 CUFF.635 - gi|190572091|ref|NC_010943.1|:2343969-2346219 q1 q2 NOTEST 396.469 223.951 -0.824027 476.902 1,85E-01 1 no CUFF.512.1 CUFF.512 - gi|190572091|ref|NC_010943.1|:1840464-1843486 q1 q2 NOTEST 136.314 70.604 -0.949109 422.322 2,41E-01 1 no CUFF.632.1 CUFF.632 - gi|190572091|ref|NC_010943.1|:2346561-2347408 q1 q2 NOTEST 351.508 167.567 -106.882 415.844 3,20E-01 1 no CUFF.941.1 CUFF.941 - gi|190572091|ref|NC_010943.1|:3664426-3665364 q1 q2 NOTEST 282.247 133.798 -10.769 412.254 3,75E-01 1 no CUFF.616.1 CUFF.616 - gi|190572091|ref|NC_010943.1|:2301552-2303180 q1 q2 NOTEST 169.682 744.885 -118.774 462.107 3,82E-01 1 no CUFF.617.1 CUFF.617 - gi|190572091|ref|NC_010943.1|:2295763-2297758 q1 q2 NOTEST 225.933 112.178 -101.011 454.517 5,49E-01 1 no CUFF.9.1 CUFF.9 - gi|190572091|ref|NC_010943.1|:41597-42402 q1 q2 OK 1729.08 2797.07 0.693913 -4.461 8,16E-01 0.000179474 yes CUFF.956.1 CUFF.956 - gi|190572091|ref|NC_010943.1|:3665445-3669232 q1 q2 NOTEST 518.525 323.653 -0.679966 444.565 8,76E-01 1 no CUFF.549.1 CUFF.549 - gi|190572091|ref|NC_010943.1|:2043111-2043664 q1 q2 OK 7148.23 11816.4 0.725138 -421.443 2,50E+00 0.000275446 yes CUFF.872.1 CUFF.872 - gi|190572091|ref|NC_010943.1|:3489557-3490326 q1 q2 NOTEST 220.274 840.662 -13.897 416.179 3,16E+00 1 no CUFF.636.1 CUFF.636 - gi|190572091|ref|NC_010943.1|:2348784-2352394 q1 q2 NOTEST 114.384 601.415 -0.927447 414.807 3,35E+00 1 no CUFF.605.1 CUFF.605 - gi|190572091|ref|NC_010943.1|:2271979-2275960 q1 q2 NOTEST 217.007 133.837 -0.697264 409.373 4,24E+00 1 no CUFF.568.1 CUFF.568 - gi|190572091|ref|NC_010943.1|:2160538-2164415 q1 q2 NOTEST 74.365 377.013 -0.980011 395.097 7,78E+00 1 no CUFF.597.1 CUFF.597 - gi|190572091|ref|NC_010943.1|:2250029-2250918 q1 q2 NOTEST 229.389 105.246 -112.403 386.937 0.000109116 1 no Ateequr Rehman House No. 2 ground floor Blauenstr. 10 79115 Freiburg im Breisgau___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Cuffdiff errors
Hello I am having a problem running Cuffdiff on some RNA-seq data. I want to compare 2 of my conditions. I successfully used Cuffdiff three days ago to compare two sets of data that are processed the exact same way (align with Tophat and use Picard to confirm adequate alignment). I am using Galaxy through MAIN and I tried Cuffdiff with these samples on 1/18 and 1/19. Here's the error: An error occurred running this job: *cuffdiff v1.3.0 (3022) cuffdiff --no-update-check -q -p 8 -c 1000 --FDR 0.05 -b /galaxy/data/hg19/sam_index/hg19.fa --labels DMSO,E2 /galaxy/main_pool/pool2/files/003/607/dataset_3607369.dat /galaxy/main_pool/pool2/files/003/590/dataset_3590726.dat,/g* Thanks for the help. -- Erin Shanle Graduate Research Assistant Molecular and Environmental Toxicology 425 McArdle Laboratory for Cancer Research 1400 University Ave Madison, WI 53706 608 262 9834 sha...@wisc.edu ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Cuffdiff errors
Hello Erin, This was a temporary problem due to a new filesystem we installed during this time frame, that has since been resolved. Please try again and if the problem persists, please send a bug report from the error dataset using the green bug icon. This allows us to gain access to the inputs and job parameters to diagnose the root cause of the problem. http://wiki.g2.bx.psu.edu/Support#Error_from_tools http://wiki.g2.bx.psu.edu/Support#Unexpected_scientific_result Thank you! Jen Galaxy team On 1/20/12 6:51 AM, Erin Shanle wrote: Hello I am having a problem running Cuffdiff on some RNA-seq data. I want to compare 2 of my conditions. I successfully used Cuffdiff three days ago to compare two sets of data that are processed the exact same way (align with Tophat and use Picard to confirm adequate alignment). I am using Galaxy through MAIN and I tried Cuffdiff with these samples on 1/18 and 1/19. Here's the error: An error occurred running this job: /cuffdiff v1.3.0 (3022) cuffdiff --no-update-check -q -p 8 -c 1000 --FDR 0.05 -b /galaxy/data/hg19/sam_index/hg19.fa --labels DMSO,E2 /galaxy/main_pool/pool2/files/003/607/dataset_3607369.dat /galaxy/main_pool/pool2/files/003/590/dataset_3590726.dat,/g/ Thanks for the help. -- Erin Shanle Graduate Research Assistant Molecular and Environmental Toxicology 425 McArdle Laboratory for Cancer Research 1400 University Ave Madison, WI 53706 608 262 9834 sha...@wisc.edu mailto:sha...@wisc.edu ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Cuffdiff question about using an unspecified (?) database/build
Hello! I have an RNA-Seq project which consists of 5 samples from the species tree shrew. When uploading these fastq files into Galaxy, I chose unspecified (?) for the database/build since the latest tree shrew version is not in the drop down list. When using TopHat, Cufflinks/Compare I have selected a reference genome from my history instead of using a built-in index, as well as a gtf annotation file for Cufflinks/Compare and everything has been working fine. Now, I am at the Cuffdiff step and I am running into an error when setting it up to perform replicate analysis. When I select my TopHat accepted hits bam file I see a red X and the error: Unspecified genome build, click the pencil icon in the history item to set the genome build. Here's a screenshot of what I'm seeing: [cid:image001.png@01CC5E4E.76F37AF0] Since the latest reference genome for tree shrew wasn't listed, that's why I chose unspecified (?). Should I go back and edit these accepted hits bam files to say the Database/Build from the drop down list is Tree shrew Dec. 2006 (Broad/tupBel1) (tupBel1)? I know that this is simple to change, but will this affect my results in any way? Any help/info would be greatly appreciated. Thanks, David inline: image001.png___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Cuffdiff question about using an unspecified (?) database/build
Jen, Thank you very much for the reply. I'm glad to know it is a known bug and not something on my side of things. So, would my analysis be affected if I did change the bam file Database/Build to the older tree shrew version found in the drop down list? What significance does this Database/Build box have in downstream analysis if you have your own fasta reference genome file and gtf annotation file that is being referenced instead of a locally cached one? I'm just trying to obtain a better understanding of the Database/Build box for analyses where I provide the fasta and gtf file. Thanks, David -Original Message- From: Jennifer Jackson [mailto:j...@bx.psu.edu] Sent: Friday, August 19, 2011 9:20 AM To: David K Crossman Cc: galaxy-user (galaxy-user@lists.bx.psu.edu) Subject: Re: [galaxy-user] Cuffdiff question about using an unspecified (?) database/build Hello David, This is a known bug. The correction is planned to be moved out onto the public Galaxy instance at the next update (within a week). Sorry for the current inconvenience, Best, Jen Galaxy team On 8/19/11 7:00 AM, David K Crossman wrote: Hello! I have an RNA-Seq project which consists of 5 samples from the species tree shrew. When uploading these fastq files into Galaxy, I chose unspecified (?) for the database/build since the latest tree shrew version is not in the drop down list. When using TopHat, Cufflinks/Compare I have selected a reference genome from my history instead of using a built-in index, as well as a gtf annotation file for Cufflinks/Compare and everything has been working fine. Now, I am at the Cuffdiff step and I am running into an error when setting it up to perform replicate analysis. When I select my TopHat accepted hits bam file I see a red X and the error: Unspecified genome build, click the pencil icon in the history item to set the genome build. Here's a screenshot of what I'm seeing: Since the latest reference genome for tree shrew wasn't listed, that's why I chose unspecified (?). Should I go back and edit these accepted hits bam files to say the Database/Build from the drop down list is Tree shrew Dec. 2006 (Broad/tupBel1) (tupBel1)? I know that this is simple to change, but will this affect my results in any way? Any help/info would be greatly appreciated. Thanks, David ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/Support ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Cuffdiff Question
Hello Kurinji, I was at your USC Galaxy seminar last week, which I found very helpful - thank you! Glad to hear that you found the workshop helpful. As a reminder, please email questions about using Galaxy and its tools to the galaxy-user mailing list (which I've cc'd). You may get quicker and different responses from community members, and everyone will benefit from the discussion. I used my recently generated RNAseq data in Galaxy (which was pre-aligned using tophat and already had cufflinks run on it) - I ran cuffcompare with all the gtf files and then cuffdiff for the three pairs (there is 1 control and 3 different drug treatments - no replicates). I got several output files, as expected, but decided just to look at the gene differential expression as a start. Some questions I have are - 1. (very basic question!) which is sample 1 (and corresponding value 1) and sample 2 (and corresponding value 2)in my output file. This is what my output file is called - 90: Cuffdiff on data 37, data 38, and data 60: gene differential expression testing 33,969 lines Is 37 sample one or sample two? Given the data - I would expect sample 37 to correspond to value 2 - but I could be wrong. Please let me know! The best way to figure out which dataset corresponds with Cuffdiff's labels is to click the rerun button in the dataset: sample names correspond directly to the reads datasets (i.e. BAM files) provided as input to Cuffdiff. 2. How do I find the UCSC gene names corresponding with start/end sites - I did input the hg18 UCSC gtf file as a reference You'll need to use a reference annotation (GTF file) that has the gene_name attribute as input for Cufflinks/compare/difff. Typically Ensembl annotations have this attribute; however, you'll need to prepend 'chr' to each line--really, to each chromosome name--in order to bring Ensembl notation in line with UCSC/Galaxy notation. Actually, I noticed that value 1 in this particular output file is all 0 - no idea why. It is not this way in the other files, making me wonder if there is an error somewhere. I am sure the bam file is okay as I viewed it on IGV and saw the patterns I would expect for some candidate genes I looked at. It's difficult for me to comment without seeing your analysis. Some output files depend on particular attributes being set correctly in the annotation file. You may want to search through our mailing list archives and see if your question has already been answered: http://gmod.827538.n3.nabble.com/Galaxy-Users-f815892.html Good luck, J.___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Cuffdiff Question
Thanks for the reply. I tried to use the script provided on a previous galaxy thread for adding the chr on to the gtf file on the mac terminal but I keep getting this error - awk: can't open file ensembl.gtf source line number 1 I am very new to using the terminal so please let me know if there is something basic that I am not doing right, Try this Galaxy workflow: http://main.g2.bx.psu.edu/u/jeremy/w/make-ensembl-gtf-compatible-with-cufflinks It simply prepends 'chr' to the chromosome name, which is needed if you're using an Ensemble reference annotation and want to use it with Cufflinks/compare/diff in Galaxy. Best, J. ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] CuffDiff gene fpkm tracking file.
this is an example of my CuffDiff gene fpkm tracking file. tracking_id class_code nearest_ref_id gene_short_name tss_id locus q1_FPKM q1_conf_lo q1_conf_hi q2_FPKM q2_conf_lo q2_conf_hi XLOC_01 - - MT-ND5 - chrM:0-1657112484.2 12260.8 12707.7 11447 11233.1 11661 XLOC_02 - - USP14 TSS1,TSS2,TSS3 chr18:148586-236453 16.7235 9.41244 24.0346 19.437 11.7368 27.1371 XLOC_03 - - SMCHD1 TSS10,TSS11,TSS12,TSS4,TSS5,TSS6,TSS7,TSS8,TSS9 chr18:2719322-2728540 28.2493 17.5093 38.9892 27.2263 16.6263 37.8262 XLOC_04 - - EMILIN2 TSS13,TSS14 chr18:2880607-2882469 3.98118 0 7.99721 4.62875 0.2785198.97899 I this is normal, how can I find the class code of transcript listed in the CuffDiff gene expression file? Hi Samuele, Without seeing your history, it's difficult to say for certain what your problem is. However, I'd guess that the GTF file that you're providing to Cuffdiff does not have the p_id attribute. You can produce a GTF file with both tss_id and p_id attributes by running Cuffcompare and using sequence data. Thanks, J. ___ galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user