Re: [galaxy-user] Tophat alignment statistics?
Hi Wei, Have a look at RNASeQC which provides more than what you specified here. (https://confluence.broadinstitute.org/display/CGATools/RNA-SeQC) This generates a detailed report with all relevant metrics on your RNA data.. I think, integrating this - a java based tool - into Galaxy should resolve your problem. Hope this helps. Raj From: galaxy-user-boun...@lists.bx.psu.edu [mailto:galaxy-user-boun...@lists.bx.psu.edu] On Behalf Of Wei Liao Sent: Friday, December 07, 2012 3:57 AM To: galaxy-user@lists.bx.psu.edu Subject: [galaxy-user] Tophat alignment statistics? Hi, galaxy users How to get Tophat alignment statistics such as % of reads aligned to exon, intron, splice junction? is there a Log file available? How many unique and mutiple alignments? I use Bam index, Flagstat, and Bam alignment metrix in Galaxy, but none reported the information I need. -- Wei Liao Research Scientist, Brentwood Biomedical Research Institute 16111 Plummer St. Bldg 7, Rm D-122 North Hills, CA 91343 818-891-7711 ext 7645 This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Why doesn't bowtie in galaxy accepting colorspace reads directly?
Hi All, I'm wondering why the Bowtie version in (even latest) Galaxy does NOT support .csfasta/.qual input files directly, though it is mentioned under "Map with Bowtie for SOLiD". This is the case of "BWA for SOLiD" as well. One would expect direct support on colorspace files. Do you have any plans of implementing this?I see this would be a great support to SOLiD users. Look forward to your comments Thanks, Raj This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Indexing files everytime - Performance Issue
All, It is noticed that Galaxy/GATK indexes reference fasta & dbSNP file everytime when it runs. Re-indexing takes time (~10min), hence it affects overall run time when it use for multiple times. However, this could be avoided by reusing the available index. Here is the snapshot of the log: INFO 11:43:57,365 HelpFormatter - The Genome Analysis Toolkit (GATK) v1.4-21-g30b937d, Compiled 2012/02/01 19:01:14 INFO 11:43:57,365 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 11:43:57,365 HelpFormatter - Please view our documentation at http://www.broadinstitute.org/gsa/wiki INFO 11:43:57,366 HelpFormatter - For support, please view our support site at http://getsatisfaction.com/gsa INFO 11:43:57,367 HelpFormatter - - INFO 11:43:57,429 GenomeAnalysisEngine - Strictness is STRICT INFO 11:43:57,432 ReferenceDataSource - Index file /tmp/tmp-gatk-6jlUfH/gatk_input.fasta.fai does not exist. Trying to create it now. PROGRESS UPDATE: file is 15 percent complete PROGRESS UPDATE: file is 28 percent complete PROGRESS UPDATE: file is 91 percent complete INFO 11:45:32,231 ReferenceDataSource - Dict file /tmp/tmp-gatk-6jlUfH/gatk_input.dict does not exist. Trying to create it now. INFO 11:45:54,262 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 11:45:54,280 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02 INFO 11:45:54,304 RMDTrackBuilder - Creating Tribble index in memory for file /tmp/tmp-gatk-6jlUfH/input_dbsnp_0.vcf INFO 11:48:05,910 RMDTrackBuilder - Writing Tribble index to disk for file /tmp/tmp-gatk-6jlUfH/input_dbsnp_0.vcf.idx Do we have any option/alternate in Galaxy to avoid this re-indexing at /tmp, as I have already built the index for reference and dbSNP. Look forward to any suggestions. Thanks, Raj This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Multiple sample runs using workflow
Hello All, I'm seeking help on how do we run a workflow (say bwa mapping on PE data) on multiple samples (eg: 10 samples, PE data) together. I assume "multiple input files selection" option does not work here as the workflow accepts two fastq input files (PE data). Do you have any experience on this? Best, Raj This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Error in DepthOfCoverage (GATK) tool
Hello Dan, But, I am currently using the latest update of Galaxy (as 'hg incoming' says 'no changes'). Just to clarify one thing: which repository should I clone - https://bitbucket.org/galaxy/galaxy-dist/ (mentioned in GetGalaxy.org) OR http://www.bx.psu.edu/hg/galaxy/ (mentioned in NewsBrief website). I use the first one to update Galaxy. I tried to pull the below changeset, but it says 'invalid revision'. Please suggest. Best, Raj From: Daniel Blankenberg [mailto:d...@bx.psu.edu] Sent: Thursday, March 01, 2012 8:45 PM To: Praveen Raj Somarajan Cc: galaxy-user@lists.bx.psu.edu Subject: Re: [galaxy-user] Error in DepthOfCoverage (GATK) tool Hi Raj, Thanks for reporting, this issue has been resolved in changeset 6778:35be930b21be. Please let us know if you encounter further issues. Thanks for using Galaxy, Dan On Mar 1, 2012, at 3:30 AM, Praveen Raj Somarajan wrote: Hello, I'm facing an issue with "Depth Of Coverage" tool when it runs on refGene and target BED file. The error message is: File "cheetah_DynamicallyCompiledCheetahTemplate_1330588825_26_16118.py", line 402, in respond NotFound: cannot find 'omit_interval_statistics' while searching for 'gatk_param_type.omit_interval_statistics' I noticed that the issue is only when the "Advanced GATK options" is enabled to set target BED file. The commandline runs perfectly with the input files, but galaxy fails due to this error. Can anyone suggest what's the issue? Best, Raj This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org<http://usegalaxy.org>. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Error in DepthOfCoverage (GATK) tool
Hello, I'm facing an issue with "Depth Of Coverage" tool when it runs on refGene and target BED file. The error message is: File "cheetah_DynamicallyCompiledCheetahTemplate_1330588825_26_16118.py", line 402, in respond NotFound: cannot find 'omit_interval_statistics' while searching for 'gatk_param_type.omit_interval_statistics' I noticed that the issue is only when the "Advanced GATK options" is enabled to set target BED file. The commandline runs perfectly with the input files, but galaxy fails due to this error. Can anyone suggest what's the issue? Best, Raj This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] snpEff: html report is not displaying after update
Hi All, I updated galaxy recently to the latest version. Everything looks fine, except snpEff report html view. It was displaying properly (all tables and summary values) before the update, but the summary values are not displaying after the update. A sample screen-shot is attached for your reference. Could you please figure out this issue? When I ran the same on command line, the reports were generated correctly. I assume, something (datatypes or preview) has changed by the update. Please let me know the work around on this? Secondly, as we know, snpEff also generates a gene-wise annotation file along with other results, but somehow we cannot access this file through Galaxy. Though we see the link in the html report, but it seems the path is broken. Let me know your suggestions. Best, Raj This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD<>___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Genomic interval file for GATK
Thanks Carlos. For Q#1, I found something that GATK v1.3 does not explicity check the format of -L input file, hence the error. And the work around for commandline is to specify the format with -L, as shown below: -L:bed Any idea how to edit the wrapper code to resolve this issue in Galaxy? Has anybody experienced/resolved this before? Thanks, Raj. From: Carlos Borroto [mailto:carlos.borr...@gmail.com] Sent: Thursday, December 08, 2011 7:43 PM To: Praveen Raj Somarajan Cc: galaxy-user@lists.bx.psu.edu Subject: Re: [galaxy-user] Genomic interval file for GATK Hi Raj, I've been also testing GATK Beta pipeline on Galaxy. This is the workflow I have so far: http://test.g2.bx.psu.edu/u/cjav/w/gatk There are a few error coming up that I haven't had the time to fix or work around yet, but I think it could be a good starting point. For example an issue with annotations in Variant Recalibrator tool, was recently fixed: https://bitbucket.org/galaxy/galaxy-central/issue/682/variant-recalibrator-error-with I haven't yet used the new manual method to enter annotations in the workflow. Regarding your questions, I don't have one for 1), I would love to hear about a solution. In my case I'm working with RNA-seq data, so I think everything would speed up if I use a good interval file, but is not clear for me at the moment how to use it or when. For 2), every time a tool outputs a BAM file in Galaxy, it is sorted and indexed automatically, in fact even if the downstream tool can use a SAM file, I still convert it to BAM just to make sure it is sorted and indexed. Regards, Carlos On Thu, Dec 8, 2011 at 1:11 AM, Praveen Raj Somarajan mailto:pravee...@ocimumbio.com>> wrote: All, I'm using a locally installed galaxy with GATK 1.3 beta (recently updated). I would be interested in variant calling using GATK on both Illumina and SOLiD data. My questions are: 1) What should be the format that "Genomic Interval" option can accept in beta version. It produced an error when I provided an (enrichment coords) bed file? DepthOfCoverage had also produced error when I used bed files. Would beta release (v1.3) accept bed file as input for genomic intervals? 2) SAMtool index is seem to be missing in Galaxy. Is this true or any other module (say SAM->BAM) incorporates this functionality? Looking forward to your comments. Raj This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org<http://usegalaxy.org>. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authori
[galaxy-user] Genomic interval file for GATK
All, I'm using a locally installed galaxy with GATK 1.3 beta (recently updated). I would be interested in variant calling using GATK on both Illumina and SOLiD data. My questions are: 1) What should be the format that "Genomic Interval" option can accept in beta version. It produced an error when I provided an (enrichment coords) bed file? DepthOfCoverage had also produced error when I used bed files. Would beta release (v1.3) accept bed file as input for genomic intervals? 2) SAMtool index is seem to be missing in Galaxy. Is this true or any other module (say SAM->BAM) incorporates this functionality? Looking forward to your comments. Raj This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] FASTQ Groomer before BWA mapping ?
All, I'm wondering why do we need to convert Illumina FASTQ into sanger using FastQ Groomer before mapping with BWA in galaxy. The lastest version of BWA itself added -I option to use Illumina data directly. What's your opinion on this? Secondly, I found that "Map with BWA for Illumina" uses -I option in the commandline during execution, even for sanger formatted reads. How does it impact the results? Thanks, Raj This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/