Re: [galaxy-user] SNP calling problems (Jennifer Jackson)

2013-10-01 Thread garzetti

Hi Jen,

thank you for your answer!

I have used the Add or Replace group tool and it worked pretty well, so 
that I could use the FreeBayes tool with no problem!


Now I have another question: I have been pre-processing my data with the 
NGS: GATK tools according to their Best Practices and I am ready for SNP 
calling. I have read the Unified Genotyper documentation and, since I am 
working with bacterial genome sequences, I would need to set the 
-sample-ploidy argument to 1 (default 2). I cannot find this option in 
the Galaxy version of this tool, not even in the advanced options. How 
can I do that?


Thank you very much!
Debora


Message: 3
Date: Fri, 27 Sep 2013 14:02:50 -0700
From: Jennifer Jackson j...@bx.psu.edu
To: garzetti garze...@mvp.uni-muenchen.de
Cc: galaxy-u...@bx.psu.edu
Subject: Re: [galaxy-user] SNP calling problems
Message-ID: 5245f27a.7020...@bx.psu.edu
Content-Type: text/plain; charset=iso-8859-1; Format=flowed

Hi Debora,

Sorry to hear that you are having problems. We can help get you going 
again! Please see below:


On 9/26/13 7:20 AM, garzetti wrote:
  

Dear all,

I have been looking for an answer to my problem in all the Galaxy 
Support resources but with no success. I am sorry if this topic has 
been already discussed!


So, I am analyzing MiSeq data on the main Galaxy.
I have Fastq files from 4 paired-end samples. After having checked the 
quality with FastQC and groomed them, I have performed a BWA mapping, 
filtered the results and converted the SAM to BAM files (for each 
sample separately). I have then called SNPs with Freebayes and 
SAMtools, encountering problems in both cases.


1) SAMtools: if I run the Generate pileup tool, then the Filter pileup 
doesn't recognize any valid format in the files I have in my History 
and I cannot go on with the analysis. Why is that? What can I do?

Make sure that the output format is set as pileup and the tool will 
accept the input. Click on the pencil icon to make the datatype 
assignment change.

http://wiki.galaxyproject.org/Support#Tool_doesn.27t_recognize_dataset

Note that Mpileup has an option to produce .bcf format, and that is not 
the same as pileup. If you have selected that type of output, then 
either re-run the tool with options that create pileup format, or 
convert bcf - vcf and use one of the tools that work with vcf format to 
work with your data downstream from there.
  
2) I have performed variant calling with Freebayes on single BAM files 
and on one merged BAM files from all my four BWA mapping files. In all 
cases, the last column is unknown, while it should be the name of my 
sample. This is not a big deal for the single vcf files, but from the 
merged BAM file, I cannot discriminate from which sample the SNPs were 
detected. I think there is a problem in the BAM files which are not 
properly indexed. Also Freebayes needs an RG tag.
Is there a tool in Galaxy I can use to index BAM files, adding the RG 
tag?

The tool  NGS: Picard (beta) - Add or Replace Groups can be used to 
annotate SAM/BAM files. This tool can be a bit picky about formats, so 
just watch for that if you get an error.


/_Quick tip:_/ You can click on the bug icon on failed datasets to see 
the complete error message and it will often tell you exactly what is 
wrong so that you can correct it (this doesn't automatically submit a 
bug, which is good to know when you are in a hurry at night or on 
weekends or just want to troubleshoot yourself). You can use this on any 
error dataset to get more information if the dataset's i info button's 
stderr/stdout links or attributes Info field does not provide enough 
details. = This functions on servers that have bug reporting enabled 
(the public Main server does, and this is straightforward to configure 
on local/cloud instances, including your own, even if you use one for 
small local file manipulations or file backup/storage (very handy  key 
file backups are always a good idea, when doing analysis in general, 
anywhere). See the Admin wiki section for more.


Going forward, there is a short screencast about the Learning resources 
in Galaxy here in a Page. It will be uploaded to Vimeo sometime in the 
next 24 hrs, and will be likely updated to include the very latest as 
the infrastructure updates on Main settle out in the next weeks or so, 
but for now here is the link: Click on the Learning Resources graphic 
to launch the quick tour:

https://main.g2.bx.psu.edu/u/galaxyproject/p/screencasts-usegalaxyorg

Galaxy team's Vimeo account: http://vimeo.com/channels/581769
We are uploading all of our vids, old  new, right now and over next few 
days. We really like and hope our user's do too and follow along. The 
public Main server will have direct links to this content, in the center 
home page, soon as part of the New  Improved Galaxy experience! I 
won't give an ETA, as this is in progress, but can hint that soon == 
expected very soon. (!)


Good luck and let us know if you need more

[galaxy-user] SNP calling problems

2013-09-26 Thread garzetti

Dear all,

I have been looking for an answer to my problem in all the Galaxy 
Support resources but with no success. I am sorry if this topic has been 
already discussed!


So, I am analyzing MiSeq data on the main Galaxy.
I have Fastq files from 4 paired-end samples. After having checked the 
quality with FastQC and groomed them, I have performed a BWA mapping, 
filtered the results and converted the SAM to BAM files (for each sample 
separately). I have then called SNPs with Freebayes and SAMtools, 
encountering problems in both cases.


1) SAMtools: if I run the Generate pileup tool, then the Filter pileup 
doesn't recognize any valid format in the files I have in my History and 
I cannot go on with the analysis. Why is that? What can I do?


2) I have performed variant calling with Freebayes on single BAM files 
and on one merged BAM files from all my four BWA mapping files. In all 
cases, the last column is unknown, while it should be the name of my 
sample. This is not a big deal for the single vcf files, but from the 
merged BAM file, I cannot discriminate from which sample the SNPs were 
detected. I think there is a problem in the BAM files which are not 
properly indexed. Also Freebayes needs an RG tag.

Is there a tool in Galaxy I can use to index BAM files, adding the RG tag?

I hope someone can help me!

Thank you very much!
Debora

--
Debora Garzetti, PhD Student
AG Rakin
Max von Pettenkofer-Institute, LMU
Pettenkoferstraße 9A
80336 Munich

E-mail: garze...@mvp.uni-muenchen.de
Phone: +49 (0)89 2180 72915

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

 http://galaxyproject.org/search/mailinglists/


[galaxy-user] SNP Calling

2012-10-17 Thread Francesco Vitiello
Does anyone knows a step-by-step pipeline to SNP calling on illumina
dataset?
From the alignment to the end

thanks
Francesco

--
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/