Ok. Thank you guys for the clarifications.
Is there a more detailed guide on how to interpret the VCF output file ? I
ran the basic command line available here:
http://samtools.sourceforge.net/mpileup.shtml
There are multiple fields and I would like to learn how to analyze them to
get SNPs in a reliable way.
Best,
Thiago
On Mon, Aug 4, 2014 at 6:03 PM, Devon Ryan <dpr...@dpryan.com> wrote:
> The documentation was about MAPQ scores, but regardless is the same as
> what I wrote (i.e., Phred or MAPQ + 33). The point in adding 33 (or a
> similar offset) to Phred and MAPQ scores is that they can then be
> represented by printable ASCII characters. If one were to subtract 33, that
> wouldn't work (in fact, you'd often get a negative value).
>
> Devon
>
> ____________________________________________
> Devon Ryan, Ph.D.
> Email: dpr...@dpryan.com
> Tel: +49 (0)178 298-6067
> Molecular and Cellular Cognition Lab
> German Centre for Neurodegenerative Diseases (DZNE)
> Ludwig-Erhard-Allee 2
> 53175 Bonn, Germany
>
> On Aug 4, 2014, at 10:53 PM, Thiago M. Venancio wrote:
>
> > The documentation says the phred score minus 33 at
> http://samtools.sourceforge.net/pileup.shtml
> >
> > Is that plus or minus 33 ?
> >
> > Thanks.
> > Thiago
> >
> >
> > On Mon, Aug 4, 2014 at 5:30 PM, Devon Ryan <dpr...@dpryan.com> wrote:
> > A dot means a match on the forward strand and a comma a match on the
> reverse strand, so there's no difference (aka mismatch) in either read
> covering that position. The last column is indeed Phred score + 33.
> >
> > FYI, if you were to input multiple BAM files, you'd find the output
> similar, with some of the columns repeated for each of the samples.
> >
> > Devon
> >
> > ____________________________________________
> > Devon Ryan, Ph.D.
> > Email: dpr...@dpryan.com
> > Tel: +49 (0)178 298-6067
> > Molecular and Cellular Cognition Lab
> > German Centre for Neurodegenerative Diseases (DZNE)
> > Ludwig-Erhard-Allee 2
> > 53175 Bonn, Germany
> >
> > On Aug 4, 2014, at 10:21 PM, Thiago M. Venancio wrote:
> >
> > > Hi Devon and TNP,
> > >
> > > Thanks for the feedback. Please allow me one clarification to see if I
> understood the documentation correctly.
> > >
> > > Thanke the following row from my example:
> > >
> > > supercontig_0 32 C 2 ., BH
> > >
> > > This means that at supercontig_0 we have a reference C at position 32,
> with two mapped reads with potential SNPs. However, dot and comman means
> differences between reads and each strand of the reference sequence. So, if
> the sequencing method is not strand specific, these SNPs should be
> excluded. Am I missing something or this point is correct ?
> > >
> > > Finally, how the last column (base quality) should be interpreted ?
> Should I use the ASCII of the character minus 33 ?
> > >
> > > Sorry if these questions are very basic. I am just trying to make sure
> I understood the process.
> > >
> > > Best,
> > > Thiago
> > >
> > >
> > >
> > >
> > > On Mon, Aug 4, 2014 at 5:00 PM, Devon Ryan <dpr...@dpryan.com> wrote:
> > > Hi Thiagp
> > >
> > > The format, including the last 3 columns is described here:
> http://samtools.sourceforge.net/pileup.shtml
> > >
> > > Best,
> > > Devon
> > >
> > > ____________________________________________
> > > Devon Ryan, Ph.D.
> > > Email: dpr...@dpryan.com
> > > Tel: +49 (0)178 298-6067
> > > Molecular and Cellular Cognition Lab
> > > German Centre for Neurodegenerative Diseases (DZNE)
> > > Ludwig-Erhard-Allee 2
> > > 53175 Bonn, Germany
> > >
> > > On Aug 4, 2014, at 9:48 PM, Thiago M. Venancio wrote:
> > >
> > > > Hi all,
> > > >
> > > > I ran the samtools mpileup on a set of mapped reads (in bam format).
> The output gave me something like that:
> > > >
> > > > supercontig_0 30 T 2 ^g.^g, BH
> > > > supercontig_0 31 G 2 ., AG
> > > > supercontig_0 32 C 2 ., BH
> > > > supercontig_0 1689 T 4 .,^g.^g, GBA5
> > > > supercontig_0 1690 A 4 .,., GBAD
> > > > supercontig_0 1691 A 4 .,., BAAF
> > > > supercontig_0 1692 C 4 .$,$., EA?C
> > > >
> > > > I inspected all the documentation I could find over the past few
> hours and was unable to find a complete explanation for this output. I
> understand the first three columns, but not the other three.
> > > >
> > > > Can anyone point me the appropriate documentation ?
> > > >
> > > > Thanks in advance.
> > > > Thiago
> > > >
> ------------------------------------------------------------------------------
> > > > Infragistics Professional
> > > > Build stunning WinForms apps today!
> > > > Reboot your WinForms applications with our WinForms controls.
> > > > Build a bridge from your legacy apps to the future.
> > > >
> http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk_______________________________________________
> > > > Samtools-help mailing list
> > > > Samtools-help@lists.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/samtools-help
> > >
> > >
> > >
> > >
> > > --
> > > =================================
> > > Thiago Motta Venancio, M.Sc., PhD
> > > http://venancio.openwetware.org/
> > > =================================
> >
> >
> >
> >
> > --
> > =================================
> > Thiago Motta Venancio, M.Sc., PhD
> > http://venancio.openwetware.org/
> > =================================
>
>
--
=================================
Thiago Motta Venancio, M.Sc., PhD
http://venancio.openwetware.org/
=================================
------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls.
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help