Ok. Thank you guys for the clarifications.

Is there a more detailed guide on how to interpret the VCF output file ? I
ran the basic command line available here:
http://samtools.sourceforge.net/mpileup.shtml

There are multiple fields and I would like to learn how to analyze them to
get SNPs in a reliable way.

Best,
Thiago




On Mon, Aug 4, 2014 at 6:03 PM, Devon Ryan <dpr...@dpryan.com> wrote:

> The documentation was about MAPQ scores, but regardless is the same as
> what I wrote (i.e., Phred or MAPQ + 33). The point in adding 33 (or a
> similar offset) to Phred and MAPQ scores is that they can then be
> represented by printable ASCII characters. If one were to subtract 33, that
> wouldn't work (in fact, you'd often get a negative value).
>
> Devon
>
> ____________________________________________
> Devon Ryan, Ph.D.
> Email: dpr...@dpryan.com
> Tel: +49 (0)178 298-6067
> Molecular and Cellular Cognition Lab
> German Centre for Neurodegenerative Diseases (DZNE)
> Ludwig-Erhard-Allee 2
> 53175 Bonn, Germany
>
> On Aug 4, 2014, at 10:53 PM, Thiago M. Venancio wrote:
>
> > The documentation says the phred score minus 33 at
> http://samtools.sourceforge.net/pileup.shtml
> >
> > Is that plus or minus 33 ?
> >
> > Thanks.
> > Thiago
> >
> >
> > On Mon, Aug 4, 2014 at 5:30 PM, Devon Ryan <dpr...@dpryan.com> wrote:
> > A dot means a match on the forward strand and a comma a match on the
> reverse strand, so there's no difference (aka mismatch) in either read
> covering that position. The last column is indeed Phred score + 33.
> >
> > FYI, if you were to input multiple BAM files, you'd find the output
> similar, with some of the columns repeated for each of the samples.
> >
> > Devon
> >
> > ____________________________________________
> > Devon Ryan, Ph.D.
> > Email: dpr...@dpryan.com
> > Tel: +49 (0)178 298-6067
> > Molecular and Cellular Cognition Lab
> > German Centre for Neurodegenerative Diseases (DZNE)
> > Ludwig-Erhard-Allee 2
> > 53175 Bonn, Germany
> >
> > On Aug 4, 2014, at 10:21 PM, Thiago M. Venancio wrote:
> >
> > > Hi Devon and TNP,
> > >
> > > Thanks for the feedback. Please allow me one clarification to see if I
> understood the documentation correctly.
> > >
> > > Thanke the following row from my example:
> > >
> > > supercontig_0   32      C       2       .,      BH
> > >
> > > This means that at supercontig_0 we have a reference C at position 32,
> with two mapped reads with potential SNPs. However, dot and comman means
> differences between reads and each strand of the reference sequence. So, if
> the sequencing method is not strand specific, these SNPs should be
> excluded. Am I missing something or this point is correct ?
> > >
> > > Finally, how the last column (base quality) should be interpreted ?
> Should I use the ASCII of the character minus 33 ?
> > >
> > > Sorry if these questions are very basic. I am just trying to make sure
> I understood the process.
> > >
> > > Best,
> > > Thiago
> > >
> > >
> > >
> > >
> > > On Mon, Aug 4, 2014 at 5:00 PM, Devon Ryan <dpr...@dpryan.com> wrote:
> > > Hi Thiagp
> > >
> > > The format, including the last 3 columns is described here:
> http://samtools.sourceforge.net/pileup.shtml
> > >
> > > Best,
> > > Devon
> > >
> > > ____________________________________________
> > > Devon Ryan, Ph.D.
> > > Email: dpr...@dpryan.com
> > > Tel: +49 (0)178 298-6067
> > > Molecular and Cellular Cognition Lab
> > > German Centre for Neurodegenerative Diseases (DZNE)
> > > Ludwig-Erhard-Allee 2
> > > 53175 Bonn, Germany
> > >
> > > On Aug 4, 2014, at 9:48 PM, Thiago M. Venancio wrote:
> > >
> > > > Hi all,
> > > >
> > > > I ran the samtools mpileup on a set of mapped reads (in bam format).
> The output gave me something like that:
> > > >
> > > > supercontig_0   30      T       2       ^g.^g,  BH
> > > > supercontig_0   31      G       2       .,      AG
> > > > supercontig_0   32      C       2       .,      BH
> > > > supercontig_0   1689    T       4       .,^g.^g,        GBA5
> > > > supercontig_0   1690    A       4       .,.,    GBAD
> > > > supercontig_0   1691    A       4       .,.,    BAAF
> > > > supercontig_0   1692    C       4       .$,$.,  EA?C
> > > >
> > > > ​I inspected all the documentation I could find over the past few
> hours and was unable to find ​a complete explanation for this output. I
> understand the first three columns, but not the other three.
> > > >
> > > > Can anyone point me the appropriate documentation ?
> > > >
> > > > Thanks in advance.
> > > > Thiago
> > > >
> ------------------------------------------------------------------------------
> > > > Infragistics Professional
> > > > Build stunning WinForms apps today!
> > > > Reboot your WinForms applications with our WinForms controls.
> > > > Build a bridge from your legacy apps to the future.
> > > >
> http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk_______________________________________________
> > > > Samtools-help mailing list
> > > > Samtools-help@lists.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/samtools-help
> > >
> > >
> > >
> > >
> > > --
> > > =================================
> > > Thiago Motta Venancio, M.Sc., PhD
> > > http://venancio.openwetware.org/
> > > =================================
> >
> >
> >
> >
> > --
> > =================================
> > Thiago Motta Venancio, M.Sc., PhD
> > http://venancio.openwetware.org/
> > =================================
>
>


-- 
=================================
Thiago Motta Venancio, M.Sc., PhD
http://venancio.openwetware.org/
=================================
------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls. 
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to