Nicholas  -

Letter is always relative to the forward strand (how it would look in the genome reference sequence). My guess is that upper and lower case in column 3 distinguish coding from non-coding bases, or bases within repetitive sequence, or something like that. (Take a look at the corresponding region in the genome reference sequence.)

                                        -  tom blackwell  -

On Wed, 21 Feb 2018, Nicholas Hill wrote:

Hi all,

Could someone tell me if I am thinking about this correctly? The wording in
the documentation isn't clear enough for me to be confident that I am
correct.

Say my pileup line is this:
chr3    73912   A       21      g,,..G,.gGGGgGGg,.Ggg
JJ<JJJ<sJJssJJsss7JkJ

So at chr3:73912 the reference was an A. On the forward strand (read1),
there are 7 guanine base pairs that aligned to the reference sequence at
this position. Additionally, on the reverse strand (read2), there are 6
guanine base pairs that aligned to the reference sequence at this position
( or is it 6 cytosine base pairs, given that it is the reverse?). This is
where I am confused.

Also, what if my reference base is lowercase:
chr3    73912   a       21      g,,..G,.gGGGgGGg,.Ggg
JJ<JJJ<sJJssJJsss7JkJ

Does this mean that the reference base is actually a thymine, given that it
is from the reference genome?

It is very important that I am absolutely sure.

Help is greatly appreciated, thank you.


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to