Nicholas -
Letter is always relative to the forward strand (how it would
look in the genome reference sequence). My guess is that upper
and lower case in column 3 distinguish coding from non-coding
bases, or bases within repetitive sequence, or something like
that. (Take a look at the corresponding region in the genome
reference sequence.)
- tom blackwell -
On Wed, 21 Feb 2018, Nicholas Hill wrote:
Hi all,
Could someone tell me if I am thinking about this correctly? The wording in
the documentation isn't clear enough for me to be confident that I am
correct.
Say my pileup line is this:
chr3 73912 A 21 g,,..G,.gGGGgGGg,.Ggg
JJ<JJJ<sJJssJJsss7JkJ
So at chr3:73912 the reference was an A. On the forward strand (read1),
there are 7 guanine base pairs that aligned to the reference sequence at
this position. Additionally, on the reverse strand (read2), there are 6
guanine base pairs that aligned to the reference sequence at this position
( or is it 6 cytosine base pairs, given that it is the reverse?). This is
where I am confused.
Also, what if my reference base is lowercase:
chr3 73912 a 21 g,,..G,.gGGGgGGg,.Ggg
JJ<JJJ<sJJssJJsss7JkJ
Does this mean that the reference base is actually a thymine, given that it
is from the reference genome?
It is very important that I am absolutely sure.
Help is greatly appreciated, thank you.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help