I have lots of insertions and deletions that appear in the output files
that I have. From http://samtools.sourceforge.net/cns0.shtml, the sample
they give is in the format

seq2  156 *  +AG/+AG  71  252  99  11  +AG  *  3  8  0
seq2  157 A  A  57  0  99  10  .$.$........    97<<<<<<<<
seq2  158 A  R  18  18 99  8   GG$G.....   <;;<<<<<
seq2  159 T  T  8   0  99  7   A$A$.....   3:<<<<<


In this example, they say "The line with the 3rd column a star indicates
that the AG insertion is supported by 3 reads; 8 reads agree with the
reference according to the raw alignment; no reads support a third allele.
However, SAMtools infers a AG homozygous insertion with a high score 252
because when we realign the reads with the prior of an insertion, we found
that the 8 reads mapped without gaps are due to a tandam repeat."

In my outputs however, I have more columns following the 3 8 0 that is
printed in their sample. Do you know what the columns after the
tab-delimited * and -G mean? Here is a sample of my output data for
your reference:

seq1   3404636 *       */-G    198     198     58      49      *
-G      46      3       0       0       0

seq1   3978142 *       */+T    76      76      58      74      *
+T      73      1       0       0       0

seq1   3996202 *       */+G    51      51      56      63      *
+G      62      1       0       0       0


-- 
Stanford University Class of 2019
som...@stanford.edu
(408)888-1430
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to