Having the occasional N in a read is pretty common. The bigger mystery is how it got a Phred score of F even with BAQ disabled. It would seem unlikely for the sequencer to have actually assigned an N that score. Were these fastq files somehow massaged prior to alignment?
Devon ____________________________________________ Devon Ryan, Ph.D. Email: dpr...@dpryan.com Tel: +49 (0)178 298-6067 Molecular and Cellular Cognition Lab German Centre for Neurodegenerative Diseases (DZNE) Ludwig-Erhard-Allee 2 53175 Bonn, Germany On Nov 4, 2014, at 9:15 PM, Arumilli, Meharji wrote: > Hi, > > I have attached two screenshots from samtools tview. One of them is normal > where the variant is not called with reference sequence on the top and other > with variant called as "AN". Does this help to infer further? > > On 04/11/14 21:26, Thomas W. Blackwell wrote: >> >> So what could possibly have introduced that 'N' into the sequence reads ? Is >> it present in the original .fastq file ? >> >> - tom blackwell - >> >> On Tue, 4 Nov 2014, Arumilli, Meharji wrote: >> >>> Hi, >>> >>> These are the commands used to call variants using samtools-0.1.19: >>> >>> samtools mpileup -ABugf ref.fa -l bed -d 1000000 bam | bcftools view -vcg - >>> | vcfutils.pl varFilter -D 1000000 > out.vcf >>> vcfutils.pl varFilter -Q 40 -d 10 out.vcf | awk '$6>=40' > fin.vcf >>> >>> >>> Hope this might help to some extent. >>> >>> >>> On 04/11/14 20:52, Thomas W. Blackwell wrote: >>>> >>>> As with the earlier question, we are all puzzling what could possibly have >>>> introduced N's into the sequence reads. No details of upstream processing >>>> steps are given, so no one has any ideas to contribute. Simplified command >>>> lines and software version numbers are always helpful. >>>> >>>> - tom blackwell - >>>> >>>> On Tue, 4 Nov 2014, Arumilli, Meharji wrote: >>>> >>>>> Hi, >>>>> >>>>> >>>>> I have performed variant calling with samtools. For, some reason some of >>>>> the variants have N in ALT column as shown below: >>>>> >>>>> Chromosome Position SNPid Reference Alternate QUAL MQ >>>>> DP >>>>> chr21 29989187 . A AN 96.50 60 46 >>>>> >>>>> This is a homozygous mutation supported by 46 reads with MQ of 60. >>>>> >>>>> Checked the bam file for this position using mpileup >>>>> >>>>> samtools mpileup -AB -f ref.fa -r chr21:29989186-29989188 input.bam >>>>> >>>>> The output is >>>>> >>>>> chr21 29989187 A 49 >>>>> ....,,,,..,,,,+1n,,...,,,,..,,,,,...,,,...,..,,.,.,^]. >>>>> 7BF<<FFFB7FF<0FIFIIIFBFIIBFFF<IIIFFB<IIFII7BIBIBB >>>>> >>>>> Is this a bug in the code that it is called as "AN" insertion. How should >>>>> i infer this mutation. >>>>> >>>>> Any comments from the users of this community are highly valuable. >>>>> >>>>> >>>>> Br >>>>> Mehar >>>>> >>>>> >>> >>> > > <normal.png><N_ALT.png>------------------------------------------------------------------------------ > _______________________________________________ > Samtools-help mailing list > Samtools-help@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/samtools-help ------------------------------------------------------------------------------ _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help