Dear Ryan, The fastq files were filtered to remove bases with score less than 20 and then aligned using bwa mem. Followed by indel realignment, fix-mate pair, BQSR using GATK and then variant calling. On 04/11/14 22:30, Devon Ryan wrote: > Having the occasional N in a read is pretty common. The bigger mystery is how > it got a Phred score of F even with BAQ disabled. It would seem unlikely for > the sequencer to have actually assigned an N that score. Were these fastq > files somehow massaged prior to alignment? > > Devon > > ____________________________________________ > Devon Ryan, Ph.D. > Email: dpr...@dpryan.com > Tel: +49 (0)178 298-6067 > Molecular and Cellular Cognition Lab > German Centre for Neurodegenerative Diseases (DZNE) > Ludwig-Erhard-Allee 2 > 53175 Bonn, Germany > > On Nov 4, 2014, at 9:15 PM, Arumilli, Meharji wrote: > >> Hi, >> >> I have attached two screenshots from samtools tview. One of them is normal >> where the variant is not called with reference sequence on the top and other >> with variant called as "AN". Does this help to infer further? >> >> On 04/11/14 21:26, Thomas W. Blackwell wrote: >>> So what could possibly have introduced that 'N' into the sequence reads ? >>> Is it present in the original .fastq file ? >>> >>> - tom blackwell - >>> >>> On Tue, 4 Nov 2014, Arumilli, Meharji wrote: >>> >>>> Hi, >>>> >>>> These are the commands used to call variants using samtools-0.1.19: >>>> >>>> samtools mpileup -ABugf ref.fa -l bed -d 1000000 bam | bcftools view -vcg >>>> - | vcfutils.pl varFilter -D 1000000 > out.vcf >>>> vcfutils.pl varFilter -Q 40 -d 10 out.vcf | awk '$6>=40' > fin.vcf >>>> >>>> >>>> Hope this might help to some extent. >>>> >>>> >>>> On 04/11/14 20:52, Thomas W. Blackwell wrote: >>>>> As with the earlier question, we are all puzzling what could possibly >>>>> have introduced N's into the sequence reads. No details of upstream >>>>> processing steps are given, so no one has any ideas to contribute. >>>>> Simplified command lines and software version numbers are always helpful. >>>>> >>>>> - tom blackwell - >>>>> >>>>> On Tue, 4 Nov 2014, Arumilli, Meharji wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> >>>>>> I have performed variant calling with samtools. For, some reason some of >>>>>> the variants have N in ALT column as shown below: >>>>>> >>>>>> Chromosome Position SNPid Reference Alternate QUAL >>>>>> MQ DP >>>>>> chr21 29989187 . A AN 96.50 60 46 >>>>>> >>>>>> This is a homozygous mutation supported by 46 reads with MQ of 60. >>>>>> >>>>>> Checked the bam file for this position using mpileup >>>>>> >>>>>> samtools mpileup -AB -f ref.fa -r chr21:29989186-29989188 input.bam >>>>>> >>>>>> The output is >>>>>> >>>>>> chr21 29989187 A 49 >>>>>> ....,,,,..,,,,+1n,,...,,,,..,,,,,...,,,...,..,,.,.,^]. >>>>>> 7BF<<FFFB7FF<0FIFIIIFBFIIBFFF<IIIFFB<IIFII7BIBIBB >>>>>> >>>>>> Is this a bug in the code that it is called as "AN" insertion. How >>>>>> should i infer this mutation. >>>>>> >>>>>> Any comments from the users of this community are highly valuable. >>>>>> >>>>>> >>>>>> Br >>>>>> Mehar >>>>>> >>>>>> >>>> >> <normal.png><N_ALT.png>------------------------------------------------------------------------------ >> _______________________________________________ >> Samtools-help mailing list >> Samtools-help@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/samtools-help
------------------------------------------------------------------------------ _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help