James Bonfield wrote: > On Tue, Aug 02, 2016 at 11:57:52AM +0200, Martin MOKREJ? wrote: >> Could samtools calmd apply the following logic for bwa-processed input? >> Get positions of all N's in the read. >> Do not complain about those positions which are based on N's. >> Do report other positions. > > I don't entirely understand what you're wanting. > > Calmd computes the MD:Z tag as described in the SAM specification, > which is difference between the sequence and the reference. It should > not be changed to some other algorithm, no matter what aligners happen > to do. If aligners produce different MD tags then they are buggy.
In principle I would agree, but bwa mem only wrote CIGAR and NM: tags in this respect. > > Similarly for NM:i. > >> OK, I know "samtools calmd" reports a total sum of the differences before >> and after but quite likely there will be only a "0 -> 0" to be reported. So >> I will get less warning on the screen, which is always good. ;) > > The warnings are there because it is correcting the aligner output. > The proper fix is to fix the aligners to produce the correct MD in the > first place, not to break calmd to be buggy in the same manner. Hmm, but bwa mem did not introduce the MD: tag, see the lines from samtools view mysample-PB.bam | grep HWI-xxxxx:xxx:xxxxxxxxx:2:1101:1110:65038 in a previous email. So, I still think "samtools calmd" could be less verbose. In this regard, I do not even know what to ask Heng Li to do with bwa. The CIGAR 74M is correct, only the NM:i:0 instead NM:i:1 is wrong. OK, now I get your point. ;-) Thank you, Martin ------------------------------------------------------------------------------ _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help